I am involved in a project where need to upload documents to a website
and scan contents in it. I found this excellent library Docsplit
http://documentcloud.github.com/docsplit/ .
The problem is I am using Heroku, since Docsplit is a wrapper of Ruby
code, I need to install many tools onto the server for this to work.
In short: Can I get 100% functionality of Docsplit on Heroku?
I am involved in a project where need to upload documents to a website
and scan contents in it. I found this excellent library Docsplit
http://documentcloud.github.com/docsplit/ .
The problem is I am using Heroku, since Docsplit is a wrapper of Ruby
code, I need to install many tools onto the server for this to work.
In short: Can I get 100% functionality of Docsplit on Heroku?
Highly unlikely. Heroku is a read-only filesystem, so any files you upload have to be stored elsewhere, like S3 or similar. Any process that is going to scan the files needs to access them as a stream, and that's best done on the same hardware or near enough (like EC2) for performance reasons. I'm also fairly certain that you can't install random binaries on Heroku, they have it locked down pretty tight there.