OCR project in Gsoc

Hi to all,

   I planned to do project in gsoc... For OCR ( Optimal Character Recoganisation ) ...

That is ,

             If we scanning one full text page from book, it will open into open office as word format. so that we can edit the page from scanned text page... I planned to convert scanned letters to words for Tamil, English Languages... I will try to support few more languages also...This OCR project will can done by Using Rmagick , i will do this successfully.

             This is my idea, if any one of you can suggest me and guide me to do this...

Thank,

Arulalan.

There are many ways to accomplish this, none of them are easy...

There's ai4r's backpropagation nueural nets implementation, with a simple OCR example at http://ai4r.rubyforge.org/neuralNetworks.html

There's also gnu Ocrad, which I've never used: Ocrad - GNU Project - Free Software Foundation (FSF), and just found http://gtamilocr.sourceforge.net/ which does OCR for Tamil characters as well.

I'd be glad to hear other suggestions...

Hi,

What about Google Tesseract???

http://code.google.com/p/tesseract-ocr/

Harold escribió:

I'm not a developer, I always use this free <a href="online-code.net ocr</a> servie.