OCR project in Gsoc

Hi to all,

   I planned to do project in gsoc... For OCR ( Optimal Character
Recoganisation ) ...

That is ,

             If we scanning one full text page from book, it will open
into open office as word format. so that we can edit the page from
scanned text page... I planned to convert scanned letters to words for
Tamil, English Languages... I will try to support few more languages
also...This OCR project will can done by Using Rmagick , i will do
this successfully.

             This is my idea, if any one of you can suggest me and
guide me to do this...



There are many ways to accomplish this, none of them are easy...

There's ai4r's backpropagation nueural nets implementation, with a
simple OCR example at http://ai4r.rubyforge.org/neuralNetworks.html

There's also gnu Ocrad, which I've never used: http://www.gnu.org/software/ocrad/,
and just found http://gtamilocr.sourceforge.net/ which does OCR for
Tamil characters as well.

I'd be glad to hear other suggestions...


What about Google Tesseract???


Harold escribió:

