OCR project in Gsoc

Arulalan · March 24, 2009, 3:36pm

Hi to all,

I planned to do project in gsoc... For OCR ( Optimal Character Recoganisation ) ...

That is ,

If we scanning one full text page from book, it will open into open office as word format. so that we can edit the page from scanned text page... I planned to convert scanned letters to words for Tamil, English Languages... I will try to support few more languages also...This OCR project will can done by Using Rmagick , i will do this successfully.

This is my idea, if any one of you can suggest me and guide me to do this...

Thank,

Arulalan.

Harold · March 24, 2009, 7:20pm

There are many ways to accomplish this, none of them are easy...

There's ai4r's backpropagation nueural nets implementation, with a simple OCR example at http://ai4r.rubyforge.org/neuralNetworks.html

There's also gnu Ocrad, which I've never used: Ocrad - GNU Project - Free Software Foundation (FSF), and just found http://gtamilocr.sourceforge.net/ which does OCR for Tamil characters as well.

I'd be glad to hear other suggestions...

Juan_Jose_Vidal · March 24, 2009, 9:52pm

Hi,

What about Google Tesseract???

http://code.google.com/p/tesseract-ocr/

Harold escribió:

11155 · September 8, 2015, 2:27am

I'm not a developer, I always use this free <a href="online-code.net ocr</a> servie.