Cleaning up noise around text
Posted: 2015-01-28T19:07:08-07:00
Hi!
I've been trying to clear the background of the kind of image that you can see below.



The process I'm doing is that first I'll run a simple filter (hand made) to remove some of the noise (picking only black pixels that are surrounded by 8 other black pixels): https://github.com/vkruoso/receita-tool ... aFilter.py - After that I just run tesseract hoping the result will be good.
I'm providing a free webservice that get information from a government site to allow an easier way to have the information (this really should be provided by the government). Doing that process I've managed to successfully decode the text 25% of the time. But that's not good enough to provide a good service.
I have very little background on image processing, so I think someone around here can give some hints about how to approach on this particular kind of image.
--
Thanks a lot.
I've been trying to clear the background of the kind of image that you can see below.



The process I'm doing is that first I'll run a simple filter (hand made) to remove some of the noise (picking only black pixels that are surrounded by 8 other black pixels): https://github.com/vkruoso/receita-tool ... aFilter.py - After that I just run tesseract hoping the result will be good.
I'm providing a free webservice that get information from a government site to allow an easier way to have the information (this really should be provided by the government). Doing that process I've managed to successfully decode the text 25% of the time. But that's not good enough to provide a good service.
I have very little background on image processing, so I think someone around here can give some hints about how to approach on this particular kind of image.
--
Thanks a lot.