gbpgmdiff

This is a small program written in php which reads a .png file and goes through and calculates the average red, green and blue value. It then goes through every pixel and calculates how different the pixel is from the average and generates a greyscale png file.
There is also a wrapper script which accepts a pbm image on standard input and generated a pgm image on standard output. pamthreshold is also run to produce a black and white image.

The purpose of this program is to enhance the readability of images which tend to have a comstant background colour but the text may be multicoloured. Often OCR applications have trouble where the ext can be lighter or darker than the background image and this program helps with such images.

The program was intended to be written so that it can be used in a scanset for FuzzyOcr. The archive includes a couple of images and a shell script showing how to call the wrapper.

Note this software is still in testing. Version 0.2 of the wrapper now takes the threshold as a parameter. A threshold of 0.95 worked well for the first two images but for the third example the threshold needed to be lowered to 0.85. Unfortunetly the converted pictures even though they look much clearer still dont appear to be easily recognisable by the OCR software.
Download Version 0.2 alpha

Here are a couple of examples of it in operation.

Source spam image -

After conversion -

 

Source spam image -

After conversion -

 

Source spam image -

After conversion -