Document Processing for Automatic Color Form Dropout
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Active In SP

Posts: 1,124
Joined: Jun 2010
12-10-2010, 04:12 PM

This article is presented by:
Andreas E. Savakis
Chris R. Brow
Document Processing for Automatic Color Form Dropout

Color dropout refers to the process of converting color form documents to black and white by removing the colors that are part of the blank form and maintaining only the information entered in the form. In this paper, no prior knowledge of the form type is assumed. Color dropout is performed by associating darker non-dropout colors with information that is entered in the form and needs to be preserved. The color dropout filter parameters include the color values of the non-dropout colors, e.g. black and blue, the distance metric, e.g. Euclidian, and the tolerances allowed around these colors. Color dropout is accomplished by converting pixels that have color within the tolerance sphere of the non-dropout colors to black and all others to white. This approach lends itself to high-speed hardware implementation with low memory requirements, such as an FPGA platform. Processing may be performed in RGB or a Luminance-Chrominance space, such as YC. The color space transformation from RGB to YC involves a matrix multiplication and the dropout filter implementation is similar in both cases. Results for color dropout processing in both RGB and YC space are presented.

Color forms constitute a large number of documents that are scanned using high-speed scanners. In color forms, the information of interest is the text that has been entered, while the document background and document lines, originally placed in the document to facilitate data entry, are not of any practical use. Representative documents of this type are medical forms, insurance forms, census forms, etc. When performing character recognition on these forms, it is desirable to eliminate the color background and lines that are part of the form, and keep only the textual information that is of relevance. Color dropout is the image processing function whose purpose is to convert the scanned color document to a binary image where the form background colors are turned to white and the text colors are turned to black. To accomplish this we need to distinguish between the colors of the background and the colors of the entered text. Color dropout may be viewed as a form of color image rendering, since the image is converted from a full-color form to black and white. There are several advantages to performing color dropout. First the textual information of interest is enhanced, because it is rendered black, while the background color, that may reduce the text contrast, is suppressed. In addition, the removal of the form lines minimizes interference with the text characters, and may reduce errors during character recognition. Another advantage is that the uncompressed file size is reduced by a factor of 24, since the color image consisting of 24 bits per pixel is converted to a binary image with only one bit per pixel. This fact significantly reduces the storage requirements for the resulting document files. Color dropout may be accomplished using optical or digital methods. Optical filters have been used when the document form involves a single dropout color. However, optical filters cannot be used with multiple dropout colors, and it is difficult to adjust the optical filter parameters of the optical filters to match nonstandard colors.
Color dropout methods based on digital processing methods sometimes attempt to remove the form lines and background information from the scanned gray scale image by postprocessing. Examples of this approach include [1], where form frames are identified for the purpose of form line removal, and [2], where the distance transformation and its gradient flow are employed to remove form lines. Such approaches may work for specific cases, but require significant computational effort and are very expensive to implement in real-time hardware that are used in high-speed scanners. Another approach to color dropout, originally developed in the context of optical character recognition, was developed by Rudak [3]. In this work, the average RGB dropout colors in color patches are determined and used in a dropout filter that can be implemented using electronic hardware. The filter bandwidth is adjusted to accommodate for color variations between forms. The advantage of this approach is that the presence of noise, e.g. black specs, does not significantly affect the average color in the color patch considered, and consequently does not affect the final color dropout result. Another approach presented in [4] proposes scanning a blank form, extracting the dropout colors from the blank form, and using them to perform color dropout when scanning other forms.

For more information about this article,please follow the link:


Important Note..!

If you are not satisfied with above reply ,..Please


So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: document processing course, girani kamgar form, prohect 2 hotelmgmtsystem document, p9 form, requst form, mhada lottary girni kamgar document varification, electronic document processing edp**rmann steering,

Quick Reply
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  secure atm by image processing jaseelati 0 268 10-02-2015, 01:33 PM
Last Post: jaseelati
  secure atm by image processing abstract jaseelati 0 334 23-01-2015, 03:08 PM
Last Post: jaseelati
  web based claim processing system pdf jaseelati 0 349 10-01-2015, 02:34 PM
Last Post: jaseelati
  A Character Segmentation Algorithm for Printed Kannada Text Document uploader 1 1,490 10-01-2015, 12:52 PM
Last Post: zcfqmbrtb
  word document computermisdaad jaseelati 0 188 09-12-2014, 02:09 PM
Last Post: jaseelati
  difference between ecommerce and ebusiness in tabular form jaseelati 0 192 02-12-2014, 04:30 PM
Last Post: jaseelati
  computermisdaad word document jaseelati 0 185 02-12-2014, 03:52 PM
Last Post: jaseelati
  Karnataka ration card online application form 2013 study tips 17 17,220 07-11-2013, 03:45 PM
Last Post: Guest
  Image Processing & Compression Techniques (Download Full Seminar Report) Computer Science Clay 50 30,624 22-10-2013, 03:28 PM
Last Post: Guest
Last Post: Guest