[0001] The present invention relates to image processing method and apparatus for detecting the direction (upward/downward/leftward/rightward) of an original image inputted using a scanner or the like.
[0002] Conventionally, the direction of an original image inputted into a computer using a scanner or the like is detected by the following methods:
[0003] (1) Detection of Original Image Direction by Software
[0004]
[0005] The above processing will be described in more detail.
[0006] Accordingly, in the binarization processing procedure
[0007]
[0008] As described above, in the OCR processing procedure
[0009] In image direction detection by software processing, a value obtained by addition of OCR results of all the characters existing the original (characters existing in the character areas resulted from the area division), is outputted as the final result.
[0010] (2) Detection of Original Image Direction by Hardware
[0011] Next, the outline of hardware construction for the conventional image direction detection processing will be described.
[0012]
[0013] In
[0014] First, the scanner sequentially reads the originals placed on the ADF, and VIDEO in
[0015] The character extraction unit
[0016] Next, the CPU
[0017] However, the above-described conventional methods have the following problems.
[0018] (1) Problems in Detection of Original Image Direction by Software
[0019] (1-1) Much Processing Time is Required.
[0020] Hereinbelow, described is a result of measurement of an A4 size image processed by a personal computer having a 266 MHz Pentium (registered trademark) II. First, it takes 1.8 seconds to create a histogram and an calculate an optimum binarization point. Next, it takes 0.3 to 1.0 seconds to perform the area division processing, although the time of the processing varies in accordance with image (depending on the number of connected black pixels). Then it takes 2 to 3 seconds to perform the OCR processing on the document original mainly including characters, although the time of the processing varies in accordance with the number of characters. Accordingly, total 4 to 5 seconds are required.
[0021] (1-2) A large Amount of Work Memory is Required.
[0022] As the entire color image is referred to so as to obtain an optimum binary image, in case of A4 size image, a 24 MBytes memory is required.
[0023] (2) Problems in Detection of Original Image Direction by hardware
[0024] (2-1) Cost is High.
[0025] As the specialized board is utilized and the CPU, the RAM, the ROM, the character extraction GA, a control GA (not shown) and the like are necessary only for the direction determination processing, the cost is high.
[0026] (2-2) Version Updating is Difficult
[0027] As the character extraction unit is comprised of a specialized GA, version updating of character extraction algorithm cannot be made without difficulty.
[0028] (3) Problem Common to Both Detection Methods
[0029] In both methods, it is impossible to perform the OCR processing on an inverted character portion. In recent years, color office documents are widely used as well as printed documents, and the color images often include more inverted character portions in comparison with monochrome originals. Accordingly, in both methods, the accuracy of recognition is low in color images having inverted character portions.
[0030] The present invention has been proposed to solve the conventional problems, and has its object to provide image processing method and apparatus which efficiently detect an image input direction of images, from an image having much differential information to an image having little differential information.
[0031] According to the present invention, the foregoing object is attained by providing an image processing method for detecting a direction of an image including a character area, inputted into a computer, the method comprising: a binary image generation step of generating a binary image of the image; a tile image generation step of generating a tile image by applying a predetermined value to tiles, each corresponding to a predetermined size area in the binary image; a character area extraction step of extracting an area in the binary image, corresponding to an area in a circumscribed rectangle surrounding connected pixels having the same value in the tile image, as a character area; and a direction detection step of recognizing a direction of characters included in the character area and thereby detecting the direction of the image.
[0032] Further, in the image processing method, at the binary image generation step, the binary image is generated with image area flags having a value 1 corresponding to a pixel equal to or greater than a predetermined value or a value 0 corresponding to a pixel less than the predetermined value, and at the tile image generation step, the tile image is generated with a tile having a value 1 where the number of image area flags having the value 1 is equal to or greater than a predetermined threshold value, and a tile having a value 0 where the number of image area flags having the value 1 is less than the predetermined threshold value.
[0033] Further, the image processing method, further comprises: a character extraction step of extracting the respective characters included in the character area extracted at the character area extraction step; and a character recognition step of recognizing a direction of the characters extracted at the character extraction step, and at the direction detection step, the direction of the character area is detected based on the result of recognition of the direction of the characters included in the character area.
[0034] Further, the image processing method further comprises: a determination step of determining whether or not the character area is an inverted image based on the binary image of the image; and an inversion processing step of inverting black and white components of the binary image if it is determined at the determination step that the character area is an inverted image.
[0035] Further, in the image processing method, at the tile image generation step, plural tile images are generated using plural different threshold values, and at the character area extraction step, the plural tile images are compared with each other and the character area included in the image is extracted.
[0036] Further, in the image processing method, the tile image is a low resolution binary image generated by counting the binary image, generated by differentiating the image, by a small area.
[0037] Further, in the image processing method, the tile image is a low resolution differential image generated by counting differential information of the image by a small area.
[0038] Further, in the image processing method, at the character area detection step, an area in the image, corresponding to the connected pixels extracted from the low resolution image, is extracted as a character area.
[0039] Further, in the image processing method, at the tile image generation step, plural low resolution images are generated using plural different threshold values.
[0040] Further, in the image processing method, at the character area extraction step, connected pixels extracted from the plural low resolution images are compared with the plural low resolution images and the character area is extracted.
[0041] Further, in the image processing method, at the character area extraction step, the low resolution image is divided into meshes, and the character area is extracted based on distribution of pixels within each mesh area.
[0042] Further, in the image processing method, the character area extraction step includes a selection output step of selectively outputting a character area extracted using connected pixels extracted from the low resolution image and a character area determined based on the distribution of pixels within each mesh area.
[0043] According to the present invention, as a character area can be detected utilizing plural low-resolution images comprised of differential information of a color image, an image input direction can be efficiently detected in images, from an image with much differential information to an image with little differential information.
[0044] Further, according to the present invention, as an inversion determination unit is provided, character recognition can be performed on an inverted character portion. Accordingly, even in a color image having many inverted character portions, the input direction can be detected.
[0045] Further, according to the present invention, as high-speed software processing can be performed with a small amount of work memory, the cost of parts upon version update of direction detection processing is unnecessary.
[0046] Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same name or similar parts throughout the figures thereof.
[0047] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087] Hereinbelow, operations of image processing method and apparatus according to the present invention will be described in detail with reference to the drawings.
[0088] <First Embodiment>
[0089]
[0090] Numeral
[0091] The respective parts in a direction detection module
[0092] In
[0093] Further, numeral
[0094] Next, as an example of the general image processing system
[0095]
[0096]
[0097] In
[0098] The system bus bridge
[0099] The sub bus switch
[0100] A RAM
[0101] The image DMA
[0102] The image DMA
[0103] The font expansion unit
[0104] The sort circuit
[0105] The IO bus
[0106] An operation interface
[0107] The IO bus
[0108] An external storage unit (HDD)
[0109] The image ring interface
[0110] The image ring
[0111] The tile expansion unit
[0112] The command processor
[0113] The status processor
[0114] The rendering unit interface
[0115] The image input interface
[0116] The image output interface
[0117] Then image rotation unit
[0118] The external bus interface unit
[0119] A memory controller
[0120] The rendering unit
[0121] Next, the flow of the direction determination processing of the present invention will be described about a case where the above-described digital color copier is employed as the image processing system
[0122]
[0123] Further, as in the case of
[0124] Hereinbelow, generation of a tile header
[0125] First, plural color images are placed on an ADF of the scanner
[0126] The scanner image processor
[0127] The image area flag generation hardware can select a processing signal by setting R, G, B or a×R+b×G+c×B and a coefficient in an internal register. Generally, as a default signal is a G signal, it is assumed in the present embodiment that the G signal is processed and image area flag is generated.
[0128] On the other hand, the scanner image processor
[0129] The image input interface
[0130] That is, in the image processing method according to the present invention, the tile image is a low resolution binary image, generated by counting binary images generated by differentiating an input image by small area.
[0131] As the generation of the image area flag information tile header, “if an image area flag equal to or greater than a threshold value T1 exists in a tile, the tile is set to 1”. As the order of image input in the image input interface
[0132] That is, in the image processing method according to the present invention, a binary image of image area flags having a value “1” if a corresponding pixel is equal to or greater than a predetermined value, or a value “0” if a corresponding pixel is less than the predetermined value, is generated, and a tile image is generated such that in a tile header, a tile where the number of “1N” image area flags is equal to greater than a predetermined threshold value is set to “1”, and a tile where the number of “1” image area flags is less than the predetermined threshold value is set to “0”.
[0133] Further, in this embodiment, the image area flag information tile header is generated for 2 bits. That is, 1 bit is generated as “if an image area flag equal to or greater than a threshold value T1 exists in even 1 of tile 32 lines, the tile is set to 1”, and the other 1 bit is generated as “if an image area flag equal to or greater than a threshold value T2 exists in even 1 of tile 32 lines, the tile is set to 1” (T1>T2).
[0134] That is, in the image processing method according to the present invention, plural tile images are generated using plural different threshold values, then a comparison is made among the plural tile images, and a character area included in the input image is extracted. Further, the feature of the image processing method according to the present invention is that plural low resolution images are generated using plural different threshold values. Further feature of the image processing method according to the present invention is that a group of connected pixels extracted from plural low resolution images is compared with the plural low resolution images, and a character area is extracted.
[0135] Note that as the image area flag
[0136] That is, in the image processing method according to the present invention, the tile image is a low resolution differential image generated by counting differential information of input image by small area.
[0137] The color image data (RGB), a channel data including the image area flag data and the tile header information are compressed in the image processor
[0138] Next, a procedure for generating the 300 dpi partial binary image
[0139] The image spooled in the RAM
[0140] As described above, in the image processing system (color digital copier), the tile header
[0141]
[0142] Further, numeral
[0143] The software processing
[0144] Note that the primary character extraction processing means limiting a portion including characters and selecting the portion as a small rectangular area from the entire image.
[0145] In
[0146]
[0147] Next, an average size of all the rectangles is obtained except rectangles having an extremely large/small area and a high oblateness among the set of rectangles {Tc1} (step S
[0148] Further, all the connected black pixels are extracted from the tile header
[0149] That is, the feature of the image processing method according to the present invention is extracting an area in the image, corresponding to connected pixels extracted from a low-resolution image, as a character area.
[0150] Further, in the set of rectangles {Tc2}, a set of rectangles having an area equal to or smaller than a predetermined area is determined as a set of non-text rectangles {Tc2-nt}, and a set of rectangles having an area larger than the predetermined area, as text rectangles {Tc1-t} (step S
[0151] Further, regarding each rectangle of the set of rectangles {Tc2-t}, the total sum of areas among the set of rectangles {Tc1-t} having coordinates overlapped with the rectangle is divided by the area of the rectangle, and the obtained value is set as a score of the rectangle (Step S
[0152] The set of rectangles {Tc2-t} are sorted in ascending order of the scores (step S
[0153] On the other hand,
[0154] Next, in each mesh, the total sum Sm of areas of the {Tc2} rectangles included in the mesh is obtained (step S
[0155] Next, the set of all the Tr rectangles, {Tr} is sorted in ascending order of the scores (step S
[0156] Finally, the output unit
[0157] That is, the feature of the image processing method according to the present invention is that a low resolution image is divided into meshes, and a character area is extracted based on distribution of pixels within each mesh area. Further, another feature of the image processing method according to the present invention is that a character area extracted using connected pixels extracted from a low resolution image and a character area determined based on distribution of pixels within each mesh area are selectively outputted.
[0158] Plural character area coordinates are obtained by the above-described primary character extraction. An image reading table is generated on the RAM in accordance with the result of processing, and when the CPU makes a program kick, the 300 dpi partial binary images
[0159] In parallel to the BSOCR processing on the area
[0160] Next, a particular example of the above-describe direction determination processing will be described with reference actual
[0161] The primary character extraction unit outputs rectangular areas arranged in the order of scores of probability of character area by the processing procedure shown in the flowchart of
[0162] In
[0163] Upon completion of reading from the area
[0164]
[0165] Note that it is preferable that in the OCR processing, the result of direction determination is the total of processing results in plural areas. For example, in a case where 200 characters exist in 1 area, to avoid time out of processing time while
[0166] When the areas
[0167] That is, the feature of the image processing method according to the present invention is that it is determined whether or not a character area is an inverted image based on a binary image of an input image, and if it is determined that the character area is an inverted image, black and white components of the binary image are inverted.
[0168]
[0169] In the primary character extraction, the processing shown in the flowchart of
[0170] An image reading tables for areas
[0171] Upon completion of reading from the area
[0172] When the areas
[0173] As described above, the feature of the image processing method according to the present invention is that the direction of an image including a character area inputted into a computer is detected by generating a binary image, then generating a tile image by adding a predetermined value to tiles each defined as a predetermined size area in the binary image, then extracting an area corresponding to an area within a circumscribed rectangle surrounding connected pixels having the same value in the tile image, as character area, from the binary image, recognizing the directions of the characters included in the character area, and detecting the direction of the image.
[0174] Further, the feature of the image processing method according to the present invention is that respective characters included in a character area are extracted, and the direction of the extracted characters is recognized, and the direction of the character area is detected based on the result of recognition of the direction of the characters included in the character area.
[0175] <Second Embodiment>
[0176] In the above-described first embodiment, the binarization threshold value A is calculated from the whole histogram provided by the image processing system, however, in the present invention, the binarization threshold value may be a fixed value of, e.g., “128”. In case of fixed threshold value, the binarization can be performed even if the image processing system side lacks the whole histogram operation unit.
[0177] Further, in the above-described first embodiment, the area division processing is performed on the 300 dpi partial image, however, the processing may be omitted. In this case, the time necessary for the area division processing can be saved, and the OCR processing can be performed on more areas. However, in a case where character cutting cannot be made without any processing from images detected by the primary character extraction processing, the result of direction detection becomes poor. Then the primary character extraction processing may be arranged such that an area to be subjected to the subsequent area division processing and an area not to be subjected to the area division processing are designated. In this case, the accuracy of direction detection can be maintained and the processing speed is increased.
[0178] Further, in the first embodiment, the partial image is a 300 dpi binary image, however, the present invention is not limited to this image. It may be arranged such that a partial multivalue image is read. In this case, the direction detection side requires a function of binarizing the partial multivalue image. In this arrangement, optimum binarization can be performed on the partial area.
[0179] Further, considering data size, the time for reading a multivalue image is longer that for reading a binary image. Accordingly, it is effective that in the primary character extraction processing, it is determined whether a binary image reading or multivalue image reading is to be performed based on the distribution of differential information, and the image processing system is instructed to select binary image reading or multivalue image reading in accordance with the determination.
[0180] <Other Embodiment>
[0181] The present invention can be applied to a system constituted by a plurality of devices (e.g., a host computer, an interface, a reader and a printer) or to an apparatus comprising a single device (e.g., a copy machine or a facsimile apparatus).
[0182] Further, the object of the present invention can also be achieved by providing a storage medium (or recording medium) holding software program code for performing the aforesaid processes to a system or an apparatus, reading the program code with a computer (e.g., CPU, MPU) of the system or apparatus from the storage medium, then executing the program. In this case, the program code read from the storage medium realizes the functions according to the embodiments, and the storage medium holding the program code constitutes the invention. Furthermore, besides aforesaid functions according to the above embodiments are realized by executing the program code which is read by a computer, the present invention includes a case where an OS (operating system) or the like working on the computer performs a part or entire actual processing in accordance with designations of the program code and realizes functions according to the above embodiments.
[0183] Furthermore, the present invention also includes a case where, after the program code read from the storage medium is written in a function expansion card which is inserted into the computer or in a memory provided in a function expansion unit which is connected to the computer, CPU or the like contained in the function expansion card or unit performs a part or entire process in accordance with designations of the program code and realizes functions of the above embodiments.
[0184] In a case where the present invention is applied to the aforesaid storage medium, the storage medium stores program code corresponding to the flowcharts described in the embodiments.
[0185] As described above, according to the present invention, as a character area is detected utilizing plural low resolution images comprised of differential information of a color image, an image input direction can be efficiently detected, from an image with much differential information to an image with little differential information. Further, according to the present invention, as the system has the inversion determination unit, the character recognition can be performed in an inverted character portion, and the input direction can be detected even in a color image having many inverted character portions. Further, according to the present invention, as high-speed software processing can be performed with a small work memory, the cost of parts necessary for version updating of the direction detection processing can be reduced.
[0186] The present invention is not limited to the above embodiments and various changes and modifications can be made within the spirit and scope of the present invention. Therefore, to appraise the public of the scope of the present invention, the following claims are made.