CN103024246A - Documentary archive image compressing method - Google Patents

Documentary archive image compressing method Download PDF

Info

Publication number
CN103024246A
CN103024246A CN201210496941XA CN201210496941A CN103024246A CN 103024246 A CN103024246 A CN 103024246A CN 201210496941X A CN201210496941X A CN 201210496941XA CN 201210496941 A CN201210496941 A CN 201210496941A CN 103024246 A CN103024246 A CN 103024246A
Authority
CN
China
Prior art keywords
image
color
secretarial document
pixel
values
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210496941XA
Other languages
Chinese (zh)
Other versions
CN103024246B (en
Inventor
吕岳
刘丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201210496941.XA priority Critical patent/CN103024246B/en
Publication of CN103024246A publication Critical patent/CN103024246A/en
Application granted granted Critical
Publication of CN103024246B publication Critical patent/CN103024246B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Of Band Width Or Redundancy In Fax (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a documentary archive image compressing method. The documentary archive image method comprises the following steps of: pre-processing, i.e., processing an original image so as to remove noise interference in a scanning process; conducting three-valued processing, i.e., processing the pre-processed image so as to show the pre-processed image by three types of color information; and conducting LL (lossless) compression coding processing, i.e., carrying out lossless compression coding on the image subjected to three-valued processing. By use of the method, all useful information in the documentary archive image is kept, meanwhile redundancies are eliminated and a high compression ratio is obtained.

Description

A kind of secretarial document method for compressing image
Technical field
The invention belongs to the file and picture process field, particularly a kind of secretarial document method for compressing image.
Background technology
In order to adapt to the demand of informatization, a large amount of papery secretarial documents are stored with electronic form through overscanning.Can avoid on the one hand owing to deposit improper cause stained; On the other hand, utilize existing information technology to manage and to search electronic record easily, thereby greatly saved manpower and materials.Usually, be stored in the computer with the coloured image form through the archives of paper quality of overscanning, how image being compressed to save memory space is urgent problem.
In recent years, many scholars study the image compression problem, have proposed various compaction coding method.Mainly be divided into lossless compression-encoding and lossy compression method coding.Wherein, lossless compression-encoding refers to the data behind the compressed encoding are reduced, and can obtain and original identical data.Some lossless compression-encoding algorithms commonly used have Huffman (Huffman) algorithm and LZW (Lenpel-Ziv﹠amp; Welch) algorithm.But the lossy compression method coding refers to the different compression methods that approach very much from initial data of data through overcompression, decompress(ion), the JPEG compaction coding method that for example is widely used.The JPEG compression is based on discrete cosine transform (DCT), and it at first is divided into image the zone of some non-overlapping copies, and discrete cosine transform is carried out in each zone.Coefficient after the conversion is quantized according to quantization table, and the coefficient after the quantification reorganizes according to zigzag scanning, then carries out Run-Length Coding, arithmetic coding or huffman coding.
Secretarial document generally is used for providing important historical information, has legal sense.This characteristic has determined should adopt lossless compression method when selecting coded system.
Summary of the invention
The present invention has overcome in the prior art the large and lossy compression method of image storage space makes the defectives such as image fault, has proposed a kind of secretarial document method for compressing image.
The present invention proposes a kind of secretarial document method for compressing image, may further comprise the steps:
Preliminary treatment, described preliminary treatment is processed original image, to remove the noise jamming in the scanning process;
Three values are processed, and described three values are processed to processing through described pretreated image, to utilize three kinds of color informations to representing through described pretreated described image; With
The LL compressed encoding is processed, and described LL compressed encoding is used for the described image of processing through described three values is carried out compressed encoding.
Wherein, described pre-treatment step comprises step by step following:
Steps A 1: the RGB color model is converted to the HSI color model;
Steps A 2: in the HSI model, use gray-scale map Enhancement Method enhancing I component wherein;
Steps A 3: result is converted to the RGB model.
Wherein, in the described three value treatment steps, described three kinds of color informations are respectively red information, white information and black information.
Wherein, in the described three value treatment steps:
Scan successively each pixel (i, j) in the described image, suppose that its rgb value is (R Ij, G Ij, B Ij), then
If work as R Ij<T R, G Ij<T G, B Ij<T BThe time, the expression current pixel is black, makes I Ij=0;
If work as
Figure BDA00002490262100021
Figure BDA00002490262100022
The time, the expression current pixel is white, makes I Ij=1;
Otherwise the expression current pixel is red, makes I Ij=2;
(T wherein R, T G, T B) and
Figure BDA00002490262100024
Be empirical value, I IjBe the color-coded of each pixel after three values, altogether
Three kinds of values are arranged: { 0,1,2}.
Wherein, described empirical value is respectively (T R, T G, T B)=(20,15,30), and ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
Wherein, in described LL compressed encoding treatment step, utilize the distance of swimming to describe described image, the described distance of swimming is the pixel of same color continuous in the delegation in the described image.
Wherein, generate the LL compressed file in described LL compressed encoding treatment step, described LL compressed file comprises: LL file header, view data opening flag, row packed data and view data end mark.
Technical scheme of the present invention is according to the correlation between the pixel in the three value images, proposes a kind of image LL lossless compression-encoding method based on the distance of swimming.
The present invention implements lossless compression-encoding to image, has kept all useful informations in the image, has avoided losing the information that needs because of distortion behind the image restoring.
The present invention is directed to the characteristics that the secretarial document image mainly has three kinds of colors, process by scan image being implemented three values, make picture material after the processing more meet the specification of secretarial document.
The present invention adopts the lossless compression-encoding method based on the distance of swimming, has eliminated redundancy, makes the image after the compression have higher compression ratio, has reduced the memory space that takies, is conducive to the preservation of secretarial document image.
Description of drawings
Fig. 1 represents secretarial document image compression flow process;
Fig. 2 represents the secretarial document image before and after the preliminary treatment;
Fig. 3 represents the secretarial document image before and after three values;
Fig. 4 represents the pressure texture CS of the distance of swimming;
Fig. 5 represents control structure BS;
Fig. 6 represents secretarial document image line compression result.
Embodiment
In conjunction with following specific embodiments and the drawings, the present invention is described in further detail.Implement process of the present invention, condition, experimental technique etc., except the following content of mentioning specially, be universal knowledege and the common practise of this area, the present invention is not particularly limited content.
Secretarial document method for compressing image of the present invention comprises that preliminary treatment, three values are processed and the LL compressed encoding is processed, as shown in Figure 1.
Pretreated purpose is that original image is processed, and by strengthening the contrast in the picture, filtering is wherein because the noise jamming that scanning produces makes picture more clear.
Three values are processed and are used for pretreated image is carried out type conversion.Because the color of secretarial document image is greatly mainly with red, white, black three looks, be classified as a kind of in three looks so other color in the image processed by three values, image is transferred to red, white, black trichromatic specification.
The LL compressed encoding is processed and is used for the image of processing through three values is carried out the Run-Length Coding compression, and Run-Length Coding boil down to Lossless Compression has kept all information in the image.And Run-Length Coding has been eliminated redundancy, has very high compression ratio, has reduced the memory space of image file.
Embodiment:
Thereby often can being subject to the interference of noise in scanning process, secretarial document affects subsequent treatment, so need to carry out preliminary treatment to image.Preferably, the present invention adopts the method for color image enhancement to improve the visual effect of image.Coloured image is processed, at first needed to select suitable color model.Color model commonly used has RGB, YUV and HSI etc.Wherein the most frequently used RGB color model is relevant with display system, and computer display comes Show Color with RGB.It is a kind of mixed type color model, and by three kinds of primary colours: red (R), green (G) and blueness (B) are mixed to get according to a certain percentage.The RGB model is based on cartesian coordinate system, and three axles are respectively R, G and B.The HSI color model is described color with tone H, saturation S, brightness I, and wherein the color harmony saturation is mainly used in describing color information, and brightness represents light intensity.This model has two characteristics: (1) I component closely links to each other with the mode that the people experiences color with the S component with irrelevant (2) H of the colour information of image, meets human vision property.In the present embodiment HSI color model is carried out preliminary treatment.
Transformational relation between RGB model and the HSI model is as follows:
(1) RGB is transformed into HSI
I = 1 3 ( R + G + B )
S = 1 - 1 R + G + B ( min ( R , G , B ) )
H = arccos { [ ( R - G ) + ( R - B ) ] / 2 [ ( R - G ) 2 + ( R - B ) ( G - B ) ] 1 / 2 }
(2) HSI is transformed into RGB
When H ∈ [0,120],
B=I(1-S)
R = I [ 1 + S cos H cos ( 60 - H ) ]
G=3I-(B+R)
When H ∈ [120,240],
R=I(1-S)
G = I [ 1 + S cos ( H - 120 ) cos ( 180 - H ) ]
B=3I-(R+G)
When H ∈ [240,360],
G=I(1-S)
B = I [ 1 + S cos ( H - 240 ) cos ( 300 - H ) ]
R=3I-(R+G)
The preliminary treatment concrete steps are as follows:
Steps A 1: the RGB color model is converted to HS I color model.The RGB color model is converted to the HSI color model according to the above-mentioned relation formula.
Steps A 2: in the HSI model, use gray-scale map Enhancement Method enhancing I component wherein.
Steps A 3: result is converted to the RGB model.The more original RGB image of RGB image after the conversion has higher contrast, and image is more clear eye-catching and color representation is distincter.
What Fig. 2 showed is the secretarial document image of preliminary treatment front and back, has improved the contrast of secretarial document image.Left image is the secretarial document image before the preliminary treatment among Fig. 2, and right image is pretreated secretarial document image among Fig. 2.The image of image after pretreatment before than preliminary treatment has higher contrast, and image is more clear eye-catching and color representation is distincter.
In coloured image, generally represent a pixel with 24bit, namely each pixel has 2 24Plant possible color.Yet for the visual characteristic of secretarial document image, mainly consisted of by red, white, black three kinds of colors, so utilize 24bit to represent that each pixel exists a large amount of redundancies.Propose accordingly a kind of secretarial document image three value methods, utilize three kinds of color informations with image representation out.
In this process, will be the image that is become by red, white, black three colour cells by the image transitions of RGB model representation.Color for each pixel in the RGB image judges, is converted into one of red, white, black color.
For example, each pixel (i, j) in the image defines its color-coded I Ij, wherein:
I Ij=0 pixels illustrated (i, j) is black
I Ij=1 pixels illustrated (i, j) is white
I Ij=2 pixels illustrated (i, j) are red
Represent that from original employing 24bit a pixel is different, because color-coded I Ij{ 0,1,2} is so need at most 2bit namely this pixel can be described out to only have three kinds of possible values.
Secretarial document image three value concrete steps are as follows:
Each pixel (i, j) in the scan image supposes that its rgb value is (R successively Ij, G Ij, B Ij), then
(1) works as R Ij<T R, G Ij<T G, B Ij<T BThe time, the expression current pixel is black, makes I Ij=0.
(2) when
Figure BDA00002490262100052
Figure BDA00002490262100053
The time, the expression current pixel is white, makes I Ij=1.
(3) if (1) and (2) is not all satisfied, is illustrated that then current pixel is red, makes I Ij=2.
(T wherein R, T G, T B) and
Figure BDA00002490262100054
Be empirical value, the value among the present invention is respectively (T R, T G, T B)=(20,15,30), ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
Left image is the secretarial document image before three values among Fig. 3, and right image is the secretarial document image after three values among Fig. 3.After three values processing, the color of each pixel is one of red, black or white three kinds of colors in the image.
A large amount of three value secretarial document images are added up, can be found to exist between the pixel in the image stronger correlation.Suppose with previous pixel in the delegation to be that redness is I I (j-1)=2, current pixel is red conditional probability P (I so Ij=2|I I (j-1)=2) satisfy following inequality:
P (I Ij=2|I I (j-1)=2)>P (I Ij=1|I I (j-1)=2) ﹠amp; P (I Ij=2|I I (j-1)=2)>P (I Ij=0|I I (j-1)=2) in like manner,
P(I ij=1|I i(j-1)=1)>P(I ij=2|I i(j-1)=1)&P(I ij=1|I i(j-1)=1)>P(I ij=0|I i(j-1)=1)
P(I ij=0|I i(j-1)=0)>P(I ij=1|I i(j-1)=0)&P(I ij=0|I i(j-1)=0)>P(I ij=2|I i(j-1)=0)
Propose a kind of LL compaction coding method based on this characteristic, basic thought is the distance of swimming of utilizing in the image, i.e. the pixel of continuous same color in the delegation, rather than isolated pixel is come presentation video one by one.For the secretarial document image, its distance of swimming number is far smaller than the number of pixels in the image, so adopt distance of swimming Description Image, has intactly kept all information in the image on the one hand, greatly reduces on the other hand redundancy.
In order effectively to describe the distance of swimming, need to carry out statistical analysis to three a large amount of value secretarial document images, such as the average of the distance of swimming number in every row, run length and variance etc.On this basis, the pressure texture CS of the definition distance of swimming,
As shown in Figure 4, wherein each grid represents a bit.Empty wire frame representation is optional, depends on run length flag bit (LF), and is as shown in table 1 to concrete meaning explanation every among the CS.
Definition control structure BS, as shown in Figure 5.Wherein,
D 6D 5D 4D 3D 2D 1=000000 o'clock, the sign image data began, and are designated as RSIMGSTART.
D 6D 5D 4D 3D 2D 1=111111 o'clock, the sign image ED was designated as BSIMGEND.
D 6D 5D 4D 3D 2D 1=000001 o'clock, data line began in the sign image, was designated as BSROWSTART.
D 6D 5D 4D 3D 2D 1=000010 o'clock, data line finished in the sign image, is designated as BSROWEND.
Table 1
Utilize distance of swimming pressure texture CS and control structure BS, successively the each row of data in the image is compressed, go compression result as shown in Figure 6, wherein N represents the distance of swimming number in the delegation.
In conjunction with statistical informations such as the width of secretarial document image, height, define a kind of new compressed file format LL, its table composed as follows:
Table 2
Figure BDA00002490262100071
The secretarial document image that red white black three colour cells become is after Run-Length Coding compresses, and the shared memory space of image further reduces.More because LL compressed encoding (Run-Length Coding) is the lossless compression-encoding method, not only keep all information in the image, eliminated simultaneously redundancy, obtained higher compression ratio.
Choose at random 1000 described compression methods of secretarial document imagery exploitation and compress, average compression factor is 1: 99.Shared memory space and compression factor are as shown in table 3 before and after the parts of images compression.
Table 3
Picture numbers Size (KB) before the compression Size (KB) after the compression Compression factor
1 2489 29 1∶86
2 3135 39 1∶80
3 2697 30 1∶90
4 3730 36 1∶104
5 1447 16 1∶90
6 1441 10 1∶144
7 1640 17 1∶96
8 3730 34 1∶110
9 1387 16 1∶87
10 2911 29 1∶100
Protection content of the present invention is not limited to above embodiment.Under the spirit and scope that do not deviate from inventive concept, variation and advantage that those skilled in the art can expect all are included in the present invention, and take appending claims as protection range.

Claims (7)

1. a secretarial document method for compressing image is characterized in that, may further comprise the steps:
Preliminary treatment, described preliminary treatment is processed original image, to remove the noise jamming in the scanning process;
Three values are processed, and described three values are processed to processing through described pretreated image, to utilize three kinds of color informations to representing through described pretreated described image; With
The LL compressed encoding is processed, and described LL compressed encoding is processed the described image that is used for processing through described three values and carried out lossless compression-encoding.
2. secretarial document method for compressing image as claimed in claim 1 is characterized in that, described pre-treatment step comprises step by step following:
Steps A 1: the RGB color model is converted to the HSI color model;
Steps A 2: in the HSI model, use gray-scale map Enhancement Method enhancing I component wherein;
Steps A 3: result is converted to the RGB model.
3. secretarial document method for compressing image as claimed in claim 1 is characterized in that, in the described three value treatment steps, described three kinds of color informations are respectively red information, white information and black information.
4. secretarial document method for compressing image as claimed in claim 1 is characterized in that, in the described three value treatment steps:
Scan successively each pixel (i, j) in the described image, suppose that its rgb value is (R Ij, G Ij, B Ij), then
If work as R Ij<T R, G Ij<T G, B Ij<T BThe time, the expression current pixel is black, makes I Ij=0;
If work as
Figure FDA00002490262000011
Figure FDA00002490262000012
Figure FDA00002490262000013
The time, the expression current pixel is white, makes I Ij=1;
Otherwise the expression current pixel is red, makes I Ij=2;
(T wherein R, T G, T B) and
Figure FDA00002490262000014
Be empirical value, I IjBe the color-coded of each pixel after three values, have three kinds of values: { 0,1,2}.
5. secretarial document method for compressing image as claimed in claim 4 is characterized in that, described empirical value is respectively (T R, T G, T B)=(20,15,30), and ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
6. secretarial document method for compressing image as claimed in claim 1 is characterized in that, in described LL compressed encoding treatment step, utilizes the distance of swimming to describe described image, and the described distance of swimming is the pixel of same color continuous in the delegation in the described image.
7. secretarial document method for compressing image as claimed in claim 6, it is characterized in that, generate the LL compressed file in described LL compressed encoding treatment step, described LL compressed file comprises: LL file header, view data opening flag, row packed data and view data end mark.
CN201210496941.XA 2012-11-29 2012-11-29 Documentary archive image compressing method Expired - Fee Related CN103024246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210496941.XA CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210496941.XA CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Publications (2)

Publication Number Publication Date
CN103024246A true CN103024246A (en) 2013-04-03
CN103024246B CN103024246B (en) 2015-06-24

Family

ID=47972346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210496941.XA Expired - Fee Related CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Country Status (1)

Country Link
CN (1) CN103024246B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113757A (en) * 2013-04-22 2014-10-22 英特尔公司 Color Buffer Compression
CN106791866A (en) * 2015-11-24 2017-05-31 潘晓虹 A kind of four are worth image compress processing method
CN109684895A (en) * 2018-12-06 2019-04-26 苏州易泰勒电子科技有限公司 A kind of three value image processing methods for electronic display tag

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696977A (en) * 2005-05-26 2005-11-16 无敌科技(西安)有限公司 Method for compressing image
EP2144432A1 (en) * 2008-07-08 2010-01-13 Panasonic Corporation Adaptive color format conversion and deconversion
CN101835045A (en) * 2010-05-05 2010-09-15 哈尔滨工业大学 Hi-fidelity remote sensing image compression and resolution ratio enhancement joint treatment method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696977A (en) * 2005-05-26 2005-11-16 无敌科技(西安)有限公司 Method for compressing image
EP2144432A1 (en) * 2008-07-08 2010-01-13 Panasonic Corporation Adaptive color format conversion and deconversion
CN101835045A (en) * 2010-05-05 2010-09-15 哈尔滨工业大学 Hi-fidelity remote sensing image compression and resolution ratio enhancement joint treatment method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
段崇雯等: "一种基于二值化和亚采样的文本图像压缩方法", 《计算机应用》 *
郑江云等: "基于RGB 灰度值缩放的彩色图像增强", 《计算机功能》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113757A (en) * 2013-04-22 2014-10-22 英特尔公司 Color Buffer Compression
CN106791866A (en) * 2015-11-24 2017-05-31 潘晓虹 A kind of four are worth image compress processing method
CN109684895A (en) * 2018-12-06 2019-04-26 苏州易泰勒电子科技有限公司 A kind of three value image processing methods for electronic display tag
CN109684895B (en) * 2018-12-06 2022-03-18 苏州易泰勒电子科技有限公司 Ternary image processing method for electronic display label

Also Published As

Publication number Publication date
CN103024246B (en) 2015-06-24

Similar Documents

Publication Publication Date Title
Douak et al. Color image compression algorithm based on the DCT transform combined to an adaptive block scanning
CN101371583B (en) Method and device of high dynamic range coding / decoding
CN102523367B (en) Real time imaging based on many palettes compresses and method of reducing
CN103458242B (en) Method for compressing image based on color classification Yu cluster
US8780996B2 (en) System and method for encoding and decoding video data
DE202012013410U1 (en) Image compression with SUB resolution images
EP0833519B1 (en) Segmentation and background suppression in JPEG-compressed images using encoding cost data
DE102018118362A1 (en) SYSTEMS AND METHOD FOR THE EFFICIENT AND LOSS-FREE COMPRESSION OF COLLECTED RAW PICTURE DATA
CN107301194B (en) Compressed storage and release method of tile type grid map
DE112012006541B4 (en) Video compression method
CN103957426A (en) RGB565 true color image lossy compression and decompression method
CN103763558A (en) Texture image compression method based on image similarities
US9153017B1 (en) System and method for optimized chroma subsampling
CN103024246A (en) Documentary archive image compressing method
Rajagukguk et al. Compression of Color Image Using Quantization Method
CN102724381B (en) Bill image compression method based on JPEG (joint photographic experts group) compression principle
CN102256126A (en) Method for coding mixed image
CN107682699A (en) A kind of nearly Lossless Image Compression method
Zheng et al. A novel gray image representation using overlapping rectangular NAM and extended shading approach
CN103761753B (en) Decompression method based on texture image similarity
Schilling et al. Feature-preserving image coding for very low bit rates
Poolakkachalil et al. Comparative analysis of lossless compression techniques in efficient DCT-based image compression system based on Laplacian Transparent Composite Model and An Innovative Lossless Compression Method for Discrete-Color Images
CN108184113B (en) Image compression coding method and system based on inter-image reference
CN106296754B (en) Show data compression method and display data processing system
El-Omari et al. Text-image segmentation and compression using adaptive statistical block based approach

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150624

Termination date: 20171129

CF01 Termination of patent right due to non-payment of annual fee