CN103024246B - Documentary archive image compressing method - Google Patents

Documentary archive image compressing method Download PDF

Info

Publication number
CN103024246B
CN103024246B CN201210496941.XA CN201210496941A CN103024246B CN 103024246 B CN103024246 B CN 103024246B CN 201210496941 A CN201210496941 A CN 201210496941A CN 103024246 B CN103024246 B CN 103024246B
Authority
CN
China
Prior art keywords
image
color
pixel
valued
swimming
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210496941.XA
Other languages
Chinese (zh)
Other versions
CN103024246A (en
Inventor
吕岳
刘丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201210496941.XA priority Critical patent/CN103024246B/en
Publication of CN103024246A publication Critical patent/CN103024246A/en
Application granted granted Critical
Publication of CN103024246B publication Critical patent/CN103024246B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Of Band Width Or Redundancy In Fax (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a documentary archive image compressing method. The documentary archive image method comprises the following steps of: pre-processing, i.e., processing an original image so as to remove noise interference in a scanning process; conducting three-valued processing, i.e., processing the pre-processed image so as to show the pre-processed image by three types of color information; and conducting LL (lossless) compression coding processing, i.e., carrying out lossless compression coding on the image subjected to three-valued processing. By use of the method, all useful information in the documentary archive image is kept, meanwhile redundancies are eliminated and a high compression ratio is obtained.

Description

A kind of secretarial document method for compressing image
Technical field
The invention belongs to testing document field, particularly a kind of secretarial document method for compressing image.
Background technology
In order to adapt to the demand of informatization, a large amount of papery secretarial documents Electronically stores through overscanning.Can avoid on the one hand due to deposit improper cause stained; On the other hand, utilize existing information technology to manage electronic record easily and to search, thus greatly save manpower and materials.Usually, the archives of paper quality through overscanning stores in a computer with color image format, and how to compress to save memory space to image is urgent problem.
In recent years, many scholars study image compression problem, propose various different compaction coding method.Mainly be divided into lossless compression-encoding and lossy compression method coding.Wherein, lossless compression-encoding refers to that the data after to compressed encoding are reduced, and can obtain and original identical data.Some conventional lossless compression-encoding algorithms have Huffman (Huffman) algorithm and LZW (Lenpel-Ziv & Welch) algorithm.Such as, but lossy compression method coding refers to the data compression method closely different from initial data through overcompression, decompress(ion), the JPEG compaction coding method be widely used.JPEG compression is based on discrete cosine transform (DCT), and first image is divided into the region of some non-overlapping copies by it, carries out discrete cosine transform to each region.Quantized according to quantization table by coefficient after conversion, the coefficient after quantification reorganizes according to zigzag scanning, then carries out Run-Length Coding, arithmetic coding or huffman coding.
Secretarial document is generally used for provides important historical information, has legal sense.This characteristic determines and should adopt lossless compression method when selecting coded system.
Summary of the invention
To instant invention overcomes in prior art that image storage space is comparatively large and lossy compression method makes the defects such as image fault, propose a kind of secretarial document method for compressing image.
The present invention proposes a kind of secretarial document method for compressing image, comprise the following steps:
Preliminary treatment, described preliminary treatment processes original image, to remove the noise jamming in scanning process;
Three-valued processing, described three-valued processing processes through described pretreated image, represents through described pretreated described image to utilize three kinds of color informations; With
The process of LL compressed encoding, described LL compressed encoding is used for carrying out compressed encoding to the described image through described three-valued processing.
Wherein, described pre-treatment step comprises step by step following:
Steps A 1: RGB color model is converted to HSI color model;
Steps A 2: with gray-scale map Enhancement Method enhancing I component wherein in HSI model;
Steps A 3: result is converted to RGB model.
Wherein, in described three-valued processing step, described three kinds of color informations are respectively red information, white information and black information.
Wherein, in described three-valued processing step:
Scan each pixel (i, j) in described image successively, suppose that its rgb value is (R ij, G ij, B ij), then
If work as R ij< T r, G ij< T g, B ij< T btime, expression current pixel is black, makes I ij=0;
If work as time, represent that current pixel is white, make I ij=1;
Otherwise, represent that current pixel is red, make I ij=2;
Wherein (T r, T g, T b) and for empirical value, I ijcolor-coded for three-valued rear each pixel, altogether
There are three kinds of values: { 0,1,2}.
Wherein, described empirical value is respectively (T r, T g, T b)=(20,15,30), and ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
Wherein, in described LL compressed encoding treatment step, utilize the distance of swimming to describe described image, the described distance of swimming is the pixel of continuous print same color in a line in described image.
Wherein, in described LL compressed encoding treatment step, generate LL compressed file, described LL compressed file comprises: LL file header, view data opening flag, row packed data and view data end mark.
Technical scheme of the present invention is according to the correlation in three-valued image between pixel, proposes a kind of image LL lossless compression-encoding method based on the distance of swimming.
The present invention implements lossless compression-encoding to image, remains all useful informations in image, loses the information of needs after avoiding image restoring because of distortion.
The present invention is directed to the feature that secretarial document image mainly has three kinds of colors, by implementing three-valued processing to scan image, making the picture material after process more meet the specification of secretarial document.
The present invention adopts the lossless compression-encoding method based on the distance of swimming, eliminates redundancy, makes the image after compression have higher compression ratio, reduce the memory space taken, be conducive to the preservation of secretarial document image.
Accompanying drawing explanation
Fig. 1 represents secretarial document image compression flow process;
Fig. 2 represents the secretarial document image before and after preliminary treatment;
Fig. 3 represents the secretarial document image of three-valued front and back;
Fig. 4 represents the pressure texture CS of the distance of swimming;
Fig. 5 represents control structure BS;
Fig. 6 represents secretarial document image line compression result.
Embodiment
In conjunction with following specific embodiments and the drawings, the present invention is described in further detail.Implement process of the present invention, condition, experimental technique etc., except the following content mentioned specially, be universal knowledege and the common practise of this area, the present invention is not particularly limited content.
Secretarial document method for compressing image of the present invention comprises preliminary treatment, three-valued processing and the process of LL compressed encoding, as shown in Figure 1.
Pretreated object processes original image, and by strengthening the contrast in picture, filtering wherein due to the noise jamming that scanning produces, makes picture more clear.
Three-valued processing is used for carrying out type conversion to pretreated image.Because the color of secretarial document image is greatly mainly with red, white, black three looks, so other color in image to be classified as the one in three looks by three-valued processing, image is transferred to red, white, black trichromatic specification.
The process of LL compressed encoding is used for carrying out Run-Length Coding compression to the image through three-valued processing, Run-Length Coding boil down to Lossless Compression, remains all information in image.And Run-Length Coding eliminates redundancy, there is very high compression ratio, reduce the memory space of image file.
Embodiment:
Secretarial document often can be subject to the interference of noise thus affect subsequent treatment in scanning process, therefore needs to carry out preliminary treatment to image.Preferably, the present invention adopts the method for color image enhancement to improve the visual effect of image.Coloured image is processed, first needs to select suitable color model.Conventional color model has RGB, YUV and HSI etc.Wherein the most frequently used RGB color model is relevant to display system, and computer display uses RGB to carry out Show Color.It is a kind of mixed type color model, by three kinds of primary colours: red (R), green (G) and blue (B) are mixed to get according to a certain percentage.RGB model is based on cartesian coordinate system, and three axles are respectively R, G and B.HSI color model tone H, saturation S, brightness I describe color, and wherein tone and saturation are mainly used in describing color information, and brightness represents the intensity of light.This model has two features: (1) I component is closely connected with the mode that people experiences color with irrelevant (2) H with the S component of the colour information of image, meets human vision property.In the present embodiment, preliminary treatment is carried out to HSI color model.
Transformational relation between RGB model and HSI model is as follows:
(1) RGB is transformed into HSI
I = 1 3 ( R + G + B )
S = 1 - 1 R + G + B ( min ( R , G , B ) )
H = arccos { [ ( R - G ) + ( R - B ) ] / 2 [ ( R - G ) 2 + ( R - B ) ( G - B ) ] 1 / 2 }
(2) HSI is transformed into RGB
When H ∈ [0,120],
B=I(1-S)
R = I [ 1 + S cos H cos ( 60 - H ) ]
G=3I-(B+R)
When H ∈ [120,240],
R=I(1-S)
G = I [ 1 + S cos ( H - 120 ) cos ( 180 - H ) ]
B=3I-(R+G)
When H ∈ [240,360],
G=I(1-S)
B = I [ 1 + S cos ( H - 240 ) cos ( 300 - H ) ]
R=3I-(R+G)
Preliminary treatment concrete steps are as follows:
Steps A 1: RGB color model is converted to HS I color model.RGB color model is converted to HSI color model according to above-mentioned relation formula.
Steps A 2: with gray-scale map Enhancement Method enhancing I component wherein in HSI model.
Steps A 3: result is converted to RGB model.RGB image after conversion has higher contrast compared with original RGB image, and image is more clear eye-catching and color representation is distincter.
Fig. 2 display be secretarial document image before and after preliminary treatment, improve the contrast of secretarial document image.In Fig. 2, left image is the secretarial document image before preliminary treatment, and in Fig. 2, right image is pretreated secretarial document image.Image after pretreatment has higher contrast compared with the image before preliminary treatment, and image is more clear eye-catching and color representation is distincter.
In coloured image, generally represent a pixel with 24bit, namely each pixel has 2 24plant possible color.But for the visual characteristic of secretarial document image, form, so utilize 24bit to represent that each pixel also exists a large amount of redundancies primarily of red, white, black three kinds of colors.Propose a kind of three-valued method of secretarial document image accordingly, utilize three kinds of color informations to be showed by image.
In this process, the image by RGB model representation is converted to the image become by red, white, black three colour cells.Color for each pixel in RGB image judges, is converted into one of red, white, black color.
Such as, for each pixel (i, j) in image, its color-coded I is defined ij, wherein:
I ij=0 pixels illustrated (i, j) is black
I ij=1 pixels illustrated (i, j) is white
I ij=2 pixels illustrated (i, j) are red
From original adopt 24bit to represent a pixel is different, due to color-coded I ijonly has three kinds of possible values { 0,1,2}, therefore need at most 2bit namely this pixel can be described out.
The three-valued concrete steps of secretarial document image are as follows:
Each pixel (i, j) successively in scan image, supposes that its rgb value is (R ij, G ij, B ij), then
(1) R is worked as ij< T r, G ij< T g, B ij< T btime, expression current pixel is black, makes I ij=0.
(2) when time, represent that current pixel is white, make I ij=1.
(3) if (1) and (2) does not all meet, then illustrate that current pixel is red, make I ij=2.
Wherein (T r, T g, T b) and for empirical value, the value in the present invention is respectively (T r, T g, T b)=(20,15,30), ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
In Fig. 3 left image be three-valued before secretarial document image, in Fig. 3 right image be three-valued after secretarial document image.After three-valued processing, in image, the color of each pixel is one of red, black or white three kinds of colors.
A large amount of three-valued secretarial document image is added up, can find to there is stronger correlation between pixel in image.Suppose that in same a line, previous pixel is redness and I i (j-1)=2, so current pixel is red conditional probability P (I ij=2|I i (j-1)=2) following inequality is met:
P (I ij=2|I i (j-1)=2) > P (I ij=1|I i (j-1)=2) & P (I ij=2|I i (j-1)=2) > P (I ij=0|I i (j-1)=2) in like manner,
P(I ij=1|I i(j-1)=1)>P(I ij=2|I i(j-1)=1)&P(I ij=1|I i(j-1)=1)>P(I ij=0|I i(j-1)=1)
P(I ij=0|I i(j-1)=0)>P(I ij=1|I i(j-1)=0)&P(I ij=0|I i(j-1)=0)>P(I ij=2|I i(j-1)=0)
Propose a kind of LL compaction coding method based on this characteristic, basic thought is the distance of swimming utilized in image, i.e. the pixel of continuous print same color in a line, instead of pixel isolated one by one carrys out presentation video.For secretarial document image, its distance of swimming number is far smaller than the number of pixels in image, so adopt distance of swimming Description Image, intactly remains all information in image on the one hand, greatly reduces redundancy on the other hand.
In order to effectively describe the distance of swimming, need to carry out statistical analysis to a large amount of three-valued secretarial document images, the distance of swimming number in such as often going, the average and variance etc. of run length.On this basis, the pressure texture CS of the definition distance of swimming,
As shown in Figure 4, wherein each grid represents a bit.Empty wire frame representation is optional, depends on run length flag bit (LF), illustrates as shown in table 1 to concrete meaning every in CS.
Definition control structure BS, as shown in Figure 5.Wherein,
D 6d 5d 4d 3d 2d 1when=000000, sign image data start, and are designated as RSIMGSTART.
D 6d 5d 4d 3d 2d 1when=111111, sign image ED, is designated as BSIMGEND.
D 6d 5d 4d 3d 2d 1when=000001, in sign image, data line starts, and is designated as BSROWSTART.
D 6d 5d 4d 3d 2d 1when=000010, in sign image, data line terminates, and is designated as BSROWEND.
Table 1
Utilize distance of swimming pressure texture CS and control structure BS, compress successively to each row of data in image, as shown in Figure 6, wherein N represents the distance of swimming number in a line to row compression result.
In conjunction with the statistical information such as width, height of secretarial document image, define a kind of new compressed file format LL, its table composed as follows:
Table 2
The secretarial document image that red black three colour cells in vain become is after Run-Length Coding compresses, and the memory space shared by image reduces further.More because LL compressed encoding (Run-Length Coding) is lossless compression-encoding method, not only remain all information in image, eliminate redundancy simultaneously, obtain higher compression ratio.
Described in random selecting 1000 secretarial document imagery exploitations, compression method compresses, and average compression factor is 1: 99.Before and after parts of images compression shared memory space and compression factor as shown in table 3.
Table 3
Picture numbers Size (KB) before compression Size (KB) after compression Compression factor
1 2489 29 1∶86
2 3135 39 1∶80
3 2697 30 1∶90
4 3730 36 1∶104
5 1447 16 1∶90
6 1441 10 1∶144
7 1640 17 1∶96
8 3730 34 1∶110
9 1387 16 1∶87
10 2911 29 1∶100
Protection content of the present invention is not limited to above embodiment.Under the spirit and scope not deviating from inventive concept, the change that those skilled in the art can expect and advantage are all included in the present invention, and are protection range with appending claims.

Claims (5)

1. a secretarial document method for compressing image, is characterized in that, comprises the following steps:
Preliminary treatment, described preliminary treatment processes original image, to remove the noise jamming in scanning process;
Three-valued processing, described three-valued processing processes through described pretreated image, represents through described pretreated described image to utilize three kinds of color informations; With
The process of LL compressed encoding, the process of described LL compressed encoding is used for carrying out lossless compression-encoding to the described image through described three-valued processing;
Wherein,
In described three-valued processing step:
Scan each pixel (i, j) in described image successively, suppose that its rgb value is (R ij, G ij, B ij), then
If work as R ij<T r, G ij<T g, B ij<T btime, expression current pixel is black, makes I ij=0;
If work as time, represent that current pixel is white, make I ij=1;
Otherwise, represent that current pixel is red, make I ij=2;
Wherein (T r, T g, T b) and for empirical value, I ijcolor-coded for three-valued rear each pixel, has three kinds of values: { 0,1,2};
In described LL compressed encoding treatment step: utilize the distance of swimming to describe described image, the described distance of swimming is the pixel of continuous print same color in a line in described image; Utilizing the distance of swimming to describe in described image, statistical analysis is carried out to three-valued secretarial document image, the pressure texture of the definition distance of swimming and control structure; Utilize distance of swimming pressure texture and control structure, successively each row of data in image is compressed;
Wherein said pressure texture comprises pixel mark, run length mark and run length; Described control structure comprises following:
D 6d 5d 4d 3d 2d 1when=000000, sign image data start, and are designated as BSIMGSTART;
D 6d 5d 4d 3d 2d 1when=111111, sign image ED, is designated as BSIMGEND;
D 6d 5d 4d 3d 2d 1when=000001, in sign image, data line starts, and is designated as BSROWSTART;
D 6d 5d 4d 3d 2d 1when=000010, in sign image, data line terminates, and is designated as BSROWEND.
2. secretarial document method for compressing image as claimed in claim 1, it is characterized in that, described pre-treatment step comprises step by step following:
Steps A 1: RGB color model is converted to HSI color model;
Steps A 2: with gray-scale map Enhancement Method enhancing I component wherein in HSI model;
Steps A 3: result is converted to RGB model.
3. secretarial document method for compressing image as claimed in claim 1, it is characterized in that, in described three-valued processing step, described three kinds of color informations are respectively red information, white information and black information.
4. secretarial document method for compressing image as claimed in claim 1, it is characterized in that, described empirical value is respectively (T r, T g, T b)=(20,15,30), and ( T ^ R , T ^ G , T ^ B ) = ( 210,225,220 ) .
5. secretarial document method for compressing image as claimed in claim 1, it is characterized in that, in described LL compressed encoding treatment step, generate LL compressed file, described LL compressed file comprises: LL file header, view data opening flag, row packed data and view data end mark.
CN201210496941.XA 2012-11-29 2012-11-29 Documentary archive image compressing method Expired - Fee Related CN103024246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210496941.XA CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210496941.XA CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Publications (2)

Publication Number Publication Date
CN103024246A CN103024246A (en) 2013-04-03
CN103024246B true CN103024246B (en) 2015-06-24

Family

ID=47972346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210496941.XA Expired - Fee Related CN103024246B (en) 2012-11-29 2012-11-29 Documentary archive image compressing method

Country Status (1)

Country Link
CN (1) CN103024246B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9582847B2 (en) * 2013-04-22 2017-02-28 Intel Corporation Color buffer compression
CN106791866A (en) * 2015-11-24 2017-05-31 潘晓虹 A kind of four are worth image compress processing method
CN109684895B (en) * 2018-12-06 2022-03-18 苏州易泰勒电子科技有限公司 Ternary image processing method for electronic display label

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696977A (en) * 2005-05-26 2005-11-16 无敌科技(西安)有限公司 Method for compressing image
EP2144432A1 (en) * 2008-07-08 2010-01-13 Panasonic Corporation Adaptive color format conversion and deconversion
CN101835045A (en) * 2010-05-05 2010-09-15 哈尔滨工业大学 Hi-fidelity remote sensing image compression and resolution ratio enhancement joint treatment method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696977A (en) * 2005-05-26 2005-11-16 无敌科技(西安)有限公司 Method for compressing image
EP2144432A1 (en) * 2008-07-08 2010-01-13 Panasonic Corporation Adaptive color format conversion and deconversion
CN101835045A (en) * 2010-05-05 2010-09-15 哈尔滨工业大学 Hi-fidelity remote sensing image compression and resolution ratio enhancement joint treatment method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于二值化和亚采样的文本图像压缩方法;段崇雯等;《计算机应用》;20050128;第25卷(第01期);93-95页 *
基于RGB 灰度值缩放的彩色图像增强;郑江云等;《计算机功能》;20120131;第38卷(第2期);226-228页 *

Also Published As

Publication number Publication date
CN103024246A (en) 2013-04-03

Similar Documents

Publication Publication Date Title
CN102523367B (en) Real time imaging based on many palettes compresses and method of reducing
CN101371583B (en) Method and device of high dynamic range coding / decoding
US8780996B2 (en) System and method for encoding and decoding video data
CN103458242B (en) Method for compressing image based on color classification Yu cluster
EP0833519B1 (en) Segmentation and background suppression in JPEG-compressed images using encoding cost data
DE102018118362A1 (en) SYSTEMS AND METHOD FOR THE EFFICIENT AND LOSS-FREE COMPRESSION OF COLLECTED RAW PICTURE DATA
DE202012013410U1 (en) Image compression with SUB resolution images
CN108337516A (en) A kind of HDR video dynamic range scalable encodings of facing multiple users
DE112012006541B4 (en) Video compression method
CN103024246B (en) Documentary archive image compressing method
EP3180910B1 (en) Method for optimized chroma subsampling, apparatus for optimized chroma subsampling and storage device
CN103414903A (en) Compressing method and device for Bayer format images
CN103024393A (en) Method for compressing and decompressing single picture
CN102271251A (en) Lossless image compression method
Rajagukguk et al. Compression of Color Image Using Quantization Method
CN102724381B (en) Bill image compression method based on JPEG (joint photographic experts group) compression principle
CN107682699A (en) A kind of nearly Lossless Image Compression method
CN103761753B (en) Decompression method based on texture image similarity
JP2017530578A (en) How to choose a compression algorithm depending on the type of image
CN106709880A (en) Grayscale image saving method and grayscale image original pixel acquisition method
Kekre et al. Storage of colour information in a greyscale image using haar wavelets and various colour spaces
CN106101711A (en) A kind of quickly real-time video codec compression algorithm
Kekre et al. A Comparison of Haar Wavelets and Kekre‟ s Wavelets for Storing Colour Information in a Greyscale Image
CN106296754B (en) Show data compression method and display data processing system
TWI502550B (en) Differential layered image compression method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150624

Termination date: 20171129

CF01 Termination of patent right due to non-payment of annual fee