CN107491730A - A kind of laboratory test report recognition methods based on image procossing - Google Patents

A kind of laboratory test report recognition methods based on image procossing Download PDF

Info

Publication number
CN107491730A
CN107491730A CN201710575858.4A CN201710575858A CN107491730A CN 107491730 A CN107491730 A CN 107491730A CN 201710575858 A CN201710575858 A CN 201710575858A CN 107491730 A CN107491730 A CN 107491730A
Authority
CN
China
Prior art keywords
test report
laboratory test
image
lab work
single image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710575858.4A
Other languages
Chinese (zh)
Inventor
尹建伟
岑超
赵景晨
邓水光
李莹
吴健
吴朝晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201710575858.4A priority Critical patent/CN107491730A/en
Publication of CN107491730A publication Critical patent/CN107491730A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/48Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of laboratory test report recognition methods based on image procossing, it passes through the investigation and analysis to chemically examining single structure, devise a set of algorithm that can accurately split each region of laboratory test report and effectively be cleared up, specification has simultaneously segmented finally obtain clearly image from how the laboratory test report photo of mobile phone shooting carries out processing step by step, and is identified using the OCR engine of increasing income of maturation;Thorough consideration has all been done in each stage present invention of laboratory test report image processing flow, performance has been optimized, improves the efficiency of image procossing;After being identified, the laboratory test report project dictionary that the present invention is established using lab work information database is realized to identifying that engine-model automatically selects and to recognition result intelligent correction, the accuracy of raising laboratory test report recognition result.

Description

A kind of laboratory test report recognition methods based on image procossing
Technical field
The invention belongs to medical OCR technique field, and in particular to a kind of laboratory test report recognition methods based on image procossing.
Background technology
OCR (Optical Character Recognition, optical character identification) refers to the image text to text information Part carries out analysis identifying processing, obtains the process of word and space of a whole page characteristic information;It is to utilize optical technology and computer technology Word in image is read out, and be converted into a kind of computer it will be appreciated that form;OCR technique is to realize word at a high speed One key technology of typing.
Research work of the China in terms of OCR technique is relative to start late, and just starts in the 1970s to numeral, English Word is female and the identification technology of symbol is studied, and late 1970s proceed by the research of Chinese Character Recognition.From 20th century The seventies, OCR just have been widely used for applying in news, printing, publication, library, office The industry-by-industries such as automation, substantially increase treatment effeciency and the degree of accuracy of form document, save manpower and materials and financial resources.
Block letter OCR identification technology has reached higher level at present, reaches 98% to the discrimination of printed Chinese character More than, even if to the poor word of printing quality, its discrimination also reaches more than 95%.With the popularization of smart mobile phone, and mobile phone The ecological environment of application is continued to develop, and OCR technique is also applied in mobile phone application, such as document identification, bank card identification, ticket According to identification, business card recognition, passport identification, identity card identification etc..
The electronization storage of document not only avoid the trouble that paper document is carried and lost, and more data analysis provides more Convenient service;Nowadays, either hospital, clinic, pharmaceutical factory and health institution, or carry out the scholars of health field research, Data are all increasingly dependent on to carry out decision-making and judge.Big data brings brand-new change to health medical treatment field, everything All it be unable to do without data storage;Although there is the structural data of oneself in hospital, data can not be easily between different departments Mutually temporarily transfer on ground;Meanwhile a patient generally can not also ensure all inspections in same hospital;In a word, current In the case of, the typing for chemically examining forms data is still the premise that can not be avoided.
OCR technique is then a kind of efficiently feasible technical scheme, by mobile phone application, more to drop significantly using threshold It is low, a large amount of human costs are also saved by user oneself typing;But user will not remove the laboratory test report of typing oneself for no reason at all Information, user is promoted to go so to do so corresponding service must be provided.Usual people are after hospital carries out routine examination, it is desirable to Understand the physical condition of the indices reflection checked, but do not have authoritative doctor and medical team to carry out the deciphering of laboratory test report; Therefore, laboratory test report identification is used as instrument, and laboratory test report is understood as service, and both complement each other, and are the tight demands of existing market.
The content of the invention
In view of it is above-mentioned, can be efficiently and accurately the invention provides a kind of laboratory test report recognition methods based on image procossing Automatic identification laboratory test report content information.
A kind of laboratory test report recognition methods based on image procossing, comprises the following steps:
(1) limb recognition is carried out to the laboratory test report photo of mobile phone shooting, obtains the quadrangular configuration of laboratory test report;
(2) cutting correction is carried out to laboratory test report photo by perspective transform based on quadrangular configuration, obtains chemically examining single image;
(3) Slant Rectify is carried out to chemical examination single image based on probability Hough transformation;
(4) extraction correction after chemically examine single image in cut-off rule, and according to cut-off rule will chemically examine single image be divided into it is upper in Lower three pieces of regions, patient information, lab work information, doctor and verification msu message are corresponded to respectively;
(5) further split to chemically examining the lab work information area in single image after correction according to line information, and it is right Chemical examination single image after segmentation carries out binary conversion treatment;
(6) LSTM (Long Short-Term Memory, the shot and long term memory net increased income in OCR engine Tesseract Network) model carries out Classification and Identification and intelligent correction to the binary image after segmentation.
The detailed process for carrying out limb recognition in the step (1) to chemical examination single image is as follows:
1.1 pairs of laboratory test report photos carry out resampling and obtain its thumbnail;
1.2 pairs of thumbnails pre-process, successively the rapid edge-detection including expansion process, based on structuring forest, Corrosion treatment and binary conversion treatment, so as to obtain marginal information image;
1.3 carry out straight-line detection to edge frame using Hough transformation, while introducing is based on local maximum and certainly Adapt to threshold value and carry out straight line similar in straight line screening and merging;
1.4 calculate intersection point between straight line using vector methods, are found by traveling through intersection point four-tuple and all are surrounded by straight line Quadrangle, take the quadrangular configuration of four edges weight and maximum quadrangle as laboratory test report;The weight is that side place is straight The quantity put on line.
The detailed process for carrying out Slant Rectify in the step (3) to chemical examination single image based on probability Hough transformation is as follows:
Laboratory test report image scaling to width is 1200 sizes by 3.1;
Chemical examination single image after scaling is converted to gray-scale map and carries out illumination amendment and binary conversion treatment by 3.2, that is, is passed through Image subtraction and increase the method for mean shift before and after mean filter and realize the amendment of illumination patterns, reuse contrast-limited Adaptive histogram equalization method strengthens the contrast of image, finally carries out binary conversion treatment to image;
3.3 pairs of binary images carry out etching operation, and using the edge of binary image after the corrosion of Canny operator extractions Pixel, obtain corresponding marginal information image;
3.4 carry out straight-line detection based on probability Hough transformation to edge frame, according to the flat of all linear angle of inclination Average carries out Slant Rectify to chemical examination single image.
Chemically examined after being corrected in the step (4) using the extraction of LSD (Line Segment Detector) Line Segment Detection Algorithm Cut-off rule in single image.
The specific implementation process of the step (5) is as follows:
Chemical examination single image after 5.1 pairs of corrections zooms in and out, and obtains corresponding downscaled images and does binary conversion treatment;
Number of the lab work information area per a line black picture element, statistical result can be in 5.2 statistics binary images Existing peak and low ebb alternating, each low ebb are blank parts in the ranks;Lab work information area is drawn according to statistical information It is divided into multirow, and deletes the row that wherein black picture element number is less than threshold value;
5.3 row informations obtained according to step 5.2, count the width of hollow white part per a line and be ranked up, count As a result the blank spaces of intercharacter are smaller in showing per a line and quantity accounts for the overwhelming majority, and between the blank between each row of lab work There was only several every larger and quantity, thereby determine that to obtain character pitch, setting arranges more than the threshold value of character pitch as minimum Spacing;
The number of each row black picture element of lab work information area, statistical result can be in 5.4 statistics binary images Existing peak and low ebb alternating, each low ebb are the blank parts between arranging;It will be chemically examined according to statistical information and minimum column pitch Project information region is divided into multiple row;
5.5 line informations obtained according to above-mentioned steps, to chemically examining the lab work information area in single image after correction Further segmentation, and select optimal global threshold to carry out using OTSU (maximum variance between clusters) the chemical examination single image after segmentation Binary conversion treatment.
The specific implementation process of the step (6) is as follows:
6.1 patient information region is identified using the document mode in Tesseract, is identified result character String, splits to recognition result character string according to blank spaces, bag is therefrom matched using the method for matching regular expressions The character block of the title containing item of information, character block followed by are the value of the item of information;And then obtained patient's letter will be parsed Breath is cleared up and structuring, obtains final patient information object;
6.2 carry out batch identification using the block mode in Tesseract to the lab work information area that segmentation is completed, Obtain the recognition result of test item;
6.3 establish the change for including test item title, test item alias, test item measurement Value Types (text, numeral etc.) Item information database is tested, and intelligent correction is carried out to test item recognition result using the lab work information database;
6.4, for any test item, obtain it by lab work information database and measure Value Types, according to measured value class The test item measured value is identified using Tesseract corresponding engine configurations for type, obtains the recognition result of measured value.
The specific of intelligent correction is carried out to test item recognition result using lab work information database in the step 6.3 Process is as follows:
6.3.1 the test item title and alias first in lab work information database, obtain including lab work The dictionary of title;
6.3.2 for the recognition result of any test item, if the recognition result is present in dictionary, error correction is terminated;If The recognition result is not present in dictionary, then calculates the normalized edit distance of the recognition result and all entries in dictionary, choosing Take entry composition error correction candidate list of the editing distance less than 0.8 and arranged by editing distance ascending order;
6.3.3 the entry that editing distance is minimum in error correction candidate list is taken as error correction candidate word, if the candidate word is deposited Error correction replacement directly then is carried out to recognition result using it in dictionary;From error correction candidate if the candidate word is not present in dictionary The candidate word is removed in item list, repeats step 6.3.3.
The advantageous effects of the present invention are as follows:
(1) invention introduces the inspection of the rapid edge-detection of integrated structure forest, improved Hough transformation and quadrangle A kind of laboratory test report method for detecting area for the high reliability surveyed.
(2) invention introduces the processing of the laboratory test report image skew correction based on Line segment detection, also with chemical examination free hand drawing Cut-off rule as in carries out region division to laboratory test report, uses different processing methods for different zones, maximumlly improves The accuracy rate of laboratory test report identification.
(3) invention introduces lab work information database is utilized, laboratory test report project dictionary is established, and realize accordingly Identification engine-model is automatically selected and to recognition result intelligent correction, improves recognition accuracy.
Brief description of the drawings
Fig. 1 is that the system of the inventive method realizes schematic diagram.
Fig. 2 is the schematic flow sheet of the inventive method.
Embodiment
In order to more specifically describe the present invention, below in conjunction with the accompanying drawings and embodiment is to technical scheme It is described in detail.
As depicted in figs. 1 and 2, the laboratory test report recognition methods of the invention based on image procossing comprises the following steps:
(1) laboratory test report limb recognition.
After the laboratory test report picture of mobile phone shooting is got, first have to identify laboratory test report main body from environmental background simultaneously With background separation, this process is dependent on the accurate detection to laboratory test report border.Present embodiment passes sequentially through resampling, pre- place The step of reason detects with rim detection, straight-line detection, intersection point calculation and quadrangle, the detection for completing laboratory test report surrounding border are appointed Business, is comprised the following steps that:
1.1 image resampling;The photo resolution of different mobile phone camera shootings differs greatly, if directly handling meeting Cause algorithm performance unstable, and high-resolution pictures are larger for operand demand;By resampling, picture width is united One is 640 pixels, it is ensured that all input dimension of pictures are close, and amount of calculation needed for reduction, processing all bases afterwards Carried out in the thumbnail of resampling.
1.2 pretreatments and rim detection;This step is by some image processing methods and based on the quick of structuring forest Key profile in edge detection method extraction image, key step are as follows:
1.2.1 expansion;Expansive working is carried out to image first, reduces interference of the details in picture for rim detection, Retain principal character and laboratory test report boundary profile;Expansion is a kind of important morphological images processing method, whole for two dimension Number space Z2In expansion to A of set A and B, BIt is defined as:
Wherein,For set B reflection, B is a structural elements or core, and A is inflated Set;This formula is with images of the B on its origin, and based on z translates to image, in this implementation B is 3 × 3 rectangle core in mode.
1.2.2 the rapid edge-detection based on structuring forest;With traditional edge detection algorithm, such as Canny, Sobel Operator etc. is compared, and the edge detection method [Dollar2013] based on structuring forest make use of the method that structuring learns, profit With the immanent structure at edge, the important edges in picture can be protruded, reduce influence of the picture detail for outline identification;For The profile of different levels can phenomenologically be presented as different gray levels.Advantageously reduce difficulty and the calculating of straight-line detection Amount, lift accuracy rate.
1.2.3 corrosion;For the edge image detected, etching operation is carried out, refines profile, details is rejected and slightly takes turns Exterior feature, retain main body contour of object;Corrosion is with expansion on the contrary, for two-dimensional integer space Z2In expansion to A of set A and B, BIt is defined as:
Wherein, corrosion of the B to A is the set that a B with z translations is included in all point z in A.
1.2.4 binaryzation;For the edge image after corrosion, carried out using maximum between-cluster variance (OTSU) algorithm optimal complete Office's threshold process, obtains the binary image of only black and white colour, the input as straight-line detection;The algorithm is using the think of clustered Think, it is assumed that the image includes two class pixels according to bimodulus histogram (foreground pixel and background pixel), and calculating can separate two classes Optimal threshold so that their variance within clusters are minimum or inter-class variance is maximum, comprise the following steps that:
1.2.4.1 the normalization histogram of calculating input image;
1.2.4.2 all possible threshold value t=1...255 is traveled through, calculates inter-class variance Wherein ωiFor class probability, μiAll it is by histogram calculation for class average;
1.2.4.3 threshold value t during inter-class variance maximumoptFor optimal threshold;
1.2.4.4 with toptFor global threshold, t will be less thanoptGray-value pixel point be set to gray scale minimum 0, more than topt Gray-value pixel point be set to gray scale maximum 255, realize image binaryzation.
1.3 straight-line detections based on Hough transformation;Hough transformation connects the side of given shape using the global characteristics of image Edge forms the edge of continuously smooth, is added up by the way that the point in original image two-dimensional coordinate is mapped into polar coordinate space, realizes Identification to analytic expression curve;Small, robustness is influenceed due to make use of image overall characteristic, therefore by noise and border interruption It is good, it is a kind of conventional line detection method.
Present embodiment is improved Hough transformation, is introduced and is screened based on the straight line of local maximum and threshold value, On the premise of straight-line detection precision is ensured, similar straight line is incorporated so that the result of straight-line detection can be described preferably Picture structure, its key step are as follows:
1.3.1, all white points in two-dimensional coordinate are mapped to the straight line of polar coordinate space:
R=xcos θ+ysin θ
Wherein, θ ∈ [θ, π], r ∈ [0, (w+h) * 2+1], w and h are respectively the wide and high of picture.
1.3.2 each quantity v for putting the straight line passed through in accumulation calculating polar coordinate spaceθ, r, i.e. weight;
1.3.3 all points are filtered, only retains and meet point (θ, r) claimed below:
①vθ, i> vmax* 0.15, vmaxFor somewhat middle cumulative number maximum;
②vθ, i=max { vX, y| (x, y) ∈ Sθ, i, k, SI, j, kIt is centered on (θ, r), the length of side is k square area, That is vi,jIt is local maximum;
③vθ -1, i> vmax*0.15∧vθ+1, i> vmax* 0.15, vmaxFor somewhat middle cumulative number maximum, this step Isolated point can be removed.
1.3.4 press weight vθ,rAll available points of polar coordinate space are ranked up, each of which point is expressed as two dimension Straight line in space is as follows, and ten straight lines of weighting weight highest are as candidate's straight line.
1.4 intersection point calculations and quadrangle detection;After the main straight in detecting picture, it is contemplated that laboratory test report is in picture In be quadrangle, therefore travel through all quadrangles for surrounding of straight line, find the quadrilateral area for being most likely to be laboratory test report region, Comprise the following steps that:
1.4.1 straight-line intersection in image range is sought;All straight line detected intersection points, and protecting two-by-two are obtained according to vector method The whole intersection points stayed in image range:
Wherein, (x1, y1), (x2, y2) it is point on straight line 1, (x3, y3), (x4, y4) it is point on straight line 2.
1.4.2 all candidate's quadrangles are found;All possible intersection point four-tuple is traveled through, finds all straight lines surround four Side shape.Quadrangle area is calculated, weeds out the quadrangle that all areas are less than picture area 10%.
1.4.3 weight sequencing;By four sides of quadrangle weight and all quadrangles are ranked up, final weight and Maximum quadrangle then regards as laboratory test report borderline region.
(2) the image cropping correction based on perspective transform.
The straight line information on 4 sides of laboratory test report obtained by the limb recognition result of step (1), by calculating between any two Intersection point obtains the coordinate on four summits of laboratory test report, and transformation matrix is established by the information of vertex point coordinate information and dimension of picture, right Image carries out perspective transform.
Perspective transform be using the centre of perspectivity, picture point, the condition of target point three point on a straight line, picture projection is new to one The process of view plane, its universal transformation formula are as follows:
Wherein, u, v are coordinates of original image coordinates, and image coordinate is after conversion:
(3) Slant Rectify based on probability Hough transformation.
After perspective transform, the edge of laboratory test report has been substantially at horizontal and vertical, but still suffers from laboratory test report Content and the chemical examination inequal situation of single edges;Further Slant Rectify is now also needed to, while is also largely kept away Exempt from the slight error that perspective transform cuts correction, present embodiment uses the Slant Rectify method based on probability Hough transformation, Detailed process is:
3.1 pairs of laboratory test report original images zoom in and out, if a height of H of original image, a width of W, keep original image wide high proportion to enter Row scaling, the image a width of 1200 after being reduced are a height ofA series of figures can be carried out in downscaled images afterwards As processing, to obtain the positional information of image segmentation, the smaller processing of picture in the case where ensureing that picture important information does not lose Efficiency is higher.
3.2 influence because mobile phone photograph easily receives lighting angle so that and the different zones Luminance Distribution of picture is uneven, The effect of global binaryzation is generated and significantly affected;In order to solve this problem, present embodiment employ medium filtering and Imaging importing, and the adaptive histogram equalization (CLAHE) of contrast-limited is combined, to brightness point before binaryzation is carried out The problem of cloth inequality, is corrected, and ensure that the homogenization of Luminance Distribution, comprises the following steps that:
3.2.1 arithmetic equal value filters;Arithmetic equal value filtering process is carried out to picture first, obtains rough Luminance Distribution sample This;The mean filter that counts is one kind of spatial filter, and the average value of the gray level that it is faced in domain using a pixel replaces The value of the pixel, i.e.,:
Wherein, (x, y) is current pixel coordinate, and S is contiguous range, and it is 10 × 10 that contiguous range is taken in present embodiment.
3.2.2 imaging importing;The average brightness L of picture after filtering is calculated, after subtracting filtering using original picture brightness value Picture luminance value, and L is added, obtain the image after brightness homogenization i.e.:
F (x, y)=f1(x, y)-f2(x, y)+L
3.2.3 the adaptive histogram equalization of contrast-limited;After completing the procedure, using contrast-limited Adaptive histogram equalization strengthens picture contrast.
Common histogram equalization algorithm be usually used in strengthen picture contrast, but if image include substantially than image its The dark or bright part in its region, the contrast in these parts cannot effectively strengthen;Adaptive histogram equalization Algorithm performs the histogram equalization responded to change above mentioned problem by localized region.In CLAHE, for each zonule Contrast amplitude limit must be all used, can overcome and avoid noise from excessively being amplified.
3.2.4 carry out OTSU binaryzations.
3.3 pairs of binary images obtained in the previous step carry out etching operation, use the rectangle core that size is 7 × 7.The operation Character area adjoining in picture can be made to connect together so that the information that next step rim detection is extracted more meets demand.
Image after 3.4 pairs of corrosion proposes the marginal information image of binary image, Canny algorithm meetings using Canny operators Input picture and Gaussian smoothing template are done into convolution, an image slightly obscured is obtained, single pixel noise is produced to the greatest extent Small influence is measured, reuses 4 mask detection levels, the vertical, edge of diagonal;Input picture and each mask are done Convolution obtains 4 sub-pictures, and preserves the maximum on each pixel and direction, determines that edge is believed using hysteresis threshold afterwards Breath:The threshold value opening flag larger from one goes out to compare the edge firmly believed, the whole edge of use direction tracking of information, now uses Less threshold value, a bianry image can be finally obtained, each point indicates whether it is edge.
3.5 pairs of edge frames carry out probability Hough transformation straight-line detection:First, randomly select in edge image Point, is mapped in polar coordinate system, the poll for the corresponding points that add up;When the point accumulation poll in polar coordinate system reaches threshold value, find out pair Two end points of the straight line answered, if line segment length is more than given threshold and is added in result set, so repeat until finding There is qualified line segment.
3.6 all straight incline angle average values detected of statistics consider that a certain bar is straight as angle of inclination rather than only Line, influence caused by indivedual special straight lines is avoided, correction result is more stable, obtains image rotation after average tilt angle Opposite angle is rectifiable.
(4) extract cut-off rule information in laboratory test report and cut.
The cut-off rule in chemical examination single image is extracted using LSD Line Segment Detection Algorithms, laboratory test report is divided into by these cut-off rules Different zones;Usual laboratory test report is divided into three major parts by horizontal line, and the top is hospital name and patient information, middle one Point it is the information of lab work, bottom is the information such as censorship doctor, proofer, auditor;Therefore can be incited somebody to action according to cut-off rule Laboratory test report is divided into some, for the further processing respectively of different parts, downscaled images is split and further carried Line information is taken, while former scaled image is split also according to the position of corresponding proportion, is cut after line information to be obtained Cut and clear up.
(5) each section procession is segmented and cleared up.
Due to reasons such as personal photo angle and light, many factors are there may be in laboratory test report photo can influence picture Quality, such as most common light and shade cause the blank sheet of paper part colours that have in same pictures may than another part word more Secretly;If use global image processing method, it is easy to word segment is had influence on, but if being used only one in a small range Relatively good binary-state threshold can dispose the overwhelming majority interference information lighter than text color, therefore first extract ranks letter Breath, then binary conversion treatment is carried out to the content in each grid, more preferable effect can be obtained, specific method is as follows:
Peak can be presented in number of the chemical examination item parts statistics of 5.1 pairs of downscaled images per a line black picture element, statistical result Replace with low ebb, each low ebb is blank parts in the ranks;Multiple rows are divided into by item parts are chemically examined according to statistical information, and Screen out the row for being less than threshold value comprising black picture element number.
5.2, according to row information obtained in the previous step, count the blank parts width per a line and are ranked up;Usual a line The blank spaces of middle intercharacter are close and quantity accounts for the overwhelming majority, and lab work be respectively spaced between row it is larger but only 4~5 It is individual, therefore be easy to that universal character pitch can be obtained, some threshold value more than character pitch is then taken again as minimum Column pitch.
The chemical examination item parts of 5.3 pairs of downscaled images count the number of each row black picture element, and peak can be presented in statistical result Replace with low ebb, each low ebb is the blank parts between row, and the part that space width is more than to threshold value according to statistical information is drawn It is divided into multiple row.
5.4 obtain the column locations information of laboratory test report original image according to the line informations of downscaled images, in proportion reduction, will The chemical examination item parts of laboratory test report original image are split, and to maximum between-cluster variance (OTSU) algorithms selection of the image after segmentation Optimal global threshold carries out binary conversion treatment, obtains clearly character image.
(6) laboratory test report Classification and Identification and intelligent correction based on Tesseract engines Yu lab work information database.
Tesseract 4 has used shot and long term memory network (LSTM) to carry out OCR identifications, and this is a kind of time recurrent neural Network, it is widely used in the fields such as handwriting recognition, speech recognition, machine translation, compared with the OCR recognition methods such as tradition, LSTM can greatly lift the accuracy rate and speed of OCR identifications.
In order to preferably carry out error correction to recognition result using priori, and difference is taken for different types of value Identifying schemes, lifted recognition accuracy, present embodiment establishes lab work information database, wherein related to identification Main project has:Test item title, test item alias, test item measurement Value Types.
After the binary image fragment after being split, measured successively for patient information, test item title, test item Value, detected, comprised the following steps that using different engine configurations:
6.1 patient informations identify and parsing;Patient information region is identified using Tesseract document mode, Result character string is identified, recognition result is split according to blank;For each character block, regular expression is used The method matched somebody with somebody, therefrom match comprising item of information title (such as name, the age, sex, diagnosis, card number, case number, outpatient service number, live Institute number, sample type etc.) character block, character block followed by is the value of the project;And then obtained patient's letter will be parsed Breath is cleared up and structuring, obtains final patient information object.
6.2 test item titles identify;For the laboratory test report title image block being partitioned into, Tesseract block mould is used Formula carries out batch identification, obtains laboratory test report item recognition the results list.
The 6.3 test item title intelligent corrections based on lab work information database;According to lab work information database In project name and alias, lab work title dictionary can be obtained;For the entry name recognition result in step 6.2, tool Body performs as follows:
6.3.1 if recognition result is present in dictionary, error correction is terminated, and remove and be somebody's turn to do in the dictionary that this error correction uses Project.
6.3.2 if recognition result is not present in dictionary, then returning for recognition result and all entries in dictionary is calculated One changes editing distance (Normalized Levenshtein Distance), and selected distance is less than 0.8 project, forms error correction Candidate list.
6.3.3 by all entries according to the editing distance ascending sort with recognition result, and error correction candidate is chosen with this The minimum entry of distance is as error correction candidate in word list;If the candidate word is present with dictionary, terminating to entangle the project Mistake, using this candidate word as error correction result, and the candidate word is removed from dictionary.
If 6.3.4 the candidate word is not present in dictionary, the candidate word, repeat step 6.3.3 are removed from dictionary.
Between editing distance refers to two character strings, as the edit operation number needed for one changes into another, edit operation Replace, insert and delete including character;Normalized edit distance is the length of editing distance divided by most long character string, specifically:
Two character strings s1, s2 are defined, their length is respectively len1, len2, and dp [i] [j] represents character string s1 [0..i] and s2 [0..j] smallest edit distance, wherein for character string s, s [0..i] is represented with 0 as starting subscript, length It is as follows for i character string s substring, detailed process:
A. dp [i] [j] is initialized, if i=0, dp [i] [j]=j;If j=0, dp [i] [j]=i;
B. state transition equation, for i > 0 and j > 0:
C. for i=1 → len1, j=1 → len2, dp [i] [j] is calculated;
D. character string s1 and s2 smallest edit distances are dp [len1] [len2];
E. normalized edit distance is
The 6.4 measurement Value Types based on lab work information database judge to identify with measured value;Entangled completing test item After mistake, for each test item, type (text, the numeral of the project survey value can be obtained from lab work database Deng), for different types of measured value, it is identified using the configuration of corresponding Tesseract engines, is finally identified respectively As a result.
The above-mentioned description to embodiment is understood that for ease of those skilled in the art and using the present invention. Person skilled in the art obviously can easily make various modifications to above-described embodiment, and described herein general Principle is applied in other embodiment without by performing creative labour.Therefore, the invention is not restricted to above-described embodiment, ability For field technique personnel according to the announcement of the present invention, the improvement made for the present invention and modification all should be in protection scope of the present invention Within.

Claims (7)

1. a kind of laboratory test report recognition methods based on image procossing, comprises the following steps:
(1) limb recognition is carried out to the laboratory test report photo of mobile phone shooting, obtains the quadrangular configuration of laboratory test report;
(2) cutting correction is carried out to laboratory test report photo by perspective transform based on quadrangular configuration, obtains chemically examining single image;
(3) Slant Rectify is carried out to chemical examination single image based on probability Hough transformation;
(4) cut-off rule in single image is chemically examined after extraction correction, and upper, middle and lower three is divided into by single image is chemically examined according to cut-off rule Block region, patient information, lab work information, doctor and verification msu message are corresponded to respectively;
(5) further split to chemically examining the lab work information area in single image after correction according to line information, and to segmentation Chemical examination single image afterwards carries out binary conversion treatment;
(6) the LSTM models increased income in OCR engine Tesseract the binary image after segmentation is carried out Classification and Identification and Intelligent correction.
2. laboratory test report recognition methods according to claim 1, it is characterised in that:To chemically examining single image in the step (1) The detailed process for carrying out limb recognition is as follows:
1.1 pairs of laboratory test report photos carry out resampling and obtain its thumbnail;
1.2 pairs of thumbnails pre-process, successively the rapid edge-detection including expansion process, based on structuring forest, corrosion Processing and binary conversion treatment, so as to obtain marginal information image;
1.3 use Hough transformation to carry out straight-line detection to edge frame, while introducing is based on local maximum and adaptively Threshold value carries out straight line similar in straight line screening and merging;
1.4 calculate intersection point between straight line using vector methods, by travel through intersection point four-tuple find it is all surrounded by straight line four Side shape, take the quadrangular configuration of four edges weight and maximum quadrangle as laboratory test report;The weight is on the straight line of side place The quantity of point.
3. laboratory test report recognition methods according to claim 1, it is characterised in that:Probability Hough is based in the step (3) Convert as follows to the detailed process of chemical examination single image progress Slant Rectify:
Laboratory test report image scaling to width is 1200 sizes by 3.1;
Chemical examination single image after scaling is converted to gray-scale map and carries out illumination amendment and binary conversion treatment by 3.2, that is, passes through average Image subtraction and increase the method for mean shift before and after filtering and realize the amendment of illumination patterns, reuse the adaptive of contrast-limited The contrast of histogram equalization method enhancing image is answered, binary conversion treatment finally is carried out to image;
3.3 pairs of binary images carry out etching operation, and using the edge picture of binary image after the corrosion of Canny operator extractions Element, obtain corresponding marginal information image;
3.4 carry out straight-line detection based on probability Hough transformation to edge frame, according to the average value of all linear angle of inclination Slant Rectify is carried out to chemical examination single image.
4. laboratory test report recognition methods according to claim 1, it is characterised in that:Using the inspection of LSD line segments in the step (4) The cut-off rule in single image is chemically examined after method of determining and calculating extraction correction.
5. laboratory test report recognition methods according to claim 1, it is characterised in that:The specific implementation process of the step (5) It is as follows:
Chemical examination single image after 5.1 pairs of corrections zooms in and out, and obtains corresponding downscaled images and does binary conversion treatment;
Height can be presented in number of the lab work information area per a line black picture element, statistical result in 5.2 statistics binary images Peak and low ebb alternating, each low ebb are blank parts in the ranks;Lab work information area is divided into according to statistical information Multirow, and delete the row that wherein black picture element number is less than threshold value;
5.3 row informations obtained according to step 5.2, count the width of hollow white part per a line and be ranked up, statistical result Display is smaller per the blank spaces of intercharacter in a line and quantity accounts for the overwhelming majority, and lab work respectively the blank spaces between row compared with Big and quantity only has several, thereby determines that to obtain character pitch, setting is more than the threshold value of character pitch as minimum column pitch;
Height can be presented in the number of each row black picture element of lab work information area, statistical result in 5.4 statistics binary images Peak and low ebb alternating, each low ebb are the blank parts between arranging;According to statistical information and minimum column pitch by lab work Information area is divided into multiple row;
5.5 line informations obtained according to above-mentioned steps, enter one to chemically examining the lab work information area in single image after correction Step segmentation, and select optimal global threshold to carry out binary conversion treatment using OTSU the chemical examination single image after segmentation.
6. laboratory test report recognition methods according to claim 1, it is characterised in that:The specific implementation process of the step (6) It is as follows:
6.1 patient information region is identified using the document mode in Tesseract, is identified result character string, is pressed Recognition result character string is split according to blank spaces, therefrom matched comprising information using the method for matching regular expressions The character block of title, character block followed by are the value of the item of information;And then obtained patient information progress will be parsed Cleaning and structuring, obtain final patient information object;
6.2 carry out batch identification using the block mode in Tesseract to the lab work information area that segmentation is completed, and obtain The recognition result of test item;
6.3 foundation include test item title, test item alias, the lab work information database of test item measurement Value Types, And intelligent correction is carried out to test item recognition result using the lab work information database;
6.4, for any test item, obtain it by lab work information database and measure Value Types, adopted according to measurement Value Types The test item measured value is identified with Tesseract corresponding engine configurations, obtains the recognition result of measured value.
7. laboratory test report recognition methods according to claim 6, it is characterised in that:Lab work is utilized in the step 6.3 The detailed process that information database carries out intelligent correction to test item recognition result is as follows:
6.3.1 the test item title and alias first in lab work information database, obtain including lab work title Dictionary;
6.3.2 for the recognition result of any test item, if the recognition result is present in dictionary, error correction is terminated;If the knowledge Other result is not present in dictionary, then calculates the normalized edit distance of the recognition result and all entries in dictionary, is chosen and is compiled Collect entry composition error correction candidate list of the distance less than 0.8 and by the arrangement of editing distance ascending order;
6.3.3 the entry that editing distance is minimum in error correction candidate list is taken as error correction candidate word, if the candidate word has word Error correction replacement directly then is carried out to recognition result using it in allusion quotation;Arranged if the candidate word is not present in dictionary from error correction candidate item The candidate word is removed in table, repeats step 6.3.3.
CN201710575858.4A 2017-07-14 2017-07-14 A kind of laboratory test report recognition methods based on image procossing Pending CN107491730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710575858.4A CN107491730A (en) 2017-07-14 2017-07-14 A kind of laboratory test report recognition methods based on image procossing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710575858.4A CN107491730A (en) 2017-07-14 2017-07-14 A kind of laboratory test report recognition methods based on image procossing

Publications (1)

Publication Number Publication Date
CN107491730A true CN107491730A (en) 2017-12-19

Family

ID=60643812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710575858.4A Pending CN107491730A (en) 2017-07-14 2017-07-14 A kind of laboratory test report recognition methods based on image procossing

Country Status (1)

Country Link
CN (1) CN107491730A (en)

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133214A (en) * 2017-12-25 2018-06-08 广东小天才科技有限公司 A kind of information search method and mobile terminal corrected based on picture
CN108569606A (en) * 2018-06-15 2018-09-25 西安理工大学 Construction elevator safety door angle identification method based on bounding box features
CN108596066A (en) * 2018-04-13 2018-09-28 武汉大学 A kind of character identifying method based on convolutional neural networks
CN108805076A (en) * 2018-06-07 2018-11-13 浙江大学 The extracting method and system of environmental impact assessment report table word
CN109190629A (en) * 2018-08-28 2019-01-11 传化智联股份有限公司 A kind of electronics waybill generation method and device
CN109214387A (en) * 2018-09-14 2019-01-15 辽宁奇辉电子***工程有限公司 A kind of railway operation detection system based on character recognition technology
CN109446345A (en) * 2018-09-26 2019-03-08 深圳中广核工程设计有限公司 Nuclear power file verification processing method and system
CN109460387A (en) * 2018-11-05 2019-03-12 帝麦克斯(苏州)医疗科技有限公司 Filename generation method and device
CN109523568A (en) * 2018-10-12 2019-03-26 广东绿康源美环境科技有限公司 A kind of gross specimen camera system based on Canny algorithm
CN109558848A (en) * 2018-11-30 2019-04-02 湖南华诺星空电子技术有限公司 A kind of unmanned plane life detection method based on Multi-source Information Fusion
CN109584165A (en) * 2018-11-30 2019-04-05 泰康保险集团股份有限公司 A kind of antidote of digital picture, device, medium and electronic equipment
CN109583358A (en) * 2018-11-26 2019-04-05 广东智源信息技术有限公司 A kind of Medical Surveillance fast accurate enforcement approach
CN109636815A (en) * 2018-12-19 2019-04-16 东北大学 A kind of metal plate and belt Product labelling information identifying method based on computer vision
CN109766893A (en) * 2019-01-09 2019-05-17 北京数衍科技有限公司 Picture character recognition methods suitable for receipt of doing shopping
CN109784341A (en) * 2018-12-25 2019-05-21 华南理工大学 A kind of medical document recognition methods based on LSTM neural network
CN109815958A (en) * 2019-02-01 2019-05-28 杭州睿琪软件有限公司 A kind of laboratory test report recognition methods, device, electronic equipment and storage medium
CN109977723A (en) * 2017-12-22 2019-07-05 苏宁云商集团股份有限公司 Big bill picture character recognition methods
CN109993160A (en) * 2019-02-18 2019-07-09 北京联合大学 A kind of image flame detection and text and location recognition method and system
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
CN110135412A (en) * 2019-04-30 2019-08-16 北京邮电大学 Business card identification method and device
CN110188649A (en) * 2019-05-23 2019-08-30 成都火石创造科技有限公司 Pdf document analysis method based on tesseract-ocr
CN110287793A (en) * 2019-05-23 2019-09-27 北京爱诺斯科技有限公司 A kind of image analysis method of recognizable eyesight prescription
CN110298282A (en) * 2019-06-21 2019-10-01 华南师范大学 Document image processing method, storage medium and calculating equipment
CN110321760A (en) * 2018-03-29 2019-10-11 北京和缓医疗科技有限公司 A kind of medical document recognition methods and device
CN110335280A (en) * 2019-07-05 2019-10-15 湖南联信科技有限公司 A kind of financial documents image segmentation and antidote based on mobile terminal
CN110399867A (en) * 2018-04-24 2019-11-01 深信服科技股份有限公司 A kind of recognition methods, system and the relevant apparatus of text class image-region
CN110554991A (en) * 2019-09-03 2019-12-10 浙江传媒学院 Method for correcting and managing text picture
CN110704687A (en) * 2019-09-02 2020-01-17 平安科技(深圳)有限公司 Character layout method, device and computer readable storage medium
CN110781898A (en) * 2019-10-21 2020-02-11 南京大学 Unsupervised learning method for Chinese character OCR post-processing
CN111369554A (en) * 2020-03-18 2020-07-03 山西安数智能科技有限公司 Optimization and pretreatment method of belt damage sample in low-brightness multi-angle environment
CN111627511A (en) * 2020-05-29 2020-09-04 北京大恒普信医疗技术有限公司 Ophthalmologic report content identification method and device and readable storage medium
CN111985491A (en) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 Similar information merging method, device, equipment and medium based on deep learning
CN112016553A (en) * 2019-05-28 2020-12-01 创新先进技术有限公司 Optical Character Recognition (OCR) system, automatic OCR correction system, method
CN112036232A (en) * 2020-07-10 2020-12-04 中科院成都信息技术股份有限公司 Image table structure identification method, system, terminal and storage medium
CN112347831A (en) * 2019-08-09 2021-02-09 株式会社日立制作所 Information processing apparatus and table identification method
CN112418204A (en) * 2020-11-18 2021-02-26 杭州未名信科科技有限公司 Text recognition method, system and computer medium based on paper document
CN112434699A (en) * 2020-11-25 2021-03-02 杭州六品文化创意有限公司 Automatic extraction and intelligent scoring system for handwritten Chinese characters or components and strokes
CN112507080A (en) * 2020-12-16 2021-03-16 北京信息科技大学 Character recognition and correction method
CN113012060A (en) * 2021-02-07 2021-06-22 深圳柔果信息科技有限公司 Image processing method, image processing system and electronic equipment
WO2021212614A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Text error correction method and apparatus, computer-readable storage medium and system
CN113920589A (en) * 2021-10-28 2022-01-11 平安银行股份有限公司 Signature identification method, device, equipment and medium based on artificial intelligence
WO2022165692A1 (en) * 2021-02-04 2022-08-11 深圳迈瑞生物医疗电子股份有限公司 Reagent management method and related device
CN116542979A (en) * 2023-07-06 2023-08-04 金钱猫科技股份有限公司 Image measurement-based prediction correction method and terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999483A (en) * 2011-09-16 2013-03-27 北京百度网讯科技有限公司 Method and device for correcting text
CN104933430A (en) * 2015-06-03 2015-09-23 北京好运到信息科技有限公司 Interactive image processing method and interactive image processing system for mobile terminal
CN105701835A (en) * 2016-02-26 2016-06-22 华北电力大学 Image edge detection method and system facing electric power facilities
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999483A (en) * 2011-09-16 2013-03-27 北京百度网讯科技有限公司 Method and device for correcting text
CN104933430A (en) * 2015-06-03 2015-09-23 北京好运到信息科技有限公司 Interactive image processing method and interactive image processing system for mobile terminal
CN105701835A (en) * 2016-02-26 2016-06-22 华北电力大学 Image edge detection method and system facing electric power facilities
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
PIOTR DOLLAR,C. LAWRENCE ZITNICK: ""Structured Forests for Fast Edge Detection"", 《2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
乔寅骐,肖健华,黄银和,尹奎英: ""基于最小二乘修正的随机 Hough 变换直线检测"", 《计算机应用》 *
付文秀,丁明,郭燚: ""基于边缘组合算法的似物性采样研究"", 《北京交通大学学报》 *
梅亚敏: ""融合先验知识的场景文本识别应用研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
段汝娇,赵 伟,黄松岭,陈建业: ""一种基于改进Hough变换的直线快速检测算法"", 《仪器仪表学报》 *
郭芹: ""复杂结构的名片识别中的版面分析方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977723A (en) * 2017-12-22 2019-07-05 苏宁云商集团股份有限公司 Big bill picture character recognition methods
CN109977723B (en) * 2017-12-22 2021-10-22 苏宁云商集团股份有限公司 Large bill picture character recognition method
CN108133214A (en) * 2017-12-25 2018-06-08 广东小天才科技有限公司 A kind of information search method and mobile terminal corrected based on picture
CN110321760A (en) * 2018-03-29 2019-10-11 北京和缓医疗科技有限公司 A kind of medical document recognition methods and device
CN108596066B (en) * 2018-04-13 2020-05-26 武汉大学 Character recognition method based on convolutional neural network
CN108596066A (en) * 2018-04-13 2018-09-28 武汉大学 A kind of character identifying method based on convolutional neural networks
CN110399867A (en) * 2018-04-24 2019-11-01 深信服科技股份有限公司 A kind of recognition methods, system and the relevant apparatus of text class image-region
CN110399867B (en) * 2018-04-24 2023-05-12 深信服科技股份有限公司 Text image area identification method, system and related device
CN108805076A (en) * 2018-06-07 2018-11-13 浙江大学 The extracting method and system of environmental impact assessment report table word
CN108805076B (en) * 2018-06-07 2021-01-08 浙江大学 Method and system for extracting table characters of environmental impact evaluation report
CN108569606A (en) * 2018-06-15 2018-09-25 西安理工大学 Construction elevator safety door angle identification method based on bounding box features
CN109190629A (en) * 2018-08-28 2019-01-11 传化智联股份有限公司 A kind of electronics waybill generation method and device
CN109214387A (en) * 2018-09-14 2019-01-15 辽宁奇辉电子***工程有限公司 A kind of railway operation detection system based on character recognition technology
CN109446345A (en) * 2018-09-26 2019-03-08 深圳中广核工程设计有限公司 Nuclear power file verification processing method and system
CN109523568A (en) * 2018-10-12 2019-03-26 广东绿康源美环境科技有限公司 A kind of gross specimen camera system based on Canny algorithm
CN109460387A (en) * 2018-11-05 2019-03-12 帝麦克斯(苏州)医疗科技有限公司 Filename generation method and device
CN109583358A (en) * 2018-11-26 2019-04-05 广东智源信息技术有限公司 A kind of Medical Surveillance fast accurate enforcement approach
CN109558848A (en) * 2018-11-30 2019-04-02 湖南华诺星空电子技术有限公司 A kind of unmanned plane life detection method based on Multi-source Information Fusion
CN109584165A (en) * 2018-11-30 2019-04-05 泰康保险集团股份有限公司 A kind of antidote of digital picture, device, medium and electronic equipment
CN109636815A (en) * 2018-12-19 2019-04-16 东北大学 A kind of metal plate and belt Product labelling information identifying method based on computer vision
CN109784341A (en) * 2018-12-25 2019-05-21 华南理工大学 A kind of medical document recognition methods based on LSTM neural network
CN109766893A (en) * 2019-01-09 2019-05-17 北京数衍科技有限公司 Picture character recognition methods suitable for receipt of doing shopping
CN109815958A (en) * 2019-02-01 2019-05-28 杭州睿琪软件有限公司 A kind of laboratory test report recognition methods, device, electronic equipment and storage medium
CN109993160B (en) * 2019-02-18 2022-02-25 北京联合大学 Image correction and text and position identification method and system
CN109993160A (en) * 2019-02-18 2019-07-09 北京联合大学 A kind of image flame detection and text and location recognition method and system
CN110135412A (en) * 2019-04-30 2019-08-16 北京邮电大学 Business card identification method and device
CN110135414B (en) * 2019-05-16 2021-07-09 京北方信息技术股份有限公司 Corpus updating method, apparatus, storage medium and terminal
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
CN110188649B (en) * 2019-05-23 2021-11-23 成都火石创造科技有限公司 Pdf file analysis method based on tesseract-ocr
CN110287793A (en) * 2019-05-23 2019-09-27 北京爱诺斯科技有限公司 A kind of image analysis method of recognizable eyesight prescription
CN110188649A (en) * 2019-05-23 2019-08-30 成都火石创造科技有限公司 Pdf document analysis method based on tesseract-ocr
CN112016553A (en) * 2019-05-28 2020-12-01 创新先进技术有限公司 Optical Character Recognition (OCR) system, automatic OCR correction system, method
US11023766B2 (en) 2019-05-28 2021-06-01 Advanced New Technologies Co., Ltd. Automatic optical character recognition (OCR) correction
CN110298282A (en) * 2019-06-21 2019-10-01 华南师范大学 Document image processing method, storage medium and calculating equipment
CN110298282B (en) * 2019-06-21 2021-07-23 华南师范大学 Document image processing method, storage medium and computing device
CN110335280A (en) * 2019-07-05 2019-10-15 湖南联信科技有限公司 A kind of financial documents image segmentation and antidote based on mobile terminal
CN112347831A (en) * 2019-08-09 2021-02-09 株式会社日立制作所 Information processing apparatus and table identification method
CN110704687A (en) * 2019-09-02 2020-01-17 平安科技(深圳)有限公司 Character layout method, device and computer readable storage medium
CN110704687B (en) * 2019-09-02 2023-08-11 平安科技(深圳)有限公司 Text layout method, text layout device and computer readable storage medium
CN110554991A (en) * 2019-09-03 2019-12-10 浙江传媒学院 Method for correcting and managing text picture
CN110781898A (en) * 2019-10-21 2020-02-11 南京大学 Unsupervised learning method for Chinese character OCR post-processing
CN111369554A (en) * 2020-03-18 2020-07-03 山西安数智能科技有限公司 Optimization and pretreatment method of belt damage sample in low-brightness multi-angle environment
WO2021212614A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Text error correction method and apparatus, computer-readable storage medium and system
CN111627511A (en) * 2020-05-29 2020-09-04 北京大恒普信医疗技术有限公司 Ophthalmologic report content identification method and device and readable storage medium
CN112036232B (en) * 2020-07-10 2023-07-18 中科院成都信息技术股份有限公司 Image table structure identification method, system, terminal and storage medium
CN112036232A (en) * 2020-07-10 2020-12-04 中科院成都信息技术股份有限公司 Image table structure identification method, system, terminal and storage medium
CN111985491A (en) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 Similar information merging method, device, equipment and medium based on deep learning
CN112418204A (en) * 2020-11-18 2021-02-26 杭州未名信科科技有限公司 Text recognition method, system and computer medium based on paper document
CN112434699A (en) * 2020-11-25 2021-03-02 杭州六品文化创意有限公司 Automatic extraction and intelligent scoring system for handwritten Chinese characters or components and strokes
CN112507080A (en) * 2020-12-16 2021-03-16 北京信息科技大学 Character recognition and correction method
WO2022165692A1 (en) * 2021-02-04 2022-08-11 深圳迈瑞生物医疗电子股份有限公司 Reagent management method and related device
CN113012060A (en) * 2021-02-07 2021-06-22 深圳柔果信息科技有限公司 Image processing method, image processing system and electronic equipment
CN113920589A (en) * 2021-10-28 2022-01-11 平安银行股份有限公司 Signature identification method, device, equipment and medium based on artificial intelligence
CN116542979A (en) * 2023-07-06 2023-08-04 金钱猫科技股份有限公司 Image measurement-based prediction correction method and terminal
CN116542979B (en) * 2023-07-06 2023-10-03 金钱猫科技股份有限公司 Image measurement-based prediction correction method and terminal

Similar Documents

Publication Publication Date Title
CN107491730A (en) A kind of laboratory test report recognition methods based on image procossing
CN108647681B (en) A kind of English text detection method with text orientation correction
Zhang et al. Text extraction from natural scene image: A survey
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
Qureshi et al. A bibliography of pixel-based blind image forgery detection techniques
Epshtein et al. Detecting text in natural scenes with stroke width transform
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
EP1271403B1 (en) Method and device for character location in images from digital camera
CN109409355B (en) Novel transformer nameplate identification method and device
CN104408449B (en) Intelligent mobile terminal scene literal processing method
Fabrizio et al. Text detection in street level images
CN111650220A (en) Vision-based image-text defect detection method
CN101122952A (en) Picture words detecting method
CN111915704A (en) Apple hierarchical identification method based on deep learning
CN101122953A (en) Picture words segmentation method
CN106446750A (en) Bar code reading method and device
CN105046200B (en) Electronic paper marking method based on straight line detection
CN107195069A (en) A kind of RMB crown word number automatic identifying method
CN113033558B (en) Text detection method and device for natural scene and storage medium
Breuel Robust, simple page segmentation using hybrid convolutional mdlstm networks
Rath et al. Indexing for a digital library of George Washington’s manuscripts: a study of word matching techniques
CN116071763A (en) Teaching book intelligent correction system based on character recognition
Forczmański et al. Stamps detection and classification using simple features ensemble
CN115082776A (en) Electric energy meter automatic detection system and method based on image recognition
Khan et al. Car Number Plate Recognition (CNPR) system using multiple template matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171219

RJ01 Rejection of invention patent application after publication