CN106778732A - Text information feature extraction and recognition method based on Gabor filter - Google Patents

Text information feature extraction and recognition method based on Gabor filter Download PDF

Info

Publication number
CN106778732A
CN106778732A CN201710027704.1A CN201710027704A CN106778732A CN 106778732 A CN106778732 A CN 106778732A CN 201710027704 A CN201710027704 A CN 201710027704A CN 106778732 A CN106778732 A CN 106778732A
Authority
CN
China
Prior art keywords
image
gabor filter
text
feature extraction
dbn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710027704.1A
Other languages
Chinese (zh)
Inventor
刘明珠
李文静
郑云非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN201710027704.1A priority Critical patent/CN106778732A/en
Publication of CN106778732A publication Critical patent/CN106778732A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

Text information Feature extraction and recognition method based on Gabor filter.The less efficient problem of the extractive technique of existing video and image Chinese version information.The present invention is realized by following steps:Design Gabor filter, design and train DBN sorter networks, with morphologic method, denoising, the filling of hole region and the rejecting of isolated point operation are carried out to the image after positioning, make the text image of positioning more accurate, and the String localization bianry image after final denoising is mapped on original video two field picture, obtain accurate String localization region;Accurate String localization region to positioning and after processing, carries out Text enhancement, binary conversion treatment, normalization and feature extraction operation;With OCR identification technologies to being processed through step 4 after text be identified.The present invention can more accurately extract video and image Chinese version information.

Description

Text information Feature extraction and recognition method based on Gabor filter
Technical field:
The present invention relates to a kind of text information Feature extraction and recognition method based on Gabor filter.
Background technology:
In recent years, with the lifting of living standard and continuing to develop for multimedia information technology of people, image, video are As indispensable a kind of important information media in daily life, also as one kind of Information Communication in internet Approach.In actual life, news, film and TV play and the content of multimedia such as shoot the video certainly substantial amounts of can all be produced daily It is raw.The video and image of such Large Copacity on internet, how to the management of video data, using and to important video The retrieval of content just becomes abnormal important.
It is the auxiliary to video content and explanation generally, just with high-layer semantic information that the word of video is In it is appreciated that video content and retrieval to video etc., so text information is inseparable with video.If can Effective identification is carried out to the word in video, it is possible to realize carrying out video content automatically using the text information for recognizing Summary, allow people being retrieved to video, the aspect such as the understanding of video and analysis it is more convenient.So, how to video In content of text effectively positioned and recognize just become it is very meaningful.For video and image Chinese version information Extract, Video content retrieval, intelligent transportation, visual identifying system, digital library and some other can be applied to Field.
The content of the invention:
Extractive technique the invention aims to solve the problems, such as existing video and image Chinese version information is less efficient, And a kind of text information Feature extraction and recognition method based on Gabor filter for proposing.
Above-mentioned purpose is realized by following technical scheme:
Step one, design Gabor filter;
Step 2, design and train DBN sorter networks;
Step 3, with morphologic method, denoising, the filling of hole region and isolated point are carried out to the image after positioning Operation is rejected, makes the text image of positioning more accurate, and the String localization bianry image after final denoising is mapped to original On video frame images, accurate String localization region is obtained;
Step 4, the accurate String localization region to positioning and after processing, carry out Text enhancement, binary conversion treatment, normalizing Change and feature extraction operation;
Step 5, with OCR identification technologies to being processed through step 4 after text be identified.
Beneficial effect:
The present invention is by the characteristic of Gabor filter and the characteristics of responded for word textural characteristics.The sinusoidal plane wave of research Characteristic and Gaussian function attribute, give using Gabor filter to the extracting method of text information feature, and four On individual direction for word textural characteristics response condition.Using the method for deep learning, construction depth confidence network.Pass through The network of construction is processed the texture template image that Gabor filter is exported on four direction, and realization is determined text Position.Then morphologic processing method is utilized, the text filed corrosion to being navigated in video mix fortune with being expanded, be opened and closed Calculate etc. to remove noise, isolated point and filling to hole region etc., make the text image for navigating to more accurate.Finally, It is text filed after to Morphological scale-space, carry out image binaryzation, character cutting and normalization and feature extraction etc. so that place Text after reason can be recognized effectively in OCR, so as to lift the discrimination of video Chinese version.
Specific embodiment:
Specific embodiment one:
The text information Feature extraction and recognition method based on Gabor filter of present embodiment, described text information is special Extraction is levied to be realized by following steps with recognition methods:
Step one, design Gabor filter;
Step 2, design and train DBN sorter networks;
Step 3, with morphologic method, denoising, the filling of hole region and isolated point are carried out to the image after positioning Operation is rejected, makes the text image of positioning more accurate, and the String localization bianry image after final denoising is mapped to original On video frame images, accurate String localization region is obtained;
Step 4, the accurate String localization region to positioning and after processing, carry out Text enhancement, binary conversion treatment, normalizing Change and feature extraction operation;
Step 5, with OCR identification technologies to being processed through step 4 after text be identified.
Specific embodiment two:
From unlike specific embodiment one, the text information feature extraction based on Gabor filter of present embodiment with Recognition methods, the process of the design Gabor filter described in step one refer to selection suitable parameters to video frame images from 0 °, 45 °, 90 °, special to character textural characteristics on 135 ° of four directions process, obtain four width on this four direction Texture template image, suppresses background area, keeps the word textural characteristics on four direction, specially:
Regard Gabor filter as a sinusoidal plane wave in spatial domain, this sinusoidal plane wave is whole good by Gaussian letters Number is modulated so as to form Gabor filter, wherein, Gabor filter is determined by 7 parameters, is respectively central point, angle Degree, mean square deviationWithAndWith, and by it is assumed hereinafter that simplify Gabor filter function,
(1) direction of sines plane wave is identical with the anglec of rotation of Gaussian kernel functions, i.e.,
(2) central point of .Gaussian kernel functions is in (0,0), mean square deviation
(3) is by the sinusoidal plane wave after Gaussian FUNCTION MODULATIONs, and its cosine is distinct with sine, and cosinusoidal component is needed Subtract, so that plane sinusoidal wave keeps the characteristic of zero-mean on the whole, by the two dimension after abbreviation Gabor filter can be defined as:
Represent location of pixels;ωRepresent frequency;θRepresent filtering direction;σRepresent mean square deviation;FrequencyωWith varianceσPass It is that formula is:
,φIt is the bandwidth of octave, value is 1.
Specific embodiment three:
From unlike specific embodiment one or two, the text information feature based on Gabor filter of present embodiment is carried Take and recognition methods, the design described in step 2 and to train DBN sorter networks refer to that DBN point is built using RBM network structures Class network, the RBM networks according to the different numbers of plies reach the DBN sorter networks of different depth, compare the network knot of different depth Structure, complexity, locating effect, select the DBN sorter networks of appropriate depth to process video frame images, orient text area Domain, specially:
DBN networks are made up of a series of probabilistic model of limited Boltzmann machines, and description process is as follows:Assuming that have one being SystemS, it hasnLayerIf input is, it is output as, the general process of study is expressed as:, If the output of systemEqual to its input, that is, it is input intoBy systemDo not have after change any information loss or Loss very little, is considered as being kept essentially constant, and illustrates inputBy each layerS i , all almost losses without information, i.e., Any one layerS i Output, be all, to original information, that is, to be input intoAnother expression;
Wherein:The pre-training of described each layer network is carried out using unsupervised learning;Only trained with unsupervised learning every time One layer in network, and using its training result as its one layer high of input;With from top and under supervision algorithm go adjust institute There is layer.
Specific embodiment four:
From unlike specific embodiment three, the text information feature extraction based on Gabor filter of present embodiment with Recognition methods, the morphologic method of utilization described in step 3, the image after positioning is carried out denoising, the filling of hole region with And the rejecting operation of isolated point, specially:
First, computing is carried out to the bianry image after being processed via DBN network class respectively with corrosion and expansion;Secondly, profit The opening and closing operation being combined into corrosion and dilation operation is individually opened or closed to the bianry image after the treatment of DBN network class Computing;Again, make before break computing and make-before-break computing are carried out to the bianry image after the treatment of DBN sorter networks.
Specific embodiment five:
From unlike specific embodiment one, two or four, the text information feature based on Gabor filter of present embodiment Extract and recognition methods, it is characterized in that:It is characterized in that:The accurate String localization to positioning and after processing described in step 4 During region, carry out Text enhancement and binary conversion treatment is specially:
(1) takes initial thresholdg 0, and
Wherein,g maxIt is maximum gradation value,g minIt is minimum gradation value;
(2) is according to initial thresholdg 0, image pixel in video is more thang 0Be less thang 0Two parts;
(3) is more than initial threshold in seeking step 2 respectivelyg 0Partial desired value and less than initial thresholdg 0Partial expectation Value, then the desired value of the two parts is averaged respectively;
(4) is constantly iterated, untilValue reach enough small, taket=, nowtIt is threshold value to be worth.
Specific embodiment six:
From unlike specific embodiment five, the text information feature extraction based on Gabor filter of present embodiment with When recognition methods, the accurate String localization region to positioning and after processing described in step 4, operation tool is normalized Body is:
Set the size of original image as, it is changed into by the size after normalization,
By splitting operation process:Each pixel in original image is amplified, i.e., the pixel value of this point From the array of original imageCopy to the array after amplifyingIn, the size for obtaining new array is
By union operation process:It is by sizeArray be divided intoIndividual size isArray;
It is then rightThe value of every bit pixel is averaged in array image, and array is replaced with average pixel value, by division The change of merging process, original imageIt is normalized to new images
Specific embodiment seven:
From unlike specific embodiment six, the text information feature extraction based on Gabor filter of present embodiment with Recognition methods, described in step 5 with OCR identification technologies to being processed through step 4 after the process that is identified of text it is specific For, using Han Wang OCR identification softwares to being processed through step 4 after bianry image be identified.

Claims (7)

1. a kind of text information Feature extraction and recognition method based on Gabor filter, it is characterized in that:Described word letter Breath Feature extraction and recognition method is realized by following steps:
Step one, design Gabor filter;
Step 2, design and train DBN sorter networks;
Step 3, with morphologic method, denoising, the filling of hole region and isolated point are carried out to the image after positioning Operation is rejected, makes the text image of positioning more accurate, and the String localization bianry image after final denoising is mapped to original On video frame images, accurate String localization region is obtained;
Step 4, the accurate String localization region to positioning and after processing, carry out Text enhancement, binary conversion treatment, normalizing Change and feature extraction operation;
Step 5, with OCR identification technologies to being processed through step 4 after text be identified.
2. the text information Feature extraction and recognition method based on Gabor filter according to claim 1, its feature It is:The process of the design Gabor filter described in step one refer to selection suitable parameters to video frame images from 0 °, 45 °, 90 °, special to character textural characteristics on 135 ° of four directions process, obtain texture of four width on this four direction special Image is levied, suppresses background area, keep the word textural characteristics on four direction, specially:
Regard Gabor filter as a sinusoidal plane wave in spatial domain, this sinusoidal plane wave is whole good by Gaussian letters Number is modulated so as to form Gabor filter, wherein, Gabor filter is determined by 7 parameters, is respectively central point、 Angle, mean square deviationWithAndWith, and by it is assumed hereinafter that simplify Gabor filter function,
(1) direction of sines plane wave is identical with the anglec of rotation of Gaussian kernel functions, i.e.,
(2) central point of .Gaussian kernel functions is in (0,0), mean square deviation
(3) is by the sinusoidal plane wave after Gaussian FUNCTION MODULATIONs, and its cosine is distinct with sine, and cosinusoidal component is needed Subtract, so that plane sinusoidal wave keeps the characteristic of zero-mean on the whole, by the two-dimensional Gabor after abbreviation Wave filter can be defined as:
Represent location of pixels;ωRepresent frequency;θRepresent filtering direction;σRepresent mean square deviation;FrequencyωWith varianceσRelation Formula is:
,φIt is the bandwidth of octave, value is 1.
3. the text information Feature extraction and recognition method based on Gabor filter according to claim 1 and 2, it is special Levying is:Design described in step 2 and to train DBN sorter networks refer to that DBN sorter networks, root are built using RBM network structures According to the RBM networks of the different numbers of plies, reach the DBN sorter networks of different depth, compare the network structure of different depth, complexity, Locating effect, select appropriate depth DBN sorter networks video frame images are processed, orient it is text filed, specially:
DBN networks are made up of a series of probabilistic model of limited Boltzmann machines, and description process is as follows:Assuming that have one being SystemS, it hasnLayerIf input is, it is output as, the general process of study is expressed as: If, the output of systemEqual to its input, that is, it is input intoBy systemDo not have after change any information loss or Person loses very little, is considered as being kept essentially constant, and illustrates inputBy each layerS i , all almost losses without information, I.e. any one layerS i Output, be all, to original information, that is, to be input intoAnother expression;
Wherein:The pre-training of described each layer network is carried out using unsupervised learning;Only trained with unsupervised learning every time One layer in network, and using its training result as its one layer high of input;With from top and under supervision algorithm go adjust institute There is layer.
4. the text information Feature extraction and recognition method based on Gabor filter according to claim 3, its feature It is:The morphologic method of utilization described in step 3, denoising, the filling of hole region is carried out to the image after positioning and is isolated The rejecting operation of point, specially:
First, computing is carried out to the bianry image after being processed via DBN network class respectively with corrosion and expansion;Secondly, profit The opening and closing operation being combined into corrosion and dilation operation is individually opened or closed to the bianry image after the treatment of DBN network class Computing;Again, make before break computing and make-before-break computing are carried out to the bianry image after the treatment of DBN sorter networks.
5. the text information Feature extraction and recognition method based on Gabor filter according to claim 1,2 or 4, its It is characterized in:During the accurate String localization region to positioning and after processing described in step 4, Text enhancement and two-value are carried out Change treatment is specially:
(1) takes initial thresholdg 0, and
Wherein,g maxIt is maximum gradation value,g minIt is minimum gradation value;
(2) is according to initial thresholdg 0, image pixel in video is more thang 0Be less thang 0Two parts;
(3) is more than initial threshold in seeking step 2 respectivelyg 0Partial desired value and less than initial thresholdg 0Partial desired value, The desired value to the two parts is averaged respectively again;
(4) is constantly iterated, untilValue reach enough small, taket=, nowtIt is threshold value to be worth.
6. the text information Feature extraction and recognition method based on Gabor filter according to claim 5, its feature It is:During the accurate String localization region to positioning and after processing described in step 4, it is normalized operation and is specially:
Set the size of original image as, it is changed into by the size after normalization,
By splitting operation process:Each pixel in original image is amplified, i.e., the pixel value of this point from The array of original imageCopy to the array after amplifyingIn, the size for obtaining new array is
By union operation process:It is by sizeArray be divided intoIndividual size isArray;
It is then rightThe value of every bit pixel is averaged in array image, and array is replaced with average pixel value, closed by division And the change of process, original imageIt is normalized to new images
7. the text information Feature extraction and recognition method based on Gabor filter according to claim 6, its feature It is:Described in step 5 with OCR identification technologies to being processed through step 4 after the process that is identified of text specifically, utilizing Han Wang OCR identification softwares to being processed through step 4 after bianry image be identified.
CN201710027704.1A 2017-01-16 2017-01-16 Text information feature extraction and recognition method based on Gabor filter Pending CN106778732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710027704.1A CN106778732A (en) 2017-01-16 2017-01-16 Text information feature extraction and recognition method based on Gabor filter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710027704.1A CN106778732A (en) 2017-01-16 2017-01-16 Text information feature extraction and recognition method based on Gabor filter

Publications (1)

Publication Number Publication Date
CN106778732A true CN106778732A (en) 2017-05-31

Family

ID=58946908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710027704.1A Pending CN106778732A (en) 2017-01-16 2017-01-16 Text information feature extraction and recognition method based on Gabor filter

Country Status (1)

Country Link
CN (1) CN106778732A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805102A (en) * 2018-06-28 2018-11-13 中译语通科技股份有限公司 A kind of video caption detection and recognition methods and system based on deep learning
CN110766014A (en) * 2018-09-06 2020-02-07 邬国锐 Bill information positioning method, system and computer readable storage medium
CN110969154A (en) * 2019-11-29 2020-04-07 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (en) * 2011-09-07 2011-12-21 谭洪舟 Binaryzation processing method of self-adaption word image based on pixel neighborhood feature
CN102968637A (en) * 2012-12-20 2013-03-13 山东科技大学 Complicated background image and character division method
CN103605991A (en) * 2013-10-28 2014-02-26 复旦大学 Automatic video advertisement detection method
CN104361336A (en) * 2014-11-26 2015-02-18 河海大学 Character recognition method for underwater video images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (en) * 2011-09-07 2011-12-21 谭洪舟 Binaryzation processing method of self-adaption word image based on pixel neighborhood feature
CN102968637A (en) * 2012-12-20 2013-03-13 山东科技大学 Complicated background image and character division method
CN103605991A (en) * 2013-10-28 2014-02-26 复旦大学 Automatic video advertisement detection method
CN104361336A (en) * 2014-11-26 2015-02-18 河海大学 Character recognition method for underwater video images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘明珠等: ""基于深度学习法的视频文本区域定位与识别"", 《哈尔滨理工大学学报》 *
刘晓敏: "《基于虹膜识别的商务会馆管理***的实现》", 31 August 2015 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805102A (en) * 2018-06-28 2018-11-13 中译语通科技股份有限公司 A kind of video caption detection and recognition methods and system based on deep learning
CN110766014A (en) * 2018-09-06 2020-02-07 邬国锐 Bill information positioning method, system and computer readable storage medium
CN110969154A (en) * 2019-11-29 2020-04-07 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Yan et al. A fast uyghur text detector for complex background images
Sun et al. A robust approach for text detection from natural scene images
Dubey et al. Infected fruit part detection using K-means clustering segmentation technique
Lukic et al. Leaf recognition algorithm using support vector machine with Hu moments and local binary patterns
Zamberletti et al. Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions
CN104050471B (en) Natural scene character detection method and system
Wu et al. Improving leaf classification rate via background removal and ROI extraction
CN111598001B (en) Identification method for apple tree diseases and insect pests based on image processing
Mahdikhanlou et al. Plant leaf classification using centroid distance and axis of least inertia method
CN112464942B (en) Computer vision-based overlapped tobacco leaf intelligent grading method
Najjar et al. Flower image segmentation based on color analysis and a supervised evaluation
CN109034094A (en) A kind of articles seeking method and apparatus
CN106778732A (en) Text information feature extraction and recognition method based on Gabor filter
Afakh et al. Aksara jawa text detection in scene images using convolutional neural network
CN110516533A (en) A kind of pedestrian based on depth measure discrimination method again
CN107633264B (en) Linear consensus integrated fusion classification method based on space spectrum multi-feature extreme learning
CN106203448A (en) A kind of scene classification method based on Nonlinear Scale Space Theory
Chen et al. Offline handwritten digits recognition using machine learning
Xu et al. A system to localize and recognize texts in Oriented ID card images
Wu et al. A robust symmetry-based method for scene/video text detection through neural network
Zhang et al. A novel approach for binarization of overlay text
Kabir et al. Discriminant feature extraction using disease segmentation for automatic leaf disease diagnosis
Lamba et al. Handwriting Recognition System-A Review
Jameel et al. A REVIEW ON RECOGNITION OF HANDWRITTEN URDU CHARACTERS USING NEURAL NETWORKS.
CN112070009B (en) Convolutional neural network expression recognition method based on improved LBP operator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170531