CN105654129A - Optical character sequence recognition method - Google Patents

Optical character sequence recognition method Download PDF

Info

Publication number
CN105654129A
CN105654129A CN201511020570.8A CN201511020570A CN105654129A CN 105654129 A CN105654129 A CN 105654129A CN 201511020570 A CN201511020570 A CN 201511020570A CN 105654129 A CN105654129 A CN 105654129A
Authority
CN
China
Prior art keywords
neural network
recurrence
word
data
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201511020570.8A
Other languages
Chinese (zh)
Inventor
刘世林
何宏靖
陈炳章
吴雨浓
姚佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Business Big Data Technology Co Ltd
Original Assignee
Chengdu Business Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Business Big Data Technology Co Ltd filed Critical Chengdu Business Big Data Technology Co Ltd
Priority to CN201511020570.8A priority Critical patent/CN105654129A/en
Publication of CN105654129A publication Critical patent/CN105654129A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2111Selection of the most significant subset of features by using evolutionary computational techniques, e.g. genetic algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Physiology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The invention belongs to the image character recognition field and relates to an optical character sequence recognition method. According to the method of the invention, CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) technologies are adopted; feature extraction is performed on a whole picture containing a plurality of characters through a CNN; identical features are transmitted to an RNN so as to be subjected to repeatedly recursive use; and continuous prediction of the plurality of characters can be realized. With the method adopted, a defect that picture segmentation is required before OCR (optical character recognition) can be eliminated, the early-stage processing process of picture character recognition can be simplified, and the efficiency of character recognition can be significantly improved; and since the RNN recursively uses output data of the last round, and in model training, a language model of dependency relationships between characters and words can be obtained through learning, and therefore, a step in an OCR method, according to which a language model is required to be additionally built for post-processing after individual characters are recognized, can be avoided; and therefore, the recognition accuracy of character and word sequences can be better improved, and the processing efficiency of character recognition can be further improved.

Description

A kind of optical character recognition sequence method
Technical field
The present invention relates to pictograph identification field, in particular to a kind of optical character recognition sequence method.
Background technology
Along with the development of society, create a large amount of demand to paper media digitizings such as ancient books, document, bill, business cards, here digitizing is not limited only to use scanner or camera to carry out " photo ", the more important thing is to change into these paper documents and store with the document can read, can edit, realize this process to need the picture scanned is carried out pictograph identification, and traditional pictograph is identified as optical character identification (OCR), optical character is identified in and is scanned on the basis of electronics image by paper document to be identified to identify. It is contemplated that the quality of the quality of scanning effect, paper document itself is (not such as printing quality, font clarity, font standard degree etc.), contents and distribution's (arranging situation of word, than plain text and table text and bill) difference, the actual effect of OCR does not always allow people satisfied. And for different paper documents recognition accuracy require difference, the identification of such as bill, being very high to the requirement of accuracy rate, because if a digit recognition mistake is it is possible to cause fatal consequence, traditional OCR identifies the identification requirement that can not meet such high precision.
Conventional OCR method includes the cutting of picture, and feature is extracted, the treating processess such as monocase identification, and wherein the cutting of picture contains a large amount of Image semantic classification processes, such as Slant Rectify, background denoising, the extraction of monocase; These treating processess are not only loaded down with trivial details consuming time, and it would furthermore be possible to make the picture a lot of available information of loss; And when picture to be identified comprises the character string of multiple word, traditional OCR method needs that former character string is cut into some little pictures comprising single word and identifies respectively, mainly there is two large problems in the method: one, the cutting difficulty of monocase picture, particularly it is mixed with the man of left and right radical, letter, numeral, symbol, or when the distortion of background noise, character, bonding, cutting is more difficult.And once problem has occurred in cutting, just it is difficult to obtain recognition result accurately. Two, character string is cut into the difference recognition methods that the sub-pictures comprising single character carries out identifying, do not make full use of the dependence between word in natural language, word, although it is supplementary that extra language model can be used to be optimized by recognition result, it is contemplated that the building process of language model and recognizer is separate, it is local finite that the optimization of this kind of mode supplements.
Need in the face of huge identification to be badly in need of a kind of can image character recognition method rapidly and efficiently.
Summary of the invention
It is an object of the invention to overcome above-mentioned deficiency existing in prior art, it is provided that a kind of optical character recognition sequence method. Invention applies convolutional neural networks (CNN) and the technology of recurrence neural network (RNN), by CNN, the whole picture comprising multiple character is carried out feature extraction, then same feature feeding RNN is carried out recurrence to reuse, to realize the object predicting multiple character continuously. The optical character recognition sequence that the inventive method realizes, overcoming before OCR identifies of system first to be carried out the drawback of picture cutting, greatly improve the recognition efficiency of pictograph, again due in the process of model training and application RNN recurrence employ a recognition result taken turns and export data, also learn to obtain in the lump by the language model of dependence between word, word like this, at the recognition efficiency further increasing pictograph of the recognition accuracy promoting word, word sequence simultaneously.
In order to realize foregoing invention object, the present invention provides following technical scheme:
A kind of optical character recognition sequence method, comprises following performing step:
(1) convolutional neural networks and recurrent neural networks model is built, each moment input signal of wherein said recurrence neural network comprises: the sample characteristics data that described convolutional neural networks extracts, the vector data that the words that the output data of a upper moment recurrence neural network and a upper moment recurrence neural network recognization go out changes into;
(2) training sample set is used to train described convolutional neural networks and recurrent neural networks model;
(3) in described convolutional neural networks pictograph sequence inputting to be identified trained and recurrence neural network, the characteristic of picture to be identified is extracted by described convolutional neural networks, it is input in described recurrence neural network, through the iteration successively of described recurrence neural network, export the complete recognition result of pictograph sequence to be identified.
Concrete, the calculation formula of the recurrence neural network forward algorithm used in the inventive method is as follows:
a h t = Σ i I - w i h x i t + Σ l V w l h v l t - 1 + Σ h ′ H w h ′ h b h ′ t - 1
b h t = θ ( a h t )
a k t = Σ h H w h k b h t
y k t = exp ( a k t ) Σ k ′ k exp ( a k ′ t )
Wherein I is the dimension degree of input vector, and V is the word of vectorization or the dimension degree of word, and H is the neuron number of hidden layer, K is the neuron number of output layer, x is the characteristic that convolutional neural networks extracts, and v is the vector data that the word that identifies of RNN or word change into through dictionary mapping tableFor the input of hidden layer neuron in current time recurrence neural network,For the output of current time recurrence neural network hidden layer neuron; wih, wlh, wh��h, forCorresponding weight parameter.For the current time recurrence neuronic input of neural network output layer; whkFor the weight that each neurone of output layer is corresponding;For the current time recurrence neuronic output of neural network output layer,It is a probable value, represents the ratio adding sum of the corresponding neuron output value neuron output value all relative to output layer of current time.
Input data comprise 3 aspects of hidden layer neuron in the recurrence neural network used the inventive method can be found out from above-mentioned formula, the learning sample feature that CNN extracts, the output data of a upper moment recurrence neural network hidden layer, and a upper moment recurrent neural networks prediction result (words identified) carries out the data of vectorization through dictionary mapping table. Therefore the recurrence neural network that the present invention uses, when the word (word) of prediction current time, had both relied on the feature of image, had also relied on the feature (language model) of upper moment output.
Further, in the inventive method, signal is just to the parameter w used when transmittingih, wlh, wh��hAll share across sequential, this avoid the linear increase of model complexity, cause possible over-fitting.
Further, the present invention adopts above-mentioned forward algorithm to transmit computing data step by step in convolutional neural networks and recurrence neural network, identification (prediction) data are got at output layer, when the annotation results with learning sample that predicts the outcome has deviation, adjust each weight in neural network by error backpropagation algorithm classical in neural network.
Further, in neural network training process, checked the training result of neural network by exploitation collection, the training direction of adjustment neural network in time, prevent the generation of over-fitting situation, in model training process, only it is retained in the training model that the upper recognition accuracy of exploitation collection is the highest.
Further, the neural network training process of this optical character recognition sequence method comprises following performing step:
(2-1) learning sample manually marked is input in convolutional neural networks;
(2-2) by described convolution network, input learning sample is carried out feature extraction;
(2-3) characteristic extracted by described convolutional neural networks inputs in the first moment recurrence neural network as the first data;
(2-4) calculating through the first moment recurrence neural network exports the first predicted data; Obtain the words recognition result of this moment recurrence neural network according to the first predicted data, this result is defined as: the first recognition result;
(2-5) corresponding vector data and by the first recognition result is changed into;
(2-6) by the first data, first recognition result of the first predicted data and vectorization is as the input data of the 2nd moment recurrence neural network, calculating through recurrence neural network exports the 2nd predicted data, and obtains two recognition result corresponding by the 2nd predicted data;
(2-7) corresponding vector data and by the 2nd recognition result it is converted into;
(2-8) by the first data, the 2nd recognition result of the 2nd predicted data and vectorization is as the input data of the 3rd moment recurrence neural network;
Recurrence successively, until when reaching the recurrence number of times of setting or export null value, terminating identifying; Each moment RNN is measured in advance word (or word) is recorded successively and just final is obtained complete string content.
Concrete, in described step (2-5) and (2-7), vectorization is carried out by dictionary mapping table, described dictionary mapping table is a two-dimentional matrix, line number is the size of dictionary, row number (the dimension degree of row vector) sets according to the size of dictionary and the scale of data, the object of dictionary mapping table is by word (or word) characterization, vectorization, in fact simple, dictionary mapping table is exactly a two-dimentional matrix, the wherein corresponding word of each row vector or a word, and the corresponding relation of this kind of row vector and words is arranged when building this dictionary mapping table.
Further, in the process building dictionary mapping table, if the unit identified is word, then first natural language can be carried out word segmentation processing, such as " this thing is very good " be become " this thing is very good ".
Further, when carrying out model training, comprising and be normalized by learning sample icon and manually mark process, wherein normalized process comprises: most long word (or word) number that setting picture sentence is possible, and the length such as setting sentence is 20.
Further, it is being normalized in process, in order to avoid data to be out of shape, the zoom of size uses the mode of equal proportion, mends neat with background colour with the region of target size disappearance.
Further, normalized picture is manually marked, if the sentence number of words of mark is less than 20, it may also be useful to a special word:<SP>carries out mending neat (length to 20), then the data choosing 75% at random are as training set, and the data of 25% are as exploitation collection.
Compared with prior art, the useful effect of the present invention: the present invention provides a kind of optical character recognition sequence method, the present invention adopts convolutional neural networks that word sequence picture to be identified is carried out entirety and levies extraction, and the characteristic extracted is input in the recurrence neural network in each moment as the first data, the pictograph recognition sequence that the inventive method realizes, the overall feature of picture is extracted by convolutional neural networks, in the identification not needing to carry out achieving whole word sequence on the basis of single character cutting and noise filtration, relative to traditional OCR method, present invention, avoiding the inaccurate irreversible identification mistake that may cause of character segmentation, greatly simplify the treating processes in early stage of pictograph identification, significantly improve the efficiency of Text region.
In addition the inventive method realizes the continuous identification of character in word sequence by recurrence neural network, when using recurrence neural network to identify character, the vector data that the words that the input signal of each moment recurrence neural network also comprises the output data of a moment recurrence neural network and a upper moment recurrence neural network recognization goes out changes into, each moment recurrence neural network is when carrying out corresponding Text region, namely the overall feature of picture that convolutional neural networks extracts has been relied on, also output data and the recognition result of a upper moment recurrence neural network has been relied on, identifying on the basis of words respectively like this, by word, between word, the language model of dependence also learns in the lump and has recognized, compared to OCR method, no longer need to be optimized supplementary by additionally building language model to monocase recognition result, simplify the post-processed process identifying word, recognition efficiency is higher, recognition result is more accurately and reliably.
In a word, the inventive method simplifies the treating processes of pictograph recognition sequence, significantly improve recognition efficiency and accuracy rate, developer is made can more to pay close attention to the tuning of model and the deposit of data, improving development efficiency, the inventive method has extremely high using value and application prospect widely in pictograph identification field.
Accompanying drawing illustrates:
Fig. 1 is the process that the realizes schematic diagram of the inventive method.
Fig. 2 is convolutional neural networks structural representation.
Fig. 3 is that the inventive method word sequence recognition process signal flows to schematic diagram.
Embodiment
Below in conjunction with test example and embodiment, the present invention is described in further detail. But this should not being interpreted as, the scope of the above-mentioned theme of the present invention is only limitted to following embodiment, and all technology realized based on content of the present invention all belong to the scope of the present invention.
The present invention provides a kind of optical character recognition sequence method. Invention applies convolutional neural networks (CNN) and the technology of recurrence neural network (RNN), by CNN, the whole picture comprising multiple character is carried out feature extraction, then same feature feeding RNN is carried out recurrence to reuse, to realize the object predicting multiple character continuously.The optical character recognition sequence that the inventive method realizes, overcoming before OCR identifies of system first to be carried out the drawback of picture cutting, greatly improve the recognition efficiency of pictograph, developer is made more to pay close attention to the tuning of model and the deposit of data, improve development efficiency, again due in the process of model training and application RNN recurrence employ a recognition result taken turns and export data, like this by word, between word, the language model of dependence also learns to obtain in the lump, at lifting word, the recognition efficiency simultaneously further increasing pictograph of the recognition accuracy of word sequence.
Following technical scheme is provided: a kind of optical character recognition sequence method, comprises following performing step as shown in Figure 1 in order to realize foregoing invention object the present invention:
(1) convolutional neural networks and recurrent neural networks model is built, each moment input signal of wherein said recurrence neural network comprises: the sample characteristics data that described convolutional neural networks extracts, the vector data that the words that the output data of a upper moment recurrence neural network and a upper moment recurrence neural network recognization go out changes into; As shown in Figure 2: described convolutional neural networks is mainly used for the automatic study of picture feature. Wherein, each characteristic pattern (featuremap, shown in vertical setting of types rectangle in figure) generation be all (namely such as the little rectangle frame in Fig. 2 by an own convolution core, it is shared in the characteristic pattern specified) carry out preliminary feature extraction, the feature that convolutional layer is extracted by double sampling layer is sampled and is mainly solved the redundancy that convolutional layer is extracted feature. In brief, described convolutional neural networks extracts the different characteristics of picture by convolutional layer, by double sampling layer, the feature extracted is sampled, (multiple convolutional layer can be comprised in a convolutional neural networks to remove redundant information, double sampling layer and full articulamentum), finally by full articulamentum different characteristic patterns is together in series and forms final full picture feature, the inventive method uses a convolutional neural networks, whole pictures is carried out disposable feature extraction, completely avoid the irreversible identification mistake that picture cutting may cause.
(2) training sample set is used to train described convolutional neural networks and recurrent neural networks model.
(3) in described convolutional neural networks pictograph sequence inputting to be identified trained and recurrence neural network, the characteristic of picture to be identified is extracted by described convolutional neural networks, it is input in described recurrence neural network, through the iteration successively of described recurrence neural network, export the complete recognition result of pictograph sequence to be identified.
Concrete, the calculation formula of the recurrence neural network forward algorithm used in the inventive method is as follows:
a h t = &Sigma; i I - w i h x i t + &Sigma; l V w l h v l t - 1 + &Sigma; h &prime; H w h &prime; h b h &prime; t - 1
b h t = &theta; ( a h t )
a k t = &Sigma; h H w h k b h t
y k t = exp ( a k t ) &Sigma; k &prime; k exp ( a k &prime; t )
Wherein I is the dimension degree of input vector, V is the word of vectorization or the dimension degree of word, H is the neuron number of hidden layer, K is the neuron number of output layer, x is the characteristic that convolutional neural networks extracts, and v is vector data (the special v that the word that identifies of RNN or word change into through dictionary mapping table0=0),For the input of hidden layer neuron in current time recurrence neural network,For the output (b of current time recurrence neural network hidden layer neuron0=0), �� () isArriveFunction; wih, wlh, wh��h, forCorresponding weight parameter, in a forward algorithm transmittance process, parameter wih, wlh, wh��hAll sharing across sequential, so-called sharing across sequential refers to recurrence neural network at signal just in transmittance process, each moment wih, wlh, wh��hIdentical (the not w of valueih=wlh=wh��h), the not w of RNN in the same timeih, wlh, wh��hIt is worth identical, reduces the complexity of model parameter, it also avoid the linear increase of model complexity and cause possible over-fitting.For the current time recurrence neuronic input of neural network output layer; whkFor the weight that each neurone of output layer is corresponding;For the current time recurrence neuronic output of neural network output layer,It is a probable value, represents the ratio adding sum of the corresponding neuron output value neuron output value all relative to output layer of current time, generally, will selectThe classification that directly maximum output neuron is corresponding is the recognition result of this moment recurrence neural network.
Input data comprise 3 aspects of hidden layer neuron in the recurrence neural network used the inventive method can be found out from above-mentioned formula, the learning sample feature that CNN extracts, the output data of a upper moment recurrence neural network hidden layer, and a upper moment recurrent neural networks prediction result (words identified) carries out the data of vectorization through dictionary mapping table. Therefore the recurrence neural network that the present invention uses, when the word (word) of prediction current time, had both relied on the feature of image, had also relied on the feature (language model) of upper moment output.
Further, the present invention adopts above-mentioned forward algorithm to transmit computing data step by step in convolutional neural networks and recurrence neural network, identification (prediction) data are got at output layer, when the annotation results with learning sample that predicts the outcome has deviation, each weight in neural network is adjusted by error backpropagation algorithm classical in neural network, error back propagation method by error step by step backpropagation share all neurones of each layer, obtain the neuronic error signal of each layer, and then revise each neuronic weight. Computing data are transmitted by layer, and the process revising a neuronic weight gradually by backward algorithm is exactly the training process of neural network by forward algorithm; Repeating said process, until the accuracy predicted the outcome reaches the threshold value of setting, deconditioning, now can think that neural network model has been trained.
Further, in neural network training process, checked the training result of neural network by exploitation collection, the training direction of adjustment neural network in time, prevent the generation of over-fitting situation, in model training process, only it is retained in the training model that the upper recognition accuracy of exploitation collection is the highest.
Further, the neural network training process of this optical character recognition sequence method comprises following performing step as shown in Figure 3:
(2-1) learning sample manually marked is input in convolutional neural networks;
(2-2) by described convolution network, input learning sample is carried out feature extraction;
(2-3) characteristic extracted by described convolutional neural networks inputs in the first moment recurrence neural network as the first data;
(2-4) calculating through the first moment recurrence neural network exports the first predicted data; Obtain the words recognition result of this moment recurrence neural network according to the first predicted data, this result is defined as: the first recognition result;
(2-5) corresponding vector data and by the first recognition result is changed into;
(2-6) by the first data, first recognition result of the first predicted data and vectorization is as the input data of the 2nd moment recurrence neural network, calculating through recurrence neural network exports the 2nd predicted data, and obtains two recognition result corresponding by the 2nd predicted data;
(2-7) corresponding vector data and by the 2nd recognition result it is converted into;
(2-8) by the first data, the 2nd recognition result of the 2nd predicted data and vectorization is as the input data of the 3rd moment recurrence neural network;
Recurrence successively, the vector that the words (recognition result) that characteristic (the first data), the output data (predicted data) in upper moment RNN and upper moment RNN extracted by CNN is identified is corresponding, as the input data of current time RNN, the prediction through RNN exports a word (or word); Until when reaching the recurrence number of times of setting, terminating identifying; Each moment RNN is measured in advance word (or word) is recorded successively and just final is obtained complete string content.
Concrete, in described step (2-5) and (2-7), vectorization is carried out by dictionary mapping table, described dictionary mapping table is a two-dimentional matrix, line number is the size of dictionary, arrange and several set according to the size of dictionary and the scale of data, the object of dictionary mapping table is by word (or word) characterization, vectorization, in fact simple, dictionary mapping table is exactly a two-dimentional matrix, wherein the corresponding word of each row vector or a word, and the corresponding relation of this kind of row vector and words is arranged when building this dictionary mapping table.
Further, in the process building dictionary mapping table, it is possible to first natural language is carried out word segmentation processing, such as " this thing is very good " is become " this thing is very good ".
Further, when carrying out model training, comprise and learning sample icon is normalized and manually marks process, normalized sample, making the basic parameter of sample equal, data unrelated complexity when reducing model training, is conducive to simplifying model training process; Wherein normalized process comprises: most long word (or word) number that setting picture sentence is possible, the length such as setting sentence is 20, the length of word sequence to be identified is corresponding with the maximum recurrence number of times of recurrence neural network, the most long word symbol number that word sequence to be identified is set when carrying out learning sample and prepare can be corresponding the maximum recurrence number of times of default recurrence neural network, increase the stability of model and predictable.
Further, it is being normalized in process, in order to avoid data to be out of shape, the zoom of size uses the mode of equal proportion, mends neat with background colour with the region of target size disappearance.
Further, normalized picture is manually marked, if sentence word (or word) number of mark is less than the maximum word (or word) number (less than 20) of setting, it may also be useful to a special word carries out mending neat (such as using "<SP>" to mend less than the samples pictures of 20 characters (or word) together to the length of 20 characters (or word)).
Further, after above-mentioned normalized and artificial mark, the data choosing 75% at random are as training sample set, and the data choosing 25% are as development sample collection. Neural network is only kept at the highest model of the upper recognition accuracy of exploitation collection in the training process, and the uniform format of development sample and learning sample, is conducive to improving the training effectiveness of neural network.

Claims (9)

1. an optical character recognition sequence method, it is characterised in that, comprise following performing step:
(1) convolutional neural networks and recurrent neural networks model is built, each moment input signal of wherein said recurrence neural network comprises: the sample characteristics data that described convolutional neural networks extracts, the vector data that the words that the output data of a upper moment recurrence neural network and a upper moment recurrence neural network recognization go out changes into;
(2) training sample set is used to train described convolutional neural networks and recurrent neural networks model;
(3), in the described convolutional neural networks trained by pictograph sequence inputting to be identified and recurrence neural network, the complete recognition result of pictograph sequence to be identified is exported.
2. the method for claim 1, it is characterised in that: the recurrent neural networks model used in present method adopts following forward algorithm formula:
a h t = &Sigma; i I w i h x i t + &Sigma; l V w l h v l t - 1 + &Sigma; h &prime; H w h &prime; h b h &prime; t - 1
b h t = &theta; ( a h t )
a k t = &Sigma; h H w h k b h t
y k t = exp ( a k t ) &Sigma; k &prime; k exp ( a k &prime; t )
Wherein I is the dimension degree of input vector, and V is the word of vectorization or the dimension degree of word, and H is the neuron number of hidden layer, K is the neuron number of output layer, x is the characteristic that convolutional neural networks extracts, and v is the vector data that the word that goes out of recurrence neural network recognization or word change intoFor the input of hidden layer neuron in current time recurrence neural network,For the output of current time recurrence neural network hidden layer neuron;For the current time recurrence neuronic input of neural network output layer;For the current time recurrence neuronic output of neural network output layer,It is a probable value, represents the ratio adding sum of the corresponding neuron output value neuron output value all relative to output layer of current time.
3. method as claimed in claim 2, it is characterised in that: the w that each moment uses in signal forward transmittance processih, wlh, wh��hIt is worth identical.
4. method as claimed in claim 3, it is characterised in that: in neural network training process, checked the training result of neural network by exploitation collection, only it is retained in the highest convolutional neural networks of the upper recognition accuracy of exploitation collection and recurrent neural networks model.
5. method as described in one of claims 1 to 3, it is characterised in that: comprise following performing step:
(2-1) learning sample manually marked is input in convolutional neural networks;
(2-2) by described convolution network, input learning sample is carried out feature extraction;
(2-3) characteristic extracted by described convolutional neural networks inputs in the first moment recurrence neural network as the first data;
(2-4) calculating through the first moment recurrence neural network exports the first predicted data; Obtain the words recognition result of this moment recurrence neural network according to the first predicted data, this result is defined as: the first recognition result;
(2-5) corresponding vector data and by the first recognition result is changed into;
(2-6) by the first data, first recognition result of the first predicted data and vectorization is as the input data of the 2nd moment recurrence neural network, calculating through recurrence neural network exports the 2nd predicted data, and obtains two recognition result corresponding by the 2nd predicted data;
(2-7) corresponding vector data and by the 2nd recognition result it is converted into;
(2-8) by the first data, the 2nd recognition result of the 2nd predicted data and vectorization is as the input data of the 3rd moment recurrence neural network;
Recurrence successively, until when reaching the recurrence number of times of setting or export null value, terminating calculating.
6. method as claimed in claim 5, it is characterized in that: in described step (2-5) and (2-7), vectorization is carried out by dictionary mapping table, described dictionary mapping table is a two-dimentional matrix, the wherein corresponding word of each row vector or a word, and the corresponding relation of this kind of row vector and words is arranged when building this dictionary mapping table.
7. method as claimed in claim 6, it is characterised in that: build in the process of dictionary mapping table, if fundamental unit is word, then natural language is carried out word segmentation processing.
8. method as claimed in claim 7, it is characterised in that: when preparing learning sample and development sample, samples pictures being normalized, described normalized comprises: the most long word number arranging that picture to be identified allows or word number.
9. method as claimed in claim 8, it is characterised in that: when the sample being normalized manually is marked, when the number of words comprised in samples pictures is less than the most long word number arranged, it may also be useful to the number of words in samples pictures is mended neat by<SP>mark symbol.
CN201511020570.8A 2015-12-30 2015-12-30 Optical character sequence recognition method Pending CN105654129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511020570.8A CN105654129A (en) 2015-12-30 2015-12-30 Optical character sequence recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511020570.8A CN105654129A (en) 2015-12-30 2015-12-30 Optical character sequence recognition method

Publications (1)

Publication Number Publication Date
CN105654129A true CN105654129A (en) 2016-06-08

Family

ID=56478266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511020570.8A Pending CN105654129A (en) 2015-12-30 2015-12-30 Optical character sequence recognition method

Country Status (1)

Country Link
CN (1) CN105654129A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570521A (en) * 2016-10-24 2017-04-19 中国科学院自动化研究所 Multi-language scene character recognition method and recognition system
CN107085730A (en) * 2017-03-24 2017-08-22 深圳爱拼信息科技有限公司 A kind of deep learning method and device of character identifying code identification
CN107516096A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 A kind of character identifying method and device
CN107844794A (en) * 2016-09-21 2018-03-27 北京旷视科技有限公司 Image-recognizing method and device
CN107992941A (en) * 2017-12-28 2018-05-04 武汉璞华大数据技术有限公司 A kind of contract terms sorting technique
CN108647310A (en) * 2018-05-09 2018-10-12 四川高原之宝牦牛网络技术有限公司 Identification model method for building up and device, character recognition method and device
CN109214386A (en) * 2018-09-14 2019-01-15 北京京东金融科技控股有限公司 Method and apparatus for generating image recognition model
CN109242796A (en) * 2018-09-05 2019-01-18 北京旷视科技有限公司 Character image processing method, device, electronic equipment and computer storage medium
CN109409392A (en) * 2017-08-18 2019-03-01 广州极飞科技有限公司 The method and device of picture recognition
CN109582972A (en) * 2018-12-27 2019-04-05 信雅达***工程股份有限公司 A kind of optical character identification error correction method based on natural language recognition
CN109753966A (en) * 2018-12-16 2019-05-14 初速度(苏州)科技有限公司 A kind of Text region training system and method
CN110598703A (en) * 2019-09-24 2019-12-20 深圳大学 OCR (optical character recognition) method and device based on deep neural network
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
WO2020248471A1 (en) * 2019-06-14 2020-12-17 华南理工大学 Aggregation cross-entropy loss function-based sequence recognition method
CN112801085A (en) * 2021-02-09 2021-05-14 沈阳麟龙科技股份有限公司 Method, device, medium and electronic equipment for recognizing characters in image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080152217A1 (en) * 2006-05-16 2008-06-26 Greer Douglas S System and method for modeling the neocortex and uses therefor
CN104794501A (en) * 2015-05-14 2015-07-22 清华大学 Mode identification method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080152217A1 (en) * 2006-05-16 2008-06-26 Greer Douglas S System and method for modeling the neocortex and uses therefor
CN104794501A (en) * 2015-05-14 2015-07-22 清华大学 Mode identification method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
丛爽: "前向递归神经网络", 《智能控制***及其应用》 *
宣森炎等: "基于联合卷积和递归神经网络的交通标志识别", 《传感器与微***》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107516096A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 A kind of character identifying method and device
CN107844794A (en) * 2016-09-21 2018-03-27 北京旷视科技有限公司 Image-recognizing method and device
CN106570521A (en) * 2016-10-24 2017-04-19 中国科学院自动化研究所 Multi-language scene character recognition method and recognition system
CN106570521B (en) * 2016-10-24 2020-04-28 中国科学院自动化研究所 Multilingual scene character recognition method and recognition system
CN107085730A (en) * 2017-03-24 2017-08-22 深圳爱拼信息科技有限公司 A kind of deep learning method and device of character identifying code identification
CN109409392A (en) * 2017-08-18 2019-03-01 广州极飞科技有限公司 The method and device of picture recognition
CN107992941A (en) * 2017-12-28 2018-05-04 武汉璞华大数据技术有限公司 A kind of contract terms sorting technique
CN108647310A (en) * 2018-05-09 2018-10-12 四川高原之宝牦牛网络技术有限公司 Identification model method for building up and device, character recognition method and device
CN109242796A (en) * 2018-09-05 2019-01-18 北京旷视科技有限公司 Character image processing method, device, electronic equipment and computer storage medium
CN109214386A (en) * 2018-09-14 2019-01-15 北京京东金融科技控股有限公司 Method and apparatus for generating image recognition model
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
CN109753966A (en) * 2018-12-16 2019-05-14 初速度(苏州)科技有限公司 A kind of Text region training system and method
CN109582972A (en) * 2018-12-27 2019-04-05 信雅达***工程股份有限公司 A kind of optical character identification error correction method based on natural language recognition
CN109582972B (en) * 2018-12-27 2023-05-16 信雅达科技股份有限公司 Optical character recognition error correction method based on natural language recognition
WO2020248471A1 (en) * 2019-06-14 2020-12-17 华南理工大学 Aggregation cross-entropy loss function-based sequence recognition method
CN110598703A (en) * 2019-09-24 2019-12-20 深圳大学 OCR (optical character recognition) method and device based on deep neural network
CN110598703B (en) * 2019-09-24 2022-12-20 深圳大学 OCR (optical character recognition) method and device based on deep neural network
CN112801085A (en) * 2021-02-09 2021-05-14 沈阳麟龙科技股份有限公司 Method, device, medium and electronic equipment for recognizing characters in image

Similar Documents

Publication Publication Date Title
CN105654129A (en) Optical character sequence recognition method
CN105654135A (en) Image character sequence recognition system based on recurrent neural network
CN105654130A (en) Recurrent neural network-based complex image character sequence recognition system
CN105678293A (en) Complex image and text sequence identification method based on CNN-RNN
Mathew et al. Docvqa: A dataset for vqa on document images
CN105654127A (en) End-to-end-based picture character sequence continuous recognition method
CN105678292A (en) Complex optical text sequence identification system based on convolution and recurrent neural network
CN110807328B (en) Named entity identification method and system for legal document multi-strategy fusion
CN105678300A (en) Complex image and text sequence identification method
Kafle et al. Answering questions about data visualizations using efficient bimodal fusion
CN104966097A (en) Complex character recognition method based on deep learning
CN111259897B (en) Knowledge-aware text recognition method and system
CN109492099A (en) It is a kind of based on field to the cross-domain texts sensibility classification method of anti-adaptive
CN107220506A (en) Breast cancer risk assessment analysis system based on deep convolutional neural network
CN106446954A (en) Character recognition method based on depth learning
CN110866388A (en) Publishing PDF layout analysis and identification method based on mixing of multiple neural networks
Calvo-Zaragoza et al. End-to-end optical music recognition using neural networks
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN114781392A (en) Text emotion analysis method based on BERT improved model
CN110674777A (en) Optical character recognition method in patent text scene
CN111966812A (en) Automatic question answering method based on dynamic word vector and storage medium
CN114579746A (en) Optimized high-precision text classification method and device
CN117011638A (en) End-to-end image mask pre-training method and device
Engin et al. Multimodal deep neural networks for banking document classification
CN112164040A (en) Steel surface defect identification method based on semi-supervised deep learning algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160608

RJ01 Rejection of invention patent application after publication