CN107704859A - A kind of character recognition method based on deep learning training framework - Google Patents
A kind of character recognition method based on deep learning training framework Download PDFInfo
- Publication number
- CN107704859A CN107704859A CN201711057406.3A CN201711057406A CN107704859A CN 107704859 A CN107704859 A CN 107704859A CN 201711057406 A CN201711057406 A CN 201711057406A CN 107704859 A CN107704859 A CN 107704859A
- Authority
- CN
- China
- Prior art keywords
- layer
- weights
- deep learning
- network
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/63—Scene text, e.g. street names
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The invention provides a kind of character recognition method based on deep learning training framework, comprise the following steps:S1, input picture is shot by camera;S2, picture is input to the Text region model formed by deep learning, obtains corresponding word content.The beneficial effects of the invention are as follows:Recognition accuracy is improved, simplifies identification condition, is advantageous to the identification of Chinese character.
Description
Technical field
The present invention relates to character recognition method, more particularly to a kind of Text region side based on deep learning training framework
Method.
Background technology
Character recognition technology, it is a key areas of application of pattern recognition.Start the fifties to inquire into normal words identification
Method, and develop optical character recognition reader.There is the practical machine using magnetic ink and sytlized font the sixties.60 years
For the later stage, there are multiple fonts and handwritten text cognitron, but accuracy of identification and machine performance are all very undesirable.70 years
Generation, the basic theories and the high performance Text region machine of development of main study text identification, and focus on the research of Chinese Character Recognition.
Nowadays character recognition technology has improved a lot.Nonetheless, now to mix chart discrimination be not still it is too high,
Mess code wrongly written character still often occurs.Therefore this project realizes block letter Chinese and English using the method for the neutral net of deep learning
High-precision identification is carried out with numeral, while realizes the efficient identification of hand-written English and numeral.
Existing Text region has following several algorithms most in use:
Strokelets:A Learned Multi-scale Representation for Scene Text
Recognition (CVPR 2014) learns middle level stroke feature by clustering image block, is then voted using Hough (HOG)
Algorithm detects character.On the basis of stroke feature and HOG features, character classification is carried out using random forest grader.
End-to-end scene text recognition (2011) use for reference the general target detection side of computer vision
Method, it is proposed that a new text recognition system.They are given using the space constraint relation between character confidence level and character
Go out most probable detection and recognition result.But the algorithm is only used for the detection identification of the text of horizontal direction arrangement.
End-to-End Text Recognition with Hybrid HMM Maxout Models (2013) and
PhotoOCR:Reading Text in Uncontrolled Conditions (2013) et al. pass through two unsupervised classification
Technology or the grader for having supervision, word image is divided into potential character zone.
End-to-End Text Recognition with Hybrid HMM Maxout Models (2013) use one
Kind complexity, the CNN networks of segmentation, correction and character recognition are included, the HMM of fixed lexicon is used in combination
(HMM) final recognition result, is generated.
PhotoOCR systems use the neural network classifier based on HOG features, and the candidate result that segmentation obtains is carried out
Marking, using the Beam searching algorithms for combining N gram language models (N-gram), obtain candidate characters set.Finally, further
Candidate characters combination is resequenced using language model and shape.
Deep Features for Text Spotting (2014) combine one non-textual grader of text, character point
Class device, binary language mode classifiers, the dense scanning based on sliding window is carried out to whole figure.Finally combine fixed word
Allusion quotation, the word in picture is analyzed.
Existing character recognition technology has the following disadvantages:
(1) recognizer degree of intelligence is relatively low, low to Printed Font Recognition efficiency, and classifying hand-written characters completely can not be effective
Know.Producing the reason for such is algorithm so that based on the feature manually extracted, and is limited to the feature extraction degree of accuracy of people, this
Sample is difficult fundamentally to obtain important breakthrough.
(2) the cumbersome load of identification process, derailed completely with real life.Such as the Text region skill of current Han Wang company
Art, it is necessary to using specific scanning device, by text page by page be scanned identification, be entered among specific computer software,
By machine recognition, the artificial check and correction of complexity is also carried out.Interactive application based on this series of process and current mobile terminal
It is not inconsistent completely.
(3) identification for Chinese character is the difficult point of whole industry.Based on the new high-tech enterprises such as American Universities, Microsoft
Industry technology leader, there is more in-depth study with hand-written English for block letter is English, but at home, for printing
The research of body Chinese character and handwritten Chinese character does not break through significantly also, contrast America and Europe researcher technical scheme, database magnitude,
Many reverse side such as application product also have significance difference away from.
The content of the invention
In order to solve the problems of the prior art, the invention provides a kind of word knowledge based on deep learning training framework
Other method.
The invention provides a kind of character recognition method based on deep learning training framework, comprise the following steps:
S1, input picture is shot by camera;
S2, picture is input to the Text region model formed by deep learning, obtains corresponding word content.
As a further improvement on the present invention, the deep learning process of the Text region model includes:Construct convolution god
Through network and convolutional neural networks solution is carried out, convolutional neural networks, which solve, includes procedure below:
(1) training group, is selected, randomly seeks N number of sample respectively from sample set as training group;
(2), by each weights, threshold value, be set to it is small close to 0 random value, and initialize Accuracy Controlling Parameter and study
Rate;
(3), take an input pattern to be added to convolutional neural networks from training group, and provide its target output vector;
(4) intermediate layer output vector, is calculated, calculates the reality output vector of network;
(5) element in output vector, is calculated into output error compared with the element in object vector;For
The hidden unit in intermediate layer also calculates error;
(6) adjustment amount of each weights and the adjustment amount of threshold value, are calculated successively;
(7) weights and adjustment threshold value, are adjusted;
(8), after M is undergone, whether judge index meets required precision, if be unsatisfactory for, returns (3), continues iteration;
If satisfaction is put into next step;(9), training terminates, and weights and threshold value are preserved hereof;Now, each weights have been
Reach stable, grader has been formed, has been trained again, directly exports weights from file and threshold value is trained, be not required to
Initialized.
The beneficial effects of the invention are as follows:By such scheme, recognition accuracy is improved, simplifies identification condition, favorably
In the identification of Chinese character.
Brief description of the drawings
Fig. 1 is the basic procedure of convolutional neural networks.
Embodiment
The invention will be further described for explanation and embodiment below in conjunction with the accompanying drawings.
Text region is divided into classical mode and deep learning pattern, is usually extraction in advance in the pattern-recognition of classics
Feature.After extracting all multiple features, correlation analysis carried out to these features, find the feature that can most represent character, remove pair
Classify unrelated and autocorrelative feature.However, the extraction of these features too relies on the experience and subjective consciousness of people, extract
The difference of feature is very big on classification performance influence, or even the order of the feature of extraction can also influence last classification performance.Meanwhile
The quality of image preprocessing also influences whether the feature of extraction.So, how using feature extraction, this process is adaptive as one
Answer, the process of self study, the optimal feature of classification performance is found by machine learning
The unit extraction image local feature of each hidden layer of convolutional Neural member, maps it onto a plane, feature is reflected
Penetrate function and use activation primitive of the sigmoid functions as convolutional network so that Feature Mapping has shift invariant.Each god
It is connected through member with the local receptor field of preceding layer.Notice that above we say, be not that locally-attached neuron weights are identical, and
Be same plane layer neuron weights it is identical, have displacement, the rotational invariance of same degree.Closelyed follow after each feature extraction
One to be used for asking local average and the subsampling layer of second extraction.This distinctive structure of feature extraction twice causes network pair
Input sample has higher distortion tolerance.That is, convolutional neural networks pass through local receptor field, shared weights and Asia
Sample to ensure the robustness of image alignment shifting, scaling, distortion.Fig. 1 is the basic procedure of convolutional neural networks.
The invention provides a kind of character recognition method based on deep learning training framework, including
(1) deep learning --- convolutional neural networks
CNN was proposed by the Yann LeCun of New York University in 1998.CNN is substantially a multi-layer perception (MLP), its into
The reason for work(, key was the mode of local connection and shared weights used by it, and the quantity of the weights on the one hand reduced makes
Obtain network to be easy to optimize, on the other hand reduce the risk of over-fitting.CNN is one kind in neutral net, and its weights are shared
Network structure is allowed to be more closely similar to biological neural network, reduces the complexity of network model, reduces the quantity of weights.This is excellent
What point showed when the input of network is multidimensional image becomes apparent, and allows input of the image directly as network, avoids
Complicated feature extraction and data reconstruction processes in tional identification algorithm.There are numerous advantages, such as network in two dimensional image processing
Voluntarily abstract image feature it can include color, texture, the topological structure of shape and image;
Convolutional neural networks overall architecture:Convolutional neural networks are a kind of supervised learning neutral nets of multilayer, hidden layer
Convolutional layer and pond sample level be to realize the nucleus module of convolutional neural networks feature extraction functions.The network model by using
Gradient descent method minimizes loss function and the weight parameter in network is successively reversely adjusted, and is improved by frequently repetitive exercise
The precision of network.The low hidden layer of convolutional neural networks is alternately made up of convolutional layer and maximum pond sample level, and high level is to connect entirely
The hidden layer and logistic regression grader of the corresponding conventional multilayer perceptron of layer.The input of first full articulamentum be by convolutional layer and
Sub-sampling layer carries out the characteristic image that feature extraction obtains.Last layer of output layer is a grader, can be returned using logic
Return, Softmax returns even SVMs and input picture is classified.
Convolutional neural networks structure includes:Convolutional layer, down-sampled layer, full linking layer.Each layer has multiple characteristic patterns, each
Characteristic pattern extracts a kind of feature of input by a kind of convolution filter, and each characteristic pattern has multiple neurons.
Convolutional layer:The use of the important feature that the reason for convolutional layer is convolution algorithm is that, by convolution algorithm, can make
Original signal feature strengthens, and reduces noise.
Down-sampled layer:Using it is down-sampled the reason for be that according to the principle of image local correlation, sub-sampling is carried out to image
Amount of calculation can be reduced, while keeps image rotation consistency.
The purpose of sampling is mainly to obscure the particular location of feature, because after some feature is found out, its particular location
Inessential, we only need this feature and other relative positions, such as one " 8 ", above we have obtained
When one " o ", we require no knowledge about its particular location in image, it is only necessary to know below it and be one " o " we just
It is known that be one ' 8' because in picture " 8 " in picture it is to the left or it is to the right do not affect us and recognize it, it is this
Obscuring the strategy of particular location, the picture that deformed and distort can be identified.
Full articulamentum:Connected entirely using softmax, the picture that obtained activation value i.e. convolutional neural networks extract is special
Sign.
Map number of convolutional layer is specified in netinit, and the map of convolutional layer size is by convolution kernel and upper
What one layer of input map size determined, it is assumed that the map sizes of last layer are that n*n, the size of convolution kernel are k*k, then this layer
Map sizes are (n-k+1) * (n-k+1).
Sample level is a sampling processing to last layer map, and sample mode here is to the adjacent small of last layer map
Region carries out aggregate statistics, area size scale*scale, and some realizations are the maximums for taking zonule, and in ToolBox
The realization in face is the average using 2*2 zonules.Pay attention to, the calculation window of convolution has calculation window that is overlapping, and sampling
Do not have overlapping, sampling is calculated inside ToolBox and with convolution (conv2 (A, K, ' valid')) come what is realized, convolution kernel is 2*
2, each element is 1/4, and removing has overlapping part in the convolution results being calculated.
CNN basic structure includes two kinds of special neuronal layers, and one is convolutional layer, and the input of each neuron is with before
One layer of local-connection, and extract the local feature;The second is pond layer, for asking local susceptibility and Further Feature Extraction
Computation layer.This structure of feature extraction twice reduces feature resolution, reduces the number of parameters that needs optimize.
CNN is Partially connected networks, and its bottom is feature extraction layer (convolutional layer), followed by pond layer (Pooling),
It may then continue with increase convolution, pond or full articulamentum.For the CNN of pattern classification, generally used in final layer
softmax.
Generally, CNN structure type is:Input layer -->Conv layers -->Pooling layers -->(repetition Conv,
Pooling layers) ... -->FC (Full-connected) layer -->Output result.The integral multiple that layer size is generally 2 is commonly entered,
Such as 32,64,96,224,384.Usual convolutional layer uses less filter, such as 3*3, and maximum is also with regard to 5*5.Pooling layers are used
In convolution results are carried out with reduction dimension, such as selection 2*2 region carries out reduction dimension to convolutional layer, then selects 2*2 regions
Maximum as output, the first half that the dimension of such convolutional layer is just reduced to.
Usually, CNN basic structure includes two layers, and one is characterized extract layer, the input of each neuron with it is previous
The local acceptance region of layer is connected, and extracts the local feature.After the local feature is extracted, it is between further feature
Position relationship is also decided therewith;The second is Feature Mapping layer, each computation layer of network is made up of multiple Feature Mappings, often
Individual Feature Mapping is a plane, and the weights of all neurons are equal in plane.Feature Mapping structure is small using influence function core
Activation primitive of the sigmoid functions as convolutional network so that Feature Mapping has shift invariant.Further, since one
Neuron on mapping face shares weights, thus reduces the number of network freedom parameter.Each in convolutional neural networks
Convolutional layer all followed by one is used for asking the computation layer of local average and second extraction, this distinctive feature extraction structure twice
Reduce feature resolution.
Input layer reads in the image by simple regularization (unified size).Unit in each layer is by preceding layer
The unit of one group of small local neighbor is as input.This local perceptron for connecting viewpoint and deriving from early stage, and and
Local sensing that Hubel, Wiesel have found from the vision system of cats, directionally selective neuron are consistent.Pass through office
Portion perceives field, and neuron can extract some basic visual signatures, such as directed edge, end point, corner etc..These features
Then used by the neuron of higher.Also, equally also tend to fit suitable for some local foundation characteristic withdrawal device
For whole image.By using this feature, convolutional neural networks are distributed in each diverse location of image but tool using a component
There is the unit of identical weight vector, to obtain the feature of image and form a width characteristic pattern (Feature Map).In each position
Put, the unit from different characteristic figure obtains respective different types of feature.Different units in one characteristic pattern are restricted to
Same operation is carried out to the local data of each diverse location in input figure.It is this operation be equal to by input picture for
One small core carries out convolution.Multiple characteristic patterns with different weight vectors are generally comprised in one convolutional layer so that same
One position can obtain a variety of different features.For example, first hidden layer includes 4 characteristic patterns, each characteristic pattern is by 5*5
Local sensing region form.Once a feature is detected, as long as it does not change relative to the relative position of other features
Become, then its absolute position in the picture just becomes not being especially important.Therefore, each convolutional layer is followed by a drop
Sample level.Down-sampled layer carries out local average and down-sampled operation, reduces the resolution ratio of characteristic pattern, while it is defeated to reduce network
Go out the sensitivity for displacement and deformation.Second hidden layer carries out the 2*2 down-sampled operation of equalization.Follow-up convolution
Layer and down-sampled layer are all alternately distributed connection, form " double pyramids " structure:The increasing number of characteristic pattern, Er Qiete
The resolution ratio of sign figure gradually reduces.
Generally in convolutional neural networks, convolutional layer and down-sampled layer alternately link together, and are calculated for reducing
Time simultaneously progressively sets up higher space and data structure invariance, and these are special by smaller down-sampled coefficient
Property is maintained.
CNN disaggregated model and the difference of conventional model are that it can be directly by a width two dimensional image input model
In, then provide classification results in output end.It is advantageous that the pretreatment of complexity is not required to, by feature extraction, pattern classification
It is put into completely in a black box, parameter needed for network is obtained by constantly optimizing, required classification, net is provided in output layer
Network core is exactly structure design and the solution of network of network.Many algorithms performance is higher than ever for this solution structure.
CNN is the neutral net of a multilayer, and every layer is made up of multiple two dimensional surfaces, and each plane is by multiple independent god
Formed through member.Comprising simply member (S- members) and complexity member (C- members) in network, S- members, which condense together, forms S- faces, the polymerization of S- faces
S- layers are formed together.Similar relation be present between C- members, C- faces and C- layers.The center section of network is by S- layers and C- layer strings
Connect and form, for input stage containing only one layer, it directly receives two-dimensional view mode.Sample characteristics extraction step has been inserted into convolutional Neural
In the interconnection architecture of network model.
Typically, S is characterized extract layer, and the input of each neuron is connected with the local receptor field of preceding layer, and extracts
The local feature, once the local feature is extracted, its position relationship between further feature is just determined;C is feature
Mapping layer, each computation layer of network are made up of multiple Feature Mappings, and each Feature Mapping is a plane, all god in plane
Weights through member are identical.Feature Mapping structure is using activation letter of the small Sigmoid functions of influence function core as convolutional network
Number so that Feature Mapping has shift invariant.Because the neuron weights on each mapping face are shared, reduce network oneself
By number of parameters, the complexity that network parameter selects is reduced.Each feature extraction layer (S- layers) in CNN follows one
For seeking the computation layer of local average and second extraction (C- layers), this distinctive structure of feature extraction twice is identifying network
When have higher distortion tolerance to input sample.
CNN networks also have middle convolutional layer, sample layer and full articulamentum are straight by original image except input and output layer
Connect and be input to input layer, the size of original image determines the size of input vector, and neuron extracts the local feature of image, often
Individual neuron is all connected with the local receptor field of preceding layer, by the sampling layer (S) that is alternately present and convolutional layer (C) and last
Full articulamentum, the output of network is provided in output layer.There are several characteristic patterns in convolutional layer and sampling layer, each layer has multiple
Plane, in every layer in the neuron extraction image of each plane specific region local feature, such as edge feature, direction character etc.,
The weights of S- layer neurons are constantly corrected in training.Neuron weights in same aspect are identical, can so there is identical journey
The displacement of degree, rotational invariance.Because weights are shared, rolled up so the mapping from a plane to next plane can be regarded as
Product computing, S- layers are considered as fuzzy filter, play a part of Further Feature Extraction.Spatial resolution between hidden layer and hidden layer
Successively decrease, the number of planes contained by every layer is incremented by, and so can be used for detecting more characteristic informations.
In convolutional layer, the characteristic pattern of preceding layer and the core that can learn carry out convolution, and the result of convolution is by activation letter
Output after number forms the neuron of this layer, so as to form this layer of characteristic pattern.Convolutional layer is separated out existing, convolutional layer with sampling interlayer
The characteristic pattern of each output may be with the convolution opening relationships of several characteristic patterns of preceding layer.Each characteristic pattern can have difference
Convolution kernel.The main task of convolutional layer is exactly to select each angle character of preceding layer characteristic pattern to make its tool from different angles
There is shift invariant.The essence of convolution is exactly that the characteristic pattern of preceding layer is handled, to obtain the characteristic pattern of this layer.Sampling
Layer main function is the spatial resolution for reducing network, and the torsion of skew and image is eliminated by reducing the spatial resolution of image
It is bent.
The number of parameters of hidden layer and the neuron number of hidden layer are unrelated, only and wave filter size and wave filter species it is more
Rare pass.The neuron number of hidden layer, it and original image, that is, size (neuron number), the size of wave filter inputted
It is all relevant with the sliding step of wave filter in the picture.
(2) convolutional neural networks solve
CNN realizes the displacement of identification image, scaling and distortion consistency, i.e. local receptive field, power by three methods
The shared and secondary sampling of value.Local receptive field refers to the neuron of each layer network and the god in a small neighbourhood of last layer
Connected through unit, by local receptive field, each neuron can extract primary visual signature, such as direction line segment, end points and
Angle point etc.;Weights are shared to cause CNN to have less parameter, it is necessary to relatively little of training data;Secondary sampling can reduce feature
Resolution ratio, realize to displacement, scaling and other forms distortion consistency.After convolutional layer generally with sample for one time layer come
Reduce and calculate time, the consistency established on space and structure.
Network has been constructed afterwards, it is necessary to be solved to network, if the allocation of parameters as traditional neural network, often
One connection can all have unknown parameter.And CNN shares using weights, so pass through the neuron on a width characteristic pattern
Share same weights can and greatly reduce free parameter, this can be used for detecting what identical feature represented in different angle
Effect.Generally all it is that sampling layer is alternately present with convolutional layer in network design, the preceding layer of full articulamentum is usually convolutional layer.
In CNN, right value update is to be based on back-propagation algorithm.
CNN is inherently a kind of mapping for being input to output, and it can learn reflecting between substantial amounts of input and output
Relation is penetrated, without the accurate mathematic(al) representation between any input and output, as long as pattern is to convolution net known to using
Network training, network just have the mapping ability between inputoutput pair.What convolutional network performed is supervised training, so its
Sample set be by shaped like:Input vector, the vector of preferable output vector is to composition.All these vectors are right, should all come
Network is come from i.e. by the actual " RUN " structure of simulation system, they can gather to come from actual motion system.Starting
Before training, all power should all be initialized with some different random numbers." small random number " is used for ensureing that network will not
Enter saturation state because weights are excessive, so as to cause failure to train;" difference " is used for ensureing that network can normally learn.It is real
On border, if with identical number deinitialization weight matrix, network learning disabilities.
Training algorithm mainly includes four steps, and this four step is divided into two stages:
First stage, forward propagation stage:
(1) sample, is taken from sample set, inputs network;
(2) corresponding reality output, is calculated;In this stage, information, by conversion step by step, is sent to output from input layer
Layer.This process is also the process that network performs in normal execution after completing to train.
Second stage, back-propagation stage:
(1) difference of the reality output with corresponding preferable output, is calculated;
(2), weight matrix is adjusted by the method for minimization error.
The work in the two stages should typically be controlled by required precision.
The training process of network is as follows:
(1) training group, is selected, randomly seeks N number of sample respectively from sample set as training group;
(2), by each weights, threshold value, be set to it is small close to 0 random value, and initialize Accuracy Controlling Parameter and study
Rate;
(3), take an input pattern to be added to network from training group, and provide its target output vector;
(4) intermediate layer output vector, is calculated, calculates the reality output vector of network;
(5) element in output vector, is calculated into output error compared with the element in object vector;For
The hidden unit in intermediate layer is also required to calculate error;
(6) adjustment amount of each weights and the adjustment amount of threshold value, are calculated successively;
(7) weights and adjustment threshold value, are adjusted;
(8), after M is undergone, whether judge index meets required precision, if be unsatisfactory for, returns (3), continues iteration;
If satisfaction is put into next step;
(9), training terminates, and weights and threshold value are preserved hereof.At this moment it is considered that each weights have reached steady
Fixed, grader has been formed.It is trained again, directly exports weights from file and threshold value is trained, it is not necessary to carry out
Initialization.
(3):The application and its principle that gesture recognition system is realized
By on the basis of above-mentioned theory, can further realize that abundant in content, diversification of forms function should
With.
1. the exploitation of the mobile terminal APP based on Text region
The technology that is described above is realized, it is necessary to application using suitable platform progress technology.With reference to current market
Situation, the mobile terminal APP for developing Text region are undoubtedly best selection.
By developing mobile terminal technology, user can be carried out quickly and easily picture by mobile phone camera and be inputted, can be with
Save the scanner input of very complicated.Shoot picture after, by picture import corresponding to APP softwares, formed by deep learning
Text region model, it is possible to draw corresponding word content, be the most efficiently actual application scheme of the present invention.
A kind of character recognition method based on deep learning training framework provided by the invention, its feature are as follows:
(1) deep learning algorithm is with thinking that Text region is combined, Model Calculating Method, parameter, the model generated,
And application interface
(2) for block letter English, handwritten form English, printing digital, handwriting digital, block letter Chinese, handwritten form
Chinese carries out a set of identification technology in all directions, it is intended to realizes the application broken through on Text region algorithm with mobile terminal.
The invention provides a kind of character recognition method based on deep learning training framework, depth technically make use of
The method of habit does Text region, and frontline technology, method is advanced, the manually extraction feature of conventional method is updated to utilize
Deep learning training pattern independently extracts feature, and this is an adaptive, process for self study.
It is to be based on character recognition technology the invention provides a kind of character recognition method based on deep learning training framework
The word in image is identified and changed with the nerual network technique of deep learning, is by the optics input mode such as take pictures
The word of various newpapers and periodicals, books, manuscript and other printed matters is converted into image information, recycles character recognition technology by image
Information is converted into the computer input technology that can be used.Can be widely used in future a large amount of written historical materials, archives folder,
The typing of official documents and correspondence and process field.
Above content is to combine specific preferred embodiment further description made for the present invention, it is impossible to is assert
The specific implementation of the present invention is confined to these explanations.For general technical staff of the technical field of the invention,
On the premise of not departing from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the present invention's
Protection domain.
Claims (2)
1. a kind of character recognition method based on deep learning training framework, it is characterised in that comprise the following steps:
S1, input picture is shot by camera;
S2, picture is input to the Text region model formed by deep learning, obtains corresponding word content.
2. the character recognition method according to claim 1 based on deep learning training framework, it is characterised in that the text
The deep learning process of word identification model includes:Construction convolutional neural networks simultaneously carry out convolutional neural networks solution, convolutional Neural
Solution To The Network includes procedure below:
(1) training group, is selected, randomly seeks N number of sample respectively from sample set as training group;
(2), by each weights, threshold value, be set to it is small close to 0 random value, and initialize Accuracy Controlling Parameter and learning rate;
(3), take an input pattern to be added to convolutional neural networks from training group, and provide its target output vector;
(4) intermediate layer output vector, is calculated, calculates the reality output vector of network;
(5) element in output vector, is calculated into output error compared with the element in object vector;For centre
The hidden unit of layer also calculates error;
(6) adjustment amount of each weights and the adjustment amount of threshold value, are calculated successively;
(7) weights and adjustment threshold value, are adjusted;
(8), after M is undergone, whether judge index meets required precision, if be unsatisfactory for, returns (3), continues iteration;If
Satisfaction is put into next step;
(9), training terminates, and weights and threshold value are preserved hereof;Now, each weights have reached stable, and grader is
Through being formed, it is trained again, directly exports weights from file and threshold value is trained, it is not necessary to initialized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711057406.3A CN107704859A (en) | 2017-11-01 | 2017-11-01 | A kind of character recognition method based on deep learning training framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711057406.3A CN107704859A (en) | 2017-11-01 | 2017-11-01 | A kind of character recognition method based on deep learning training framework |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107704859A true CN107704859A (en) | 2018-02-16 |
Family
ID=61178158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711057406.3A Pending CN107704859A (en) | 2017-11-01 | 2017-11-01 | A kind of character recognition method based on deep learning training framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107704859A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108509934A (en) * | 2018-04-12 | 2018-09-07 | 南京烽火天地通信科技有限公司 | A kind of Balakrishnan image identification method based on deep learning |
CN108595410A (en) * | 2018-03-19 | 2018-09-28 | 小船出海教育科技(北京)有限公司 | The automatic of hand-written composition corrects method and device |
CN109753929A (en) * | 2019-01-03 | 2019-05-14 | 华东交通大学 | A kind of united high-speed rail insulator inspection image-recognizing method of picture library |
CN110598737A (en) * | 2019-08-06 | 2019-12-20 | 深圳大学 | Online learning method, device, equipment and medium of deep learning model |
CN111598079A (en) * | 2019-02-21 | 2020-08-28 | 北京京东尚科信息技术有限公司 | Character recognition method and device |
CN112308058A (en) * | 2020-10-25 | 2021-02-02 | 北京信息科技大学 | Method for recognizing handwritten characters |
CN113673706A (en) * | 2020-05-15 | 2021-11-19 | 富泰华工业(深圳)有限公司 | Machine learning model training method and device and electronic equipment |
CN113887282A (en) * | 2021-08-30 | 2022-01-04 | 中国科学院信息工程研究所 | Detection system and method for any-shape adjacent text in scene image |
CN115797952A (en) * | 2023-02-09 | 2023-03-14 | 山东山大鸥玛软件股份有限公司 | Handwritten English line recognition method and system based on deep learning |
CN116824597A (en) * | 2023-07-03 | 2023-09-29 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104537393A (en) * | 2015-01-04 | 2015-04-22 | 大连理工大学 | Traffic sign recognizing method based on multi-resolution convolution neural networks |
CN105184312A (en) * | 2015-08-24 | 2015-12-23 | 中国科学院自动化研究所 | Character detection method and device based on deep learning |
CN106650748A (en) * | 2016-11-16 | 2017-05-10 | 武汉工程大学 | Chinese character recognition method based on convolution neural network |
CN107145885A (en) * | 2017-05-03 | 2017-09-08 | 金蝶软件(中国)有限公司 | A kind of individual character figure character recognition method and device based on convolutional neural networks |
CN107273897A (en) * | 2017-07-04 | 2017-10-20 | 华中科技大学 | A kind of character recognition method based on deep learning |
-
2017
- 2017-11-01 CN CN201711057406.3A patent/CN107704859A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104537393A (en) * | 2015-01-04 | 2015-04-22 | 大连理工大学 | Traffic sign recognizing method based on multi-resolution convolution neural networks |
CN105184312A (en) * | 2015-08-24 | 2015-12-23 | 中国科学院自动化研究所 | Character detection method and device based on deep learning |
CN106650748A (en) * | 2016-11-16 | 2017-05-10 | 武汉工程大学 | Chinese character recognition method based on convolution neural network |
CN107145885A (en) * | 2017-05-03 | 2017-09-08 | 金蝶软件(中国)有限公司 | A kind of individual character figure character recognition method and device based on convolutional neural networks |
CN107273897A (en) * | 2017-07-04 | 2017-10-20 | 华中科技大学 | A kind of character recognition method based on deep learning |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595410A (en) * | 2018-03-19 | 2018-09-28 | 小船出海教育科技(北京)有限公司 | The automatic of hand-written composition corrects method and device |
CN108509934B (en) * | 2018-04-12 | 2021-12-21 | 南京烽火天地通信科技有限公司 | Vietnamese picture identification method based on deep learning |
CN108509934A (en) * | 2018-04-12 | 2018-09-07 | 南京烽火天地通信科技有限公司 | A kind of Balakrishnan image identification method based on deep learning |
CN109753929A (en) * | 2019-01-03 | 2019-05-14 | 华东交通大学 | A kind of united high-speed rail insulator inspection image-recognizing method of picture library |
CN111598079A (en) * | 2019-02-21 | 2020-08-28 | 北京京东尚科信息技术有限公司 | Character recognition method and device |
CN110598737B (en) * | 2019-08-06 | 2023-02-24 | 深圳大学 | Online learning method, device, equipment and medium of deep learning model |
CN110598737A (en) * | 2019-08-06 | 2019-12-20 | 深圳大学 | Online learning method, device, equipment and medium of deep learning model |
CN113673706A (en) * | 2020-05-15 | 2021-11-19 | 富泰华工业(深圳)有限公司 | Machine learning model training method and device and electronic equipment |
CN112308058A (en) * | 2020-10-25 | 2021-02-02 | 北京信息科技大学 | Method for recognizing handwritten characters |
CN112308058B (en) * | 2020-10-25 | 2023-10-24 | 北京信息科技大学 | Method for recognizing handwritten characters |
CN113887282A (en) * | 2021-08-30 | 2022-01-04 | 中国科学院信息工程研究所 | Detection system and method for any-shape adjacent text in scene image |
CN115797952A (en) * | 2023-02-09 | 2023-03-14 | 山东山大鸥玛软件股份有限公司 | Handwritten English line recognition method and system based on deep learning |
CN116824597A (en) * | 2023-07-03 | 2023-09-29 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
CN116824597B (en) * | 2023-07-03 | 2024-05-24 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107704859A (en) | A kind of character recognition method based on deep learning training framework | |
Rahman et al. | A new benchmark on american sign language recognition using convolutional neural network | |
Pastor-Pellicer et al. | Insights on the use of convolutional neural networks for document image binarization | |
Islalm et al. | Recognition bangla sign language using convolutional neural network | |
Bhowmik et al. | Recognition of Bangla handwritten characters using an MLP classifier based on stroke features | |
CN111652332B (en) | Deep learning handwritten Chinese character recognition method and system based on two classifications | |
CN108985217A (en) | A kind of traffic sign recognition method and system based on deep space network | |
Balaha et al. | Automatic recognition of handwritten Arabic characters: a comprehensive review | |
CN110866530A (en) | Character image recognition method and device and electronic equipment | |
Akhand et al. | Convolutional neural network training incorporating rotation-based generated patterns and handwritten numeral recognition of major Indian scripts | |
Jana et al. | Handwritten digit recognition using convolutional neural networks | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
CN115880704B (en) | Automatic cataloging method, system, equipment and storage medium for cases | |
Pratama et al. | Deep convolutional neural network for hand sign language recognition using model E | |
Nongmeikapam et al. | Handwritten Manipuri Meetei-Mayek classification using convolutional neural network | |
Jadhav et al. | Recognition of handwritten bengali characters using low cost convolutional neural network | |
Zhou et al. | Morphological Feature Aware Multi-CNN Model for Multilingual Text Recognition. | |
Liu et al. | Multi-digit recognition with convolutional neural network and long short-term memory | |
Ahmed et al. | Cursive Script Text Recognition in Natural Scene Images | |
Hijam et al. | Convolutional neural network based Meitei Mayek handwritten character recognition | |
Ahmed et al. | Sub-sampling approach for unconstrained Arabic scene text analysis by implicit segmentation based deep learning classifier | |
Thakar et al. | Sign Language to Text Conversion in Real Time using Transfer Learning | |
Shinde et al. | Automatic Data Collection from Forms using Optical Character Recognition | |
Fedorovici et al. | Improved neural network OCR based on preprocessed blob classes | |
Hasan et al. | A new state of art deep learning approach for Bangla handwritten digit recognition using SVM classifier |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180216 |
|
RJ01 | Rejection of invention patent application after publication |