CN110008961A - Text real-time identification method, device, computer equipment and storage medium - Google Patents

Text real-time identification method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110008961A
CN110008961A CN201910256927.4A CN201910256927A CN110008961A CN 110008961 A CN110008961 A CN 110008961A CN 201910256927 A CN201910256927 A CN 201910256927A CN 110008961 A CN110008961 A CN 110008961A
Authority
CN
China
Prior art keywords
output result
convolution
carried out
result
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910256927.4A
Other languages
Chinese (zh)
Other versions
CN110008961B (en
Inventor
张欢
李爱林
张仕洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huafu Information Technology Co Ltd
Original Assignee
Shenzhen Huafu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huafu Information Technology Co Ltd filed Critical Shenzhen Huafu Information Technology Co Ltd
Priority to CN201910256927.4A priority Critical patent/CN110008961B/en
Publication of CN110008961A publication Critical patent/CN110008961A/en
Application granted granted Critical
Publication of CN110008961B publication Critical patent/CN110008961B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to text real-time identification method, device, computer equipment and storage medium, this method includes obtaining images to be recognized data;Images to be recognized data are input in Text region model and carry out Text region, to obtain recognition result;The recognition result is aligned using CTC loss function, to obtain character string;Wherein, the Text region model is resulting as sample data training convolutional neural networks by the image data of tape identification.The present invention carries out Text region by the way that images to be recognized data are input in Text region model, in the training process to verbal model, by using convolutional calculation, in conjunction with pond layer it is down-sampled and batch standardization layer and lose layer accelerate convergence rate, improve stability, prevent over-fitting, change convolution kernel, to reduce calculation amount, realization can both guarantee to identify text with low power, can also improve the rate of Text region.

Description

Text real-time identification method, device, computer equipment and storage medium
Technical field
The present invention relates to character recognition methods, more specifically refer to text real-time identification method, device, computer equipment And storage medium.
Background technique
Text detection process be include String localization and text identification, existing character identification system mostly uses traditional meter Calculation machine vision algorithm does not use neural network, and accuracy rate is lower, needs preparatory Character segmentation mostly, and the error of segmentation will be into One step influences identification, and concrete scheme is to carry out Character segmentation, the character split is classified respectively, after then carrying out Reason is by all Connection operators identified at final recognition result.Identification is divided into two steps by such algorithm, and first The error that step generates is intended only as intermediate steps, does not need segmentation result centainly, and the error divided can travel to Next step will seriously affect the accuracy of monocase classification, to influence final recognition effect.
In addition, also having new recognition methods at present, Text region mould is gone out using the good neural metwork training of current effect Type identifies text using the model, and generally, line of text identification is a sequence to sequence problem, that is, inputs picture Information, that is, pixel sequence exports a text sequence, and the RNN model based on LSTM is due to good Series Modeling ability at this time, It can be very good to solve the problems, such as such sequence, however from power consumption and speed, relative to convolution, LSTM is very unfavorable In mobile terminal deployment.And sequence of pictures it is born without time-dependent relation, with heavy LSTM modeling be not it is unique most Good selection, neural network Text region need to expend a large amount of computing resource mostly, cannot be detached from the environment of cloud.
Therefore, it is necessary to design a kind of new method, realization can both guarantee to identify text with low power, can also improve The rate of Text region.
Summary of the invention
It is an object of the invention to overcome the deficiencies of existing technologies, text real-time identification method, device, computer are provided and set Standby and storage medium.
To achieve the above object, the invention adopts the following technical scheme: text real-time identification method includes:
Obtain images to be recognized data;
Images to be recognized data are input in Text region model and carry out Text region, to obtain recognition result;
The recognition result is aligned using CTC loss function, to obtain character string;
Wherein, the Text region model is the image data by tape identification as sample data training convolutional nerve net Network is resulting.
Its further technical solution are as follows: the Text region model is the image data by tape identification as sample data Training convolutional neural networks are resulting, comprising:
Construct loss function and convolutional neural networks;
The image data of tape identification is obtained, to obtain sample data;
Sample data is inputted in convolutional neural networks and carries out convolutional calculation, to obtain sample output result;
In the image data entrance loss function that sample is exported to result and tape identification, to obtain penalty values;
The parameter of convolutional neural networks is adjusted according to penalty values;
Convolutional neural networks are learnt using sample data and using deep learning frame, to obtain Text region mould Type.
Its further technical solution are as follows: described input sample data in convolutional neural networks carries out convolutional calculation, with Result is exported to sample, comprising:
The process of convolution that convolution kernel is 3*3 is carried out to sample data, to obtain the first output result;
Maximum pondization processing is carried out to the first output result, to obtain the second output result;
Intersection process of convolution is carried out to the second output result, to obtain third output result;
The processing of mean value pondization is carried out to third output result, to obtain the 4th output result;
The process of convolution and intersect process of convolution that convolution kernel is 3*3 are carried out to third output result, it is defeated to obtain the 5th Result out;
4th output result and the 5th output result are spliced, to obtain the 6th output result;
6th output result is subjected to intersection process of convolution, to obtain the 7th output result;
7th output result and the 4th output result are subjected to splicing, to obtain the 6th output result;
Intersection process of convolution is carried out to the 6th output result, to obtain the 8th output result;
Maximum pondization processing is carried out to the 8th output result, to obtain the 9th output result;
Abandon to the 9th output result the adjacent area processing of layer characteristic pattern, to obtain the tenth output result;
The processing of mean value pondization is carried out to the 7th output result, to obtain the 11st output result;
Tenth output result and the 11st output result are spliced, to obtain the 12nd output result;
Intersection process of convolution is carried out to the 12nd output result, to obtain the 13rd output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 13rd output result, to obtain the 14th output result;
Abandon to the 14th output result the adjacent area processing of layer characteristic pattern, to obtain the 15th output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 15th output result, to obtain the 16th output result;
Global pool processing is carried out to the 16th output result, to obtain the 17th output result;
17th output result is connected entirely, to obtain the 18th output result;
Tiling processing is carried out to the 18th output result, to obtain the 19th output result;
19th output result and the 16th output result are spliced, to obtain the 20th output result;
The process of convolution that convolution kernel is 1*8 and 8*1 is carried out to the 20th output result, to obtain sample output result.
Its further technical solution are as follows: it is described that intersection process of convolution is carried out to the second output result, to obtain third output As a result, comprising:
The process of convolution that convolution kernel is 1*1 is carried out to the second output result, to obtain PRELIMINARY RESULTS;
The process of convolution that convolution kernel is 1*3 is carried out to PRELIMINARY RESULTS, to obtain second fruiting;
The process of convolution that convolution kernel is 3*1 is carried out to second fruiting, to obtain result three times;
The process of convolution that convolution kernel is 1*1 is carried out to result three times, to obtain third output result.
Its further technical solution are as follows: it is described that the processing of mean value pondization is carried out to third output result, to obtain the 4th output As a result, comprising:
Pixel adjacent in third output result is averaged, to obtain the 4th output result.
Its further technical solution are as follows: it is described that the recognition result is aligned using CTC loss function, to obtain character string Later, further includes:
Output character sequence.
The present invention also provides text real-time distinguishing apparatus, comprising:
Data capture unit, for obtaining images to be recognized data;
Recognition unit carries out Text region for images to be recognized data to be input in Text region model, to obtain Recognition result;
Alignment unit, for being aligned the recognition result using CTC loss function, to obtain character string.
Its further technical solution are as follows: state device further include:
Training unit, for resulting as sample data training convolutional neural networks by the image data of tape identification, To obtain Text region model.
The present invention also provides a kind of computer equipment, the computer equipment includes memory and processor, described to deposit Computer program is stored on reservoir, the processor realizes above-mentioned method when executing the computer program.
The present invention also provides a kind of storage medium, the storage medium is stored with computer program, the computer journey Sequence can realize above-mentioned method when being executed by processor.
Compared with the prior art, the invention has the advantages that: the present invention is by being input to text for images to be recognized data Text region is carried out in identification model, in the training process to verbal model, by using convolutional calculation, is dropped in conjunction with pond layer Sampling and batch standardization layer and loss layer accelerate convergence rate, improve stability, prevent over-fitting, change convolution kernel, to subtract Few calculation amount, realizing can both guarantee to identify text with low power, can also improve the rate of Text region.
The invention will be further described in the following with reference to the drawings and specific embodiments.
Detailed description of the invention
Technical solution in order to illustrate the embodiments of the present invention more clearly, below will be to needed in embodiment description Attached drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the invention, general for this field For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the application scenarios schematic diagram of text real-time identification method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of text real-time identification method provided in an embodiment of the present invention;
Fig. 3 is the sub-process schematic diagram of text real-time identification method provided in an embodiment of the present invention;
Fig. 4 is the sub-process schematic diagram of text real-time identification method provided in an embodiment of the present invention;
Fig. 5 is the sub-process schematic diagram of text real-time identification method provided in an embodiment of the present invention;
Fig. 6 is the schematic diagram provided in an embodiment of the present invention for intersecting process of convolution;
Fig. 7 is the schematic diagram of equalization provided in an embodiment of the present invention processing;
Fig. 8 be another embodiment of the present invention provides text real-time identification method flow diagram;
Fig. 9 is the schematic block diagram of text real-time distinguishing apparatus provided in an embodiment of the present invention;
Figure 10 be another embodiment of the present invention provides text real-time distinguishing apparatus schematic block diagram;
Figure 11 is the schematic block diagram of computer equipment provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this description of the invention merely for the sake of description specific embodiment And be not intended to limit the present invention.As description of the invention and it is used in the attached claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in description of the invention and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
Fig. 1 and Fig. 2 are please referred to, Fig. 1 is that the application scenarios of text real-time identification method provided in an embodiment of the present invention are illustrated Figure.Fig. 2 is the schematic flow chart of text real-time identification method provided in an embodiment of the present invention.The text real-time identification method is answered For in server, the server and terminal progress data interaction to be shot to obtain images to be recognized data by terminal, and will be wait know Other image data is transmitted to server, Text region is carried out to it by the Text region model in server, to recognition result To obtain true character string, i.e. text information after being aligned, the text information can be transmitted to terminal or be believed with text Breath controlling terminal makes corresponding response.
Fig. 2 is the flow diagram of text real-time identification method provided in an embodiment of the present invention.As shown in Fig. 2, this method Include the following steps S110 to S130.
S110, images to be recognized data are obtained.
In the present embodiment, images to be recognized data, which refer to, shoots resulting image data by terminal, or is also possible to The modes such as scanning obtain resulting image data.
S120, it images to be recognized data is input in Text region model carries out Text region, to obtain recognition result.
In the present embodiment, recognition result refers to that length is about the probability sequence of 50 to 200 character.
Wherein, the Text region model is the image data by tape identification as sample data training convolutional nerve net Network is resulting.
In one embodiment, referring to Fig. 3, above-mentioned Text region model training step may include step S121~ S126。
S121, building loss function and convolutional neural networks.
In the present embodiment, building convolutional neural networks are to carry out convolutional calculation to image data, to reach classification and mesh The effect of position is demarcated, each network is required to carry out penalty values calculating, the penalty values generation using loss function in the training process Gap between the result and actual result of table output, penalty values are smaller, then gap is smaller, show that the network training must be better, Vice versa.Convolutional neural networks are widely used in target detection, semantic segmentation, in the Computer Vision Tasks such as object classification, take Very good effect was obtained, shows its adaptability good for visual task.
S122, the image data for obtaining tape identification, to obtain sample data.
In the present embodiment, sample data refers to the image data with words identification, if the sample data is segmented into Dry training set and fraction test set, are trained convolutional neural networks using several training sets, to select loss It is worth lesser convolutional neural networks, is tested using test set.
S123, convolutional calculation will be carried out in sample data input convolutional neural networks, to obtain sample output result.
In the present embodiment, sample output result refers to probability sequence, that is, the text sequence number of sample data prediction.
In one embodiment, referring to Fig. 4, above-mentioned step S123 may include step S123a~S123v.
S123a, the process of convolution that convolution kernel is 3*3 is carried out to sample data, to obtain the first output result;
S123b, maximum pondization processing is carried out to the first output result, to obtain the second output result.
In the present embodiment, maximum pondization processing refers to the pixel maximum for reading image.
S123c, intersection process of convolution is carried out to the second output result, to obtain third output result.
In the present embodiment, referring to Fig. 5, above-mentioned step S123c may include step S123c1~S123c4.
S123c1, the process of convolution that convolution kernel is 1*1 is carried out to the second output result, to obtain PRELIMINARY RESULTS;
S123c2, the process of convolution that convolution kernel is 1*3 is carried out to PRELIMINARY RESULTS, to obtain second fruiting;
S123c3, the process of convolution that convolution kernel is 3*1 is carried out to second fruiting, to obtain result three times;
S123c4, the process of convolution that convolution kernel is 1*1 is carried out to result three times, to obtain third output result.
As shown in fig. 6, the figure middle section is convolution kernel, currently used is usually the convolution kernel of 3*3, middle layer volume Product core can with do multiplication on the feature mutual relation of front, calculation amount is larger, use convolution kernel be 1*3 and 3*1 superposition convolution at It is 3*3 process of convolution that reason, which replaces convolution kernel, and forms bottleneck with the process of convolution that convolution kernel is 1*1 in front and back, reduces calculation amount.
S123d, the processing of mean value pondization is carried out to third output result, to obtain the 4th output result.
Specifically, pixel adjacent in third output result is averaged, to obtain the 4th output result.
When due to splicing, each feature resolution is different, and the picture of big resolution ratio is aligned with the mode in mean value pond Information, so-called mean value pond, that is, be averaged in adjacent pixel, to reduce average value, as shown in Figure 7.
S123e, the process of convolution and intersect process of convolution that convolution kernel is 3*3 are carried out to third output result, to obtain 5th output result;
S123f, the 4th output result and the 5th output result are spliced, to obtain the 6th output result;
S123g, the 6th output result is subjected to intersection process of convolution, to obtain the 7th output result;
S123h, the 7th output result and the 4th output result are subjected to splicing, to obtain the 6th output result;
S123i, intersection process of convolution is carried out to the 6th output result, to obtain the 8th output result;
S123j, maximum pondization processing is carried out to the 8th output result, to obtain the 9th output result;
S123k, the adjacent area for abandon layer characteristic pattern to the 9th output result are handled, to obtain the tenth output knot Fruit;
S123l, the processing of mean value pondization is carried out to the 7th output result, to obtain the 11st output result;
S123m, the tenth output result and the 11st output result are spliced, to obtain the 12nd output result;
S123n, intersection process of convolution is carried out to the 12nd output result, to obtain the 13rd output result;
S123o, the process of convolution that convolution kernel is 3*3 is carried out to the 13rd output result, to obtain the 14th output result;
S123p, the adjacent area for abandon layer characteristic pattern to the 14th output result are handled, to obtain the 15th output As a result;
S123q, the process of convolution that convolution kernel is 3*3 is carried out to the 15th output result, to obtain the 16th output result;
S123r, global pool processing is carried out to the 16th output result, to obtain the 17th output result;
S123s, the 17th output result is connected entirely, to obtain the 18th output result;
S123t, tiling processing is carried out to the 18th output result, to obtain the 19th output result;
S123u, the 19th output result and the 16th output result are spliced, to obtain the 20th output result;
S123v, the process of convolution that convolution kernel is 1*8 and 8*1 is carried out to the 20th output result, to obtain sample output knot Fruit.
Repeatedly connection shallow-layer and further feature, extract the feature of image sequence.The spy for being extracted network early period using splicing The feature that the process of convolution that sign i.e. shallow-layer feature is constantly and this feature passes through intersection convolution or convolution kernel is 3*3 is extracted later is being led to Road dimension is spliced, and (takes Chinese characters in common use number, for 8500), width is that W (can be with to obtain the classification number of a length of character to be identified The number being set as between 50 to 200) characteristic pattern, by characteristic pattern along width cut, obtain length be W characteristic sequence, that is, Probability sequence.
Above-mentioned intersection process of convolution is first to carry out convolution kernel to be the process of convolution of 1*1, then carrying out convolution kernel is 1*1's Process of convolution, then carries out the process of convolution that convolution kernel is 3*1, finally carries out the process of convolution that convolution kernel is 1*1.Using convolution It calculates, it is down-sampled in conjunction with pond layer, and criticize standardization layer and lose layer and accelerate convergence rate, raising stability prevented from intending It closes, random drop feature is effective to full articulamentum, and experiment shows but without so effective for convolutional layer, therefore uses newest The discarding mode for convolutional layer, enhance network robustness.
In convolutional network finally, using big transverse direction and lateral convolution kernel (1*8 and 8*1), small calculation amount is being kept Simultaneously as convolution kernel horizontal and vertical (being all 8) is very long, this will well using between lateral position between lengthwise position Related information compensates for instead of the ability of LSTM processing image lateral position feature and character string feature and lacks LSTM band The influence come.LSTM is mainly used in speech processes originally, and the fields such as natural language processing, it can processing sequence input well To the problem of sequence output, Text region task, due to that can be sequence of pictures by picture segmentation, output be also word sequence, Therefore the framework that its sequence can be used to sequence is handled, however from voice and unlike, the natural only left and right knot of picture Structure, there is no dependences as voice for the sequence relation of text picture from left to right, therefore use long core process of convolution text Identification picture can preferably substitute LSTM network.
S124, sample is exported in the image data entrance loss function of result and tape identification, to obtain penalty values;
S125, the parameter that convolutional neural networks are adjusted according to penalty values;
S126, convolutional neural networks are learnt using sample data and using deep learning frame, to obtain text Identification model.
By continuously adjusting the parameter of convolutional neural networks, and repeatedly learnt and trained, to be met the requirements Convolutional neural networks, specifically using tensorflow training, after being converted to corresponding Text region model, pass through Tensorflow tflite and tensorflow mace have very easily been deployed on server or terminal.It is not only It supports common controller to run, opencl (full name open computing language, open operation language) can also be passed through Accelerate in the enterprising line control unit of relevant device.
Obtained Text region model, single forward calculation only have about 0.22Gflops, and forward calculation can be located in real time Manage a large amount of Text region tasks.Eliminate complicated RNN (Recognition with Recurrent Neural Network, Recurrent Neural Network) model Power and memory requirements are largely calculated on embedded device, in addition, the Text region algorithm of reality actual use can be put into Need to face the fuzzy of picture, a series of problems, such as illumination is bad, physical deformation etc..Fine and extensive text augmentation can be passed through And generation, this problem is carefully handled well, so that algorithm obtains extraordinary effect under reality scene, specific service test Fruit.
S130, the recognition result is aligned using CTC loss function, to obtain character string;
Text region model exports the probability sequence that a string length is about 50 to 200 character.Due to final purpose It is to obtain true character string, i.e. alphabetic character number in images to be recognized data, as being usually 7 digit sequences in license plate, It needs for the two to be aligned.Both alignment are gone using very more CTC loss functions in speech recognition, obtain character Sequence.
This method operates on Android device RK3399, to the Text region of several quasi-representatives, identifies 8 bit digitals, privately owned survey Examination, which collects, closes accuracy rate about 99.1%, and about 20 milliseconds of speed;Identify 14 Chinese characters, the accuracy rate on privately owned test set is about 98.8%, about 46 milliseconds of speed.
Above-mentioned text real-time identification method carries out text by the way that images to be recognized data are input in Text region model Word identification, it is down-sampled in conjunction with pond layer and batch standardize by using convolutional calculation in the training process to verbal model Layer and loss layer accelerate convergence rate, improve stability, prevent over-fitting, change convolution kernel, and to reduce calculation amount, realization both may be used To guarantee to identify text with low power, the rate of Text region can also be improved.
Fig. 8 be another embodiment of the present invention provides a kind of text real-time identification method flow diagram.Such as Fig. 8 institute Show, the text real-time identification method of the present embodiment includes step S210-S240.Wherein step S210-S230 and above-described embodiment In step S110-S130 it is similar, details are not described herein.The following detailed description of in the present embodiment increase step S240.
S240, output character sequence.
It will identify that resulting character string is exported to the character ordinal number that terminal show or according to output to correspond to Response, for example transfer corresponding data etc..
Fig. 9 is a kind of schematic block diagram of text real-time distinguishing apparatus 300 provided in an embodiment of the present invention.As shown in figure 9, Corresponding to the above text real-time identification method, the present invention also provides a kind of text real-time distinguishing apparatus 300.The text identifies in real time Device 300 includes the unit for executing above-mentioned text real-time identification method, which can be configured in server or terminal In.
Specifically, referring to Fig. 9, the text real-time distinguishing apparatus 300 includes:
Data capture unit 301, for obtaining images to be recognized data;
Recognition unit 302 carries out Text region for images to be recognized data to be input in Text region model, with To recognition result;
Alignment unit 303, for being aligned the recognition result using CTC loss function, to obtain character string.
In one embodiment, described device further include:
Training unit, for resulting as sample data training convolutional neural networks by the image data of tape identification, To obtain Text region model.
In one embodiment, the training unit includes:
Subelement is constructed, for constructing loss function and convolutional neural networks;
Sample data forms subelement, for obtaining the image data of tape identification, to obtain sample data;
Computation subunit carries out convolutional calculation for inputting sample data in convolutional neural networks, defeated to obtain sample Result out;
Penalty values obtain subelement, for sample to be exported to the image data entrance loss function of result and tape identification It is interior, to obtain penalty values;
Ginseng subelement is adjusted, for adjusting the parameter of convolutional neural networks according to penalty values;
Learn subelement, for learning using sample data and using deep learning frame to convolutional neural networks, To obtain Text region model.
In one embodiment, the computation subunit includes:
First convolution processing module, it is defeated to obtain first for carrying out the process of convolution that convolution kernel is 3*3 to sample data Result out;
First maximum pond module, for carrying out maximum pondization processing to the first output result, to obtain the second output knot Fruit;
Second convolution processing module, for carrying out intersection process of convolution to the second output result, to obtain third output knot Fruit;
First mean value pond module, for carrying out the processing of mean value pondization to third output result, to obtain the 4th output knot Fruit;
Third process of convolution module, for carrying out the process of convolution and intersection that convolution kernel is 3*3 to third output result Process of convolution, to obtain the 5th output result;
First splicing module, for splicing the 4th output result and the 5th output result, to obtain the 6th Export result;
Volume Four accumulates processing module, for the 6th output result to be carried out intersection process of convolution, to obtain the 7th output knot Fruit;
Second splicing module, for the 7th output result and the 4th output result to be carried out splicing, to obtain the 6th Export result;
5th process of convolution module, for carrying out intersection process of convolution to the 6th output result, to obtain the 8th output knot Fruit;
Second maximum pond module, for carrying out maximum pondization processing to the 8th output result, to obtain the 9th output knot Fruit;
First discard module, the adjacent area for abandon layer characteristic pattern to the 9th output result is handled, to obtain Tenth output result;
Second mean value pond module, for carrying out the processing of mean value pondization to the 7th output result, to obtain the 11st output As a result;
Third splicing module, for splicing the tenth output result and the 11st output result, to obtain the 12nd Export result;
6th process of convolution module, it is defeated to obtain the 13rd for carrying out intersection process of convolution to the 12nd output result Result out;
7th process of convolution module, for carrying out the process of convolution that convolution kernel is 3*3 to the 13rd output result, to obtain 14th output result;
Second discard module, the adjacent area for abandon layer characteristic pattern to the 14th output result are handled, with To the 15th output result;
8th process of convolution module, for carrying out the process of convolution that convolution kernel is 3*3 to the 15th output result, to obtain 16th output result;
Global pool module, for carrying out global pool processing to the 16th output result, to obtain the 17th output knot Fruit;
Full link block, for being connected entirely to the 17th output result, to obtain the 18th output result;
Tile module, for carrying out tiling processing to the 18th output result, to obtain the 19th output result;
4th splicing module, for splicing to the 19th output result and the 16th output result, to obtain second Ten output results;
9th process of convolution module, for carrying out the process of convolution that convolution kernel is 1*8 and 8*1 to the 20th output result, To obtain sample output result.
In one embodiment, the second convolution processing module includes:
Preliminary convolution submodule, it is preliminary to obtain for carrying out the process of convolution that convolution kernel is 1*1 to the second output result As a result;
Secondary convolution submodule, for carrying out the process of convolution that convolution kernel is 1*3 to PRELIMINARY RESULTS, to obtain secondary knot Fruit;
Cubic convolution submodule, for carrying out the process of convolution that convolution kernel is 3*1 to second fruiting, to be tied three times Fruit;
Four convolution submodules, for carrying out the process of convolution that convolution kernel is 1*1 to result three times, to obtain third output As a result.
Figure 10 be another embodiment of the present invention provides a kind of text real-time distinguishing apparatus 300 schematic block diagram.Such as figure Shown in 10, the text real-time distinguishing apparatus 300 of the present embodiment is to increase output unit 304 on the basis of above-described embodiment.
Output unit 304 is used for output character sequence.
It should be noted that it is apparent to those skilled in the art that, above-mentioned text real-time distinguishing apparatus 300 and each unit specific implementation process, can with reference to the corresponding description in preceding method embodiment, for convenience of description and Succinctly, details are not described herein.
Above-mentioned text real-time distinguishing apparatus 300 can be implemented as a kind of form of computer program, which can To be run in computer equipment as shown in figure 11.
Figure 11 is please referred to, Figure 11 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.The calculating Machine equipment 500 can be terminal, be also possible to server, wherein terminal can be smart phone, tablet computer, notebook electricity Brain, desktop computer, personal digital assistant and wearable device etc. have the electronic equipment of communication function.Server can be independence Server, be also possible to the server cluster of multiple servers composition.
Refering to fig. 11, which includes processor 502, memory and the net connected by system bus 501 Network interface 505, wherein memory may include non-volatile memory medium 503 and built-in storage 504.
The non-volatile memory medium 503 can storage program area 5031 and computer program 5032.The computer program 5032 include program instruction, which is performed, and processor 502 may make to execute a kind of text real-time identification method.
The processor 502 is for providing calculating and control ability, to support the operation of entire computer equipment 500.
The built-in storage 504 provides environment for the operation of the computer program 5032 in non-volatile memory medium 503, should When computer program 5032 is executed by processor 502, processor 502 may make to execute a kind of text real-time identification method.
The network interface 505 is used to carry out network communication with other equipment.It will be understood by those skilled in the art that in Figure 11 The structure shown, only the block diagram of part-structure relevant to application scheme, does not constitute and is applied to application scheme The restriction of computer equipment 500 thereon, specific computer equipment 500 may include more more or fewer than as shown in the figure Component perhaps combines certain components or with different component layouts.
Wherein, the processor 502 is for running computer program 5032 stored in memory, to realize following step It is rapid:
Obtain images to be recognized data;
Images to be recognized data are input in Text region model and carry out Text region, to obtain recognition result;
The recognition result is aligned using CTC loss function, to obtain character string;
Wherein, the Text region model is the image data by tape identification as sample data training convolutional nerve net Network is resulting.
In one embodiment, processor 502 is realizing that the Text region model is made by the image data of tape identification When for step obtained by sample data training convolutional neural networks, it is implemented as follows step:
Construct loss function and convolutional neural networks;
The image data of tape identification is obtained, to obtain sample data;
Sample data is inputted in convolutional neural networks and carries out convolutional calculation, to obtain sample output result;
In the image data entrance loss function that sample is exported to result and tape identification, to obtain penalty values;
The parameter of convolutional neural networks is adjusted according to penalty values;
Convolutional neural networks are learnt using sample data and using deep learning frame, to obtain Text region mould Type.
In one embodiment, processor 502 carries out convolution in described input sample data in convolutional neural networks of realization It calculates, when obtaining sample output result step, is implemented as follows step:
The process of convolution that convolution kernel is 3*3 is carried out to sample data, to obtain the first output result;
Maximum pondization processing is carried out to the first output result, to obtain the second output result;
Intersection process of convolution is carried out to the second output result, to obtain third output result;
The processing of mean value pondization is carried out to third output result, to obtain the 4th output result;
The process of convolution and intersect process of convolution that convolution kernel is 3*3 are carried out to third output result, it is defeated to obtain the 5th Result out;
4th output result and the 5th output result are spliced, to obtain the 6th output result;
6th output result is subjected to intersection process of convolution, to obtain the 7th output result;
7th output result and the 4th output result are subjected to splicing, to obtain the 6th output result;
Intersection process of convolution is carried out to the 6th output result, to obtain the 8th output result;
Maximum pondization processing is carried out to the 8th output result, to obtain the 9th output result;
Abandon to the 9th output result the adjacent area processing of layer characteristic pattern, to obtain the tenth output result;
The processing of mean value pondization is carried out to the 7th output result, to obtain the 11st output result;
Tenth output result and the 11st output result are spliced, to obtain the 12nd output result;
Intersection process of convolution is carried out to the 12nd output result, to obtain the 13rd output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 13rd output result, to obtain the 14th output result;
Abandon to the 14th output result the adjacent area processing of layer characteristic pattern, to obtain the 15th output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 15th output result, to obtain the 16th output result;
Global pool processing is carried out to the 16th output result, to obtain the 17th output result;
17th output result is connected entirely, to obtain the 18th output result;
Tiling processing is carried out to the 18th output result, to obtain the 19th output result;
19th output result and the 16th output result are spliced, to obtain the 20th output result;
The process of convolution that convolution kernel is 1*8 and 8*1 is carried out to the 20th output result, to obtain sample output result.
In one embodiment, processor 502 realize it is described to second output result carry out intersection process of convolution, to obtain When third exports result step, it is implemented as follows step:
The process of convolution that convolution kernel is 1*1 is carried out to the second output result, to obtain PRELIMINARY RESULTS;
The process of convolution that convolution kernel is 1*3 is carried out to PRELIMINARY RESULTS, to obtain second fruiting;
The process of convolution that convolution kernel is 3*1 is carried out to second fruiting, to obtain result three times;
The process of convolution that convolution kernel is 1*1 is carried out to result three times, to obtain third output result.
In one embodiment, processor 502 is described to third output result progress mean value pondization processing in realization, to obtain When the 4th output result step, it is implemented as follows step:
Pixel adjacent in third output result is averaged, to obtain the 4th output result.
In one embodiment, processor 502 is described using the CTC loss function alignment recognition result in realization, to obtain To after character string step, following steps are also realized:
Output character sequence.
It should be appreciated that in the embodiment of the present application, processor 502 can be central processing unit (Central Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic Device, discrete gate or transistor logic, discrete hardware components etc..Wherein, general processor can be microprocessor or Person's processor is also possible to any conventional processor etc..
Those of ordinary skill in the art will appreciate that be realize above-described embodiment method in all or part of the process, It is that relevant hardware can be instructed to complete by computer program.The computer program includes program instruction, computer journey Sequence can be stored in a storage medium, which is computer readable storage medium.The program instruction is by the department of computer science At least one processor in system executes, to realize the process step of the embodiment of the above method.
Therefore, the present invention also provides a kind of storage mediums.The storage medium can be computer readable storage medium.This is deposited Storage media is stored with computer program, and processor is made to execute following steps when wherein the computer program is executed by processor:
Obtain images to be recognized data;
Images to be recognized data are input in Text region model and carry out Text region, to obtain recognition result;
The recognition result is aligned using CTC loss function, to obtain character string;
Wherein, the Text region model is the image data by tape identification as sample data training convolutional nerve net Network is resulting.
In one embodiment, the processor realizes that the Text region model is logical executing the computer program When crossing the image data of tape identification as step obtained by sample data training convolutional neural networks, it is implemented as follows step:
Construct loss function and convolutional neural networks;
The image data of tape identification is obtained, to obtain sample data;
Sample data is inputted in convolutional neural networks and carries out convolutional calculation, to obtain sample output result;
In the image data entrance loss function that sample is exported to result and tape identification, to obtain penalty values;
The parameter of convolutional neural networks is adjusted according to penalty values;
Convolutional neural networks are learnt using sample data and using deep learning frame, to obtain Text region mould Type.
In one embodiment, the processor is realized described input sample data and is rolled up in the execution computer program Convolutional calculation is carried out in product neural network, when obtaining sample output result step, is implemented as follows step:
The process of convolution that convolution kernel is 3*3 is carried out to sample data, to obtain the first output result;
Maximum pondization processing is carried out to the first output result, to obtain the second output result;
Intersection process of convolution is carried out to the second output result, to obtain third output result;
The processing of mean value pondization is carried out to third output result, to obtain the 4th output result;
The process of convolution and intersect process of convolution that convolution kernel is 3*3 are carried out to third output result, it is defeated to obtain the 5th Result out;
4th output result and the 5th output result are spliced, to obtain the 6th output result;
6th output result is subjected to intersection process of convolution, to obtain the 7th output result;
7th output result and the 4th output result are subjected to splicing, to obtain the 6th output result;
Intersection process of convolution is carried out to the 6th output result, to obtain the 8th output result;
Maximum pondization processing is carried out to the 8th output result, to obtain the 9th output result;
Abandon to the 9th output result the adjacent area processing of layer characteristic pattern, to obtain the tenth output result;
The processing of mean value pondization is carried out to the 7th output result, to obtain the 11st output result;
Tenth output result and the 11st output result are spliced, to obtain the 12nd output result;
Intersection process of convolution is carried out to the 12nd output result, to obtain the 13rd output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 13rd output result, to obtain the 14th output result;
Abandon to the 14th output result the adjacent area processing of layer characteristic pattern, to obtain the 15th output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 15th output result, to obtain the 16th output result;
Global pool processing is carried out to the 16th output result, to obtain the 17th output result;
17th output result is connected entirely, to obtain the 18th output result;
Tiling processing is carried out to the 18th output result, to obtain the 19th output result;
19th output result and the 16th output result are spliced, to obtain the 20th output result;
The process of convolution that convolution kernel is 1*8 and 8*1 is carried out to the 20th output result, to obtain sample output result.
In one embodiment, the processor execute the computer program and realize it is described to the second output result into Row intersects process of convolution, when obtaining third output result step, is implemented as follows step:
The process of convolution that convolution kernel is 1*1 is carried out to the second output result, to obtain PRELIMINARY RESULTS;
The process of convolution that convolution kernel is 1*3 is carried out to PRELIMINARY RESULTS, to obtain second fruiting;
The process of convolution that convolution kernel is 3*1 is carried out to second fruiting, to obtain result three times;
The process of convolution that convolution kernel is 1*1 is carried out to result three times, to obtain third output result.
In one embodiment, the processor execute the computer program and realize it is described to third export result into The processing of row mean value pondization is implemented as follows step when obtaining the 4th output result step:
Pixel adjacent in third output result is averaged, to obtain the 4th output result.
In one embodiment, the processor is realized described using CTC loss function in the execution computer program It is aligned the recognition result, after obtaining character string step, also realizes following steps:
Output character sequence.
The storage medium can be USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), magnetic disk Or the various computer readable storage mediums that can store program code such as CD.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond the scope of this invention.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary.For example, the division of each unit, only Only a kind of logical function partition, there may be another division manner in actual implementation.Such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.
The steps in the embodiment of the present invention can be sequentially adjusted, merged and deleted according to actual needs.This hair Unit in bright embodiment device can be combined, divided and deleted according to actual needs.In addition, in each implementation of the present invention Each functional unit in example can integrate in one processing unit, is also possible to each unit and physically exists alone, can also be with It is that two or more units are integrated in one unit.
If the integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product, It can store in one storage medium.Based on this understanding, technical solution of the present invention is substantially in other words to existing skill The all or part of part or the technical solution that art contributes can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, terminal or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right It is required that protection scope subject to.

Claims (10)

1. text real-time identification method characterized by comprising
Obtain images to be recognized data;
Images to be recognized data are input in Text region model and carry out Text region, to obtain recognition result;
The recognition result is aligned using CTC loss function, to obtain character string;
Wherein, the Text region model is the image data by tape identification as sample data training convolutional neural networks institute ?.
2. text real-time identification method according to claim 1, which is characterized in that the Text region model is to pass through band The image data of mark is resulting as sample data training convolutional neural networks, comprising:
Construct loss function and convolutional neural networks;
The image data of tape identification is obtained, to obtain sample data;
Sample data is inputted in convolutional neural networks and carries out convolutional calculation, to obtain sample output result;
In the image data entrance loss function that sample is exported to result and tape identification, to obtain penalty values;
The parameter of convolutional neural networks is adjusted according to penalty values;
Convolutional neural networks are learnt using sample data and using deep learning frame, to obtain Text region model.
3. text real-time identification method according to claim 2, which is characterized in that described by sample data input convolution mind Convolutional calculation is carried out, in network to obtain sample output result, comprising:
The process of convolution that convolution kernel is 3*3 is carried out to sample data, to obtain the first output result;
Maximum pondization processing is carried out to the first output result, to obtain the second output result;
Intersection process of convolution is carried out to the second output result, to obtain third output result;
The processing of mean value pondization is carried out to third output result, to obtain the 4th output result;
The process of convolution and intersect process of convolution that convolution kernel is 3*3 are carried out to third output result, to obtain the 5th output knot Fruit;
4th output result and the 5th output result are spliced, to obtain the 6th output result;
6th output result is subjected to intersection process of convolution, to obtain the 7th output result;
7th output result and the 4th output result are subjected to splicing, to obtain the 6th output result;
Intersection process of convolution is carried out to the 6th output result, to obtain the 8th output result;
Maximum pondization processing is carried out to the 8th output result, to obtain the 9th output result;
Abandon to the 9th output result the adjacent area processing of layer characteristic pattern, to obtain the tenth output result;
The processing of mean value pondization is carried out to the 7th output result, to obtain the 11st output result;
Tenth output result and the 11st output result are spliced, to obtain the 12nd output result;
Intersection process of convolution is carried out to the 12nd output result, to obtain the 13rd output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 13rd output result, to obtain the 14th output result;
Abandon to the 14th output result the adjacent area processing of layer characteristic pattern, to obtain the 15th output result;
The process of convolution that convolution kernel is 3*3 is carried out to the 15th output result, to obtain the 16th output result;
Global pool processing is carried out to the 16th output result, to obtain the 17th output result;
17th output result is connected entirely, to obtain the 18th output result;
Tiling processing is carried out to the 18th output result, to obtain the 19th output result;
19th output result and the 16th output result are spliced, to obtain the 20th output result;
The process of convolution that convolution kernel is 1*8 and 8*1 is carried out to the 20th output result, to obtain sample output result.
4. text real-time identification method according to claim 3, which is characterized in that described to hand over the second output result Process of convolution is pitched, to obtain third output result, comprising:
The process of convolution that convolution kernel is 1*1 is carried out to the second output result, to obtain PRELIMINARY RESULTS;
The process of convolution that convolution kernel is 1*3 is carried out to PRELIMINARY RESULTS, to obtain second fruiting;
The process of convolution that convolution kernel is 3*1 is carried out to second fruiting, to obtain result three times;
The process of convolution that convolution kernel is 1*1 is carried out to result three times, to obtain third output result.
5. text real-time identification method according to claim 3, which is characterized in that described to be carried out to third output result It is worth pondization processing, to obtain the 4th output result, comprising:
Pixel adjacent in third output result is averaged, to obtain the 4th output result.
6. text real-time identification method according to any one of claims 1 to 5, which is characterized in that described to be lost using CTC Function is aligned the recognition result, after obtaining character string, further includes:
Output character sequence.
7. text real-time distinguishing apparatus characterized by comprising
Data capture unit, for obtaining images to be recognized data;
Recognition unit carries out Text region for images to be recognized data to be input in Text region model, to be identified As a result;
Alignment unit, for being aligned the recognition result using CTC loss function, to obtain character string.
8. text real-time identification method according to claim 7, which is characterized in that described device further include:
Training unit, for resulting as sample data training convolutional neural networks by the image data of tape identification, with To Text region model.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor, on the memory It is stored with computer program, the processor is realized as described in any one of claims 1 to 7 when executing the computer program Method.
10. a kind of storage medium, which is characterized in that the storage medium is stored with computer program, the computer program quilt Processor can realize the method as described in any one of claims 1 to 7 when executing.
CN201910256927.4A 2019-04-01 2019-04-01 Text real-time identification method, text real-time identification device, computer equipment and storage medium Active CN110008961B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910256927.4A CN110008961B (en) 2019-04-01 2019-04-01 Text real-time identification method, text real-time identification device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910256927.4A CN110008961B (en) 2019-04-01 2019-04-01 Text real-time identification method, text real-time identification device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110008961A true CN110008961A (en) 2019-07-12
CN110008961B CN110008961B (en) 2023-05-12

Family

ID=67169203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910256927.4A Active CN110008961B (en) 2019-04-01 2019-04-01 Text real-time identification method, text real-time identification device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110008961B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688411A (en) * 2019-09-25 2020-01-14 北京地平线机器人技术研发有限公司 Text recognition method and device
CN111428656A (en) * 2020-03-27 2020-07-17 信雅达***工程股份有限公司 Mobile terminal identity card identification method based on deep learning and mobile device
CN112116001A (en) * 2020-09-17 2020-12-22 苏州浪潮智能科技有限公司 Image recognition method, image recognition device and computer-readable storage medium
CN112215229A (en) * 2020-08-27 2021-01-12 北京英泰智科技股份有限公司 Lightweight network end-to-end-based license plate identification method and device
CN112668600A (en) * 2019-10-16 2021-04-16 商汤国际私人有限公司 Text recognition method and device
CN113283427A (en) * 2021-07-20 2021-08-20 北京世纪好未来教育科技有限公司 Text recognition method, device, equipment and medium
WO2024088269A1 (en) * 2022-10-26 2024-05-02 维沃移动通信有限公司 Character recognition method and apparatus, and electronic device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335754A (en) * 2015-10-29 2016-02-17 小米科技有限责任公司 Character recognition method and device
CN106354701A (en) * 2016-08-30 2017-01-25 腾讯科技(深圳)有限公司 Chinese character processing method and device
CN106570509A (en) * 2016-11-04 2017-04-19 天津大学 Dictionary learning and coding method for extracting digital image feature
CN108182455A (en) * 2018-01-18 2018-06-19 齐鲁工业大学 A kind of method, apparatus and intelligent garbage bin of the classification of rubbish image intelligent
CN108427953A (en) * 2018-02-26 2018-08-21 北京易达图灵科技有限公司 A kind of character recognition method and device
CN108875904A (en) * 2018-04-04 2018-11-23 北京迈格威科技有限公司 Image processing method, image processing apparatus and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335754A (en) * 2015-10-29 2016-02-17 小米科技有限责任公司 Character recognition method and device
CN106354701A (en) * 2016-08-30 2017-01-25 腾讯科技(深圳)有限公司 Chinese character processing method and device
CN106570509A (en) * 2016-11-04 2017-04-19 天津大学 Dictionary learning and coding method for extracting digital image feature
CN108182455A (en) * 2018-01-18 2018-06-19 齐鲁工业大学 A kind of method, apparatus and intelligent garbage bin of the classification of rubbish image intelligent
CN108427953A (en) * 2018-02-26 2018-08-21 北京易达图灵科技有限公司 A kind of character recognition method and device
CN108875904A (en) * 2018-04-04 2018-11-23 北京迈格威科技有限公司 Image processing method, image processing apparatus and computer readable storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688411A (en) * 2019-09-25 2020-01-14 北京地平线机器人技术研发有限公司 Text recognition method and device
CN112668600A (en) * 2019-10-16 2021-04-16 商汤国际私人有限公司 Text recognition method and device
CN112668600B (en) * 2019-10-16 2024-05-21 商汤国际私人有限公司 Text recognition method and device
CN111428656A (en) * 2020-03-27 2020-07-17 信雅达***工程股份有限公司 Mobile terminal identity card identification method based on deep learning and mobile device
CN112215229A (en) * 2020-08-27 2021-01-12 北京英泰智科技股份有限公司 Lightweight network end-to-end-based license plate identification method and device
CN112215229B (en) * 2020-08-27 2023-07-18 北京英泰智科技股份有限公司 License plate recognition method and device based on lightweight network end-to-end
CN112116001A (en) * 2020-09-17 2020-12-22 苏州浪潮智能科技有限公司 Image recognition method, image recognition device and computer-readable storage medium
CN112116001B (en) * 2020-09-17 2022-06-07 苏州浪潮智能科技有限公司 Image recognition method, image recognition device and computer-readable storage medium
CN113283427A (en) * 2021-07-20 2021-08-20 北京世纪好未来教育科技有限公司 Text recognition method, device, equipment and medium
CN113283427B (en) * 2021-07-20 2021-10-01 北京世纪好未来教育科技有限公司 Text recognition method, device, equipment and medium
WO2024088269A1 (en) * 2022-10-26 2024-05-02 维沃移动通信有限公司 Character recognition method and apparatus, and electronic device and storage medium

Also Published As

Publication number Publication date
CN110008961B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN110008961A (en) Text real-time identification method, device, computer equipment and storage medium
WO2020199931A1 (en) Face key point detection method and apparatus, and storage medium and electronic device
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
KR20210073569A (en) Method, apparatus, device and storage medium for training image semantic segmentation network
CN110188331A (en) Model training method, conversational system evaluation method, device, equipment and storage medium
CN109902546A (en) Face identification method, device and computer-readable medium
CN109522967A (en) A kind of commodity attribute recognition methods, device, equipment and storage medium
CN108288075A (en) A kind of lightweight small target detecting method improving SSD
CN110378348A (en) Instance of video dividing method, equipment and computer readable storage medium
CN106203363A (en) Human skeleton motion sequence Activity recognition method
CN106599800A (en) Face micro-expression recognition method based on deep learning
CN108776983A (en) Based on the facial reconstruction method and device, equipment, medium, product for rebuilding network
CN109766840A (en) Facial expression recognizing method, device, terminal and storage medium
CN109902548A (en) A kind of object properties recognition methods, calculates equipment and system at device
CN111160350A (en) Portrait segmentation method, model training method, device, medium and electronic equipment
CN110232373A (en) Face cluster method, apparatus, equipment and storage medium
CN109145717A (en) A kind of face identification method of on-line study
CN109086768A (en) The semantic image dividing method of convolutional neural networks
CN107491729B (en) Handwritten digit recognition method based on cosine similarity activated convolutional neural network
CN109344920A (en) Customer attributes prediction technique, storage medium, system and equipment
CN110826462A (en) Human body behavior identification method of non-local double-current convolutional neural network model
CN108509833A (en) A kind of face identification method, device and equipment based on structured analysis dictionary
CN110097090A (en) A kind of image fine granularity recognition methods based on multi-scale feature fusion
CN109086653A (en) Handwriting model training method, hand-written character recognizing method, device, equipment and medium
CN110378208A (en) A kind of Activity recognition method based on depth residual error network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant after: Shenzhen Huafu Technology Co.,Ltd.

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant before: SHENZHEN HUAFU INFORMATION TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant