CN106650725A - Full convolutional neural network-based candidate text box generation and text detection method - Google Patents
Full convolutional neural network-based candidate text box generation and text detection method Download PDFInfo
- Publication number
- CN106650725A CN106650725A CN201611070587.9A CN201611070587A CN106650725A CN 106650725 A CN106650725 A CN 106650725A CN 201611070587 A CN201611070587 A CN 201611070587A CN 106650725 A CN106650725 A CN 106650725A
- Authority
- CN
- China
- Prior art keywords
- text
- candidate
- detection
- network
- inception
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a full convolutional neural network-based candidate text box generation and text detection method. The method comprises the steps of generating text region candidate boxes, taking a natural scene picture and a set of real bounding boxes for marking a text region as inputs by an inception-RPN, generating a controllable number of word region candidate boxes, sliding an inception network on a convolutional feature response graph of a VGG16 model, and providing assistance in each sliding position through a set of text feature priori boxes; incorporating text type monitoring information easily causing ambiguity, fusing multilevel regional down-sampling information, and performing text detection; training an inception candidate box generation network and a text detection network in an end-to-end way through back propagation and stochastic gradient descent; and performing iterative voting by the candidate boxes, obtaining a higher text recall rate in a supplementary way, and removing excessive detection boxes by using a candidate box filtering algorithm. According to the method, the accuracy rates of 0.83 and 0.85 are obtained in ICDAR 2011 and 2013 robust text detection standard databases and are superior to the previous best result.
Description
Technical field
The present invention relates to natural scene picture Chinese version candidate frame generates the technology with text detection, more particularly to based on complete
Candidate's text box of convolutional neural networks is generated and Method for text detection.
Background technology
Text in image provides abundant and accurate high-caliber semantic information, and these information understand for scene,
Image and food are retrieved, and content-based recommendation system etc. is potentially large number of using most important.The text inspection of natural scene picture
Survey has attracted substantial amounts of concern in computer vision and image understanding community.However, the text detection of natural scene remains one
It is individual full of challenge and an open question.First, the background of textual image is very complicated, and symbol, mark, fragment of brick and grass
The regions such as ground composition is very difficult to and text differentiation.Additionally, uneven illumination condition, heavy exposure, low contrast, fuzzy
Huge challenge is added to text detection task with the super confounding factor such as low resolution
The content of the invention
To overcome the deficiencies in the prior art, the present invention to propose that the candidate's text box based on full convolutional neural networks is generated and text
This detection method.
The technical scheme is that what is be achieved in that:
Candidate's text box based on full convolutional neural networks is generated and Method for text detection, including step
S1:Generate text filed candidate frame, inception-RPN is with natural scene picture and a set of retrtieval region
Real border frame produces the word region candidate frame of controlled quantity, on the convolution characteristic response figure of VGG16 models as input
Slide an inception network, and aids in a set of text feature priori frame in each sliding position;
S2:The text categories supervision message for easily causing ambiguity is incorporated to, multi-level region down-sampling information is incorporated, is carried out
Text detection;
S3:By backpropagation and stochastic gradient descent, inception candidate frames are trained to give birth in a kind of mode end to end
Into network and text detection network;
S4:The ballot of candidate frame iteration obtains higher text recall rate in the way of a kind of supplement, is filtered using candidate frame
Algorithm, removes the detection block of surplus.
Further, step S1 includes step
S11:Text feature priori frame is designed;
S12:Build Inception candidate frames and generate network.
Further, totally 24 kinds of step S11 Chinese eigen priori frame, the width of wherein each sliding position sliding window sets
For 32,48,64 and 80, Aspect Ratio is 0.2,0.5,0.8,1.0,1.2 and 1.5.
Further, inception candidate frames generate convolutional layer of the network by a 3*3, the volume of 5*5 in step S12
The maximum pond layer of lamination and 3*3 is connected to the corresponding space of the characteristic response figure of a Conv5_3 as input and receives
On domain.
Further, step S2 Chinese version classification supervision message is:Candidate frame IoU overlaps being appointed as more than or equal to 0.5
There is text, candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, other are appointed as not including
Text message.
Further, multi-level in step S2 region down-sampling information is:VGG16 networks Conv4_3 and
The convolution characteristic response figure of Conv5_3 is carried out multi-level region down-sampling, and obtains the sampling feature of two 512*H*W,
Then the feature for being linked together with the convolution layer decoder of a 512*1*1.
The beneficial effects of the present invention is, compared with prior art, the present invention proposes inception candidate frames and generates net
Network, this network applies different size of sliding window on convolution characteristic pattern, and aids in a set of text in each sliding position
Feature priori frame, generates word region candidate frame.This different size of sliding window retains local information on relevant position
While also take into account contextual information, help filters out the candidate frame without text, and the inception candidate frames of the present invention are generated
Network has obtained very high recall rate in the case of only with hundreds of word candidates frame;The present invention also draws in text detection network
Enter the extra easily text categories supervision message of an ambiguity and incorporate multi-level region down-sampling information, these information
The more distinction information of help text detection e-learning distinguish text from complicated background;Additionally, the present invention is in order to more
Well using the model in training process, it is proposed that a kind of scheme of candidate frame iteration ballot, obtained in the way of a kind of supplement
Higher word recall rate, the filter algorithm that the present invention is used retains optimal candidate frame, removes the candidate frame of surplus.
Description of the drawings
Fig. 1 is the flow chart of candidate text box generation and Method for text detection of the present invention based on full convolutional neural networks.
Fig. 2 is the exemplary plot that the IoU of the word region candidate frame of one embodiment of the invention list overlaps specific interval.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
Fig. 1 is referred to, candidate text box of the present invention based on full convolutional neural networks is generated and Method for text detection, comprising
Four steps:S1, text filed candidate frame are generated;S2, text detection;S3, end to end study optimization;S4, heuristic process.
The part S1's act as:Inception-RPN is with natural scene picture and a set of retrtieval region
Real border frame as input, produce the word region candidate frame of controlled quantity;For searching words region candidate frame, we
Slide an inception network on the convolution characteristic response figure of VGG16 models, and aids in a set of text in each sliding position
Eigen priori frame.Particularly may be divided into two steps:(1) text feature priori frame (2) Inception candidate frames are designed and generates network.
Each sliding position arrange four kinds of different scales (32,48,64 with 80) different with six kinds ratio (0.2,0.5,0.8,1.0,
1.2 and 1.5), common k=24 kinds priori sliding window.In the study stage, be more than 0.5 divided by union occuring simultaneously with real text frame
Be appointed as text label, otherwise overlapping region is appointed as background label divided by union refion less than 0.3.Design
Inception candidate frames generate convolutional layer of the network by a 3*3, and the convolutional layer of 5*5 and the maximum pond layer of 3*3 are connected to one
In the corresponding space acceptance region of the characteristic response figure of the individual Conv5_3 as input.In addition, in order to reduce dimension, the volume of 1*1
Product operation is used on the maximum pond layer of 3*3.Then, we couple together the feature of various pieces on passage coordinate,
The connection features vector of one 640 dimension is sent to two output layers:Classification layer predicts score of the region with the presence or absence of text, returns
Layer is returned to improve the text filed position of the various priori windows of each sliding position.
Step S2 includes:(1) the comprehensive text categories supervision message for easily causing ambiguity is to increase more rational prisons
Superintend and direct information, help grader to learn more area's another characteristics, identify from complicated and diversified background text filed, and filter
Fall the candidate frame not comprising text.(2) multi-level region down-sampling information is incorporated.It act as preferably utilizing multi-level volume
The distinction information of product feature and abundant each sliding window.
Being much operated in detection network in the past is appointed as the presence of text the candidate frame that IoU is overlapped more than 0.5, otherwise
It is appointed as no presence of text.But this judgement candidate frame is irrational with the presence or absence of the method for text, because IoU is overlapped
Interval 0.2 to 0.5 may include space or autgmentability text message, as shown in Figure 2.The label information that these mix can be upset
The classification learning of text and non-textual candidate frame.For this purpose, it is proposed that candidate frame IoU is overlapped being appointed as more than or equal to 0.5
There is text, candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, other are appointed as not including
Text message.This strategy provides more rational supervision messages and helps grader to learn more distinction features, with
Text is identified from complicated and diversified background and the candidate frame without text is filtered out.
In order to better profit from multi-level convolution feature and enrich the discriminant information of each candidate frame, the present invention is in VGG16
The convolution characteristic response figure of the Conv4_3 and Conv5_3 of network is carried out multi-level region down-sampling, and obtains two 512*
The sampling feature of H*W.Then the feature for being linked together with the convolution layer decoder of a 512*1*1.The convolutional layer of this 1*1
Together and in the training process Weight merges by multi-level sampling combinations of features to act as (1).(2) reduce dimension with
First full articulamentum of matching VGG16.
The part S3 is different from having pointed out the four step Training strategies for combining RPN and Fast-RCNN, and the present invention is logical
The method for crossing backpropagation and stochastic gradient descent generates network and text detection network with end-to-end inception candidate frames
Mode be trained.Shared convolutional network is by the good imageNet sorter networks initialization of training in advance.The weight of new layer
The Gaussian Profile initialization that by average be 0 and deviation is 0.01.Benchmark learning rate is 0.001, and original is reduced into 40000 times per iteration
/ 10th for coming.Momentum and weights attenuation are set to 0.9 and 0.0005.
Inception candidate frames generate network and text detection network two fraternal input layers:One classification layer, one
Return layer.Inception candidate frames generate network and the difference of text detection network output layer is as follows:(1) inception candidates
Frame generates network, and each priori frame should be by independent parameter, so we need to predict k=24 priori candidate simultaneously
Frame.Classification layer exports 2k and judges whether candidate frame has the score of text, while returning the candidate frame after layer output 4k improves
Deviate the numerical value of former candidate frame.(2) text detection network has three output scores to each candidate frame, and background, mould are corresponded to respectively
Paste text and the candidate frame that there is text.Return layer and export 4 deviation from regression values of each text candidates frame.In our training process
The loss function minimum of this multitask is made, formula is as follows:
L(p,p*,t,t*)=Lcls(p,p*)+λLreg(t,t*), (0.1)
The loss function L of classification layerclsIt is softmax loss functions, p and p*It is respectively the label and real mark of prediction
Sign.Return loss function LregUsing smooth-L1 loss functions.In addition, t={ tx,ty,tw,thAndPoint
The deviation from regression value vector of prediction and true candidate frame, t are not represented not accordingly*By equation below gained:
Here, P={ Px,Py,Pw,PhAnd G={ Gx,Gy,Gw,GhCorresponding candidate frame P and real text frame G is represented respectively
Centre coordinate, height and width.λ represents loss balance parameters, and we allow λ=3 in inception candidate frames generate network
So that he is partial to more preferable candidate frame position, in text detection network by λ=1.
The part S4 includes candidate frame iteration voting mechanism and filter algorithm.Candidate frame iteration voting mechanism makes this
Invention obtains higher text recall rate in the way of a kind of supplement, and improve text detection system is energy.Filter algorithm makes this
Invention removes the detection block of surplus, to improve accuracy.
Natural scene picture and a set of real text frame data are input to inception candidate frames and are generated by the present invention first
Network, produces a number of word region candidate frame.Then will obtain word region candidate frame send into one be used for text and
Non-textual classification and the text detection network of String localization, the network increased in the training process the text for easily causing ambiguity
Classification supervision message and multi-level region down-sampling information is incorporated.Whole system declines mechanism by backpropagation and gradient
It is trained in a kind of mode end to end.The mid-module present invention to make full use of training process is thrown using candidate frame iteration
Ticket mechanism obtains the high recall rate of text example in the way of a kind of supplement, improves the performance of whole text detection system.Finally
The present invention applies a kind of filter algorithm, this algorithm that the inside and outside candidate frame of each text example is found for coordinate position, protects
High score candidate frame is stayed, the candidate frame of low score is removed.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art
For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as
Protection scope of the present invention.
Claims (6)
1. the candidate's text box based on full convolutional neural networks is generated and Method for text detection, it is characterised in that including step
S1:Generate text filed candidate frame, inception-RPN is true with natural scene picture and a set of retrtieval region
Bounding box produces the word region candidate frame of controlled quantity as input, slides on the convolution characteristic response figure of VGG16 models
One inception network, and aid in a set of text feature priori frame in each sliding position;
S2:The text categories supervision message for easily causing ambiguity is incorporated to, multi-level region down-sampling information is incorporated, text is carried out
Detection;
S3:By backpropagation and stochastic gradient descent, inception candidate frames are trained to generate net in a kind of mode end to end
Network and text detection network;
S4:The ballot of candidate frame iteration obtains higher text recall rate in the way of a kind of supplement, using candidate frame filter algorithm,
Remove the detection block of surplus.
2. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special
Levy and be, step S1 includes step
S11:Text feature priori frame is designed;
S12:Build Inception candidate frames and generate network.
3. candidate's text box as claimed in claim 2 based on full convolutional neural networks is generated and Method for text detection, and it is special
Levy and be, totally 24 kinds of step S11 Chinese eigen priori frame, wherein each sliding position sliding window width is set to 32,48,64
With 80, Aspect Ratio is 0.2,0.5,0.8,1.0,1.2 and 1.5.
4. candidate's text box as claimed in claim 2 based on full convolutional neural networks is generated and Method for text detection, and it is special
Levy and be, inception candidate frames generate convolutional layer of the network by a 3*3 in step S12, and the convolutional layer and 3*3 of 5*5 are most
Great Chiization layer is connected in the corresponding space acceptance region of the characteristic response figure of a Conv5_3 as input.
5. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special
Levy and be, step S2 Chinese version classification supervision message is:Candidate frame IoU overlaps being appointed as more than or equal to 0.5 and there is text,
Candidate frame IoU is overlapped and is appointed as " fuzzy text " less than 0.5 more than or equal to 0.2, and other are appointed as not comprising text message.
6. candidate's text box as claimed in claim 1 based on full convolutional neural networks is generated and Method for text detection, and it is special
Levy and be, region down-sampling information multi-level in step S2 is:It is special in the convolution of the Conv4_3 and Conv5_3 of VGG16 networks
Levy response diagram and be carried out multi-level region down-sampling, and obtain the sampling feature of two 512*H*W, then with a 512*1*
The feature that 1 convolution layer decoder links together.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611070587.9A CN106650725B (en) | 2016-11-29 | 2016-11-29 | Candidate text box generation and text detection method based on full convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611070587.9A CN106650725B (en) | 2016-11-29 | 2016-11-29 | Candidate text box generation and text detection method based on full convolution neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106650725A true CN106650725A (en) | 2017-05-10 |
CN106650725B CN106650725B (en) | 2020-06-26 |
Family
ID=58813359
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611070587.9A Active CN106650725B (en) | 2016-11-29 | 2016-11-29 | Candidate text box generation and text detection method based on full convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106650725B (en) |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107316058A (en) * | 2017-06-15 | 2017-11-03 | 国家新闻出版广电总局广播科学研究院 | Improve the method for target detection performance by improving target classification and positional accuracy |
CN107397658A (en) * | 2017-07-26 | 2017-11-28 | 成都快眼科技有限公司 | A kind of multiple dimensioned full convolutional network and vision blind-guiding method and device |
CN107480649A (en) * | 2017-08-24 | 2017-12-15 | 浙江工业大学 | Fingerprint sweat pore extraction method based on full convolution neural network |
CN108090443A (en) * | 2017-12-15 | 2018-05-29 | 华南理工大学 | Scene text detection method and system based on deeply study |
CN108154145A (en) * | 2018-01-24 | 2018-06-12 | 北京地平线机器人技术研发有限公司 | The method and apparatus for detecting the position of the text in natural scene image |
CN108288088A (en) * | 2018-01-17 | 2018-07-17 | 浙江大学 | A kind of scene text detection method based on end-to-end full convolutional neural networks |
CN108647681A (en) * | 2018-05-08 | 2018-10-12 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108764228A (en) * | 2018-05-28 | 2018-11-06 | 嘉兴善索智能科技有限公司 | Word object detection method in a kind of image |
CN109165697A (en) * | 2018-10-12 | 2019-01-08 | 福州大学 | A kind of natural scene character detecting method based on attention mechanism convolutional neural networks |
CN109190458A (en) * | 2018-07-20 | 2019-01-11 | 华南理工大学 | A kind of person of low position's head inspecting method based on deep learning |
CN109299274A (en) * | 2018-11-07 | 2019-02-01 | 南京大学 | A kind of natural scene Method for text detection based on full convolutional neural networks |
CN109376658A (en) * | 2018-10-26 | 2019-02-22 | 信雅达***工程股份有限公司 | A kind of OCR method based on deep learning |
CN109389114A (en) * | 2017-08-08 | 2019-02-26 | 富士通株式会社 | Line of text acquisition device and method |
CN109492630A (en) * | 2018-10-26 | 2019-03-19 | 信雅达***工程股份有限公司 | A method of the word area detection positioning in the financial industry image based on deep learning |
CN109598290A (en) * | 2018-11-22 | 2019-04-09 | 上海交通大学 | A kind of image small target detecting method combined based on hierarchical detection |
CN109800756A (en) * | 2018-12-14 | 2019-05-24 | 华南理工大学 | A kind of text detection recognition methods for the intensive text of Chinese historical document |
CN109918987A (en) * | 2018-12-29 | 2019-06-21 | 中国电子科技集团公司信息科学研究院 | A kind of video caption keyword recognition method and device |
CN110135248A (en) * | 2019-04-03 | 2019-08-16 | 华南理工大学 | A kind of natural scene Method for text detection based on deep learning |
CN110135424A (en) * | 2019-05-23 | 2019-08-16 | 阳光保险集团股份有限公司 | Tilt text detection model training method and ticket image Method for text detection |
CN110135408A (en) * | 2019-03-26 | 2019-08-16 | 北京捷通华声科技股份有限公司 | Text image detection method, network and equipment |
CN110619325A (en) * | 2018-06-20 | 2019-12-27 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN112418207A (en) * | 2020-11-23 | 2021-02-26 | 南京审计大学 | Weak supervision character detection method based on self-attention distillation |
CN112765353A (en) * | 2021-01-22 | 2021-05-07 | 重庆邮电大学 | Scientific research text-based biomedical subject classification method and device |
CN113454638A (en) * | 2018-12-19 | 2021-09-28 | 艾奎菲股份有限公司 | System and method for joint learning of complex visual inspection tasks using computer vision |
CN117275005A (en) * | 2023-09-21 | 2023-12-22 | 北京百度网讯科技有限公司 | Text detection, text detection model optimization and data annotation method and device |
CN117496130A (en) * | 2023-11-22 | 2024-02-02 | 中国科学院空天信息创新研究院 | Basic model weak supervision target detection method based on context awareness self-training |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015132665A2 (en) * | 2014-03-07 | 2015-09-11 | Wolf, Lior | System and method for the detection and counting of repetitions of repetitive activity via a trained network |
CN104915386A (en) * | 2015-05-25 | 2015-09-16 | 中国科学院自动化研究所 | Short text clustering method based on deep semantic feature learning |
CN105740892A (en) * | 2016-01-27 | 2016-07-06 | 北京工业大学 | High-accuracy human body multi-position identification method based on convolutional neural network |
CN105912611A (en) * | 2016-04-05 | 2016-08-31 | 中国科学技术大学 | CNN based quick image search method |
-
2016
- 2016-11-29 CN CN201611070587.9A patent/CN106650725B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015132665A2 (en) * | 2014-03-07 | 2015-09-11 | Wolf, Lior | System and method for the detection and counting of repetitions of repetitive activity via a trained network |
CN104915386A (en) * | 2015-05-25 | 2015-09-16 | 中国科学院自动化研究所 | Short text clustering method based on deep semantic feature learning |
CN105740892A (en) * | 2016-01-27 | 2016-07-06 | 北京工业大学 | High-accuracy human body multi-position identification method based on convolutional neural network |
CN105912611A (en) * | 2016-04-05 | 2016-08-31 | 中国科学技术大学 | CNN based quick image search method |
Non-Patent Citations (2)
Title |
---|
KEZE WANG 等: "Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
金连文 等: "深度学习在手写汉字识别中的应用综述", 《自动化学报》 * |
Cited By (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107316058A (en) * | 2017-06-15 | 2017-11-03 | 国家新闻出版广电总局广播科学研究院 | Improve the method for target detection performance by improving target classification and positional accuracy |
CN107397658B (en) * | 2017-07-26 | 2020-06-19 | 成都快眼科技有限公司 | Multi-scale full-convolution network and visual blind guiding method and device |
CN107397658A (en) * | 2017-07-26 | 2017-11-28 | 成都快眼科技有限公司 | A kind of multiple dimensioned full convolutional network and vision blind-guiding method and device |
CN109389114B (en) * | 2017-08-08 | 2021-12-03 | 富士通株式会社 | Text line acquisition device and method |
CN109389114A (en) * | 2017-08-08 | 2019-02-26 | 富士通株式会社 | Line of text acquisition device and method |
CN107480649A (en) * | 2017-08-24 | 2017-12-15 | 浙江工业大学 | Fingerprint sweat pore extraction method based on full convolution neural network |
CN108090443A (en) * | 2017-12-15 | 2018-05-29 | 华南理工大学 | Scene text detection method and system based on deeply study |
CN108090443B (en) * | 2017-12-15 | 2020-09-22 | 华南理工大学 | Scene text detection method and system based on deep reinforcement learning |
CN108288088B (en) * | 2018-01-17 | 2020-02-28 | 浙江大学 | Scene text detection method based on end-to-end full convolution neural network |
CN108288088A (en) * | 2018-01-17 | 2018-07-17 | 浙江大学 | A kind of scene text detection method based on end-to-end full convolutional neural networks |
CN108154145B (en) * | 2018-01-24 | 2020-05-19 | 北京地平线机器人技术研发有限公司 | Method and device for detecting position of text in natural scene image |
CN108154145A (en) * | 2018-01-24 | 2018-06-12 | 北京地平线机器人技术研发有限公司 | The method and apparatus for detecting the position of the text in natural scene image |
CN108647681A (en) * | 2018-05-08 | 2018-10-12 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108647681B (en) * | 2018-05-08 | 2019-06-14 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108764228A (en) * | 2018-05-28 | 2018-11-06 | 嘉兴善索智能科技有限公司 | Word object detection method in a kind of image |
CN110619325A (en) * | 2018-06-20 | 2019-12-27 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN110619325B (en) * | 2018-06-20 | 2024-03-08 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN109190458A (en) * | 2018-07-20 | 2019-01-11 | 华南理工大学 | A kind of person of low position's head inspecting method based on deep learning |
CN109165697B (en) * | 2018-10-12 | 2021-11-30 | 福州大学 | Natural scene character detection method based on attention mechanism convolutional neural network |
CN109165697A (en) * | 2018-10-12 | 2019-01-08 | 福州大学 | A kind of natural scene character detecting method based on attention mechanism convolutional neural networks |
CN109492630A (en) * | 2018-10-26 | 2019-03-19 | 信雅达***工程股份有限公司 | A method of the word area detection positioning in the financial industry image based on deep learning |
CN109376658A (en) * | 2018-10-26 | 2019-02-22 | 信雅达***工程股份有限公司 | A kind of OCR method based on deep learning |
CN109299274B (en) * | 2018-11-07 | 2021-12-17 | 南京大学 | Natural scene text detection method based on full convolution neural network |
CN109299274A (en) * | 2018-11-07 | 2019-02-01 | 南京大学 | A kind of natural scene Method for text detection based on full convolutional neural networks |
CN109598290A (en) * | 2018-11-22 | 2019-04-09 | 上海交通大学 | A kind of image small target detecting method combined based on hierarchical detection |
CN109800756A (en) * | 2018-12-14 | 2019-05-24 | 华南理工大学 | A kind of text detection recognition methods for the intensive text of Chinese historical document |
CN109800756B (en) * | 2018-12-14 | 2021-02-12 | 华南理工大学 | Character detection and identification method for dense text of Chinese historical literature |
CN113454638A (en) * | 2018-12-19 | 2021-09-28 | 艾奎菲股份有限公司 | System and method for joint learning of complex visual inspection tasks using computer vision |
CN109918987B (en) * | 2018-12-29 | 2021-05-14 | 中国电子科技集团公司信息科学研究院 | Video subtitle keyword identification method and device |
CN109918987A (en) * | 2018-12-29 | 2019-06-21 | 中国电子科技集团公司信息科学研究院 | A kind of video caption keyword recognition method and device |
CN110135408B (en) * | 2019-03-26 | 2021-02-19 | 北京捷通华声科技股份有限公司 | Text image detection method, network and equipment |
CN110135408A (en) * | 2019-03-26 | 2019-08-16 | 北京捷通华声科技股份有限公司 | Text image detection method, network and equipment |
CN110135248A (en) * | 2019-04-03 | 2019-08-16 | 华南理工大学 | A kind of natural scene Method for text detection based on deep learning |
CN110135424B (en) * | 2019-05-23 | 2021-06-11 | 阳光保险集团股份有限公司 | Inclined text detection model training method and ticket image text detection method |
CN110135424A (en) * | 2019-05-23 | 2019-08-16 | 阳光保险集团股份有限公司 | Tilt text detection model training method and ticket image Method for text detection |
CN112418207A (en) * | 2020-11-23 | 2021-02-26 | 南京审计大学 | Weak supervision character detection method based on self-attention distillation |
CN112418207B (en) * | 2020-11-23 | 2024-03-19 | 南京审计大学 | Weak supervision character detection method based on self-attention distillation |
CN112765353A (en) * | 2021-01-22 | 2021-05-07 | 重庆邮电大学 | Scientific research text-based biomedical subject classification method and device |
CN117275005A (en) * | 2023-09-21 | 2023-12-22 | 北京百度网讯科技有限公司 | Text detection, text detection model optimization and data annotation method and device |
CN117496130A (en) * | 2023-11-22 | 2024-02-02 | 中国科学院空天信息创新研究院 | Basic model weak supervision target detection method based on context awareness self-training |
Also Published As
Publication number | Publication date |
---|---|
CN106650725B (en) | 2020-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106650725A (en) | Full convolutional neural network-based candidate text box generation and text detection method | |
CN113254648B (en) | Text emotion analysis method based on multilevel graph pooling | |
CN107066445B (en) | The deep learning method of one attribute emotion word vector | |
CN104217214B (en) | RGB D personage's Activity recognition methods based on configurable convolutional neural networks | |
CN103631859B (en) | Intelligent review expert recommending method for science and technology projects | |
CN110298037A (en) | The matched text recognition method of convolutional neural networks based on enhancing attention mechanism | |
CN109461157A (en) | Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field | |
CN106354710A (en) | Neural network relation extracting method | |
CN110083700A (en) | A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks | |
CN109961132A (en) | System and method for learning the structure of depth convolutional neural networks | |
CN107092596A (en) | Text emotion analysis method based on attention CNNs and CCR | |
CN110516539A (en) | Remote sensing image building extracting method, system, storage medium and equipment based on confrontation network | |
CN106844442A (en) | Multi-modal Recognition with Recurrent Neural Network Image Description Methods based on FCN feature extractions | |
CN106845499A (en) | A kind of image object detection method semantic based on natural language | |
CN109492666A (en) | Image recognition model training method, device and storage medium | |
CN108038205A (en) | For the viewpoint analysis prototype system of Chinese microblogging | |
CN108197294A (en) | A kind of text automatic generation method based on deep learning | |
CN113378047B (en) | Multi-aspect enhancement-based graph neural network recommendation method | |
CN110222634A (en) | A kind of human posture recognition method based on convolutional neural networks | |
CN107657056A (en) | Method and apparatus based on artificial intelligence displaying comment information | |
CN112925908A (en) | Attention-based text classification method and system for graph Attention network | |
CN109063719A (en) | A kind of image classification method of co-ordinative construction similitude and category information | |
CN110110063A (en) | A kind of question answering system construction method based on Hash study | |
CN113780002A (en) | Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning | |
CN113255895A (en) | Graph neural network representation learning-based structure graph alignment method and multi-graph joint data mining method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |