CN107908642A - Industry text entities extracting method based on distributed platform - Google Patents
Industry text entities extracting method based on distributed platform Download PDFInfo
- Publication number
- CN107908642A CN107908642A CN201710902720.0A CN201710902720A CN107908642A CN 107908642 A CN107908642 A CN 107908642A CN 201710902720 A CN201710902720 A CN 201710902720A CN 107908642 A CN107908642 A CN 107908642A
- Authority
- CN
- China
- Prior art keywords
- text
- extraction
- model
- feature
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of industry text entities extracting method based on distributed platform, including:Relationship characteristic model is obtained using deep learning neural metwork training text data set;The relationship characteristic of extraction is generated into multiple elasticity distribution formula relationship characteristic data set RDD;The category feature model extraction category feature that data set in RDD is obtained by improved non-linear svm classifier Algorithm for Training;Corresponding linguistic context physical model is found according to the category feature of extraction, and the solid data in the text for corresponding to category feature is extracted by trained physical model;Judge whether this quantity of corresponding linguistic context text exceedes given threshold, if exceed threshold value, the re -training linguistic context physical model, utilizes the solid data in the text of the corresponding category feature of physical model extraction of re -training, otherwise, text entities feature and text data are preserved.The text feature entity under different context can be handled, effectively increases the efficiency and extraction entity accuracy rate of entity extraction.
Description
Technical field
The present invention relates to a kind of extracting method of text entities, more particularly to a kind of industry text based on distributed platform
This entity extraction method.
Background technology
Traditional Text Extraction is using pattern match Relation extraction method, the Relation extraction based on dictionary driving, base
In Relation extraction method of machine learning etc., these first most of methods are that word frequency is higher in the method extraction text by participle
Word as effective entity.These methods are suitable for the relatively simple scene of entity in text, but under different context, these
Method cannot effectively distinguish entity under different context, need not will can split originally or the segmentation and conjunction of merged entity mistake
And.
Meanwhile word of the traditional detection method to the mistake for not having to occur in former text, it is difficult to be carried out by segmenting method
Extraction.
Occur many extraction instance methods based on deep learning in the recent period, wherein extraction entity algorithm is divided into calculated performance
It is not higher that relatively good but extraction is accurate, two kinds of models that extraction accuracy is higher but calculated performance is slow.Such as fast linear
Entity extraction model, convolutional neural networks are exactly accelerated model, and non-linear entity extraction model, deep neural network model are exactly
The relatively good model of accuracy.
It is real that Chinese patent literature CN2017100036859 discloses a kind of online traditional Chinese medical science text name based on deep learning
Body recognition methods, the entity extraction method are carried by reptile rich text training sample set, while using the method for neutral net
Text feature is taken, this can extract the accuracy of the entity of sample to a certain extent, but with the increase pair of training sample
The extraction physical model answered also increases, while the time of training can gradually increase, while extracts the characteristic time also with increase.
The content of the invention
For above-mentioned technical problem, the present invention seeks to:A kind of industry text entities based on distributed platform are provided to carry
Method is taken, using multiple elasticity distribution formula entity extraction models in Spark platforms, the text feature handled under different context is real
Body, so can effectively improve the efficiency of entity extraction, can also improve extraction entity accuracy rate.At the same time by supporting vector
Weights are improved in machine sorting algorithm, enhance the generalization ability of text, the further accuracy of text.
The technical scheme is that:
A kind of industry text entities extracting method based on distributed platform, comprises the following steps:
S01:Relationship characteristic model is obtained using deep learning neural metwork training text data set, and passes through relationship characteristic
Relationship characteristic in model extraction target text;
S02:The relationship characteristic of extraction is generated into multiple elasticity distribution formula relationship characteristic data set RDD;
S03:The category feature model that data set in RDD is obtained by improved non-linear svm classifier Algorithm for Training
Extract category feature;
S04:Corresponding linguistic context physical model is found according to the category feature of extraction, and is extracted by trained physical model
Solid data in the text of corresponding category feature;
S05:Judge whether this quantity of corresponding linguistic context text exceedes given threshold T, if exceed threshold value T, re -training should
Linguistic context physical model, using the solid data in the text of the corresponding category feature of physical model extraction of re -training, otherwise, is protected
Deposit text entities feature and text data.
Preferably, the step S01 is specifically included:
S11:Text is segmented by ansj segmenting methods of increasing income, count word frequency of each word in all texts and
Word frequency in current text, removes general auxiliary word, stop words and the high word of frequency, by all texts according to ought be above
The relation of word frequency in this and the word frequency in all texts, extracts N number of word, will be placed on per one kind in same file folder;
S12:Each word in N number of word is randomly set to the data characteristics of A dimensions, each text forms N*A dimension datas;
S13:Using each word feature as deep learning neutral net input node neuron, then pass through the first hidden layer
Convolution is carried out, sub-sample and local average are carried out by the second hidden layer, second of convolution is carried out by the 3rd hidden layer, is led to
Cross the 4th hidden layer and carry out second of sub-sample and the calculation of local average juice, full articulamentum, convert text to B dimension datas, lead to
Multiple testing and debugging accuracy is crossed, obtains relationship characteristic model.
Preferably, the step S03 is specifically included:
S31:The weight and offset in non-linear svm classifier algorithm are adjusted, makes the relationship characteristic of input and has marked
The error of the feature of sample preserves the category feature model of text in setting range;
S32:The disaggregated model method of selection is improved non-linear svm classifier algorithm, its training pattern class object letter
Number isWhereinForecast classification condition is y=w' φ (xi)+
b+εi, obtain discriminant functionWherein weightsC is penalty factor, is one
Empirical parameter, i are RDD numbers, and w is vectorial weight, siIt is the Euclidean distance of positive sample and negative sample in relationship characteristic, b is point
Threshold value during class, εiFor error, φ (xi) it is Non-linear Kernel function;
S32:Gradually adjustment penalty factor, test select optimal penalty factor, wherein Non-linear Kernel function phi (xi)
For min (x (i), xs(i)), wherein x (i), xs(i) it is feature vector that any two text relationship characteristic sample extraction arrives;Often
The label of class relationship characteristic sample is corresponding classification number, and the α of discriminant function is obtained by multiple off-line trainingiAnd b, wherein sentencing
Other functionIt is exactly corresponding category feature model.
Preferably, in the step S03, bad and sample text that is having apparent error will be extracted and be put into new class, by
Step section test sample so that test sample class is optimal.
Compared with prior art, it is an advantage of the invention that:
Present invention improves over sorting algorithm model, wherein mainly with the addition of punishment in training pattern class object function
The weighting coefficient of the factor, enhances the generalization ability of train classification models, while employs Non-linear Kernel function min (x (i), xs
(i)) so that the correspondence classification of text can accurately be found.Pass through distributed spark platforms Text Feature Extraction physical model point at the same time
Into the extraction text entities model of multiple scenes, solve tradition extraction text entities training and computational load is bigger asks
Inscribe, entity in the energy each text of rapid extraction, can more accurately extract text entities.
Brief description of the drawings
The invention will be further described with reference to the accompanying drawings and embodiments:
Fig. 1 is the flow chart of the industry text entities extracting method of the invention based on distributed platform.
Embodiment
To make the object, technical solutions and advantages of the present invention of greater clarity, with reference to embodiment and join
According to attached drawing, the present invention is described in more detail.It should be understood that these descriptions are merely illustrative, and it is not intended to limit this hair
Bright scope.In addition, in the following description, the description to known features and technology is eliminated, to avoid this is unnecessarily obscured
The concept of invention.
Embodiment:
As shown in fig. 1, the industry text entities extracting method based on distributed platform, comprises the following steps:
(1) in text collection, the textual data of every profession and trade is obtained respectively by akka communication modules in spark Open Source Platforms
It is believed that breath, the text data for needing to extract entity of monitoring device collection is transmitted on distributed Spark platforms.
(2) spark platform clusters are built, a wherein server is saved as management node, 4 servers as service
Point.Dependence wherein between management node essential record data flow is simultaneously responsible for task scheduling and the new RDD of generation.Service
Node is mainly the store function for realizing parser and data.
(3) existing text data set is trained to obtain relationship characteristic model, Ran Houli by deep learning neural net method
With the relationship characteristic in the new text of relationship characteristic model extraction;
The generation of relationship characteristic model, specifically includes:
S21:First text is segmented with ansj segmenting methods of increasing income, each word is then calculated in institute by statistical
There are the word frequency in text and the word frequency in current text, remove general auxiliary word, stop words, and the word that frequency is higher
Language, then the word frequency relation in the word frequency in current text and all texts, extracts N number of primary word, while will be each
Class is put into same file folder.
S22:Then the data characteristics that each vocabulary is 200 dimensions is randomly provided, so each samples of text can form N*
The data of 200 dimensions.
S23:Using the relationship characteristic of each word as deep learning neutral net input node neuron, then pass through first
Hidden layer carries out convolution, the second hidden layer carries out sub-sample and local average, the 3rd hidden layer carry out second of convolution, the 4th
A hidden layer carries out second of sub-sample and the calculation of local average juice, full articulamentum, realizes that N*200 dimension datas turn to 1000 dimension datas
Change.70% data wherein are used to training and 30% to be used to test.Gradual adjusting training is adjusted by multiple test accuracy
The model of depth network generation, it is exactly the relational model for generating text that can finally be optimal network model.
(4) the relationship characteristic text data extracted is converted into text elasticity distribution formula RDD relationship characteristic text datas,
Then it is divided into multiple RDD according to text contextual feature stream and carrys out burst processing.
(5) the elasticity distribution formula RDD features text data that will convert into passes through improved non-linear svm classifier Algorithm for Training
For category feature model conversion out into category feature, trained data set is existing and categorized good industry text data
Collection, while the advantage that quickly can be quickly calculated using spark distributed platforms, can to correcting industry text data set again
With Fast Training, new category feature model is obtained.
Bad and sample text that is having apparent error will be extracted to be put into new class, progressively adjust test sample so that
Test sample class is optimal;New text set can form different classifications, be feature by the distribution of spark platforms, can be fast
All samples are extracted corresponding entity by speed by the physical model of corresponding types.It is corresponding more with increasing for classification
Class physical model robustness is stronger, and it is better to extract entity accuracy.
The category feature model that improved non-linear svm classifier Algorithm for Training comes out, comprises the following steps:
Choose improved supporting vector machine model is as train classification models, its training pattern class object functionIts corresponding constraints is y=w' φ (xi)+b+εi, pass through object function peace treaty
Beam condition derives discriminant functionWherein weightsC is penalty factor, is one
A adjustable parameter, i are 1 to arrive n training text number of samples, and w is weight vector, siIt is the Euclidean distance of positive sample and negative sample,
And as the weighting coefficient of penalty factor in object function, b is threshold value, εiFor error, φ (xi) it is Non-linear Kernel function;
Between penalty factor is set as 1 to 100, feature extraction is carried out to the positive negative sample of preprepared pedestrian, it is right
Kernel function φ (the x answeredi) it is min (x (i), xs(i)), wherein x (i), xs(i) it is feature that the positive and negative sample extraction of any two arrives
Vector;The label of positive sample is that value is 1, and negative sample label value is -1, and off-line training obtains the α of discriminant functioniAnd b, wherein sentencing
Other functionIt is exactly corresponding non-linear SVM detection models;
By the result y for judging detection modeli, export the classification that respective value corresponds to text linguistic context.
(6) corresponding linguistic context physical model is found according to the category feature of text, and is extracted by trained physical model
The solid data in the text of corresponding types is selected, wherein linguistic context physical model is existing by word2vec instruments of increasing income
The physical model for the text that industry text data set is trained;
(7) when some scene text quantity exceedes threshold value T, the word2vec instruments re -training scenario entities will be used
Model, can first save the data on distributed platform when no more than number of thresholds, wherein more than 10,000 samples of general T
Quantity.
It should be appreciated that the above-mentioned embodiment of the present invention is used only for exemplary illustration or explains the present invention's
Principle, without being construed as limiting the invention.Therefore, that is done without departing from the spirit and scope of the present invention is any
Modification, equivalent substitution, improvement etc., should all be included in the protection scope of the present invention.In addition, appended claims purport of the present invention
Covering the whole changes fallen into scope and border or this scope and the equivalents on border and repairing
Change example.
Claims (4)
1. a kind of industry text entities extracting method based on distributed platform, it is characterised in that comprise the following steps:
S01:Relationship characteristic model is obtained using deep learning neural metwork training text data set, and passes through relationship characteristic model
Extract the relationship characteristic in target text;
S02:The relationship characteristic of extraction is generated into multiple elasticity distribution formula relationship characteristic data set RDD;
S03:The category feature model extraction that data set in RDD is obtained by improved non-linear svm classifier Algorithm for Training
Category feature;
S04:Corresponding linguistic context physical model is found according to the category feature of extraction, and is extracted and corresponded to by trained physical model
Solid data in the text of category feature;
S05:Judge whether this quantity of corresponding linguistic context text exceedes given threshold T, if exceed threshold value T, the re -training linguistic context
Physical model, using the solid data in the text of the corresponding category feature of physical model extraction of re -training, otherwise, preserves text
This substance feature and text data.
2. the industry text entities extracting method according to claim 1 based on distributed platform, it is characterised in that described
Step S01 is specifically included:
S11:Text is segmented by ansj segmenting methods of increasing income, word frequency of each word in all texts is counted and is working as
Word frequency in preceding text, removes general auxiliary word, stop words and the high word of frequency, by all texts according in current text
Word frequency and the word frequency in all texts relation, extract N number of word, by per one kind be placed on same file folder in;
S12:Each word in N number of word is randomly set to the data characteristics of A dimensions, each text forms N*A dimension datas;
S13:Using each word feature as deep learning neutral net input node neuron, then carried out by the first hidden layer
Convolution, sub-sample and local average are carried out by the second hidden layer, and second of convolution is carried out by the 3rd hidden layer, by the
Four hidden layers carry out second of sub-sample and the calculation of local average juice, full articulamentum, B dimension datas are converted text to, by more
Secondary testing and debugging accuracy, obtains relationship characteristic model.
3. the industry text entities extracting method according to claim 1 based on distributed platform, it is characterised in that described
Step S03 is specifically included:
S31:Adjust the weight and offset in non-linear svm classifier algorithm, the sample for making the relationship characteristic of input and having marked
Feature error in setting range, preserve the category feature model of text;
S32:The disaggregated model method of selection is improved non-linear svm classifier algorithm, its training pattern class object function isWhereinForecast classification condition is y=w' φ (xi)+b+
εi, obtain discriminant functionWherein weightsC is penalty factor, is a warp
Test parameter, i is RDD numbers, and w is vectorial weight, siIt is the Euclidean distance of positive sample and negative sample in relationship characteristic, b is classification
When threshold value, εiFor error, φ (xi) it is Non-linear Kernel function;
S32:Gradually adjustment penalty factor, test select optimal penalty factor, wherein Non-linear Kernel function phi (xi) it is min
(x(i),xs(i)), wherein x (i), xs(i) it is feature vector that any two text relationship characteristic sample extraction arrives;Per class relation
The label of feature samples is corresponding classification number, and the α of discriminant function is obtained by multiple off-line trainingiAnd b, wherein discriminant functionIt is exactly corresponding category feature model.
4. the industry text entities extracting method according to claim 1 based on distributed platform, it is characterised in that described
In step S03, bad and sample text that is having apparent error will be extracted and be put into new class, progressively adjust test sample so that
Test sample class is optimal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710902720.0A CN107908642B (en) | 2017-09-29 | 2017-09-29 | Industry text entity extraction method based on distributed platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710902720.0A CN107908642B (en) | 2017-09-29 | 2017-09-29 | Industry text entity extraction method based on distributed platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107908642A true CN107908642A (en) | 2018-04-13 |
CN107908642B CN107908642B (en) | 2021-11-12 |
Family
ID=61840291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710902720.0A Active CN107908642B (en) | 2017-09-29 | 2017-09-29 | Industry text entity extraction method based on distributed platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107908642B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508757A (en) * | 2018-10-30 | 2019-03-22 | 北京陌上花科技有限公司 | Data processing method and device for Text region |
CN109754014A (en) * | 2018-12-29 | 2019-05-14 | 北京航天数据股份有限公司 | Industry pattern training method, device, equipment and medium |
CN111274348A (en) * | 2018-12-04 | 2020-06-12 | 北京嘀嘀无限科技发展有限公司 | Service feature data extraction method and device and electronic equipment |
CN111382570A (en) * | 2018-12-28 | 2020-07-07 | 深圳市优必选科技有限公司 | Text entity recognition method and device, computer equipment and storage medium |
CN111950279A (en) * | 2019-05-17 | 2020-11-17 | 百度在线网络技术(北京)有限公司 | Entity relationship processing method, device, equipment and computer readable storage medium |
CN112052646A (en) * | 2020-08-27 | 2020-12-08 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN114756385A (en) * | 2022-06-16 | 2022-07-15 | 合肥中科类脑智能技术有限公司 | Elastic distributed training method in deep learning scene |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100250547A1 (en) * | 2001-08-13 | 2010-09-30 | Xerox Corporation | System for Automatically Generating Queries |
CN104933164A (en) * | 2015-06-26 | 2015-09-23 | 华南理工大学 | Method for extracting relations among named entities in Internet massive data and system thereof |
CN105389378A (en) * | 2015-11-19 | 2016-03-09 | 广州精标信息科技有限公司 | System for integrating separate data |
CN106168965A (en) * | 2016-07-01 | 2016-11-30 | 竹间智能科技(上海)有限公司 | Knowledge mapping constructing system |
CN106599032A (en) * | 2016-10-27 | 2017-04-26 | 浙江大学 | Text event extraction method in combination of sparse coding and structural perceptron |
CN106599041A (en) * | 2016-11-07 | 2017-04-26 | 中国电子科技集团公司第三十二研究所 | Text processing and retrieval system based on big data platform |
US20170124181A1 (en) * | 2015-10-30 | 2017-05-04 | Oracle International Corporation | Automatic fuzzy matching of entities in context |
CN106682220A (en) * | 2017-01-04 | 2017-05-17 | 华南理工大学 | Online traditional Chinese medicine text named entity identifying method based on deep learning |
US20170169094A1 (en) * | 2015-12-15 | 2017-06-15 | International Business Machines Corporation | Statistical Clustering Inferred From Natural Language to Drive Relevant Analysis and Conversation With Users |
-
2017
- 2017-09-29 CN CN201710902720.0A patent/CN107908642B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100250547A1 (en) * | 2001-08-13 | 2010-09-30 | Xerox Corporation | System for Automatically Generating Queries |
CN104933164A (en) * | 2015-06-26 | 2015-09-23 | 华南理工大学 | Method for extracting relations among named entities in Internet massive data and system thereof |
US20170124181A1 (en) * | 2015-10-30 | 2017-05-04 | Oracle International Corporation | Automatic fuzzy matching of entities in context |
CN105389378A (en) * | 2015-11-19 | 2016-03-09 | 广州精标信息科技有限公司 | System for integrating separate data |
US20170169094A1 (en) * | 2015-12-15 | 2017-06-15 | International Business Machines Corporation | Statistical Clustering Inferred From Natural Language to Drive Relevant Analysis and Conversation With Users |
CN106168965A (en) * | 2016-07-01 | 2016-11-30 | 竹间智能科技(上海)有限公司 | Knowledge mapping constructing system |
CN106599032A (en) * | 2016-10-27 | 2017-04-26 | 浙江大学 | Text event extraction method in combination of sparse coding and structural perceptron |
CN106599041A (en) * | 2016-11-07 | 2017-04-26 | 中国电子科技集团公司第三十二研究所 | Text processing and retrieval system based on big data platform |
CN106682220A (en) * | 2017-01-04 | 2017-05-17 | 华南理工大学 | Online traditional Chinese medicine text named entity identifying method based on deep learning |
Non-Patent Citations (2)
Title |
---|
任育伟 等: ""搜索日志中命名实体识别"", 《现代图书情报技术》 * |
张帆 等: ""基于深度学习的医疗命名实体识别"", 《计算技术与自动化》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508757A (en) * | 2018-10-30 | 2019-03-22 | 北京陌上花科技有限公司 | Data processing method and device for Text region |
CN111274348A (en) * | 2018-12-04 | 2020-06-12 | 北京嘀嘀无限科技发展有限公司 | Service feature data extraction method and device and electronic equipment |
CN111274348B (en) * | 2018-12-04 | 2023-05-12 | 北京嘀嘀无限科技发展有限公司 | Service feature data extraction method and device and electronic equipment |
CN111382570A (en) * | 2018-12-28 | 2020-07-07 | 深圳市优必选科技有限公司 | Text entity recognition method and device, computer equipment and storage medium |
CN111382570B (en) * | 2018-12-28 | 2024-05-03 | 深圳市优必选科技有限公司 | Text entity recognition method, device, computer equipment and storage medium |
CN109754014A (en) * | 2018-12-29 | 2019-05-14 | 北京航天数据股份有限公司 | Industry pattern training method, device, equipment and medium |
CN109754014B (en) * | 2018-12-29 | 2021-04-27 | 北京航天数据股份有限公司 | Industrial model training method, device, equipment and medium |
CN111950279A (en) * | 2019-05-17 | 2020-11-17 | 百度在线网络技术(北京)有限公司 | Entity relationship processing method, device, equipment and computer readable storage medium |
CN112052646A (en) * | 2020-08-27 | 2020-12-08 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN112052646B (en) * | 2020-08-27 | 2024-03-29 | 安徽聚戎科技信息咨询有限公司 | Text data labeling method |
CN114756385A (en) * | 2022-06-16 | 2022-07-15 | 合肥中科类脑智能技术有限公司 | Elastic distributed training method in deep learning scene |
Also Published As
Publication number | Publication date |
---|---|
CN107908642B (en) | 2021-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107908642A (en) | Industry text entities extracting method based on distributed platform | |
CN105809190B (en) | A kind of SVM cascade classifier methods based on Feature Selection | |
CN104915327B (en) | A kind of processing method and processing device of text information | |
CN107194418B (en) | Rice aphid detection method based on antagonistic characteristic learning | |
CN106960214A (en) | Object identification method based on image | |
CN105095856A (en) | Method for recognizing human face with shielding based on mask layer | |
CN107871101A (en) | A kind of method for detecting human face and device | |
CN108932527A (en) | Using cross-training model inspection to the method for resisting sample | |
CN108090099B (en) | Text processing method and device | |
CN107818298A (en) | General Raman spectral characteristics extracting method for machine learning material recognition | |
CN110070090A (en) | A kind of logistic label information detecting method and system based on handwriting identification | |
CN104732248B (en) | Human body target detection method based on Omega shape facilities | |
CN107180084A (en) | Word library updating method and device | |
CN106611193A (en) | Image content information analysis method based on characteristic variable algorithm | |
CN109255339B (en) | Classification method based on self-adaptive deep forest human gait energy map | |
CN105930792A (en) | Human action classification method based on video local feature dictionary | |
CN107145778A (en) | A kind of intrusion detection method and device | |
CN106971180A (en) | A kind of micro- expression recognition method based on the sparse transfer learning of voice dictionary | |
CN113489685A (en) | Secondary feature extraction and malicious attack identification method based on kernel principal component analysis | |
Mehdipour Ghazi et al. | Open-set plant identification using an ensemble of deep convolutional neural networks | |
CN110210433A (en) | A kind of container number detection and recognition methods based on deep learning | |
CN107357895A (en) | A kind of processing method of the text representation based on bag of words | |
CN108241662A (en) | The optimization method and device of data mark | |
Gillies et al. | Arabic text recognition system | |
CN110837818A (en) | Chinese white sea rag dorsal fin identification method based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |