CN106815194A - Model training method and device and keyword recognition method and device - Google Patents
Model training method and device and keyword recognition method and device Download PDFInfo
- Publication number
- CN106815194A CN106815194A CN201510850285.2A CN201510850285A CN106815194A CN 106815194 A CN106815194 A CN 106815194A CN 201510850285 A CN201510850285 A CN 201510850285A CN 106815194 A CN106815194 A CN 106815194A
- Authority
- CN
- China
- Prior art keywords
- word
- sentence
- term vector
- text message
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
This application discloses a kind of model training method and device and keyword recognition method and device.Wherein, the model training method includes:The text message with part of speech mark is obtained, wherein, text message includes a plurality of sentence, and each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine the term vector of each word in every sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;In units of the sentence in text message, the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, neural network model is used to be marked the word in sentence.Present application addresses technical problem in the prior art to the keyword identification accuracy difference in sentence.
Description
Technical field
The application is related to text-processing field, knows in particular to a kind of model training method and device and keyword
Other method and device.
Background technology
Generally all include sentence keyword to be expressed in the sentence of text, for example, user's statement " has recently
Point is tired, and I thinks that Yonghe Palace Temple is played ", wherein, place " Yonghe Palace Temple " is the keyword that it is included.However, for meter
For calculation machine system, can not exactly find out these keywords as people, existing computer system for
The identification of keyword is normally based on the part of speech or sentence structure of word in sentence, and target is found out after entering line statement participle
, used as keyword, this mode is larger for the dependence of participle instrument, although for single part of speech for the word of part of speech
Extract effective, and for occur in various parts of speech and natural language complicated clause, new clause, destructuring clause with
And new vocabulary, its identification accuracy it is poor.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
The embodiment of the present application provides a kind of model training method and device and keyword recognition method and device, with least
Solve the technical problem in the prior art to the keyword identification accuracy difference in sentence.
According to the one side of the embodiment of the present application, there is provided a kind of model training method, including:Obtain and carry part of speech
The text message of mark, wherein, the text message includes a plurality of sentence, and each word in every sentence is carried
The part of speech mark of corresponding part of speech type;Determine the term vector of each word in every sentence, institute's predicate
Vector is the Multidimensional numerical for uniquely representing corresponding word;In units of the sentence in the text message, will be every
The corresponding part of speech mark of each word and its corresponding term vector are input to Recognition with Recurrent Neural Network in bar sentence, and training is obtained
Neural network model, wherein, the neural network model is used to be marked the word in sentence.
Further, it is determined that the term vector of each word in every sentence includes:To every in the text message
Bar sentence carries out word segmentation processing, obtains the set of words of the text message;Search each word in the set of words
Corresponding term vector.
Further, it is determined that before the term vector of each word in every sentence, the model training method
Also include:The text message of preset data amount is obtained, text message set is obtained;Institute is generated using machine learning mode
The corresponding term vector of each word in text message set is stated, term vector set is obtained;Wherein, the word collection is searched
The corresponding term vector of each word includes in conjunction:Each word in the set of words is searched from the term vector set
Corresponding term vector.
Further, the keyword tag in every sentence of the text message is the first preset mark, other words
Labeled as the second preset mark, to cause when word is recognized using the neural network model, by the keyword mark
It is designated as first preset mark.
According to the another aspect of the embodiment of the present application, a kind of keyword recognition method is additionally provided, including:To text to be measured
Originally word segmentation processing is carried out, the corresponding term vector of each word is determined;In units of the sentence in the text to be measured,
The corresponding term vector of each word in every sentence is input in neural network model, using the neutral net mould
The keyword that phenotypic marker goes out in the text to be measured.
According to the another aspect of the embodiment of the present application, a kind of model training apparatus are additionally provided, including:First obtains single
Unit, for obtaining the text message with part of speech mark, wherein, the text message includes a plurality of sentence, every language
Each word in sentence carries the part of speech mark of corresponding part of speech type;Determining unit, it is described every for determining
The term vector of each word in bar sentence, the term vector is the Multidimensional numerical for uniquely representing corresponding word;
Training unit, in units of the sentence in the text message, by the corresponding part of speech of each word in every sentence
Mark and its corresponding term vector are input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, the god
It is used to be marked the word in sentence through network model.
Further, the training unit includes:Word-dividing mode, for being carried out to every sentence in the text message
Word segmentation processing, obtains the set of words of the text message;Enquiry module, for searching each in the set of words
The corresponding term vector of word.
Further, the model training apparatus also include:Second acquisition unit, for it is determined that every sentence
In each word term vector before, obtain preset data amount text message, obtain text message set;Generation
Unit, for generating the corresponding term vector of each word in the text message set using machine learning mode, obtains
Term vector set;Wherein, during the enquiry module from the term vector set specifically for searching the set of words
The corresponding term vector of each word.
Further, the keyword tag in every sentence of the text message is the first preset mark, other words
Labeled as the second preset mark, to cause when word is recognized using the neural network model, by the keyword mark
It is designated as first preset mark.
According to the another aspect of the embodiment of the present application, a kind of keyword identifying device is additionally provided, including:Vector determines
Unit, for carrying out word segmentation processing to text to be measured, determines the corresponding term vector of each word;Indexing unit, uses
In in units of the sentence in the text to be measured, the corresponding term vector of each word in every sentence is input to god
In through network model, the keyword in the text to be measured is marked using the neural network model.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many
Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language
The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope
Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to
Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered
Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art
Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen
Schematic description and description please does not constitute the improper restriction to the application for explaining the application.In accompanying drawing
In:
Fig. 1 is the flow chart of the model training method according to the embodiment of the present application;
Fig. 2 is the flow chart of the keyword recognition method according to the embodiment of the present application;
Fig. 3 is the schematic diagram of the model training apparatus according to the embodiment of the present application;
Fig. 4 is the schematic diagram of the keyword identifying device according to the embodiment of the present application.
Specific embodiment
In order that those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application
Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present application, it is clear that described embodiment
The only embodiment of the application part, rather than whole embodiments.Based on the embodiment in the application, ability
The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to
The scope of the application protection.
It should be noted that term " first ", " in the description and claims of this application and above-mentioned accompanying drawing
Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this
The data that sample is used can be exchanged in the appropriate case, so as to embodiments herein described herein can with except
Here the order beyond those for illustrating or describing is implemented.Additionally, term " comprising " and " having " and they
Any deformation, it is intended that covering is non-exclusive to be included, for example, containing process, the side of series of steps or unit
Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear
List or for these processes, method, product or other intrinsic steps of equipment or unit.
According to the embodiment of the present application, there is provided a kind of embodiment of the method for model training method, it is necessary to explanation,
The step of flow of accompanying drawing is illustrated can perform in the such as one group computer system of computer executable instructions, and
And, although logical order is shown in flow charts, but in some cases, can be with different from order herein
Perform shown or described step.
Fig. 1 is the flow chart of the model training method according to the embodiment of the present application, as shown in figure 1, the method is included such as
Lower step:
Step S102, obtains the text message with part of speech mark, wherein, text message includes a plurality of sentence, every
Each word in sentence carries the part of speech mark of corresponding part of speech type.
The text message with part of speech mark of the embodiment of the present application, can be the sample of the text message of advance collection,
By being manually marked to word interested in wherein every sentence, the part of speech of word interested, shape are marked
Into text message.Wherein, it can also be multiple that the parts of speech classification of word interested can be one, such as mark
Point noun, personage's noun etc..Its mask method can be:Place interested is expressed as:PLACE (place) is invalid
Word is expressed as:NUL (sky).Such as " I thinks that Yonghe Palace Temple is played ", may be after participle " I thinks that Yonghe Palace Temple is played ",
It is after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Step S104, determines the term vector of each word in every sentence, and term vector is for uniquely representing corresponding
The Multidimensional numerical of word.
After the text message with part of speech mark is got, determine that word is corresponding in every sentence in the text
Term vector, the term vector of each word represents with one group of Multidimensional numerical, each not phase of the corresponding term vector of different words
Together.Wherein, the term vector of word can have been pre-defined, after text message is extracted, from advance
The vector of each word in text message is inquired in the term vector of definition.Can also be given birth to according to term vector set in advance
Into rule, the term vector of each word is generated.Because each word carries corresponding part of speech mark in text message
Note, therefore, the corresponding term vector of each word also correspond to be marked with the word identical part of speech.
Step S106, in units of the sentence in text message, by the corresponding part of speech mark of each word in every sentence
And its corresponding term vector is input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, neutral net mould
Type is used to be marked the word in sentence.
In the present embodiment, after the term vector for determining each word included in text message, with text message
In sentence be unit, the sentence in text message is sequentially inputted to be trained in Recognition with Recurrent Neural Network, be input to
Sentence in Recognition with Recurrent Neural Network is replaced with the wherein corresponding term vector of each word, i.e. by each word in sentence
Corresponding term vector is input to Recognition with Recurrent Neural Network.The text message for extracting is trained by Recognition with Recurrent Neural Network,
Obtain neural network model.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with
Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model
Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form, relative to existing
Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle
Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training
Word, can exactly identify the keyword in the sentence of various structure types, and the accuracy to keyword identification is high.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many
Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language
The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope
Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to
Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered
Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art
Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Preferably, determining the term vector of each word in every sentence includes:Every sentence in text message is carried out
Word segmentation processing, obtains the set of words of text message;Search the corresponding term vector of each word in set of words.
In the present embodiment, the term vector of word is previously generated, generate term vector set.Collecting as the text of sample
After this information, each word pair inquired about from the term vector set for previously generating in every sentence of text information
The term vector answered.Wherein, the word segmentation processing to every sentence of text message can be using participle instrument, according to one
Set pattern then carries out participle, can be " I thinks that Yonghe Palace Temple is played " after participle such as " I thinks that Yonghe Palace Temple is played ".
Further, it is determined that before the term vector of each word in every sentence, model training method also includes:
The text message of preset data amount is obtained, text message set is obtained;Text message collection is generated using machine learning mode
The corresponding term vector of each word in conjunction, obtains term vector set;Wherein, each word correspondence in set of words is searched
Term vector include:The corresponding term vector of each word in set of words is searched from term vector set.
In the present embodiment, it is determined that before the corresponding term vector of word, first generation term vector set, specifically, first obtains
Substantial amounts of text message is taken, wherein, preset data amount can be the larger data volume of the scope for pre-setting;To obtain
The text message of the preset data amount for arriving as training term vector text message set, then using machine learning mode
The corresponding term vector of generation each word therein, obtains term vector set.So, in the text to being determined as sample
In this information during the corresponding term vector of word, can directly be inquired about from the term vector set and obtained.
Machine learning mode can carry out term vector training using Google word2vec, according to input text, to each
Individual word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can such as will with self-defined
It is 0,1,0 that " happiness " may be marked ...].
Preferably, the keyword tag in every sentence of text message is the first preset mark, and other words are labeled as
Second preset mark, is the first pre- bidding by keyword tag to cause when word is recognized using neural network model
Note.
It is the to keyword tag interested using as in the sentence of the text message of training sample in the present embodiment
Other invalid words are labeled as the second preset mark by one preset mark.When model training is carried out, what training was obtained
Neural network model can remember these marks, therefore, recognize sentence in the neural network model obtained using training
In keyword when, can in its output result by keyword tag be the first preset mark, by other invalid words
Labeled as the second preset mark.
For example, word interested is the word for representing place, place is expressed as:PLACE (place), invalid word lists
It is shown as:NUL (sky).By sentence " I thinks that Yonghe Palace Temple is played ", it is marked after participle, is marked as that " I thinks that NUL goes
NUL harmony PLACE palace PLACE play NUL ".
A kind of optional mode of the model training method of the embodiment of the present application includes:
Step one, the substantial amounts of text message of collection, as term vector training text collection 1, for training term vector.
Step 2, participle is carried out to text set 1, term vector is generated using machine learning mode, obtain term vector set.
Wherein, machine learning can carry out term vector training using Google word2vec, according to input text, to each
Word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can be with self-defined, such as by " height
It is emerging " may to mark be 0,1,0 ...].
The related text message of step 3, capturing service, participle is carried out to every sentence, manually carries out word to each word
Property mark, used as training set 2, part of speech is classification interested.Wherein, classification interested can also may be used for one
Think multiple, such as marking terrain noun, personage's noun.Its labeling method can be:Place interested is expressed as:
PLACE (place), invalid word is expressed as:NUL (sky).May be " I after participle such as " I thinks that Yonghe Palace Temple is played "
Think that Yonghe Palace Temple is played ", be after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Word in step 4, training set 2 is represented with the term vector generated in above-mentioned step 2, in units of sentence,
Term vector in training set 2 is input into RNN (Recognition with Recurrent Neural Network) to be trained, the RNN training after being trained
Model.Wherein, the input Recognition with Recurrent Neural Network with sentence as neutral net, machine can remember word in sentence,
Part of speech mark and combinations thereof form, and in the parameters memorizing in model these words, part of speech mark and combinations thereof form.
According to the embodiment of the present application, by way of term vector and Recognition with Recurrent Neural Network are combined, model training is carried out,
So that keyword extraction is small to participle instrument accuracy dependence, and robustness is relatively strong (such as:Do not occur in training set
Word, part of speech is also can obtain in test, it is keyword to identify whether).
A kind of keyword recognition method is additionally provided according to the embodiment of the present application, the keyword recognition method can be used for leading to
The model training method for crossing the above embodiments of the present application trains the neural network model for obtaining to recognize keyword.Such as Fig. 2
Shown, the keyword recognition method includes:
Step S202, word segmentation processing is carried out to text to be measured, determines the corresponding term vector of each word.
In the present embodiment, the mode and the above embodiments of the present application of word segmentation processing and determination term vector to text to be measured
The mode being previously mentioned in middle model training method is identical, does not repeat here.
Step S204, in units of the sentence in text to be measured, by the corresponding term vector of each word in every sentence
It is input in neural network model, the keyword in text to be measured is marked using neural network model.
The nerve that neural network model in the present embodiment is obtained for the model training method training of the above embodiments of the present application
Network model.
In units of the sentence in text to be measured, the wherein corresponding term vector of word is input in neural network model,
The keyword in text to be measured is identified using neural network model, and is marked.Specifically, obtain to be measured
Text, carries out participle, and each word term vector is represented, term vector is input into neural network model in units of sentence,
Obtain the part of speech mark to each word, you can obtain the corresponding word of part of speech interested.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with
Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model
Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form.Relative to existing
Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle
Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training
Word, in units of sentence, therefrom can exactly identify the keyword in the sentence of various structure types, to key
The accuracy of word identification is high.
For example, be " how is Yonghe Palace Temple evaluation " after " how is Yonghe Palace Temple evaluation " participle, by neutral net mould
Type calculate after result be:" harmony PLACE palace PLACE evaluate NUL how NUL ", by screening, can obtain
Take place noun interested:Yonghe Palace Temple.
The embodiment of the present application additionally provides a kind of model training apparatus, and the device can be used for performing the embodiment of the present application
Model training method, as shown in figure 3, the device includes:First acquisition unit 301, determining unit 303 and training
Unit 305.
First acquisition unit 301 is used to obtain the text message with part of speech mark, wherein, text message includes a plurality of
Sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type.
The text message with part of speech mark of the embodiment of the present application, can be the sample of the text message of advance collection,
By being manually marked to word interested in wherein every sentence, the part of speech of word interested, shape are marked
Into text message.Wherein, it can also be multiple that the parts of speech classification of word interested can be 1, such as mark
Point noun, personage's noun etc..Its mask method can be:Place interested is expressed as:PLACE (place) is invalid
Word is expressed as:NUL (sky).Such as " I thinks that Yonghe Palace Temple is played ", may be after participle " I thinks that Yonghe Palace Temple is played ",
It is after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Determining unit 303 is used to determine the term vector of each word in every sentence, and term vector is for unique expression
The Multidimensional numerical of corresponding word.
After the text message with part of speech mark is got, determine that word is corresponding in every sentence in the text
Term vector, the term vector of each word represents with one group of Multidimensional numerical, each not phase of the corresponding term vector of different words
Together.Wherein, the term vector of word can have been pre-defined, after text message is extracted, from advance
The vector of each word in text message is inquired in the term vector of definition.Can also be given birth to according to term vector set in advance
Into rule, the term vector of each word is generated.Because each word carries corresponding part of speech mark in text message
Note, therefore, the corresponding term vector of each word also correspond to be marked with the word identical part of speech.
Training unit 305 is used in units of the sentence in text message, by the corresponding word of each word in every sentence
Property mark and its corresponding term vector be input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, nerve
Network model is used to be marked the word in sentence.
In the present embodiment, after the term vector for determining each word included in text message, with text message
In sentence be unit, the sentence in text message is sequentially inputted to be trained in Recognition with Recurrent Neural Network, be input to
Sentence in Recognition with Recurrent Neural Network is replaced with the wherein corresponding term vector of each word, i.e. by each word in sentence
Corresponding term vector is input to Recognition with Recurrent Neural Network.The text message for extracting is trained by Recognition with Recurrent Neural Network,
Obtain neural network model.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with
Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model
Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form, relative to existing
Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle
Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training
Word, can exactly identify the keyword in the sentence of various structure types, and the accuracy to keyword identification is high.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many
Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language
The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope
Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to
Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered
Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art
Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Preferably, training unit includes:Word-dividing mode, for carrying out word segmentation processing to every sentence in text message,
Obtain the set of words of text message;Enquiry module, for searching the corresponding term vector of each word in set of words.
In the present embodiment, the term vector of word is previously generated, generate term vector set.Collecting as the text of sample
After this information, each word pair inquired about from the term vector set for previously generating in every sentence of text information
The term vector answered.Wherein, the word segmentation processing to every sentence of text message can be using participle instrument, according to one
Set pattern then carries out participle, can be " I thinks that Yonghe Palace Temple is played " after participle such as " I thinks that Yonghe Palace Temple is played ".
Preferably, model training apparatus also include:Second acquisition unit, for it is determined that each word in every sentence
Before the term vector of language, the text message of preset data amount is obtained, obtain text message set;Generation unit, is used for
The corresponding term vector of each word in text message set is generated using machine learning mode, term vector set is obtained;Its
In, enquiry module is specifically for the corresponding term vector of each word in the lookup set of words from term vector set.
In the present embodiment, it is determined that before the corresponding term vector of word, first generation term vector set, specifically, first obtains
Substantial amounts of text message is taken, wherein, preset data amount can be the larger data volume of the scope for pre-setting;To obtain
The text message of the preset data amount for arriving as training term vector text message set, then using machine learning mode
The corresponding term vector of generation each word therein, obtains term vector set.So, in the text to being determined as sample
In this information during the corresponding term vector of word, can directly be inquired about from the term vector set and obtained.
Machine learning mode can carry out term vector training using Google word2vec, according to input text, to each
Individual word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can such as will with self-defined
It is 0,1,0 that " happiness " may be marked ...].
Preferably, the keyword tag in every sentence of text message is the first preset mark, and other words are labeled as
Second preset mark, is the first pre- bidding by keyword tag to cause when word is recognized using neural network model
Note.
It is the to keyword tag interested using as in the sentence of the text message of training sample in the present embodiment
Other invalid words are labeled as the second preset mark by one preset mark.When model training is carried out, what training was obtained
Neural network model can remember these marks, therefore, recognize sentence in the neural network model obtained using training
In keyword when, can in its output result by keyword tag be the first preset mark, by other invalid words
Labeled as the second preset mark.
For example, word interested is the word for representing place, place is expressed as:PLACE (place), invalid word lists
It is shown as:NUL (sky).By sentence " I thinks that Yonghe Palace Temple is played ", it is marked after participle, is marked as that " I thinks that NUL goes
NUL harmony PLACE palace PLACE play NUL ".
The model training apparatus include processor and memory, above-mentioned first acquisition unit 301, determining unit 303
Stored in memory as program unit with the grade of training unit 305, stored in memory by computing device
Said procedure unit.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one
Or more, trained by adjusting kernel parameter and obtain neural network model, for being identified to keyword in sentence.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/
Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one
Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit
In the program code for performing initialization there are as below methods step:The text message with part of speech mark is obtained, wherein, text
This information includes a plurality of sentence, and each word in every sentence carries the part of speech mark of corresponding part of speech type;
Determine the term vector of each word in every sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;
In units of the sentence in text message, by the corresponding part of speech of each word in every sentence mark and its corresponding word to
Amount is input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, neural network model is used in sentence
Word be marked.
The embodiment of the present application additionally provides a kind of keyword identifying device, and the device can be used for performing the embodiment of the present application
Keyword recognition method, as shown in figure 4, the device includes:Vector determination unit 401 and indexing unit 403.
Vector determination unit 401 is used to carry out word segmentation processing to text to be measured, determines the corresponding term vector of each word.
In the present embodiment, the mode and the above embodiments of the present application of word segmentation processing and determination term vector to text to be measured
The mode being previously mentioned in middle model training method is identical, does not repeat here.
Indexing unit 403 is used in units of the sentence in text to be measured, and each word in every sentence is corresponding
Term vector is input in neural network model, and the keyword in text to be measured is marked using neural network model.
The nerve that neural network model in the present embodiment is obtained for the model training method training of the above embodiments of the present application
Network model.
In units of the sentence in text to be measured, the wherein corresponding term vector of word is input in neural network model,
The keyword in text to be measured is identified using neural network model, and is marked.Specifically, obtain to be measured
Text, carries out participle, and each word term vector is represented, term vector is input into neural network model in units of sentence,
Obtain the part of speech mark to each word, you can obtain the corresponding word of part of speech interested.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with
Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model
Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form.Relative to existing
Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle
Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training
Word, in units of sentence, therefrom can exactly identify the keyword in the sentence of various structure types, to key
The accuracy of word identification is high.
For example, be " how is Yonghe Palace Temple evaluation " after " how is Yonghe Palace Temple evaluation " participle, by neutral net mould
Type calculate after result be:" harmony PLACE palace PLACE evaluate NUL how NUL ", by screening, can obtain
Take place noun interested:Yonghe Palace Temple.
The keyword identifying device includes processor and memory, above-mentioned vector determination unit 401 and indexing unit 403
Deng being stored in memory as program unit, by computing device storage said procedure unit in memory.
It is above-mentioned to may be stored in memory.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one
Or more, utilize neural network model to be identified keyword in text to be measured by adjusting kernel parameter.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/
Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one
Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit
In the program code for performing initialization there are as below methods step:Word segmentation processing is carried out to text to be measured, each word is determined
The corresponding term vector of language;In units of the sentence in text to be measured, by the corresponding word of each word in every sentence to
Amount is input in neural network model, and the keyword in text to be measured is marked using neural network model.
Above-mentioned the embodiment of the present application sequence number is for illustration only, and the quality of embodiment is not represented.
In above-described embodiment of the application, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment
The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other
Mode realize.Wherein, device embodiment described above is only schematical, such as division of described unit,
Can be a kind of division of logic function, there can be other dividing mode when actually realizing, for example multiple units or component
Can combine or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, institute
Display or the coupling each other for discussing or direct-coupling or communication connection can be by some interfaces, unit or mould
The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit
The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to
On multiple units.Some or all of unit therein can be according to the actual needs selected to realize this embodiment scheme
Purpose.
In addition, during each functional unit in the application each embodiment can be integrated in a processing unit, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.It is above-mentioned integrated
Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or when using,
Can store in a computer read/write memory medium.Based on such understanding, the technical scheme essence of the application
On all or part of the part that is contributed to prior art in other words or the technical scheme can be with software product
Form is embodied, and the computer software product is stored in a storage medium, including some instructions are used to so that one
Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the application
State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD
Etc. it is various can be with the medium of store program codes.
The above is only the preferred embodiment of the application, it is noted that for the ordinary skill people of the art
For member, on the premise of the application principle is not departed from, some improvements and modifications can also be made, these improve and moisten
Decorations also should be regarded as the protection domain of the application.
Claims (10)
1. a kind of model training method, it is characterised in that including:
The text message with part of speech mark is obtained, wherein, the text message includes a plurality of sentence, every language
Each word in sentence carries the part of speech mark of corresponding part of speech type;
Determine the term vector of each word in every sentence, the term vector is to represent correspondence for unique
Word Multidimensional numerical;
In units of the sentence in the text message, by the corresponding part of speech of each word in every sentence mark and
Its corresponding term vector is input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, the nerve
Network model is used to be marked the word in sentence.
2. model training method according to claim 1, it is characterised in that determine each in every sentence
The term vector of word includes:
Word segmentation processing is carried out to every sentence in the text message, the set of words of the text message is obtained;
Search the corresponding term vector of each word in the set of words.
3. model training method according to claim 2, it is characterised in that it is determined that every in every sentence
Before the term vector of individual word, the model training method also includes:
The text message of preset data amount is obtained, text message set is obtained;
The corresponding term vector of each word in the text message set is generated using machine learning mode, word is obtained
Vector set;
Wherein, the corresponding term vector of each word includes in searching the set of words:From the term vector set
It is middle to search the corresponding term vector of each word in the set of words.
4. model training method according to any one of claim 1 to 3, it is characterised in that the text message
Every sentence in keyword tag be the first preset mark, other words be labeled as the second preset mark, with
So that being the described first pre- bidding by the keyword tag when word is recognized using the neural network model
Note.
5. a kind of keyword recognition method, it is characterised in that including:
Word segmentation processing is carried out to text to be measured, the corresponding term vector of each word is determined;
It is in units of the sentence in the text to be measured, the corresponding term vector of each word in every sentence is defeated
Enter in training the neural network model for obtaining to the model training method any one of Claims 1-4,
The keyword in the text to be measured is marked using the neural network model.
6. a kind of model training apparatus, it is characterised in that including:
First acquisition unit, for obtaining the text message with part of speech mark, wherein, the text message bag
A plurality of sentence is included, each word in every sentence carries the part of speech mark of corresponding part of speech type;
Determining unit, the term vector for determining each word in every sentence, the term vector is use
In the Multidimensional numerical for uniquely representing corresponding word;
Training unit, in units of the sentence in the text message, by each word pair in every sentence
The part of speech mark answered and its corresponding term vector are input to Recognition with Recurrent Neural Network, and training obtains neural network model,
Wherein, the neural network model is used to be marked the word in sentence.
7. model training apparatus according to claim 6, it is characterised in that the training unit includes:
Word-dividing mode, for carrying out word segmentation processing to every sentence in the text message, obtains the text envelope
The set of words of breath;
Enquiry module, for searching the corresponding term vector of each word in the set of words.
8. model training apparatus according to claim 7, it is characterised in that the model training apparatus also include:
Second acquisition unit, for it is determined that before the term vector of each word in every sentence, obtaining
The text message of preset data amount, obtains text message set;
Generation unit, for generating the text message set using machine learning mode in each word it is corresponding
Term vector, obtains term vector set;
Wherein, the enquiry module from the term vector set specifically for searching each in the set of words
The corresponding term vector of word.
9. model training apparatus according to any one of claim 6 to 8, it is characterised in that the text message
Every sentence in keyword tag be the first preset mark, other words be labeled as the second preset mark, with
So that being the described first pre- bidding by the keyword tag when word is recognized using the neural network model
Note.
10. a kind of keyword identifying device, it is characterised in that including:
Vector determination unit, for carrying out word segmentation processing to text to be measured, determine the corresponding word of each word to
Amount;
Indexing unit, in units of the sentence in the text to be measured, by each word in every sentence
Corresponding term vector is input to the god that the model training method training any one of Claims 1-4 is obtained
In through network model, the keyword in the text to be measured is marked using the neural network model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510850285.2A CN106815194A (en) | 2015-11-27 | 2015-11-27 | Model training method and device and keyword recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510850285.2A CN106815194A (en) | 2015-11-27 | 2015-11-27 | Model training method and device and keyword recognition method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106815194A true CN106815194A (en) | 2017-06-09 |
Family
ID=59155406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510850285.2A Pending CN106815194A (en) | 2015-11-27 | 2015-11-27 | Model training method and device and keyword recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106815194A (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291693A (en) * | 2017-06-15 | 2017-10-24 | 广州赫炎大数据科技有限公司 | A kind of semantic computation method for improving term vector model |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN107943525A (en) * | 2017-11-17 | 2018-04-20 | 魏茨怡 | A kind of mobile phone app interactive modes based on Recognition with Recurrent Neural Network |
CN108268443A (en) * | 2017-12-21 | 2018-07-10 | 北京百度网讯科技有限公司 | It determines the transfer of topic point and obtains the method, apparatus for replying text |
CN109145282A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Punctuate model training method, punctuate method, apparatus and computer equipment |
CN109241330A (en) * | 2018-08-20 | 2019-01-18 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of key phrase in audio for identification |
CN109325226A (en) * | 2018-09-10 | 2019-02-12 | 广州杰赛科技股份有限公司 | Term extraction method, apparatus and storage medium based on deep learning network |
CN109344246A (en) * | 2018-09-25 | 2019-02-15 | 平安科技(深圳)有限公司 | A kind of electric questionnaire generation method, computer readable storage medium and terminal device |
CN109344830A (en) * | 2018-08-17 | 2019-02-15 | 平安科技(深圳)有限公司 | Sentence output, model training method, device, computer equipment and storage medium |
CN109783603A (en) * | 2018-12-13 | 2019-05-21 | 平安科技(深圳)有限公司 | Based on document creation method, device, terminal and the medium from coding neural network |
CN109857847A (en) * | 2019-01-15 | 2019-06-07 | 北京搜狗科技发展有限公司 | A kind of data processing method, device and the device for data processing |
CN110019831A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of analysis method and device of product attribute |
CN110472198A (en) * | 2018-05-10 | 2019-11-19 | 腾讯科技(深圳)有限公司 | A kind of determination method of keyword, the method for text-processing and server |
WO2019228016A1 (en) * | 2018-05-31 | 2019-12-05 | 阿里巴巴集团控股有限公司 | Intelligent writing method and apparatus |
CN110969018A (en) * | 2018-09-30 | 2020-04-07 | 北京国双科技有限公司 | Case description element extraction method, machine learning model acquisition method and device |
CN111126066A (en) * | 2019-12-13 | 2020-05-08 | 智慧神州(北京)科技有限公司 | Method and device for determining Chinese retrieval method based on neural network |
CN111178067A (en) * | 2019-12-19 | 2020-05-19 | 北京明略软件***有限公司 | Information acquisition model generation method and device and information acquisition method and device |
CN111291570A (en) * | 2018-12-07 | 2020-06-16 | 北京国双科技有限公司 | Method and device for realizing element identification in judicial documents |
CN111813896A (en) * | 2020-07-13 | 2020-10-23 | 重庆紫光华山智安科技有限公司 | Text triple relation identification method and device, training method and electronic equipment |
WO2020232898A1 (en) * | 2019-05-23 | 2020-11-26 | 平安科技(深圳)有限公司 | Text classification method and apparatus, electronic device and computer non-volatile readable storage medium |
CN112035660A (en) * | 2020-08-14 | 2020-12-04 | 海尔优家智能科技(北京)有限公司 | Object class determination method and device based on network model |
CN112735413A (en) * | 2020-12-25 | 2021-04-30 | 浙江大华技术股份有限公司 | Instruction analysis method based on camera device, electronic equipment and storage medium |
CN116610804A (en) * | 2023-07-19 | 2023-08-18 | 深圳须弥云图空间科技有限公司 | Text recall method and system for improving recognition of small sample category |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573046A (en) * | 2015-01-20 | 2015-04-29 | 成都品果科技有限公司 | Comment analyzing method and system based on term vector |
CN104615589A (en) * | 2015-02-15 | 2015-05-13 | 百度在线网络技术(北京)有限公司 | Named-entity recognition model training method and named-entity recognition method and device |
CN104899304A (en) * | 2015-06-12 | 2015-09-09 | 北京京东尚科信息技术有限公司 | Named entity identification method and device |
-
2015
- 2015-11-27 CN CN201510850285.2A patent/CN106815194A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573046A (en) * | 2015-01-20 | 2015-04-29 | 成都品果科技有限公司 | Comment analyzing method and system based on term vector |
CN104615589A (en) * | 2015-02-15 | 2015-05-13 | 百度在线网络技术(北京)有限公司 | Named-entity recognition model training method and named-entity recognition method and device |
CN104899304A (en) * | 2015-06-12 | 2015-09-09 | 北京京东尚科信息技术有限公司 | Named entity identification method and device |
Non-Patent Citations (1)
Title |
---|
张建保,等: "《中国生物医学工程进展 下》", 30 April 2007 * |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291693A (en) * | 2017-06-15 | 2017-10-24 | 广州赫炎大数据科技有限公司 | A kind of semantic computation method for improving term vector model |
CN109145282A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Punctuate model training method, punctuate method, apparatus and computer equipment |
CN109145282B (en) * | 2017-06-16 | 2023-11-07 | 贵州小爱机器人科技有限公司 | Sentence-breaking model training method, sentence-breaking device and computer equipment |
CN110019831B (en) * | 2017-09-29 | 2021-09-07 | 北京国双科技有限公司 | Product attribute analysis method and device |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN107608970B (en) * | 2017-09-29 | 2024-04-26 | 百度在线网络技术(北京)有限公司 | Part-of-speech tagging model generation method and device |
CN110019831A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of analysis method and device of product attribute |
CN107943525A (en) * | 2017-11-17 | 2018-04-20 | 魏茨怡 | A kind of mobile phone app interactive modes based on Recognition with Recurrent Neural Network |
CN108268443A (en) * | 2017-12-21 | 2018-07-10 | 北京百度网讯科技有限公司 | It determines the transfer of topic point and obtains the method, apparatus for replying text |
CN110472198A (en) * | 2018-05-10 | 2019-11-19 | 腾讯科技(深圳)有限公司 | A kind of determination method of keyword, the method for text-processing and server |
WO2019228016A1 (en) * | 2018-05-31 | 2019-12-05 | 阿里巴巴集团控股有限公司 | Intelligent writing method and apparatus |
CN109344830A (en) * | 2018-08-17 | 2019-02-15 | 平安科技(深圳)有限公司 | Sentence output, model training method, device, computer equipment and storage medium |
KR102316063B1 (en) | 2018-08-20 | 2021-10-22 | 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. | Method and apparatus for identifying key phrase in audio data, device and medium |
US11308937B2 (en) * | 2018-08-20 | 2022-04-19 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for identifying key phrase in audio, device and medium |
EP3614378A1 (en) * | 2018-08-20 | 2020-02-26 | Beijing Baidu Netcom Science and Technology Co., Ltd. | Method and apparatus for identifying key phrase in audio, device and medium |
KR20200021429A (en) * | 2018-08-20 | 2020-02-28 | 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. | Method and apparatus for identifying key phrase in audio data, device and medium |
CN109241330A (en) * | 2018-08-20 | 2019-01-18 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of key phrase in audio for identification |
CN109325226A (en) * | 2018-09-10 | 2019-02-12 | 广州杰赛科技股份有限公司 | Term extraction method, apparatus and storage medium based on deep learning network |
CN109344246B (en) * | 2018-09-25 | 2024-01-05 | 平安科技(深圳)有限公司 | Electronic questionnaire generating method, computer readable storage medium and terminal device |
CN109344246A (en) * | 2018-09-25 | 2019-02-15 | 平安科技(深圳)有限公司 | A kind of electric questionnaire generation method, computer readable storage medium and terminal device |
CN110969018A (en) * | 2018-09-30 | 2020-04-07 | 北京国双科技有限公司 | Case description element extraction method, machine learning model acquisition method and device |
CN111291570A (en) * | 2018-12-07 | 2020-06-16 | 北京国双科技有限公司 | Method and device for realizing element identification in judicial documents |
CN109783603B (en) * | 2018-12-13 | 2023-05-26 | 平安科技(深圳)有限公司 | Text generation method, device, terminal and medium based on self-coding neural network |
CN109783603A (en) * | 2018-12-13 | 2019-05-21 | 平安科技(深圳)有限公司 | Based on document creation method, device, terminal and the medium from coding neural network |
CN109857847A (en) * | 2019-01-15 | 2019-06-07 | 北京搜狗科技发展有限公司 | A kind of data processing method, device and the device for data processing |
WO2020232898A1 (en) * | 2019-05-23 | 2020-11-26 | 平安科技(深圳)有限公司 | Text classification method and apparatus, electronic device and computer non-volatile readable storage medium |
CN111126066B (en) * | 2019-12-13 | 2023-05-02 | 北京因特睿软件有限公司 | Method and device for determining Chinese congratulation technique based on neural network |
CN111126066A (en) * | 2019-12-13 | 2020-05-08 | 智慧神州(北京)科技有限公司 | Method and device for determining Chinese retrieval method based on neural network |
CN111178067B (en) * | 2019-12-19 | 2023-05-26 | 北京明略软件***有限公司 | Information acquisition model generation method and device and information acquisition method and device |
CN111178067A (en) * | 2019-12-19 | 2020-05-19 | 北京明略软件***有限公司 | Information acquisition model generation method and device and information acquisition method and device |
CN111813896A (en) * | 2020-07-13 | 2020-10-23 | 重庆紫光华山智安科技有限公司 | Text triple relation identification method and device, training method and electronic equipment |
CN112035660A (en) * | 2020-08-14 | 2020-12-04 | 海尔优家智能科技(北京)有限公司 | Object class determination method and device based on network model |
CN112735413A (en) * | 2020-12-25 | 2021-04-30 | 浙江大华技术股份有限公司 | Instruction analysis method based on camera device, electronic equipment and storage medium |
CN112735413B (en) * | 2020-12-25 | 2024-05-31 | 浙江大华技术股份有限公司 | Instruction analysis method based on camera device, electronic equipment and storage medium |
CN116610804A (en) * | 2023-07-19 | 2023-08-18 | 深圳须弥云图空间科技有限公司 | Text recall method and system for improving recognition of small sample category |
CN116610804B (en) * | 2023-07-19 | 2024-01-05 | 深圳须弥云图空间科技有限公司 | Text recall method and system for improving recognition of small sample category |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106815194A (en) | Model training method and device and keyword recognition method and device | |
CN106815192B (en) | Model training method and device and sentence emotion recognition method and device | |
CN110175325B (en) | Comment analysis method based on word vector and syntactic characteristics and visual interaction interface | |
CN107633007B (en) | Commodity comment data tagging system and method based on hierarchical AP clustering | |
CN109299258B (en) | Public opinion event detection method, device and equipment | |
CN109829155A (en) | Determination method, automatic scoring method, apparatus, equipment and the medium of keyword | |
CN106815198A (en) | The recognition methods of model training method and device and sentence type of service and device | |
CN106202030B (en) | Rapid sequence labeling method and device based on heterogeneous labeling data | |
CN104809142A (en) | Trademark inquiring system and method | |
CN110674312B (en) | Method, device and medium for constructing knowledge graph and electronic equipment | |
CN106815193A (en) | Model training method and device and wrong word recognition methods and device | |
CN109299481A (en) | MT engine recommended method, device and electronic equipment | |
CN112035675A (en) | Medical text labeling method, device, equipment and storage medium | |
CN110610193A (en) | Method and device for processing labeled data | |
CN108549723B (en) | Text concept classification method and device and server | |
CN107301411B (en) | Mathematical formula identification method and device | |
CN106557463A (en) | Sentiment analysis method and device | |
CN105095196B (en) | The method and apparatus of new word discovery in text | |
JP4600045B2 (en) | Opinion extraction learning device and opinion extraction classification device | |
CN112256845A (en) | Intention recognition method, device, electronic equipment and computer readable storage medium | |
CN110321549B (en) | New concept mining method based on sequential learning, relation mining and time sequence analysis | |
US20160350264A1 (en) | Server and method for extracting content for commodity | |
CN111522901A (en) | Method and device for processing address information in text | |
CN107203558A (en) | Object recommendation method and apparatus, recommendation information treating method and apparatus | |
CN112613321A (en) | Method and system for extracting entity attribute information in text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: Beijing Guoshuang Technology Co.,Ltd. Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing Applicant before: Beijing Guoshuang Technology Co.,Ltd. |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170609 |
|
RJ01 | Rejection of invention patent application after publication |