CN106815194A - Model training method and device and keyword recognition method and device - Google Patents

Model training method and device and keyword recognition method and device Download PDF

Info

Publication number
CN106815194A
CN106815194A CN201510850285.2A CN201510850285A CN106815194A CN 106815194 A CN106815194 A CN 106815194A CN 201510850285 A CN201510850285 A CN 201510850285A CN 106815194 A CN106815194 A CN 106815194A
Authority
CN
China
Prior art keywords
word
sentence
term vector
text message
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510850285.2A
Other languages
Chinese (zh)
Inventor
刘粉香
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510850285.2A priority Critical patent/CN106815194A/en
Publication of CN106815194A publication Critical patent/CN106815194A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of model training method and device and keyword recognition method and device.Wherein, the model training method includes:The text message with part of speech mark is obtained, wherein, text message includes a plurality of sentence, and each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine the term vector of each word in every sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;In units of the sentence in text message, the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, neural network model is used to be marked the word in sentence.Present application addresses technical problem in the prior art to the keyword identification accuracy difference in sentence.

Description

Model training method and device and keyword recognition method and device
Technical field
The application is related to text-processing field, knows in particular to a kind of model training method and device and keyword Other method and device.
Background technology
Generally all include sentence keyword to be expressed in the sentence of text, for example, user's statement " has recently Point is tired, and I thinks that Yonghe Palace Temple is played ", wherein, place " Yonghe Palace Temple " is the keyword that it is included.However, for meter For calculation machine system, can not exactly find out these keywords as people, existing computer system for The identification of keyword is normally based on the part of speech or sentence structure of word in sentence, and target is found out after entering line statement participle , used as keyword, this mode is larger for the dependence of participle instrument, although for single part of speech for the word of part of speech Extract effective, and for occur in various parts of speech and natural language complicated clause, new clause, destructuring clause with And new vocabulary, its identification accuracy it is poor.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
The embodiment of the present application provides a kind of model training method and device and keyword recognition method and device, with least Solve the technical problem in the prior art to the keyword identification accuracy difference in sentence.
According to the one side of the embodiment of the present application, there is provided a kind of model training method, including:Obtain and carry part of speech The text message of mark, wherein, the text message includes a plurality of sentence, and each word in every sentence is carried The part of speech mark of corresponding part of speech type;Determine the term vector of each word in every sentence, institute's predicate Vector is the Multidimensional numerical for uniquely representing corresponding word;In units of the sentence in the text message, will be every The corresponding part of speech mark of each word and its corresponding term vector are input to Recognition with Recurrent Neural Network in bar sentence, and training is obtained Neural network model, wherein, the neural network model is used to be marked the word in sentence.
Further, it is determined that the term vector of each word in every sentence includes:To every in the text message Bar sentence carries out word segmentation processing, obtains the set of words of the text message;Search each word in the set of words Corresponding term vector.
Further, it is determined that before the term vector of each word in every sentence, the model training method Also include:The text message of preset data amount is obtained, text message set is obtained;Institute is generated using machine learning mode The corresponding term vector of each word in text message set is stated, term vector set is obtained;Wherein, the word collection is searched The corresponding term vector of each word includes in conjunction:Each word in the set of words is searched from the term vector set Corresponding term vector.
Further, the keyword tag in every sentence of the text message is the first preset mark, other words Labeled as the second preset mark, to cause when word is recognized using the neural network model, by the keyword mark It is designated as first preset mark.
According to the another aspect of the embodiment of the present application, a kind of keyword recognition method is additionally provided, including:To text to be measured Originally word segmentation processing is carried out, the corresponding term vector of each word is determined;In units of the sentence in the text to be measured, The corresponding term vector of each word in every sentence is input in neural network model, using the neutral net mould The keyword that phenotypic marker goes out in the text to be measured.
According to the another aspect of the embodiment of the present application, a kind of model training apparatus are additionally provided, including:First obtains single Unit, for obtaining the text message with part of speech mark, wherein, the text message includes a plurality of sentence, every language Each word in sentence carries the part of speech mark of corresponding part of speech type;Determining unit, it is described every for determining The term vector of each word in bar sentence, the term vector is the Multidimensional numerical for uniquely representing corresponding word; Training unit, in units of the sentence in the text message, by the corresponding part of speech of each word in every sentence Mark and its corresponding term vector are input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, the god It is used to be marked the word in sentence through network model.
Further, the training unit includes:Word-dividing mode, for being carried out to every sentence in the text message Word segmentation processing, obtains the set of words of the text message;Enquiry module, for searching each in the set of words The corresponding term vector of word.
Further, the model training apparatus also include:Second acquisition unit, for it is determined that every sentence In each word term vector before, obtain preset data amount text message, obtain text message set;Generation Unit, for generating the corresponding term vector of each word in the text message set using machine learning mode, obtains Term vector set;Wherein, during the enquiry module from the term vector set specifically for searching the set of words The corresponding term vector of each word.
Further, the keyword tag in every sentence of the text message is the first preset mark, other words Labeled as the second preset mark, to cause when word is recognized using the neural network model, by the keyword mark It is designated as first preset mark.
According to the another aspect of the embodiment of the present application, a kind of keyword identifying device is additionally provided, including:Vector determines Unit, for carrying out word segmentation processing to text to be measured, determines the corresponding term vector of each word;Indexing unit, uses In in units of the sentence in the text to be measured, the corresponding term vector of each word in every sentence is input to god In through network model, the keyword in the text to be measured is marked using the neural network model.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen Schematic description and description please does not constitute the improper restriction to the application for explaining the application.In accompanying drawing In:
Fig. 1 is the flow chart of the model training method according to the embodiment of the present application;
Fig. 2 is the flow chart of the keyword recognition method according to the embodiment of the present application;
Fig. 3 is the schematic diagram of the model training apparatus according to the embodiment of the present application;
Fig. 4 is the schematic diagram of the keyword identifying device according to the embodiment of the present application.
Specific embodiment
In order that those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present application, it is clear that described embodiment The only embodiment of the application part, rather than whole embodiments.Based on the embodiment in the application, ability The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to The scope of the application protection.
It should be noted that term " first ", " in the description and claims of this application and above-mentioned accompanying drawing Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this The data that sample is used can be exchanged in the appropriate case, so as to embodiments herein described herein can with except Here the order beyond those for illustrating or describing is implemented.Additionally, term " comprising " and " having " and they Any deformation, it is intended that covering is non-exclusive to be included, for example, containing process, the side of series of steps or unit Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear List or for these processes, method, product or other intrinsic steps of equipment or unit.
According to the embodiment of the present application, there is provided a kind of embodiment of the method for model training method, it is necessary to explanation, The step of flow of accompanying drawing is illustrated can perform in the such as one group computer system of computer executable instructions, and And, although logical order is shown in flow charts, but in some cases, can be with different from order herein Perform shown or described step.
Fig. 1 is the flow chart of the model training method according to the embodiment of the present application, as shown in figure 1, the method is included such as Lower step:
Step S102, obtains the text message with part of speech mark, wherein, text message includes a plurality of sentence, every Each word in sentence carries the part of speech mark of corresponding part of speech type.
The text message with part of speech mark of the embodiment of the present application, can be the sample of the text message of advance collection, By being manually marked to word interested in wherein every sentence, the part of speech of word interested, shape are marked Into text message.Wherein, it can also be multiple that the parts of speech classification of word interested can be one, such as mark Point noun, personage's noun etc..Its mask method can be:Place interested is expressed as:PLACE (place) is invalid Word is expressed as:NUL (sky).Such as " I thinks that Yonghe Palace Temple is played ", may be after participle " I thinks that Yonghe Palace Temple is played ", It is after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Step S104, determines the term vector of each word in every sentence, and term vector is for uniquely representing corresponding The Multidimensional numerical of word.
After the text message with part of speech mark is got, determine that word is corresponding in every sentence in the text Term vector, the term vector of each word represents with one group of Multidimensional numerical, each not phase of the corresponding term vector of different words Together.Wherein, the term vector of word can have been pre-defined, after text message is extracted, from advance The vector of each word in text message is inquired in the term vector of definition.Can also be given birth to according to term vector set in advance Into rule, the term vector of each word is generated.Because each word carries corresponding part of speech mark in text message Note, therefore, the corresponding term vector of each word also correspond to be marked with the word identical part of speech.
Step S106, in units of the sentence in text message, by the corresponding part of speech mark of each word in every sentence And its corresponding term vector is input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, neutral net mould Type is used to be marked the word in sentence.
In the present embodiment, after the term vector for determining each word included in text message, with text message In sentence be unit, the sentence in text message is sequentially inputted to be trained in Recognition with Recurrent Neural Network, be input to Sentence in Recognition with Recurrent Neural Network is replaced with the wherein corresponding term vector of each word, i.e. by each word in sentence Corresponding term vector is input to Recognition with Recurrent Neural Network.The text message for extracting is trained by Recognition with Recurrent Neural Network, Obtain neural network model.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form, relative to existing Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training Word, can exactly identify the keyword in the sentence of various structure types, and the accuracy to keyword identification is high.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Preferably, determining the term vector of each word in every sentence includes:Every sentence in text message is carried out Word segmentation processing, obtains the set of words of text message;Search the corresponding term vector of each word in set of words.
In the present embodiment, the term vector of word is previously generated, generate term vector set.Collecting as the text of sample After this information, each word pair inquired about from the term vector set for previously generating in every sentence of text information The term vector answered.Wherein, the word segmentation processing to every sentence of text message can be using participle instrument, according to one Set pattern then carries out participle, can be " I thinks that Yonghe Palace Temple is played " after participle such as " I thinks that Yonghe Palace Temple is played ".
Further, it is determined that before the term vector of each word in every sentence, model training method also includes: The text message of preset data amount is obtained, text message set is obtained;Text message collection is generated using machine learning mode The corresponding term vector of each word in conjunction, obtains term vector set;Wherein, each word correspondence in set of words is searched Term vector include:The corresponding term vector of each word in set of words is searched from term vector set.
In the present embodiment, it is determined that before the corresponding term vector of word, first generation term vector set, specifically, first obtains Substantial amounts of text message is taken, wherein, preset data amount can be the larger data volume of the scope for pre-setting;To obtain The text message of the preset data amount for arriving as training term vector text message set, then using machine learning mode The corresponding term vector of generation each word therein, obtains term vector set.So, in the text to being determined as sample In this information during the corresponding term vector of word, can directly be inquired about from the term vector set and obtained.
Machine learning mode can carry out term vector training using Google word2vec, according to input text, to each Individual word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can such as will with self-defined It is 0,1,0 that " happiness " may be marked ...].
Preferably, the keyword tag in every sentence of text message is the first preset mark, and other words are labeled as Second preset mark, is the first pre- bidding by keyword tag to cause when word is recognized using neural network model Note.
It is the to keyword tag interested using as in the sentence of the text message of training sample in the present embodiment Other invalid words are labeled as the second preset mark by one preset mark.When model training is carried out, what training was obtained Neural network model can remember these marks, therefore, recognize sentence in the neural network model obtained using training In keyword when, can in its output result by keyword tag be the first preset mark, by other invalid words Labeled as the second preset mark.
For example, word interested is the word for representing place, place is expressed as:PLACE (place), invalid word lists It is shown as:NUL (sky).By sentence " I thinks that Yonghe Palace Temple is played ", it is marked after participle, is marked as that " I thinks that NUL goes NUL harmony PLACE palace PLACE play NUL ".
A kind of optional mode of the model training method of the embodiment of the present application includes:
Step one, the substantial amounts of text message of collection, as term vector training text collection 1, for training term vector.
Step 2, participle is carried out to text set 1, term vector is generated using machine learning mode, obtain term vector set. Wherein, machine learning can carry out term vector training using Google word2vec, according to input text, to each Word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can be with self-defined, such as by " height It is emerging " may to mark be 0,1,0 ...].
The related text message of step 3, capturing service, participle is carried out to every sentence, manually carries out word to each word Property mark, used as training set 2, part of speech is classification interested.Wherein, classification interested can also may be used for one Think multiple, such as marking terrain noun, personage's noun.Its labeling method can be:Place interested is expressed as: PLACE (place), invalid word is expressed as:NUL (sky).May be " I after participle such as " I thinks that Yonghe Palace Temple is played " Think that Yonghe Palace Temple is played ", be after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Word in step 4, training set 2 is represented with the term vector generated in above-mentioned step 2, in units of sentence, Term vector in training set 2 is input into RNN (Recognition with Recurrent Neural Network) to be trained, the RNN training after being trained Model.Wherein, the input Recognition with Recurrent Neural Network with sentence as neutral net, machine can remember word in sentence, Part of speech mark and combinations thereof form, and in the parameters memorizing in model these words, part of speech mark and combinations thereof form.
According to the embodiment of the present application, by way of term vector and Recognition with Recurrent Neural Network are combined, model training is carried out, So that keyword extraction is small to participle instrument accuracy dependence, and robustness is relatively strong (such as:Do not occur in training set Word, part of speech is also can obtain in test, it is keyword to identify whether).
A kind of keyword recognition method is additionally provided according to the embodiment of the present application, the keyword recognition method can be used for leading to The model training method for crossing the above embodiments of the present application trains the neural network model for obtaining to recognize keyword.Such as Fig. 2 Shown, the keyword recognition method includes:
Step S202, word segmentation processing is carried out to text to be measured, determines the corresponding term vector of each word.
In the present embodiment, the mode and the above embodiments of the present application of word segmentation processing and determination term vector to text to be measured The mode being previously mentioned in middle model training method is identical, does not repeat here.
Step S204, in units of the sentence in text to be measured, by the corresponding term vector of each word in every sentence It is input in neural network model, the keyword in text to be measured is marked using neural network model.
The nerve that neural network model in the present embodiment is obtained for the model training method training of the above embodiments of the present application Network model.
In units of the sentence in text to be measured, the wherein corresponding term vector of word is input in neural network model, The keyword in text to be measured is identified using neural network model, and is marked.Specifically, obtain to be measured Text, carries out participle, and each word term vector is represented, term vector is input into neural network model in units of sentence, Obtain the part of speech mark to each word, you can obtain the corresponding word of part of speech interested.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form.Relative to existing Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training Word, in units of sentence, therefrom can exactly identify the keyword in the sentence of various structure types, to key The accuracy of word identification is high.
For example, be " how is Yonghe Palace Temple evaluation " after " how is Yonghe Palace Temple evaluation " participle, by neutral net mould Type calculate after result be:" harmony PLACE palace PLACE evaluate NUL how NUL ", by screening, can obtain Take place noun interested:Yonghe Palace Temple.
The embodiment of the present application additionally provides a kind of model training apparatus, and the device can be used for performing the embodiment of the present application Model training method, as shown in figure 3, the device includes:First acquisition unit 301, determining unit 303 and training Unit 305.
First acquisition unit 301 is used to obtain the text message with part of speech mark, wherein, text message includes a plurality of Sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type.
The text message with part of speech mark of the embodiment of the present application, can be the sample of the text message of advance collection, By being manually marked to word interested in wherein every sentence, the part of speech of word interested, shape are marked Into text message.Wherein, it can also be multiple that the parts of speech classification of word interested can be 1, such as mark Point noun, personage's noun etc..Its mask method can be:Place interested is expressed as:PLACE (place) is invalid Word is expressed as:NUL (sky).Such as " I thinks that Yonghe Palace Temple is played ", may be after participle " I thinks that Yonghe Palace Temple is played ", It is after artificial mark " I thinks that NUL goes NUL harmony PLACE palace PLACE to play NUL ".
Determining unit 303 is used to determine the term vector of each word in every sentence, and term vector is for unique expression The Multidimensional numerical of corresponding word.
After the text message with part of speech mark is got, determine that word is corresponding in every sentence in the text Term vector, the term vector of each word represents with one group of Multidimensional numerical, each not phase of the corresponding term vector of different words Together.Wherein, the term vector of word can have been pre-defined, after text message is extracted, from advance The vector of each word in text message is inquired in the term vector of definition.Can also be given birth to according to term vector set in advance Into rule, the term vector of each word is generated.Because each word carries corresponding part of speech mark in text message Note, therefore, the corresponding term vector of each word also correspond to be marked with the word identical part of speech.
Training unit 305 is used in units of the sentence in text message, by the corresponding word of each word in every sentence Property mark and its corresponding term vector be input to Recognition with Recurrent Neural Network, training obtains neural network model, wherein, nerve Network model is used to be marked the word in sentence.
In the present embodiment, after the term vector for determining each word included in text message, with text message In sentence be unit, the sentence in text message is sequentially inputted to be trained in Recognition with Recurrent Neural Network, be input to Sentence in Recognition with Recurrent Neural Network is replaced with the wherein corresponding term vector of each word, i.e. by each word in sentence Corresponding term vector is input to Recognition with Recurrent Neural Network.The text message for extracting is trained by Recognition with Recurrent Neural Network, Obtain neural network model.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form, relative to existing Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training Word, can exactly identify the keyword in the sentence of various structure types, and the accuracy to keyword identification is high.
According to the embodiment of the present application, by obtaining the text message marked with part of speech, wherein, text message includes many Bar sentence, each word in every sentence carries the part of speech mark of corresponding part of speech type;Determine every language The term vector of each word in sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word;With text envelope Sentence in breath is unit, and the corresponding part of speech mark of each word in every sentence and its corresponding term vector are input to Recognition with Recurrent Neural Network, training obtains neural network model, facilitates the use neural network model and the word in sentence is entered Line flag, so as to identify keyword therein, solves and recognizes accuracy to the keyword in sentence in the prior art Poor technical problem, has reached the effect of the accuracy for improving keyword identification.
Preferably, training unit includes:Word-dividing mode, for carrying out word segmentation processing to every sentence in text message, Obtain the set of words of text message;Enquiry module, for searching the corresponding term vector of each word in set of words.
In the present embodiment, the term vector of word is previously generated, generate term vector set.Collecting as the text of sample After this information, each word pair inquired about from the term vector set for previously generating in every sentence of text information The term vector answered.Wherein, the word segmentation processing to every sentence of text message can be using participle instrument, according to one Set pattern then carries out participle, can be " I thinks that Yonghe Palace Temple is played " after participle such as " I thinks that Yonghe Palace Temple is played ".
Preferably, model training apparatus also include:Second acquisition unit, for it is determined that each word in every sentence Before the term vector of language, the text message of preset data amount is obtained, obtain text message set;Generation unit, is used for The corresponding term vector of each word in text message set is generated using machine learning mode, term vector set is obtained;Its In, enquiry module is specifically for the corresponding term vector of each word in the lookup set of words from term vector set.
In the present embodiment, it is determined that before the corresponding term vector of word, first generation term vector set, specifically, first obtains Substantial amounts of text message is taken, wherein, preset data amount can be the larger data volume of the scope for pre-setting;To obtain The text message of the preset data amount for arriving as training term vector text message set, then using machine learning mode The corresponding term vector of generation each word therein, obtains term vector set.So, in the text to being determined as sample In this information during the corresponding term vector of word, can directly be inquired about from the term vector set and obtained.
Machine learning mode can carry out term vector training using Google word2vec, according to input text, to each Individual word generates a dimension identical unique vector, i.e. Multidimensional numerical, and the dimension of the array can such as will with self-defined It is 0,1,0 that " happiness " may be marked ...].
Preferably, the keyword tag in every sentence of text message is the first preset mark, and other words are labeled as Second preset mark, is the first pre- bidding by keyword tag to cause when word is recognized using neural network model Note.
It is the to keyword tag interested using as in the sentence of the text message of training sample in the present embodiment Other invalid words are labeled as the second preset mark by one preset mark.When model training is carried out, what training was obtained Neural network model can remember these marks, therefore, recognize sentence in the neural network model obtained using training In keyword when, can in its output result by keyword tag be the first preset mark, by other invalid words Labeled as the second preset mark.
For example, word interested is the word for representing place, place is expressed as:PLACE (place), invalid word lists It is shown as:NUL (sky).By sentence " I thinks that Yonghe Palace Temple is played ", it is marked after participle, is marked as that " I thinks that NUL goes NUL harmony PLACE palace PLACE play NUL ".
The model training apparatus include processor and memory, above-mentioned first acquisition unit 301, determining unit 303 Stored in memory as program unit with the grade of training unit 305, stored in memory by computing device Said procedure unit.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, trained by adjusting kernel parameter and obtain neural network model, for being identified to keyword in sentence.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/ Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit In the program code for performing initialization there are as below methods step:The text message with part of speech mark is obtained, wherein, text This information includes a plurality of sentence, and each word in every sentence carries the part of speech mark of corresponding part of speech type; Determine the term vector of each word in every sentence, term vector is the Multidimensional numerical for uniquely representing corresponding word; In units of the sentence in text message, by the corresponding part of speech of each word in every sentence mark and its corresponding word to Amount is input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, neural network model is used in sentence Word be marked.
The embodiment of the present application additionally provides a kind of keyword identifying device, and the device can be used for performing the embodiment of the present application Keyword recognition method, as shown in figure 4, the device includes:Vector determination unit 401 and indexing unit 403.
Vector determination unit 401 is used to carry out word segmentation processing to text to be measured, determines the corresponding term vector of each word.
In the present embodiment, the mode and the above embodiments of the present application of word segmentation processing and determination term vector to text to be measured The mode being previously mentioned in middle model training method is identical, does not repeat here.
Indexing unit 403 is used in units of the sentence in text to be measured, and each word in every sentence is corresponding Term vector is input in neural network model, and the keyword in text to be measured is marked using neural network model.
The nerve that neural network model in the present embodiment is obtained for the model training method training of the above embodiments of the present application Network model.
In units of the sentence in text to be measured, the wherein corresponding term vector of word is input in neural network model, The keyword in text to be measured is identified using neural network model, and is marked.Specifically, obtain to be measured Text, carries out participle, and each word term vector is represented, term vector is input into neural network model in units of sentence, Obtain the part of speech mark to each word, you can obtain the corresponding word of part of speech interested.
Due to being that the corresponding term vector of word therein is input into Memory Neural Networks in units of sentence, machine can be with Word, part of speech mark in memory sentence and combinations thereof form, and with the parameter (neutral net in neural network model Model Parameter determines that major part is matrix) remember these words, part of speech mark and combinations thereof form.Relative to existing Have in technology using part of speech or sentence structure based on word in sentence, target part of speech is found out after entering line statement participle Word as keyword mode, the present embodiment recognizes the key in text by the neural network model that obtains of training Word, in units of sentence, therefrom can exactly identify the keyword in the sentence of various structure types, to key The accuracy of word identification is high.
For example, be " how is Yonghe Palace Temple evaluation " after " how is Yonghe Palace Temple evaluation " participle, by neutral net mould Type calculate after result be:" harmony PLACE palace PLACE evaluate NUL how NUL ", by screening, can obtain Take place noun interested:Yonghe Palace Temple.
The keyword identifying device includes processor and memory, above-mentioned vector determination unit 401 and indexing unit 403 Deng being stored in memory as program unit, by computing device storage said procedure unit in memory. It is above-mentioned to may be stored in memory.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, utilize neural network model to be identified keyword in text to be measured by adjusting kernel parameter.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/ Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit In the program code for performing initialization there are as below methods step:Word segmentation processing is carried out to text to be measured, each word is determined The corresponding term vector of language;In units of the sentence in text to be measured, by the corresponding word of each word in every sentence to Amount is input in neural network model, and the keyword in text to be measured is marked using neural network model.
Above-mentioned the embodiment of the present application sequence number is for illustration only, and the quality of embodiment is not represented.
In above-described embodiment of the application, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode realize.Wherein, device embodiment described above is only schematical, such as division of described unit, Can be a kind of division of logic function, there can be other dividing mode when actually realizing, for example multiple units or component Can combine or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, institute Display or the coupling each other for discussing or direct-coupling or communication connection can be by some interfaces, unit or mould The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to On multiple units.Some or all of unit therein can be according to the actual needs selected to realize this embodiment scheme Purpose.
In addition, during each functional unit in the application each embodiment can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or when using, Can store in a computer read/write memory medium.Based on such understanding, the technical scheme essence of the application On all or part of the part that is contributed to prior art in other words or the technical scheme can be with software product Form is embodied, and the computer software product is stored in a storage medium, including some instructions are used to so that one Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the application State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD Etc. it is various can be with the medium of store program codes.
The above is only the preferred embodiment of the application, it is noted that for the ordinary skill people of the art For member, on the premise of the application principle is not departed from, some improvements and modifications can also be made, these improve and moisten Decorations also should be regarded as the protection domain of the application.

Claims (10)

1. a kind of model training method, it is characterised in that including:
The text message with part of speech mark is obtained, wherein, the text message includes a plurality of sentence, every language Each word in sentence carries the part of speech mark of corresponding part of speech type;
Determine the term vector of each word in every sentence, the term vector is to represent correspondence for unique Word Multidimensional numerical;
In units of the sentence in the text message, by the corresponding part of speech of each word in every sentence mark and Its corresponding term vector is input to Recognition with Recurrent Neural Network, and training obtains neural network model, wherein, the nerve Network model is used to be marked the word in sentence.
2. model training method according to claim 1, it is characterised in that determine each in every sentence The term vector of word includes:
Word segmentation processing is carried out to every sentence in the text message, the set of words of the text message is obtained;
Search the corresponding term vector of each word in the set of words.
3. model training method according to claim 2, it is characterised in that it is determined that every in every sentence Before the term vector of individual word, the model training method also includes:
The text message of preset data amount is obtained, text message set is obtained;
The corresponding term vector of each word in the text message set is generated using machine learning mode, word is obtained Vector set;
Wherein, the corresponding term vector of each word includes in searching the set of words:From the term vector set It is middle to search the corresponding term vector of each word in the set of words.
4. model training method according to any one of claim 1 to 3, it is characterised in that the text message Every sentence in keyword tag be the first preset mark, other words be labeled as the second preset mark, with So that being the described first pre- bidding by the keyword tag when word is recognized using the neural network model Note.
5. a kind of keyword recognition method, it is characterised in that including:
Word segmentation processing is carried out to text to be measured, the corresponding term vector of each word is determined;
It is in units of the sentence in the text to be measured, the corresponding term vector of each word in every sentence is defeated Enter in training the neural network model for obtaining to the model training method any one of Claims 1-4, The keyword in the text to be measured is marked using the neural network model.
6. a kind of model training apparatus, it is characterised in that including:
First acquisition unit, for obtaining the text message with part of speech mark, wherein, the text message bag A plurality of sentence is included, each word in every sentence carries the part of speech mark of corresponding part of speech type;
Determining unit, the term vector for determining each word in every sentence, the term vector is use In the Multidimensional numerical for uniquely representing corresponding word;
Training unit, in units of the sentence in the text message, by each word pair in every sentence The part of speech mark answered and its corresponding term vector are input to Recognition with Recurrent Neural Network, and training obtains neural network model, Wherein, the neural network model is used to be marked the word in sentence.
7. model training apparatus according to claim 6, it is characterised in that the training unit includes:
Word-dividing mode, for carrying out word segmentation processing to every sentence in the text message, obtains the text envelope The set of words of breath;
Enquiry module, for searching the corresponding term vector of each word in the set of words.
8. model training apparatus according to claim 7, it is characterised in that the model training apparatus also include:
Second acquisition unit, for it is determined that before the term vector of each word in every sentence, obtaining The text message of preset data amount, obtains text message set;
Generation unit, for generating the text message set using machine learning mode in each word it is corresponding Term vector, obtains term vector set;
Wherein, the enquiry module from the term vector set specifically for searching each in the set of words The corresponding term vector of word.
9. model training apparatus according to any one of claim 6 to 8, it is characterised in that the text message Every sentence in keyword tag be the first preset mark, other words be labeled as the second preset mark, with So that being the described first pre- bidding by the keyword tag when word is recognized using the neural network model Note.
10. a kind of keyword identifying device, it is characterised in that including:
Vector determination unit, for carrying out word segmentation processing to text to be measured, determine the corresponding word of each word to Amount;
Indexing unit, in units of the sentence in the text to be measured, by each word in every sentence Corresponding term vector is input to the god that the model training method training any one of Claims 1-4 is obtained In through network model, the keyword in the text to be measured is marked using the neural network model.
CN201510850285.2A 2015-11-27 2015-11-27 Model training method and device and keyword recognition method and device Pending CN106815194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510850285.2A CN106815194A (en) 2015-11-27 2015-11-27 Model training method and device and keyword recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510850285.2A CN106815194A (en) 2015-11-27 2015-11-27 Model training method and device and keyword recognition method and device

Publications (1)

Publication Number Publication Date
CN106815194A true CN106815194A (en) 2017-06-09

Family

ID=59155406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510850285.2A Pending CN106815194A (en) 2015-11-27 2015-11-27 Model training method and device and keyword recognition method and device

Country Status (1)

Country Link
CN (1) CN106815194A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model
CN107608970A (en) * 2017-09-29 2018-01-19 百度在线网络技术(北京)有限公司 part-of-speech tagging model generating method and device
CN107943525A (en) * 2017-11-17 2018-04-20 魏茨怡 A kind of mobile phone app interactive modes based on Recognition with Recurrent Neural Network
CN108268443A (en) * 2017-12-21 2018-07-10 北京百度网讯科技有限公司 It determines the transfer of topic point and obtains the method, apparatus for replying text
CN109145282A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Punctuate model training method, punctuate method, apparatus and computer equipment
CN109241330A (en) * 2018-08-20 2019-01-18 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of key phrase in audio for identification
CN109325226A (en) * 2018-09-10 2019-02-12 广州杰赛科技股份有限公司 Term extraction method, apparatus and storage medium based on deep learning network
CN109344246A (en) * 2018-09-25 2019-02-15 平安科技(深圳)有限公司 A kind of electric questionnaire generation method, computer readable storage medium and terminal device
CN109344830A (en) * 2018-08-17 2019-02-15 平安科技(深圳)有限公司 Sentence output, model training method, device, computer equipment and storage medium
CN109783603A (en) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 Based on document creation method, device, terminal and the medium from coding neural network
CN109857847A (en) * 2019-01-15 2019-06-07 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing
CN110019831A (en) * 2017-09-29 2019-07-16 北京国双科技有限公司 A kind of analysis method and device of product attribute
CN110472198A (en) * 2018-05-10 2019-11-19 腾讯科技(深圳)有限公司 A kind of determination method of keyword, the method for text-processing and server
WO2019228016A1 (en) * 2018-05-31 2019-12-05 阿里巴巴集团控股有限公司 Intelligent writing method and apparatus
CN110969018A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Case description element extraction method, machine learning model acquisition method and device
CN111126066A (en) * 2019-12-13 2020-05-08 智慧神州(北京)科技有限公司 Method and device for determining Chinese retrieval method based on neural network
CN111178067A (en) * 2019-12-19 2020-05-19 北京明略软件***有限公司 Information acquisition model generation method and device and information acquisition method and device
CN111291570A (en) * 2018-12-07 2020-06-16 北京国双科技有限公司 Method and device for realizing element identification in judicial documents
CN111813896A (en) * 2020-07-13 2020-10-23 重庆紫光华山智安科技有限公司 Text triple relation identification method and device, training method and electronic equipment
WO2020232898A1 (en) * 2019-05-23 2020-11-26 平安科技(深圳)有限公司 Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
CN112035660A (en) * 2020-08-14 2020-12-04 海尔优家智能科技(北京)有限公司 Object class determination method and device based on network model
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN116610804A (en) * 2023-07-19 2023-08-18 深圳须弥云图空间科技有限公司 Text recall method and system for improving recognition of small sample category

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573046A (en) * 2015-01-20 2015-04-29 成都品果科技有限公司 Comment analyzing method and system based on term vector
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN104899304A (en) * 2015-06-12 2015-09-09 北京京东尚科信息技术有限公司 Named entity identification method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573046A (en) * 2015-01-20 2015-04-29 成都品果科技有限公司 Comment analyzing method and system based on term vector
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN104899304A (en) * 2015-06-12 2015-09-09 北京京东尚科信息技术有限公司 Named entity identification method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张建保,等: "《中国生物医学工程进展 下》", 30 April 2007 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291693A (en) * 2017-06-15 2017-10-24 广州赫炎大数据科技有限公司 A kind of semantic computation method for improving term vector model
CN109145282A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Punctuate model training method, punctuate method, apparatus and computer equipment
CN109145282B (en) * 2017-06-16 2023-11-07 贵州小爱机器人科技有限公司 Sentence-breaking model training method, sentence-breaking device and computer equipment
CN110019831B (en) * 2017-09-29 2021-09-07 北京国双科技有限公司 Product attribute analysis method and device
CN107608970A (en) * 2017-09-29 2018-01-19 百度在线网络技术(北京)有限公司 part-of-speech tagging model generating method and device
CN107608970B (en) * 2017-09-29 2024-04-26 百度在线网络技术(北京)有限公司 Part-of-speech tagging model generation method and device
CN110019831A (en) * 2017-09-29 2019-07-16 北京国双科技有限公司 A kind of analysis method and device of product attribute
CN107943525A (en) * 2017-11-17 2018-04-20 魏茨怡 A kind of mobile phone app interactive modes based on Recognition with Recurrent Neural Network
CN108268443A (en) * 2017-12-21 2018-07-10 北京百度网讯科技有限公司 It determines the transfer of topic point and obtains the method, apparatus for replying text
CN110472198A (en) * 2018-05-10 2019-11-19 腾讯科技(深圳)有限公司 A kind of determination method of keyword, the method for text-processing and server
WO2019228016A1 (en) * 2018-05-31 2019-12-05 阿里巴巴集团控股有限公司 Intelligent writing method and apparatus
CN109344830A (en) * 2018-08-17 2019-02-15 平安科技(深圳)有限公司 Sentence output, model training method, device, computer equipment and storage medium
KR102316063B1 (en) 2018-08-20 2021-10-22 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for identifying key phrase in audio data, device and medium
US11308937B2 (en) * 2018-08-20 2022-04-19 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for identifying key phrase in audio, device and medium
EP3614378A1 (en) * 2018-08-20 2020-02-26 Beijing Baidu Netcom Science and Technology Co., Ltd. Method and apparatus for identifying key phrase in audio, device and medium
KR20200021429A (en) * 2018-08-20 2020-02-28 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for identifying key phrase in audio data, device and medium
CN109241330A (en) * 2018-08-20 2019-01-18 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of key phrase in audio for identification
CN109325226A (en) * 2018-09-10 2019-02-12 广州杰赛科技股份有限公司 Term extraction method, apparatus and storage medium based on deep learning network
CN109344246B (en) * 2018-09-25 2024-01-05 平安科技(深圳)有限公司 Electronic questionnaire generating method, computer readable storage medium and terminal device
CN109344246A (en) * 2018-09-25 2019-02-15 平安科技(深圳)有限公司 A kind of electric questionnaire generation method, computer readable storage medium and terminal device
CN110969018A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Case description element extraction method, machine learning model acquisition method and device
CN111291570A (en) * 2018-12-07 2020-06-16 北京国双科技有限公司 Method and device for realizing element identification in judicial documents
CN109783603B (en) * 2018-12-13 2023-05-26 平安科技(深圳)有限公司 Text generation method, device, terminal and medium based on self-coding neural network
CN109783603A (en) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 Based on document creation method, device, terminal and the medium from coding neural network
CN109857847A (en) * 2019-01-15 2019-06-07 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing
WO2020232898A1 (en) * 2019-05-23 2020-11-26 平安科技(深圳)有限公司 Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
CN111126066B (en) * 2019-12-13 2023-05-02 北京因特睿软件有限公司 Method and device for determining Chinese congratulation technique based on neural network
CN111126066A (en) * 2019-12-13 2020-05-08 智慧神州(北京)科技有限公司 Method and device for determining Chinese retrieval method based on neural network
CN111178067B (en) * 2019-12-19 2023-05-26 北京明略软件***有限公司 Information acquisition model generation method and device and information acquisition method and device
CN111178067A (en) * 2019-12-19 2020-05-19 北京明略软件***有限公司 Information acquisition model generation method and device and information acquisition method and device
CN111813896A (en) * 2020-07-13 2020-10-23 重庆紫光华山智安科技有限公司 Text triple relation identification method and device, training method and electronic equipment
CN112035660A (en) * 2020-08-14 2020-12-04 海尔优家智能科技(北京)有限公司 Object class determination method and device based on network model
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN112735413B (en) * 2020-12-25 2024-05-31 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN116610804A (en) * 2023-07-19 2023-08-18 深圳须弥云图空间科技有限公司 Text recall method and system for improving recognition of small sample category
CN116610804B (en) * 2023-07-19 2024-01-05 深圳须弥云图空间科技有限公司 Text recall method and system for improving recognition of small sample category

Similar Documents

Publication Publication Date Title
CN106815194A (en) Model training method and device and keyword recognition method and device
CN106815192B (en) Model training method and device and sentence emotion recognition method and device
CN110175325B (en) Comment analysis method based on word vector and syntactic characteristics and visual interaction interface
CN107633007B (en) Commodity comment data tagging system and method based on hierarchical AP clustering
CN109299258B (en) Public opinion event detection method, device and equipment
CN109829155A (en) Determination method, automatic scoring method, apparatus, equipment and the medium of keyword
CN106815198A (en) The recognition methods of model training method and device and sentence type of service and device
CN106202030B (en) Rapid sequence labeling method and device based on heterogeneous labeling data
CN104809142A (en) Trademark inquiring system and method
CN110674312B (en) Method, device and medium for constructing knowledge graph and electronic equipment
CN106815193A (en) Model training method and device and wrong word recognition methods and device
CN109299481A (en) MT engine recommended method, device and electronic equipment
CN112035675A (en) Medical text labeling method, device, equipment and storage medium
CN110610193A (en) Method and device for processing labeled data
CN108549723B (en) Text concept classification method and device and server
CN107301411B (en) Mathematical formula identification method and device
CN106557463A (en) Sentiment analysis method and device
CN105095196B (en) The method and apparatus of new word discovery in text
JP4600045B2 (en) Opinion extraction learning device and opinion extraction classification device
CN112256845A (en) Intention recognition method, device, electronic equipment and computer readable storage medium
CN110321549B (en) New concept mining method based on sequential learning, relation mining and time sequence analysis
US20160350264A1 (en) Server and method for extracting content for commodity
CN111522901A (en) Method and device for processing address information in text
CN107203558A (en) Object recommendation method and apparatus, recommendation information treating method and apparatus
CN112613321A (en) Method and system for extracting entity attribute information in text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20170609

RJ01 Rejection of invention patent application after publication