CN109165380A - A kind of neural network model training method and device, text label determine method and device - Google Patents

A kind of neural network model training method and device, text label determine method and device Download PDF

Info

Publication number
CN109165380A
CN109165380A CN201810837902.9A CN201810837902A CN109165380A CN 109165380 A CN109165380 A CN 109165380A CN 201810837902 A CN201810837902 A CN 201810837902A CN 109165380 A CN109165380 A CN 109165380A
Authority
CN
China
Prior art keywords
label
word
neural network
text
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810837902.9A
Other languages
Chinese (zh)
Other versions
CN109165380B (en
Inventor
刘伟伟
史佳慧
骆世顺
黄萍萍
斯凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MIGU Digital Media Co Ltd
Original Assignee
MIGU Digital Media Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIGU Digital Media Co Ltd filed Critical MIGU Digital Media Co Ltd
Priority to CN201810837902.9A priority Critical patent/CN109165380B/en
Publication of CN109165380A publication Critical patent/CN109165380A/en
Application granted granted Critical
Publication of CN109165380B publication Critical patent/CN109165380B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of neural network model training methods, comprising: obtains the sample characteristics collection being made of the semantic topic feature vector of several texts, and can be used as the tally set that several labels of text label are constituted;Based on the sample characteristics collection and the tally set, neural network model is trained in the following manner: taking the sample characteristics collection as the input of the 1st layer of neural network model, it take the 1st label in the tally set as the output of the 1st layer of neural network model, the 1st grade of neural network model of training;Take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, take m-th of label in the tally set as the output of m layers of neural network model, m grades of neural network models of training;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.The invention also discloses a kind of neural network model training device, text labels to determine method and device.

Description

A kind of neural network model training method and device, text label determine method and Device
Technical field
The present invention relates to technical field of data processing more particularly to a kind of neural network model training method and devices, text This label determines method and device.
Background technique
In the related technology, it is labelled by the method for multi-class classification for text, the corresponding label of each text is deposited In the problem that the result to text progress tag recognition is not comprehensive, accuracy and robustness are low.
Summary of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of neural network model training method and devices, text label It determines method and device, multi-tag identification can be carried out to text, improve the accuracy and robustness of text label.
In order to achieve the above objectives, the technical solution of the embodiment of the present invention is achieved in that
In a first aspect, the embodiment of the present invention provides a kind of neural network model training method, comprising:
Obtain the sample characteristics collection being made of the semantic topic feature vector of several texts;
It obtains and can be used as the tally set that several labels of text label are constituted;
Based on the sample characteristics collection and the tally set, neural network model is trained in the following manner:
Take the sample characteristics collection as the input of the 1st layer of neural network model, is with the 1st label in the tally set The output of 1st layer of neural network model, the 1st grade of neural network model of training are pre- according to the keyword of the text of label to be allocated Survey the performance of corresponding label;
It take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, with institute The output that m-th of label in tally set is m layers of neural network model is stated, m grades of neural network models of training are according to keyword Predict the performance of corresponding label;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.
Second aspect, the embodiment of the present invention provide a kind of text mark based on above-mentioned neural network model training method Sign the method for determination, comprising:
Calculate the feature vector of the corresponding keyword of text;
The feature vector of the corresponding keyword of the text is inputted into m grades of neural network models, obtains corresponding m mark Label, 2≤m;
Calculate the distribution probability that different classes of lower label concentrates each label;
M label and the distribution probability are weighted to obtain the corresponding tally set of text.
The third aspect, the embodiment of the present invention provide a kind of neural network model training device, and described device includes:
Acquiring unit, sample characteristics collection that the semantic topic feature vector for obtaining by several texts is constituted and can The tally set that several labels as text label are constituted;
Training unit is used for the sample characteristics collection and the tally set, trains neural network mould in the following manner Type:
Take the sample characteristics collection as the input of the 1st layer of neural network model, is with the 1st label in the tally set The output of 1st layer of neural network model, the 1st grade of neural network model of training are pre- according to the keyword of the text of label to be allocated Survey the performance of corresponding label;
It take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, with institute The output that m-th of label in tally set is m layers of neural network model is stated, m grades of neural network models of training are according to keyword Predict the performance of corresponding label;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.
Fourth aspect, the embodiment of the present invention provide a kind of text label determining device, and described device includes:
First computing unit, for calculating the feature vector of the corresponding keyword of text;
Input unit is obtained for the feature vector of the corresponding keyword of the text to be inputted m grades of neural network models To corresponding m label, 2≤m;
Second computing unit concentrates the distribution probability of each label for calculating different classes of lower label, by m label with The distribution probability is weighted to obtain the corresponding tally set of text.
Neural network model training method provided in an embodiment of the present invention and device, text label determine method and device, The tally set training neural network that the sample characteristics collection and several labels that text based semantic topic feature vector is constituted are constituted Model, and the neural network model obtained based on training determines text label;It so, it is possible to determine multiple marks to a text Label, improve the accuracy and robustness of text label.
Detailed description of the invention
Fig. 1 is an optional processing flow schematic diagram of neural network model training method provided in an embodiment of the present invention;
Fig. 2 is the sample spy provided in an embodiment of the present invention for obtaining and being made of the semantic topic feature vector of several texts The processing flow schematic diagram of collection;
Fig. 3 is the processing flow schematic diagram for the keyword that the embodiment of the present invention obtains text;
Fig. 4 is the processing flow schematic diagram for the word weight that the embodiment of the present invention calculates each word;
Fig. 5 is that the embodiment of the present invention is shown based on the process flow of sample characteristics collection and tally set training neural network model It is intended to;
Fig. 6 is the concrete structure schematic diagram of chain type of embodiment of the present invention neural network CMLP;
Fig. 7 is another one optional processing flow schematic diagram of neural network model of embodiment of the present invention training method;
Fig. 8 is the processing flow schematic diagram that text label of the embodiment of the present invention determines method;
Fig. 9 is the composed structure schematic diagram of neural network model training device provided in an embodiment of the present invention;
Figure 10 is the composed structure schematic diagram of text label determining device provided in an embodiment of the present invention;
Figure 11 is the hardware composed structure schematic diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Before the embodiment of the present invention is described in detail, first to the present embodiments relate to noun explain.
1) stop words refers in information retrieval, to save memory space and improving search efficiency, in processing natural language Certain words or word are fallen in meeting automatic fitration before or after data (or text).
2) meaningless word refers to that auxiliary words of mood, adverbial word, preposition, conjunction etc. itself have no specific meaning, only will It is put into a complete sentence word for just having certain effect, as it is common " ", " ", " coming ", " going ", " how " etc..
3) Entity recognition is named, refers to the entity with certain sense in identification text, mainly includes name, place name, machine Structure name, proper noun etc..
The characteristics of in order to more fully hereinafter understand the embodiment of the present invention and technology contents, with reference to the accompanying drawing to this The realization of inventive embodiments is described in detail, appended attached drawing purposes of discussion only for reference, is not used to limit the present invention.
In view of multi-class classification method be text label there are the problem of, the present invention propose utilize and know classification, Tally set combined method, multiple labeling close on classification, a pair of of multi-composite type support vector machines, the random gloomy decision Tree algorithms instruction such as another The method for practicing sample label characteristic still when the above method becomes larger there are sample characteristics dimension, determines the accurate of text label Property it is unstable, and the problems such as training difficulty is big, poor accuracy.
In view of the above problems, an optional processing stream of neural network model training method provided in an embodiment of the present invention Journey schematic diagram, as shown in Figure 1, comprising the following steps:
Step S101 obtains the sample characteristics collection being made of the semantic topic feature vector of several texts.
In some optional embodiments, neural network model training device handles text, obtains multiple semantemes Theme feature vector, multiple semantic topic feature vectors constitute sample characteristics collection.If neural network model training device obtain by The treatment process for the sample characteristics collection that the semantic topic feature vector of dry text is constituted, as shown in Figure 2, comprising the following steps:
Step S1011 obtains the keyword of text.
In some embodiments, neural network model training device obtains the process flow of the keyword of text, such as Fig. 3 It is shown, comprising the following steps:
Step S1a carries out word segmentation processing to text, obtains multiple words.
In an optional embodiment, neural network model training device uses language technology platform (Language Technology Platform, LTP) Open-Source Tools segment text, and carry out stop words filtering, nothing to word segmentation result The filtering of meaning word, the processing such as Entity recognition is named to word segmentation result.When being named Entity recognition to word segmentation result, elder brother Mountain Wujiang Jiangyin is identified as ns (place name), and Huangpu Military Academy is identified as ni (mechanism name), and Zhangsun Wuji is identified as nh (name).It is logical It crosses and text is segmented, can guarantee that the important feature in text is effectively recognized.
Step S1b calculates the word weight of each word.
In an optional embodiment, neural network model training device calculates the process flow of the word weight of each word, As shown in Figure 4, comprising the following steps:
Step S1b1, the self attributes based on each word calculate the first word weight.
Here, self attributes include at least: part of speech, lexeme are set and theme correlation degree;Part of speech include: noun, verb, Adjective;
In actual implementation, the first word weight of noun is greater than verb and adjectival first word weight;Auxiliary word and other First word weight of the function words such as the word of reference property is zero.First word weight of the word at beginning and end is greater than other positions First word weight of the word set, the first word weight of word related or similar to text header is big in text.Such as the title of text For " free child is most conscious ", then the first word weight of " freedom " is big in text.
Step S1b2, weight between the first weight calculation word of the word in the first word weight based on each word and set of words.
Here, weight is centered on a word between institute's predicate, the set that the word of preset quantity is constituted forward and backward; First weight of each word in the first word weight of each word and the set of words is weighted iteration, obtains weight between word.
Indicate that word, S (W, n) indicate set of words with W, wherein n indicates the quantity of word in set of words.
In some embodiments, word-based feature TEXTRANK graph model calculates weight between word, by taking word A as an example, by word The weight of each word in the first word weight set of words corresponding with word A of A is weighted iteration, shown in following formula:
Wherein,And, d is weight constant, and n is word for each word weight in the corresponding set of words of word A The quantity of word in the corresponding set of words of A, Weight (W) weight between word.
Step S1b3 calculates the word weight under the word weight and classification under full dose.
In a specific embodiment, the following formula of neural network model training mechanism calculates the word power under full dose Weight:
Tfidf (W)=Tf (W) * idf (W) (2)
Wherein, f (W) is word w corresponding word frequency in whole samples, and f (all) is total word number, and C (all) is all samples Item number, C (W) are the sample strip number comprising W.
Neural network model training device calculates the treatment process of word weight TFidfc under classification and calculates the word under full dose The treatment process of weight is similar, and which is not described herein again.
Step S1b4, based on multiplying for the word weight under the word weight and the classification under weight, the full dose between institute's predicate Product, determines the word weight of each word.
Neural network model training device will be under the word weight and the classification under weight, the full dose between institute's predicate Word multiplied by weight obtains word weight, shown in following formula:
WT (W)=Weight (Wi)*TFidfF*TFidfC (3)
In some embodiments, word weight can also be normalized, such as weighs the word being calculated divided by list The weight limit value of sample, obtains the normalized weight of word weight.
Step S1c is ranked up word weight according to size, using the big N number of word of word weight as the key of the text Word, N are positive integer.
In some embodiments, synonym transfer dictionary can also be based on to the keyword of acquisition and carries out synonym conversion, Synonymicon is to obtain according to the crucial establishment of the distinctive conversion of text, and the example of synonymicon is as shown in table 1 below:
Classification Former word Convert word
City workplace Shield beauty Shield flower
City workplace To a high-profile It makes widely known
City workplace Superstar Star
City workplace Rich and powerful people Rich and powerful family family
Fantasy is magical The elixir of life Red medicine
Fantasy is magical Indifferently It is grim
Fantasy is magical Cotton clothes The common people
Fantasy is magical Ghosty-Spirits haunt Ghosts and monsters
Science fiction It is refreshing red Red medicine
Science fiction Cruel text The cruel heart
Swordsman's celestial being is chivalrous Martial arts circles Wu Dao
Table 1
Based on synonymicon shown in table 1, at least to plot, personage, background, character personality and piece identity's dimension Under label have synonymy word converted, while under each dimension deactivated label or mark mistake label into Row cleaning filtering, and then guarantee that the label under different dimensions is unique.Meanwhile for the difference of same label semanteme under of all categories, Synonymous conversion is carried out based on the Semantic mapping between all kinds of distinguishing labels, format transformation are as follows: classification | prime word | conversion word;For original It include the label of tag library in beginning word and conversion word, it will come out by preliminary marker recognition.
Step S1012 determines the semantic topic feature vector of text based on the keyword.
In some embodiments, neural network model training device is directed to keyword, carries out TF-idf to sample set library It calculates;Feature Dimension Reduction and semantic topic table are carried out using LSI potential applications index (Latent Semantic Indexing) again Show.
It in the specific implementation, is with sample set library work when carrying out TF-idf calculating to sample set library for keyword For matrix column, TF-idf calculating is carried out using keyword as row, further abstract displaying is done to keyword.
It is based on singular value decomposition when indexing progress Feature Dimension Reduction and semantic topic expression using LSI potential applications (SVD) degree of correlation between document, theme, the meaning of a word and word is obtained;For the matrix A of j word of i documentijCan be broken down into as Lower formula:
Ai*j=Ui*kSk*kVk*j (4)
Wherein, U indicates the degree of correlation between document and theme, and S indicates the degree of correlation between theme and the meaning of a word, B indicate word with The degree of correlation between the meaning of a word.
The semantic topic feature vector of text is as shown in following formula:
Wherein, X is the semantic topic feature vector of text, and d is the word feature vector of text.
Step S1013 constructs sample characteristics collection based on institute's semantic topic feature vector.
In some embodiments, semantic topic feature vector structure of the neural network model training device based on several texts Set is built, regard the set as sample characteristics collection.
Step S102 is obtained and be can be used as the tally set that several labels of text label are constituted.
In some embodiments, it is pre-set to can be neural network model training device for tally set, or by taking Business device is sent to neural network model training device;Label includes at least: city workplace, fantasy be magical, science fiction, swordsman's celestial being are chivalrous, History, describing love affairs, romance, war etc..
Step S103, based on sample characteristics collection and tally set training neural network model.
Since text has complicated and diversified feature, the expression way of text also has diversification, and theme or viewpoint illustrate Specific aim it is not strong, the contents such as the conjecture of personal plot, experience description and Background Culture describe in disorder;And neural network model has There is stronger non-linear mapping capability, there is stronger generalization ability to noise data, and can learn by oneself to sample characteristics It practises, has higher adaptive ability and fault-tolerant ability;Therefore, the embodiment of the present invention is based on sample characteristics collection and label is assembled for training Practice neural network model, so that the neural network model has the performance of the corresponding label of prediction text.
In some embodiments, the process flow based on sample characteristics collection and tally set training neural network model, such as schemes Shown in 5, sample characteristics collection includes n dimension sample characteristics, is expressed as X (x1, x2, x3…xn);Tally set includes m dimension label, is indicated For Y (y1, y2, y3…ym), building chain type neural network (referred to as CMLP) model is successively combined using neural network MLP.
In the specific implementation, with the sample characteristics collection X (x1, x2, x3…xn) it is the defeated of the 1st layer of neural network model Enter, with the 1st label y in the tally set1For the output of the 1st layer of neural network model, the 1st grade of neural network model of training The performance of corresponding label is predicted according to the keyword of the text of label to be allocated;
With the 1st layer of training resultWith sample characteristics collection X (x1, x2, x3…xn) it is the 2nd layer of neural network model Input, with the 2nd label y in the tally set2For the output of the 2nd layer of neural network model, the 2nd grade of neural network mould of training Type predicts the performance of corresponding label according to the keyword of the text of label to be allocated;
And so on, with m-1 layers of training resultAnd the sample characteristics collection is m layers of mind Input through network model take m-th of label in the tally set as the output of m layers of neural network model, m grades of training Neural network model predicts the performance of corresponding label according to keyword;Wherein, 3≤m≤M, the M tally set includes Total number of labels amount.
The specific structure of chain type neural network CMLP in above-described embodiment, as shown in fig. 6, CMLP is by an input layer, more A hidden layer, an output layer are constituted.Wherein input layer is the semantic topic feature vector, X of text;In order to reduce characteristic noise And avoid feature sparse, feature extraction is carried out to the keyword of text in advance, while being implicit spy important in prominent sample Sign, text based theme distribution indicate sample characteristics, and using the semantic topic vector X of text as chain type neural network Input.Again by any label y in the semantic topic vector X of text and tally set Y1It is trained and obtains the first training resultThe semantic topic vector X of first training result and text is then as input and next label y in tally set Y2Joined Number training, obtains the second training result... and so on, the training result and text of upper one layer of neural network model Semantic topic vector X combination as feature input enter hidden layer training, such repetitive exercise, according to every layer of neural network The input of model is different, the different perceptron C (C of construction training1, C2…Cm), and (such as by weight computing and activation primitive Relu function) feature transmitting is carried out, until the training of m label in tally set finishes.
The training process of above-mentioned neural network model is converted into the process of weight and bias combination between finding neuron, with Make the loss reduction between actual value and desired value.Weight computing f=wx+w ', y are passed through for input feature vector propagated forwardl+b It is calculated with the activation primitive relu in following formula, obtains the penalty values between actual value and desired value.
Loss function is as shown in following formula, as single sample prediction result y=1, if when prediction probability h=1, this When loss function be 0;If h=0, penalty values are infinitely great.By stochastic gradient descent method using chain type derivation constantly update w, The parameters such as w ' and b, by iteration for several times so that loss function reaches minimum value, to obtain optimized parameter model.
In the embodiment of the present invention, each text is made to correspond to a group of labels collection by the method that multi-tag is classified, such as one Film can correspond to multiple labels such as comedy, history, war, improve the accuracy and robustness of text label.And the relevant technologies In the method for multi-class classification make each text that can only correspond to a label, a such as same film is only capable of corresponding comedy, goes through A label in history and war, reduces the accuracy and robustness of text label.
Based on above-mentioned neural network model training method, it is all kinds of that the embodiment of the present invention provides online article as shown in table 2 below Not, the corresponding abstract label of each dimension Books based on training sample set and is handed over according to sample class and abstract label dimension Fork dimensional labels are trained.For example, 11499 boy student's class online article books and 502 abstract labels of target are randomly selected Chain network mapping training is carried out, entire training process is subordinated to books from book content to final tally set and returns with label Belong to classification.It pre-processes firstly, carrying out the pretreatments such as synonym conversion to lower online article content keyword of all categories, knows by feature Each online article theme vector feature is not calculated with distribution;Any label is concentrated to carry out sample training secondly, extracting target labels Practice, such as the first training label is " love ", then entire training set can then carry out study and the mould of feature to " love " this result Type training obtains whole training set to the recognition result of the label.Then, for first time training result and entire training set Feature combination is trained to obtain the training result of second label to second label such as " romance ".So iteration continues, All label training in entire abstract tally set are finished, when penalty values reach minimum, training pattern is completed.Each sample exists Training pattern identification is the recognition result for obtaining 502 labels, by taking a romantic novel as an example, by neural network model Carrying out the label result set that multi-tag identifies is (" youth ", " campus ", " unrequited love " ...).
Table 2
Finally, calculating the sample number of all labels in the sample number and such label that a label in class label is covered Accounting, obtain the distribution probability that different classes of lower label concentrates each label.Each label is concentrated in conjunction with following lower labels of all categories Distribution probability formula:
Calculate the sample number S that such label t is coveredtWith the sample number of such all labelAccounting, finally Recognition result and such lower label distribution probability are weighted each sample, obtain final label result set, obtain Label result set is as shown in table 3 below:
Table 3
Another one optional process flow of neural network model training method provided in an embodiment of the present invention, such as Fig. 7 institute Show, comprising the following steps:
Step S201 is based on books brief introduction and the text of the book, carries out keyword abstraction, obtains books keyword.
Step S202 carries out synonym conversion to the books keyword.
Step S203, judges whether keyword can be carried out tag recognition, and judging result is when being, as first secondary label knot Fruit;When judging result is no, step S204 is executed.
Step S204 carries out TF-idf calculating to keyword and LSI topic model determines theme feature vector.
Step S205 carries out multi-tag classification using disaggregated model using theme feature vector as sample set and tally set, Obtain book labels result.
The embodiment of the present invention also provides a kind of text label determination side based on above-mentioned neural network model training method Method, the embodiment of the present invention provide the process flow that a kind of text label determines method, as shown in Figure 8, comprising:
Step S301 calculates the semantic topic feature vector of text.
In the embodiment of the present invention, calculate in treatment process and the above-mentioned steps S101 of the semantic topic feature vector of text The treatment process of record is identical, and which is not described herein again.
The semantic topic feature vector of the text is inputted m grades of neural network models, obtains corresponding m by step S302 A label, 2≤m.
Step S303 calculates the distribution probability that different classes of lower label concentrates each label.
In the embodiment of the present invention, different classes of lower label is calculated using above-mentioned formula (8) and concentrates the distribution of each label general Rate.
M label and the distribution probability are weighted to obtain the corresponding tally set of text by step S304.
Based on above-mentioned neural network model training method, the embodiment of the present invention also provides a kind of neural network model training Device, the composed structure of the neural network model training device 400, as shown in Figure 9, comprising:
Acquiring unit 401, sample characteristics collection that the semantic topic feature vector for obtaining by several texts is constituted and It can be used as the tally set that several labels of text label are constituted;
Training unit 402 is used for the sample characteristics collection and the tally set, trains neural network in the following manner Model:
Take the sample characteristics collection as the input of the 1st layer of neural network model, is with the 1st label in the tally set The output of 1st layer of neural network model, the 1st grade of neural network model of training are pre- according to the keyword of the text of label to be allocated Survey the performance of corresponding label;
It take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, with institute The output that m-th of label in tally set is m layers of neural network model is stated, m grades of neural network models of training are according to keyword Predict the performance of corresponding label;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.
In the embodiment of the present invention, the acquiring unit 401 is also used to obtain the keyword of text;Based on the keyword Determine the semantic topic feature vector of text;Sample characteristics collection is constructed based on institute's semantic topic feature vector.
In the embodiment of the present invention, the acquiring unit 401 is also used to carry out word segmentation processing to the text, obtain multiple Word;
Calculate the word weight of each word;
Institute's predicate weight is ranked up according to size, using the big N number of word of word weight as the keyword of the text, N For positive integer.
In the embodiment of the present invention, the acquiring unit 401 is also used to the self attributes based on each word and calculates the first word Weight;
Weight between first weight calculation word of the word in the first word weight based on each word and set of words, the word set It is combined into centered on a word, the set that the word of preset quantity is constituted forward and backward;
Calculate the word weight under the word weight and classification under full dose;
Based on the product of the word weight under the word weight and the classification under weight, the full dose between institute's predicate, determine every The word weight of a word.
In the embodiment of the present invention, the acquiring unit 401 is also used to the first word weight of each word and the word set The first weight of each word is weighted iteration in closing, and obtains weight between word.
In the embodiment of the present invention, the acquiring unit 401 is also used to calculate the semantic topic feature vector of text;
The semantic topic feature vector of the text is inputted into m grades of neural network models, obtains corresponding m label, 2 ≤m;
Calculate the distribution probability that different classes of lower label concentrates each label;
M label and the distribution probability are weighted to obtain the corresponding tally set of text.
In the embodiment of the present invention, the acquiring unit 401 is also used to calculate the sample that a label is covered in class label The accounting of the sample number of all labels, obtains different classes of lower label and concentrates the distribution of each label general in this number and such label Rate.
Method is determined based on above-mentioned text label, and the embodiment of the present invention also provides a kind of text label determining device, described The composed structure of text label determining device 500, as shown in Figure 10, comprising:
First computing unit 501, for calculating the feature vector of the corresponding keyword of text;
Input unit 502, for the feature vector of the corresponding keyword of the text to be inputted m grades of neural network models, Obtain corresponding m label, 2≤m;
Second computing unit 503 concentrates the distribution probability of each label for calculating different classes of lower label, by m label It is weighted to obtain the corresponding tally set of text with the distribution probability.
In the embodiment of the present invention, second computing unit 503 is also used to calculate a label in class label and is covered Sample number and such label in all labels sample number accounting, obtain different classes of lower label concentrate each label point Cloth probability.
Figure 11 is that (neural network model training device or text label determine electronic equipment provided in an embodiment of the present invention Device) hardware composed structure schematic diagram, electronic equipment 700 includes: at least one processor 701, memory 702 and at least One network interface 704.Various components in electronic equipment 700 are coupled by bus system 705.It is understood that bus System 705 is for realizing the connection communication between these components.Bus system 705 further includes electricity in addition to including data/address bus Source bus, control bus and status signal bus in addition.But for the sake of clear explanation, various buses are all designated as in Figure 11 Bus system 705.
It is appreciated that memory 702 can be volatile memory or nonvolatile memory, volatibility may also comprise Both with nonvolatile memory.Wherein, nonvolatile memory can be ROM, programmable read only memory (PROM, Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM, Erasable Programmable Read-Only Memory), electrically erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic RAM (FRAM, Ferromagnetic random access memory), flash memory (Flash Memory), magnetic surface storage, light Disk or CD-ROM (CD-ROM, Compact Disc Read-Only Memory);Magnetic surface storage can be disk and deposit Reservoir or magnetic tape storage.Volatile memory can be random access memory (RAM, Random Access Memory), It is used as External Cache.By exemplary but be not restricted explanation, the RAM of many forms is available, for example, it is static with Machine access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory), dynamic random access memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), double data speed synchronous dynamic RAM (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced synchronous dynamic random-access deposits Reservoir (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronized links dynamic Random access memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct rambus are random It accesses memory (DRRAM, Direct Rambus Random Access Memory).The memory of description of the embodiment of the present invention 702 are intended to include but are not limited to the memory of these and any other suitable type.
Memory 702 in the embodiment of the present invention is for storing various types of data to support the behaviour of electronic equipment 700 Make.The example of these data includes: any computer program for operating on electronic equipment 700, such as application program 7022.Realize that the program of present invention method may be embodied in application program 7022.
The method that the embodiments of the present invention disclose can be applied in processor 701, or be realized by processor 701. Processor 701 may be a kind of IC chip, the processing capacity with signal.During realization, the above method it is each Step can be completed by the integrated logic circuit of the hardware in processor 701 or the instruction of software form.Above-mentioned processing Device 701 can be general processor, digital signal processor (DSP, Digital Signal Processor) or other can Programmed logic device, discrete gate or transistor logic, discrete hardware components etc..Processor 701 may be implemented or hold Disclosed each method, step and logic diagram in the row embodiment of the present invention.General processor can be microprocessor or appoint What conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly in hardware decoding Processor executes completion, or in decoding processor hardware and software module combination execute completion.Software module can position In storage medium, which is located at memory 702, and processor 701 reads the information in memory 702, hard in conjunction with it Part completes the step of preceding method.
In the exemplary embodiment, electronic equipment 700 can by one or more application specific integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), FPGA, general processor, controller, MCU, MPU or other electronic components are realized, for executing preceding method.
Correspondingly, the embodiment of the present invention also provides a kind of storage medium, storage is by computer journey in the storage medium Sequence, when the computer program is run by processor, for realizing the above-mentioned neural network model training method of the embodiment of the present invention Or the above-mentioned text label of the embodiment of the present invention determines method.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions each in flowchart and/or the block diagram The combination of process and/or box in process and/or box and flowchart and/or the block diagram.It can provide these computers Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices To generate a machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute For realizing the function of being specified in one or more flows of the flowchart and/or one or more blocks of the block diagram Device.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that instruction stored in the computer readable memory generation includes The manufacture of command device, the command device are realized in one box of one or more flows of the flowchart and/or block diagram Or the function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that Series of operation steps are executed on computer or other programmable devices to generate computer implemented processing, thus calculating The instruction executed on machine or other programmable devices is provided for realizing in one or more flows of the flowchart and/or side The step of function of being specified in block diagram one box or multiple boxes.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, all Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention should be included in guarantor of the invention Within the scope of shield.

Claims (10)

1. a kind of neural network model training method, which is characterized in that the described method includes:
Obtain the sample characteristics collection being made of the semantic topic feature vector of several texts;
It obtains and can be used as the tally set that several labels of text label are constituted;
Based on the sample characteristics collection and the tally set, neural network model is trained in the following manner:
It take the sample characteristics collection as the input of the 1st layer of neural network model, with the 1st label in the tally set for the 1st layer The output of neural network model, the 1st grade of neural network model of training are predicted to correspond to according to the keyword of the text of label to be allocated Label performance;
It take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, with the label Concentrating m-th of label is the output of m layers of neural network model, and m grades of neural network models of training are according to keyword prediction pair The performance for the label answered;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.
2. the method according to claim 1, wherein the semantic topic feature vector obtained by several texts The sample characteristics collection of composition, comprising:
Obtain the keyword of text;
The semantic topic feature vector of text is determined based on the keyword;
Sample characteristics collection is constructed based on institute's semantic topic feature vector.
3. the method according to claim 1, wherein the keyword for obtaining text, comprising:
Word segmentation processing is carried out to the text, obtains multiple words;
Calculate the word weight of each word;
Institute's predicate weight is ranked up according to size, using the big N number of word of word weight as the keyword of the text, N is positive Integer.
4. according to the method described in claim 2, it is characterized in that, the word weight for calculating each word, comprising:
Self attributes based on each word calculate the first word weight;
Weight between first weight calculation word of the word in the first word weight based on each word and set of words, the word set be combined into Centered on one word, the word of preset quantity is constituted forward and backward set;
Calculate the word weight under the word weight and classification under full dose;
Based on the product of the word weight under the word weight and the classification under weight, the full dose between institute's predicate, each word is determined Word weight.
5. according to the method described in claim 4, it is characterized in that, in the first word weight based on each word and set of words Word the first weight calculation word between weight, comprising:
First weight of each word in the first word weight of each word and the set of words is weighted iteration, obtains weighing between word Weight.
6. method according to any one of claims 1 to 5, which is characterized in that the method also includes:
Calculate the semantic topic feature vector of text;
The semantic topic feature vector of the text is inputted into m grades of neural network models, obtains corresponding m label, 2≤m;
Calculate the distribution probability that different classes of lower label concentrates each label;
M label and the distribution probability are weighted to obtain the corresponding tally set of text.
7. a kind of text label based on neural network model training method described in claim 1 determines that method, feature exist In, which comprises
Calculate the semantic topic feature vector of text;
The semantic topic feature vector is inputted into m grades of neural network models, obtains corresponding m label, 2≤m;
Calculate the distribution probability that different classes of lower label concentrates each label;
M label and the distribution probability are weighted to obtain the corresponding tally set of text.
8. the method according to the description of claim 7 is characterized in that point for calculating different classes of lower label and concentrating each label Cloth probability, comprising:
The accounting for calculating the sample number of all labels in the sample number and such label that a label in class label is covered, obtains Different classes of lower label concentrates the distribution probability of each label.
9. a kind of neural network model training device, which is characterized in that described device includes:
Acquiring unit, the sample characteristics collection and can be used as text that the semantic topic feature vector for obtaining by several texts is constituted The tally set that several labels of this label are constituted;
Training unit is used for the sample characteristics collection and the tally set, trains neural network model in the following manner:
It take the sample characteristics collection as the input of the 1st layer of neural network model, with the 1st label in the tally set for the 1st layer The output of neural network model, the 1st grade of neural network model of training are predicted to correspond to according to the keyword of the text of label to be allocated Label performance;
It take m-1 layers of training result and the sample characteristics collection as the input of m layers of neural network model, with the label Concentrating m-th of label is the output of m layers of neural network model, and m grades of neural network models of training are according to keyword prediction pair The performance for the label answered;Wherein, 2≤m≤M, M are the total number of labels amount that the tally set includes.
10. a kind of text label determining device, which is characterized in that described device includes:
First computing unit, for calculating the feature vector of the corresponding keyword of text;
Input unit is corresponded to for the feature vector of the corresponding keyword of the text to be inputted m grades of neural network models M label, 2≤m;
Second computing unit concentrates the distribution probability of each label for calculating different classes of lower label, by m label and described point Cloth probability is weighted to obtain the corresponding tally set of text.
CN201810837902.9A 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device Active CN109165380B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810837902.9A CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810837902.9A CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Publications (2)

Publication Number Publication Date
CN109165380A true CN109165380A (en) 2019-01-08
CN109165380B CN109165380B (en) 2022-07-01

Family

ID=64898322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810837902.9A Active CN109165380B (en) 2018-07-26 2018-07-26 Neural network model training method and device and text label determining method and device

Country Status (1)

Country Link
CN (1) CN109165380B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992646A (en) * 2019-03-29 2019-07-09 腾讯科技(深圳)有限公司 The extracting method and device of text label
CN110147499A (en) * 2019-05-21 2019-08-20 智者四海(北京)技术有限公司 Label method, recommended method and recording medium
CN110428052A (en) * 2019-08-01 2019-11-08 江苏满运软件科技有限公司 Construction method, device, medium and the electronic equipment of deep neural network model
CN110472665A (en) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 Model training method, file classification method and relevant apparatus
CN110491374A (en) * 2019-08-27 2019-11-22 北京明日汇科技管理有限公司 Hotel service interactive voice recognition methods neural network based and device
CN111177385A (en) * 2019-12-26 2020-05-19 北京明略软件***有限公司 Multi-level classification model training method, multi-level classification method and device
CN111339301A (en) * 2020-02-28 2020-06-26 创新奇智(青岛)科技有限公司 Label determination method and device, electronic equipment and computer readable storage medium
CN111666769A (en) * 2020-06-11 2020-09-15 暨南大学 Method for extracting financial field event sentences in annual newspaper
CN111695052A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Label classification method, data processing device and readable storage medium
CN111695053A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Sequence labeling method, data processing device and readable storage medium
CN111797325A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Event labeling method and device, storage medium and electronic equipment
WO2020224097A1 (en) * 2019-05-06 2020-11-12 平安科技(深圳)有限公司 Intelligent semantic document recommendation method and device, and computer-readable storage medium
CN113486147A (en) * 2021-07-07 2021-10-08 中国建设银行股份有限公司 Text processing method and device, electronic equipment and computer readable medium
CN113822013A (en) * 2021-03-08 2021-12-21 京东科技控股股份有限公司 Labeling method and device for text data, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN105046274A (en) * 2015-07-13 2015-11-11 浪潮软件集团有限公司 Automatic labeling method for electronic commerce commodity category
KR20170039951A (en) * 2015-10-02 2017-04-12 네이버 주식회사 Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
CN106909654A (en) * 2017-02-24 2017-06-30 北京时间股份有限公司 A kind of multiclass classification system and method based on newsletter archive information
CN107944946A (en) * 2017-11-03 2018-04-20 清华大学 Commercial goods labels generation method and device
US20180157743A1 (en) * 2016-12-07 2018-06-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Multi-Label Classification

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN105046274A (en) * 2015-07-13 2015-11-11 浪潮软件集团有限公司 Automatic labeling method for electronic commerce commodity category
KR20170039951A (en) * 2015-10-02 2017-04-12 네이버 주식회사 Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
US20180157743A1 (en) * 2016-12-07 2018-06-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Multi-Label Classification
CN106909654A (en) * 2017-02-24 2017-06-30 北京时间股份有限公司 A kind of multiclass classification system and method based on newsletter archive information
CN107944946A (en) * 2017-11-03 2018-04-20 清华大学 Commercial goods labels generation method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JSSE READ 等: "Classifier chains for multi-label classification", 《MACHINE LEARNING》 *
张春焰等: "基于路径选择的层次多标签分类", 《计算机技术与发展》 *
王进 等: "基于Spark的组合分类器链多标签分类方法", 《中国科学技术大学学报》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992646A (en) * 2019-03-29 2019-07-09 腾讯科技(深圳)有限公司 The extracting method and device of text label
CN109992646B (en) * 2019-03-29 2021-03-26 腾讯科技(深圳)有限公司 Text label extraction method and device
CN111797325A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Event labeling method and device, storage medium and electronic equipment
WO2020224097A1 (en) * 2019-05-06 2020-11-12 平安科技(深圳)有限公司 Intelligent semantic document recommendation method and device, and computer-readable storage medium
CN110147499A (en) * 2019-05-21 2019-08-20 智者四海(北京)技术有限公司 Label method, recommended method and recording medium
CN110472665A (en) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 Model training method, file classification method and relevant apparatus
CN110428052A (en) * 2019-08-01 2019-11-08 江苏满运软件科技有限公司 Construction method, device, medium and the electronic equipment of deep neural network model
CN110491374A (en) * 2019-08-27 2019-11-22 北京明日汇科技管理有限公司 Hotel service interactive voice recognition methods neural network based and device
CN111177385A (en) * 2019-12-26 2020-05-19 北京明略软件***有限公司 Multi-level classification model training method, multi-level classification method and device
CN111177385B (en) * 2019-12-26 2023-04-07 北京明略软件***有限公司 Multi-level classification model training method, multi-level classification method and device
CN111339301A (en) * 2020-02-28 2020-06-26 创新奇智(青岛)科技有限公司 Label determination method and device, electronic equipment and computer readable storage medium
CN111339301B (en) * 2020-02-28 2023-11-28 创新奇智(青岛)科技有限公司 Label determining method, label determining device, electronic equipment and computer readable storage medium
CN111666769A (en) * 2020-06-11 2020-09-15 暨南大学 Method for extracting financial field event sentences in annual newspaper
CN111695052A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Label classification method, data processing device and readable storage medium
CN111695053A (en) * 2020-06-12 2020-09-22 上海智臻智能网络科技股份有限公司 Sequence labeling method, data processing device and readable storage medium
CN113822013A (en) * 2021-03-08 2021-12-21 京东科技控股股份有限公司 Labeling method and device for text data, computer equipment and storage medium
CN113822013B (en) * 2021-03-08 2024-04-05 京东科技控股股份有限公司 Labeling method and device for text data, computer equipment and storage medium
CN113486147A (en) * 2021-07-07 2021-10-08 中国建设银行股份有限公司 Text processing method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN109165380B (en) 2022-07-01

Similar Documents

Publication Publication Date Title
CN109165380A (en) A kind of neural network model training method and device, text label determine method and device
US10120861B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
CN109146610A (en) It is a kind of intelligently to insure recommended method, device and intelligence insurance robot device
CN106503192A (en) Name entity recognition method and device based on artificial intelligence
CN110728298A (en) Multi-task classification model training method, multi-task classification method and device
CN108304373A (en) Construction method, device, storage medium and the electronic device of semantic dictionary
CN110717038B (en) Object classification method and device
CN110678882A (en) Selecting answer spans from electronic documents using machine learning
Dethlefs et al. Conditional random fields for responsive surface realisation using global features
CN107679225A (en) A kind of reply generation method based on keyword
CN109918627A (en) Document creation method, device, electronic equipment and storage medium
CN108345612A (en) A kind of question processing method and device, a kind of device for issue handling
CN111400584A (en) Association word recommendation method and device, computer equipment and storage medium
Xuanyuan et al. Sentiment classification algorithm based on multi-modal social media text information
CN112528136A (en) Viewpoint label generation method and device, electronic equipment and storage medium
Li et al. LSTM-based deep learning models for answer ranking
Ahmed et al. Conversational ai: An explication of few-shot learning problem in transformers-based chatbot systems
Tomer et al. STV-BEATS: skip thought vector and bi-encoder based automatic text summarizer
Benayas et al. Unified transformer multi-task learning for intent classification with entity recognition
CN116821307B (en) Content interaction method, device, electronic equipment and storage medium
Kurup et al. Evolution of neural text generation: Comparative analysis
CN111767720A (en) Title generation method, computer and readable storage medium
Chakkarwar et al. A Review on BERT and Its Implementation in Various NLP Tasks
Du et al. Hierarchical multi-layer transfer learning model for biomedical question answering
CN114328820A (en) Information searching method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant