CN114049528A

CN114049528A - Method and equipment for identifying brand name

Info

Publication number: CN114049528A
Application number: CN202210030450.XA
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Mdata Information Technology Co ltd
Current assignee: Shanghai Mido Technology Co ltd
Priority date: 2022-01-12
Filing date: 2022-01-12
Publication date: 2022-02-15
Anticipated expiration: 2042-01-12
Also published as: CN114049528B

Abstract

The method comprises the steps that sequence label labeling is carried out on each position of each text in sample data by using a BMES (BMES), and training data comprising a text sequence and a label sequence are obtained; constructing a sequence labeling model by using the training data, and constructing a classification model by adopting a machine learning method; identifying an original text based on the sequence labeling model to obtain an identification result of sequence labeling; and correcting the recognition result of the sequence label based on the classification model to obtain the recognition result of the brand name. Therefore, the accuracy of identifying the brand name can be improved, and false alarm can be eliminated by using a classification model for identifying errors.

Description

Method and equipment for identifying brand name

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying a brand name.

Background

Although machine learning and deep learning models exist at present, these models still easily generate some omissions when identifying brand names, and for some easily confused word groups, if the models do not have sufficient corpus, the models are easily mispredicted, for example, a brand of "beauty" is easily confused in a word group, such as "beauty landscape". On the other hand, when the model is trained, a good markup corpus is relied on, but the good markup corpus is very time-consuming and labor-consuming by depending on manual work, and a good markup method is needed to solve the problem.

Disclosure of Invention

An object of the present application is to provide a method and an apparatus for identifying a brand name, which solve the problems in the prior art that identification of the brand name requires manual labeling and that missed brands and confusing words are easily identified.

According to one aspect of the present application, there is provided a method of brand name identification, the method comprising:

performing sequence label labeling on each position of each text in the sample data by using a BMES label type to obtain training data comprising a text sequence and a label sequence;

constructing a sequence labeling model by using the training data, and constructing a classification model by adopting a machine learning method;

identifying an original text based on the sequence labeling model to obtain an identification result of sequence labeling;

and correcting the recognition result of the sequence label based on the classification model to obtain the recognition result of the brand name.

Optionally, performing sequence tagging on each position of each text in the sample data by using BMES, including:

and performing sequence label labeling on each position of each text in the sample data by using a BMES (BMES) according to a regular expression, wherein the regular expression is determined by editing rules of business summary.

Optionally, constructing a sequence annotation model using the training data includes:

inputting the training data into a conditional random field model to obtain a training result;

and performing model effect evaluation on the training result by using the test data, and taking a combined model with an evaluation result meeting the requirement as a sequence labeling model.

Optionally, identifying the original text based on the sequence tagging model to obtain an identification result of the sequence tagging, including:

inputting the original text into the sequence labeling model to generate a characteristic function of each moment, wherein the characteristic function comprises a state characteristic function and a transfer characteristic function;

acquiring the weight corresponding to each characteristic function from the sequence labeling model, and determining a label network according to the characteristic functions and the corresponding weights;

and calculating the label network by using a Viterbi decoding algorithm to obtain an optimal label path, and obtaining the identification result of the sequence label according to the optimal label path.

adding a conditional random field model layer into the BERT model to obtain a combined model;

inputting the training data into the combined model to obtain a training result;

extracting features of the training data by using the BERT model, and continuously calculating the features by using the conditional random field model layer to obtain a transfer matrix;

and calculating the transfer matrix by using a Viterbi decoding algorithm to obtain an optimal label path, and obtaining the identification result of the sequence label according to the optimal label path.

Optionally, the constructing the classification model by using a machine learning method includes:

acquiring a positive sample from the training data, and acquiring a negative sample from the corpus according to the regular expression;

labeling the positive examples with a first label and the negative examples with a second label;

and constructing a classification model according to the marked positive samples and the marked negative samples.

Optionally, correcting the recognition result of the sequence label based on the classification model to obtain a recognition result of a brand name, including:

and inputting the recognition result of the sequence label into the classification model for recognition, if the recognition result is a first label, the brand name exists in the recognition result of the sequence label, and if the recognition result is a second label, the recognition result of the sequence label is corrected.

According to yet another aspect of the present application, there is also provided a brand name recognition apparatus, including:

one or more processors; and

a memory storing computer readable instructions that, when executed, cause the processor to perform the operations of the method as previously described.

According to yet another aspect of the present application, there is also provided a computer readable medium having computer readable instructions stored thereon, the computer readable instructions being executable by a processor to implement the method as described above.

Compared with the prior art, the method and the device have the advantages that sequence label labeling is carried out on each position of each text in sample data by using the BMES, so that training data comprising text sequences and label sequences are obtained; constructing a sequence labeling model by using the training data, and constructing a classification model by adopting a machine learning method; identifying an original text based on the sequence labeling model to obtain an identification result of sequence labeling; and correcting the recognition result of the sequence label based on the classification model to obtain the recognition result of the brand name. Therefore, the accuracy of identifying the brand name can be improved, and false alarm can be eliminated by using a classification model for identifying errors.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 illustrates a method flow diagram of brand name identification provided in accordance with an aspect of the present application;

FIG. 2 is a schematic diagram of sequence annotation in an embodiment of the present application;

FIG. 3 is a schematic diagram illustrating a process of obtaining a sequence annotation model according to an embodiment of the present application;

FIG. 4 illustrates an example diagram of a method of brand name identification in an embodiment of the present application.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

The present application is described in further detail below with reference to the attached figures.

In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (e.g., Central Processing Units (CPUs)), input/output interfaces, network interfaces, and memory.

The Memory may include volatile Memory in a computer readable medium, Random Access Memory (RAM), and/or nonvolatile Memory such as Read Only Memory (ROM) or flash Memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, Phase-Change RAM (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other Memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic Disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

Fig. 1 shows a flowchart of a method for brand name identification provided according to an aspect of the present application, the method comprising: S11-S14, wherein in the S11, sequence label labeling is carried out on each position of each text in the sample data by using a BMES (BMES), and training data comprising a text sequence and a label sequence are obtained; step S12, constructing a sequence labeling model by using the training data, and constructing a classification model by adopting a machine learning method; step S13, identifying the original text based on the sequence label model to obtain the identification result of the sequence label; and step S14, correcting the recognition result of the sequence label based on the classification model to obtain the recognition result of the brand name. Therefore, the accuracy of identifying the brand name can be improved, and false alarm can be eliminated by using a classification model for identifying errors.

Specifically, in step S11, sequence label labeling is performed on each position of each text in the sample data using BMES, so as to obtain training data including a text sequence and a label sequence; the sample data is some text corpora with brand names, and sequence labels are labeled on the texts, wherein the sequence labels are sequentially labeled on each sentence of each sample data according to the sequence of words to obtain a sequence label; and marking the sample data to obtain training data. As shown in fig. 2, the sequence labeling is to classify each position by a label, and each position label is classified by BMES, where B denotes a start position of a word, M denotes a middle position of a word, E denotes an end position of a word, and S denotes a single word formation; and after sequence label labeling is carried out on each text data, training data are obtained, wherein the training data comprise a text sequence and a label sequence. The training data is used for training a model for subsequent use, thereby improving the accuracy of the recognition of the model.

Specifically, in step S12, a sequence annotation model is constructed using the training data, and a classification model is constructed using a machine learning method; constructing a sequence labeling model for identifying a text by using training data labeled by a label, wherein the sequence labeling model is input by a sequence label and a text label so as to identify words contained in one text; and constructing a classification model by using training data, wherein the training data is composed of data tuples and associated class labels, and one array tuple of a training set can be expressed as: [ [ a1, a2, a3], [ b1, b2, b3], [ c1, c2, c3] ], its classmark is represented as: [ 'a', 'b', 'c' ], each field in the tuple is a characteristic or attribute of the data. Therefore, training data can be used as a positive sample of the classification model, then a negative sample is obtained, and a machine learning method is adopted for training and constructing the classification model. And judging whether the brand and the located paragraph are brand texts or not through the classification model.

Specifically, in step S13, identifying an original text based on the sequence labeling model to obtain an identification result of a sequence label; the trained sequence labeling model is used, the recognition effect of the model is good, a new text, namely an original text needing to be recognized whether a brand name exists or not, is input into the sequence labeling model for recognition, and the output result is the recognition result obtained after the original text is subjected to sequence labeling.

Specifically, in step S14, the recognition result of the sequence label is corrected based on the classification model, and a recognition result of the brand name is obtained. The original text is classified and identified again by using the classification model to obtain a classification result whether the brand name is included, and then the model result is compared with the classification result according to the identification result of the model so as to judge whether the brand name is really identified. Specifically, the recognition result of the sequence annotation is input into a classification model, such as "the beauty landscape", the recognition result may include a brand recognition result of "beauty", the original text of the result is input into the classification model again to judge whether the paragraph of the brand word is the brand text, so as to correct the recognition result, and if the brand word is not the brand name, the recognition result is filtered.

Next to the above embodiment, sequence labeling is performed on each position of each text in sample data by using BMES according to a regular expression, where the regular expression is determined by editing a rule summarized by a service. Here, the regular expression can be used for labeling, so that the problem of time and labor consumption caused by manual labeling is avoided, and the regular expression can be written into a regular expression for a summary rule accumulated by experts for a long time on the service, such as American: the blower of the American air conditioner and the American washing machine carries out sequence label marking through the regular expression, and the marking accuracy is also improved.

In some embodiments of the present application, in step S12, the training data is input into the conditional random field model to obtain a training result; and performing model effect evaluation on the training result by using the test data, and taking a combined model with an evaluation result meeting the requirement as a sequence labeling model. Here, a Conditional Random Field (CRF) is an discriminative probability model, and is used for a sequence labeling algorithm, training CRF by inputting training data into CRF, and finally performing an effect evaluation on a training result by using test data, so as to select a model with a better evaluation effect, that is, if the trained CRF model has a good evaluation effect, the model can be used as a sequence labeling model in the embodiment of the present application to identify a sequence tag.

Next to the above embodiment, in step S13, inputting the original text into the sequence annotation model to generate a feature function at each time, where the feature function includes a state feature function and a transfer feature function; acquiring the weight corresponding to each characteristic function from the sequence labeling model, and determining a label network according to the characteristic functions and the corresponding weights; and calculating the label network by using a Viterbi decoding algorithm to obtain an optimal label path, and obtaining the identification result of the sequence label according to the optimal label path. When a CRF model is used to label data of an identification sequence, text data (original text) to be identified is processed to generate a feature function at each moment, that is, a state feature function and a transfer feature function are generated, then weights corresponding to each feature function are taken out from the model, the weights are added, finally a fence network, that is, a label network representing a label path, is obtained, an optimal label path is obtained by using a Viterbi decoding algorithm, an identification result is obtained according to a label sequence corresponding to the label path, for example, taking 'Hongxincke is a Chinese brand' as an example, the text is subjected to feature extraction and then the corresponding weights are added to obtain the label network, the optimal path of the label network is calculated, and the obtained optimal label path is B- > M- > M- > E- > S- > S- > S- > S-, the tag sequence is then labeled: B-M-M-E-S-S-S.

In some embodiments of the present application, in step S12, a conditional random field model layer is added to the BERT model to obtain a joint model; inputting the training data into the combined model to obtain a training result; and performing model effect evaluation on the training result by using the test data, and taking a combined model with an evaluation result meeting the requirement as a sequence labeling model. The method comprises the steps of identifying sequence labels of an original text by using a BERT and CRF combined model, processing input data by using the BERT model when the combined model is constructed, then inputting output data to a CRF layer, adding a CRF on the output layer of the BERT model, obtaining a final identification result after processing of the CRF layer, training by using training data when the combined model is trained, and then evaluating the effect of the combined model by using test data, so that the model with a better evaluation effect is selected, namely if the evaluation effect of the trained combined model is good, the model can be used as the sequence labeling model in the embodiment of the application to identify the sequence labels.

According to the embodiment, the BERT model is used for extracting features of the training data, and the conditional random field model layer is continuously used for calculating the features to obtain a transfer matrix; and calculating the transfer matrix by using a Viterbi decoding algorithm to obtain an optimal label path, and obtaining the identification result of the sequence label according to the optimal label path. Here, when using the BERT-CRF model, text data (training data when performing the training model) input to BERT is represented in the form of word vectors, then characteristics are extracted using BERT, then characteristics are calculated using the CRF layer, a transition matrix is obtained, transition probabilities between labels are learned by the transition matrix, where the transition probabilities are each value in the transition matrix, and the transition probabilities between labels are represented using the transition matrix, such as the text "hong xing ke is a chinese brand," labeled as B-M-E-S, i.e., the transition probabilities of B ≧ M, so the size of the transition matrix is 4 × 9, 4 represents how many labels, and 9 represents the sentence length.

And calculating the optimal path in the transfer matrix by using a Viterbi decoding algorithm, wherein the label sequence corresponding to the optimal path is the identification result of the sequence label of the text data at this time.

In some embodiments of the present application, as shown in fig. 3, a correct labeling result is given by using training data labeled by a regular expression, that is, a stack of texts; and then, using the data as training data to train a sequence labeling model of CRF or BERT-CRF, using test data to evaluate the effect of the model, and storing the model with good effect by using indexes such as Precision, Recall, F1 and the like. The test data can be labeled data and data not involved in training, and is used for evaluating the quality of the model by training, F1 is an index for calculating the Precision and the Recall comprehensively, such as F1=2 (Precision and Recall)/(Precision and Recall), and F1>0.8 is considered to be good.

As shown in fig. 4, the original text "hair dryer of beauty" is input into the sequence labeling model for recognition, the sequence labeling model can be recognized by using a CRF model or a BERT model to obtain a model recognition result "beauty", the original text corresponding to the recognition result (i.e., "hair dryer of beauty") is input into the classification model again, the classification model recognizes that the original text mark is "1", it is indicated that "beauty" recognized by the sequence labeling model is a brand name, so the brand name "beauty" is output, and if the classification model recognizes that the original text mark is "0", the result is corrected, and the brand name is not output.

In some embodiments of the present application, when a classification model is constructed, a positive sample is obtained from the training data, and a negative sample is obtained from a corpus according to the regular expression; labeling the positive examples with a first label and the negative examples with a second label; and constructing a classification model according to the marked positive samples and the marked negative samples. Here, for a text with a brand, the text is labeled as a first label (for example, labeled as 1), and the corpus used in the training sequence labeling model can be regarded as the text with the brand, so that training data can be used as a positive sample, and when a negative sample is obtained, some preferences are needed, for example, important obtaining is performed on confusable brand words such as millet, apple, and beauty, when obtaining is performed, related corpora can be obtained from the corpus through a regular expression, and the obtained related corpora is used as a negative sample, and a second label (for example, labeled as 0) is performed. So that the trained classification model does not tend to be considered as a brand whenever the wording of a certain brand appears.

And inputting the recognition result of the sequence label into the classification model for recognition, wherein if the recognition result is a first label, the brand name exists in the recognition result of the sequence label, and if the recognition result is a second label, the recognition result of the sequence label is corrected. Here, the classification model is used to judge the brand recognition result, and when judging, if the sentence where the brand word is located is marked as 0, the sentence is not a brand sentence, and correction is triggered, for example, "beauty landscape", and the sentence is marked as 0 by the classification model, so that the beauty recognized by the sentence needs to be filtered; by filtering the recognition result of the sequence labeling model, the misinformation caused by the sequence labeling model can be reduced, and the easily confused brand words are filtered.

Furthermore, the embodiment of the present application also provides a computer readable medium, on which computer readable instructions are stored, the computer readable instructions being executable by a processor to implement the aforementioned method for brand name identification.

In an embodiment of the present application, there is also provided an apparatus for brand name identification, the apparatus including:

one or more processors; and

For example, the computer readable instructions, when executed, cause the one or more processors to:

performing sequence label labeling on each position of each text in the sample data by using a BMES (BMES), and obtaining training data comprising a text sequence and a label sequence;

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

In addition, some of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application through the operation of the computer. Program instructions which invoke the methods of the present application may be stored on a fixed or removable recording medium and/or transmitted via a data stream on a broadcast or other signal-bearing medium and/or stored within a working memory of a computer device operating in accordance with the program instructions. An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or a solution according to the aforementioned embodiments of the present application.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. A method of brand name identification, the method comprising:

2. The method of claim 1, wherein performing sequence tagging using BMES for each position of each text in the sample data comprises:

3. The method of claim 1, wherein constructing a sequence annotation model using the training data comprises:

4. The method of claim 3, wherein recognizing the original text based on the sequence label model to obtain a recognition result of the sequence label comprises:

5. The method of claim 1, wherein constructing a sequence annotation model using the training data comprises:

6. The method of claim 5, wherein recognizing the original text based on the sequence label model to obtain a recognition result of the sequence label comprises:

7. The method of claim 2, wherein constructing the classification model using a machine learning approach comprises:

8. The method of claim 7, wherein correcting the recognition result of the sequence label based on the classification model to obtain a recognition result of a brand name comprises:

9. An apparatus for brand name identification, the apparatus comprising:

one or more processors; and

a memory storing computer readable instructions that, when executed, cause the processor to perform the operations of the method of any of claims 1 to 8.

10. A computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of any one of claims 1 to 8.