CN111553442B - Optimization method and system for classifier chain tag sequence - Google Patents

Optimization method and system for classifier chain tag sequence Download PDF

Info

Publication number
CN111553442B
CN111553442B CN202010397834.6A CN202010397834A CN111553442B CN 111553442 B CN111553442 B CN 111553442B CN 202010397834 A CN202010397834 A CN 202010397834A CN 111553442 B CN111553442 B CN 111553442B
Authority
CN
China
Prior art keywords
occurrence
sample
classifier
classifier chain
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010397834.6A
Other languages
Chinese (zh)
Other versions
CN111553442A (en
Inventor
郑蓉蓉
薛文婷
张强
宋博川
贾全烨
柴博
张闻彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
Global Energy Interconnection Research Institute
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Smart Grid Research Institute Co ltd
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Smart Grid Research Institute Co ltd, State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Smart Grid Research Institute Co ltd
Priority to CN202010397834.6A priority Critical patent/CN111553442B/en
Publication of CN111553442A publication Critical patent/CN111553442A/en
Application granted granted Critical
Publication of CN111553442B publication Critical patent/CN111553442B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for optimizing a classifier chain tag sequence, wherein the method comprises the following steps: acquiring an input sample to be classified; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy. The invention provides a label sequence generation strategy corresponding to the classifier chain, and the label sequence is generated by accelerating an algorithm, so that the time consumption is low, the accuracy of the obtained label sequence of the classifier chain is high, and the performance optimization of the original classifier chain model is realized.

Description

Optimization method and system for classifier chain tag sequence
Technical Field
The invention relates to the technical field of machine learning, in particular to a method and a system for optimizing a classifier chain tag sequence.
Background
Classification is a very important method in machine learning, and classification can enable a machine to classify objects of interest, so as to achieve the purpose of identifying different objects. However, in practical problems, the category to which an object belongs has a certain uncertainty itself, for example, in some text classification tasks, news about certain sports stars may belong to sports news or stars entertainment news. In reality, different attributes can be attached to the objects, in order to accurately predict and classify all the attributes of the objects, the multi-label classification technology is a common method, and compared with the multi-classification or classification problems, the technical difficulty of multi-label classification mainly comprises the following steps: the dimensions of the tags to be processed are too high to explore the potential links between the tags.
However, read et al propose a Classifier Chain (CC) algorithm based on binary correlation, and connect Classifier results in series, so that the whole Classifier Chain can utilize potential association relations between labels, and thus the whole Classifier Chain can output better results. Although the classifier chain algorithm is optimized compared with the original binary correlation algorithm, the dimension of the label to be processed is increased due to the fact that the classifier chain algorithm belongs to a chain-type growth type classifier model, the consumption of the whole algorithm in time is increased, the sequence of the classifier chain is randomly generated, and the risk of error propagation exists.
Disclosure of Invention
Therefore, the optimization method and the system for the classifier chain label sequence overcome the defects of overhigh dimension, large calculation amount, random generation of the classifier chain and large risk of error propagation in the prior art of multi-label classification.
In order to achieve the above purpose, the present invention provides the following technical solutions:
in a first aspect, an embodiment of the present invention provides a method for optimizing a classifier chain tag sequence, including:
acquiring an input sample to be classified;
identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain;
obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis;
forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix;
and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy.
In one embodiment, the elements of the co-occurrence matrix are probabilities of simultaneous occurrence and non-simultaneous occurrence of sample tag set elements in the sample tag set.
In an embodiment, the step of forming a co-occurrence vector using a plurality of co-occurrence branches of a co-occurrence matrix includes:
obtaining the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix, and obtaining the maximum co-occurrence rate;
acquiring a second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;
forming a plurality of co-occurrence branches by each first sample tag element, the maximum co-occurrence rate corresponding to each first sample tag element and the second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;
a co-occurrence vector is composed of a plurality of co-occurrence branches.
In one embodiment, the co-occurrence is the ratio of each element in the co-occurrence matrix to the number of input samples to be classified.
In one embodiment, the step of obtaining the initial branch of the classifier chain according to the co-occurrence vector and generating the sequence of the classifier chain labels based on the greedy strategy includes:
adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain;
and searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, and continuously increasing the whole chain structure until the whole classifier chain label sequence is obtained.
In one embodiment, the step of performing the growing of the whole chain structure by searching for the label at the end of the classifier chain for the corresponding maximum co-occurrence branch includes:
and selecting a third sample tag element at the end of the classifier chain, if a co-occurrence branch taking the third sample tag element as an endpoint exists in the co-occurrence vector, adding a fourth sample tag element at the other end of the co-occurrence branch to the tail of the classifier chain, removing the corresponding fourth sample tag element from the tag set, otherwise, traversing the rest tag set and acquiring a fifth sample tag element corresponding to suboptimal, adding the fifth sample tag element to the classifier chain, removing the corresponding fifth sample tag element from the tag set, and pushing the same, so as to continuously increase the whole chain structure.
In a second aspect, an embodiment of the present invention provides an optimization system for a classifier chain tag sequence, including:
the sample acquisition module is used for acquiring a label sample to be classified;
the classifier chain model identification module is used for acquiring a sample label set of a sample to be classified;
the co-occurrence analysis module is used for acquiring a co-occurrence matrix corresponding to the sample label set;
the co-occurrence vector acquisition module is used for acquiring co-occurrence vectors by utilizing a plurality of co-occurrence branches of the co-occurrence matrix;
and the classifier chain label sequence generation module is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector and generating the required sequence of the classifier chains based on a greedy strategy.
In a third aspect, an embodiment of the present invention provides a terminal, including: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to perform the method for optimizing the classifier chain tag sequence according to the first aspect of the embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause the computer to perform the method for optimizing a classifier chain tag sequence according to the first aspect of the embodiment of the present invention.
The technical scheme of the invention has the following advantages:
the invention provides a method and a system for optimizing a classifier chain label sequence, which are characterized in that an input sample to be classified is obtained; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; the method comprises the steps of obtaining the initial branches of the classifier chains according to the co-occurrence vector, generating the sequence of the classifier chain labels based on a greedy strategy, providing a new corresponding label sequence generation strategy, generating the label sequence through an acceleration algorithm, and achieving high accuracy of the label sequence of the obtained classifier chains and performance optimization of the original classifier chain model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a specific example of a method for optimizing a classifier chain tag sequence according to an embodiment of the present invention;
FIG. 2 is a block diagram of an optimization system for a classifier chain tag sequence according to an embodiment of the present invention;
fig. 3 is a composition diagram of a specific example of a terminal according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Example 1
The method for optimizing the classifier chain tag sequence provided by the embodiment of the invention, as shown in fig. 1, comprises the following steps:
step S1: and obtaining an input sample to be classified.
In an embodiment of the invention, text to be classified is input into a classifier chain model.
Step S2: and identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain.
In practice, the multi-label classification algorithm is solved by utilizing a binary correlation conversion strategy, the original multi-label classification problem is converted into a plurality of corresponding bi-classification problems, and a plurality of bi-classification results are combined into a final multi-label classification set, so that the purpose of multi-label classification is realized. For example, there is a sample input space X, in this embodiment, the sample input space is an input sample to be classified, and a sample output space Y, in this embodiment, the sample output space Y generates a sample tag set, and a sample tag set is initially generated, where the sample tag set is an important component of the classifier chain, and for two sample spaces, there is a corresponding data set D that satisfies:
wherein D comprises n training samples x i In this embodiment, n training samples are the number of input samples to be classified, and for each training sample x i If the single attribute of the sample is assumed to be a, training sample x i Can be represented by a k-dimensional vector as follows:
x i =[a i1 ,a i2 ,...,a ik ]
wherein for d labels, d classifiers f need to be trained, assuming that the result output for each classifier is f (x i ) Then the objective of the binary correlation is to try to use d classifiers f (x i ) To approximate the corresponding real result y i The binary relevance algorithm has simple core, realizes traversal and has less cost, but the lack of utilization of intra-label association causes the binary relevance to be poor in the actual multi-label classification effect.
The classifier chain is an optimization algorithm based on binary correlation, the algorithm core is consistent with the binary correlation, the multi-label classification problem is converted into a plurality of two-class problems, but the classifier string type is connected by the application of the method, so that the classifier training models are not mutually isolated, and the potential correlation among labels is utilized, so that the classification result of the whole classification task is optimized. Classifier chains implement the concatenation of the string types of the classifier by adding the results of the classifier chain model to the samples of its inputs, assuming a given input x i The method meets the following conditions:
x i =[a i1 ,a i2 ,...,a ik ]
the results of the first q classifiers that have been predicted are:
[f 1 (x i ),f 2 (x i ),...,f q (x i )]
updating the corresponding sample with the newly obtained classification result each time, thereby obtaining the updated sample each time:
[a i1 ,...,a ik ,f 1 (x i ),...,f q (x i )],q=1,2,...,d
the method is characterized in that the method comprises the steps of determining a sequence of the classifier, wherein the sequence of the classifier is determined by a sequence determination algorithm, and the sequence determination algorithm is used for determining the sequence of the classifier according to the sequence determination algorithm.
Step S3: and obtaining a co-occurrence matrix corresponding to the sample label set by utilizing co-occurrence analysis.
The classifier chain based on the binary correlation optimization algorithm generates a sample label set which is an important component part of the classifier chain, and the embodiment of the invention quantitatively measures potential relations possibly existing between two sample label set elements by calculating the co-occurrence times of the two sample label set elements through co-occurrence analysis, thereby effectively measuring deep relations existing between the sample label set elements.
In the embodiment of the invention, the elements of the co-occurrence matrix are the probabilities that the elements of the sample tag set in the sample tag set are simultaneously present and are not simultaneously present. Firstly, generating a corresponding co-occurrence matrix according to a corresponding research object, and correspondingly generating the co-occurrence matrix by using the research object as a row and a column of the co-occurrence matrix respectively. For example, in words, by splitting a segment of words into a plurality of words, the words are used as rows and columns of the corresponding co-occurrence matrix, which is only taken as an example, but not limited to, and the corresponding co-occurrence matrix is generated according to actual requirements in practical application. Similarly, in the multi-label classification task, the purpose of constructing the corresponding co-occurrence matrix M is achieved by taking all labels as rows and columns of the co-occurrence matrix. For the labels, the probability of simultaneous occurrence is probably less, and the corresponding co-occurrence matrix is quite sparse, so the invention takes the label not occurring at the same time into the category of statistics, considers the label not occurring at the same time, reflects the relevance of the two labels to a certain extent, and satisfies the condition that the set of the label i is S i Satisfy the set of label j as S j The formula for calculating the co-occurrence matrix element is:
in the embodiment of the invention, the co-occurrence rate is the ratio of each element in the co-occurrence matrix to the number of the input samples to be classified, and in order to convert the content of the co-occurrence matrix into the corresponding percentage for convenient comparison, the invention defines the concept of the co-occurrence rate, and provides the number of the input samples to be classified corresponding to n training samples, and the formula for calculating the co-occurrence rate is as follows:
in the embodiment of the invention, the corresponding co-occurrence matrix is calculated according to the co-occurrence matrix and the co-occurrence rate, and meanwhile, it should be noted that the co-occurrence matrix is necessarily a symmetrical matrix, so that only half of the co-occurrence relations corresponding to the elements need to be calculated, and a given tag set l= { L is assumed 1 ,l 2 ,l 3 ,l 4 ,l 5 Simulation generated co-occurrence matrix as follows: :
R l 1 l 2 l 3 l 4 l 5
l 1 _ 0.672 0.649 0.644 0.632
l 2 _ _ 0.583 0.676 0.630
l 3 _ _ _ 0.674 0.619
l 4 _ _ _ _ 0.662
l 5 _ _ _ _ _
the method comprises the steps of establishing a corresponding co-occurrence matrix for a label set of a sample, calculating a corresponding co-occurrence rate, and representing a correlation between labels in a proportional form.
Step S4: the co-occurrence vector is composed of a plurality of co-occurrence branches of the co-occurrence matrix.
In an embodiment of the present invention, the step of forming a co-occurrence vector by using a plurality of co-occurrence branches of a co-occurrence matrix includes: obtaining the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix, and obtaining the maximum co-occurrence rate; acquiring a second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element; forming a plurality of co-occurrence branches by each first sample tag element, the maximum co-occurrence rate corresponding to each first sample tag element and the second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element; a co-occurrence vector is composed of a plurality of co-occurrence branches.
In the embodiment of the invention, the co-occurrence matrix occupies d 2 If the co-occurrence matrix is traversed frequently, the time consumption of the whole algorithm is increased, so that the invention considers that a simple extraction operation is performed on the co-occurrence matrix, firstly, the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix is obtained, the maximum co-occurrence rate of each first sample tag element is obtained, and the maximum values corresponding to all tags in the extracted co-occurrence matrix are combined to form a co-occurrence vector V. For example for the first sample tag element l i Find out a maximum co-occurrence rate R i The formula is satisfied:
R i =max{R ij },j=1,2,...,d
when find this R i Thereafter, assume that the sample is labeled with element l i Co-occurrence rate R of maximum corresponding composition i Is the second sample tag element l ji Then label the first sample by element l i Second sample tag element l ji Maximum co-occurrence rate R i Constitutes one co-occurrence branch in the co-occurrence vector V, then the co-occurrence branch b i The formula is satisfied:
b i =[l i ,l ji ,R i ]
by extracting all co-occurrence branches b i The final co-occurrence vector V can be written as follows:
[[l 1 ,l j1 ,R 1 ],[l 2 ,l j2 ,R 2 ],...,[l d ,l jd ,R d ]]
co-occurrence vector extraction is accomplished in this way, reducing traversal on the matrix itself by traversing co-occurrence vectors, which helps to speed up the algorithm generation sequence.
Step S5: and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy.
In the embodiment of the invention, the step of acquiring the initial branch of the classifier chain according to the co-occurrence vector and generating the sequence of the classifier chain labels based on the greedy strategy comprises the following steps: adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain, thereby starting to generate the required sequence of the classifier chain by increasing the form of the classifier chain, specifically, according to a greedy strategy, only increasing the tail of the classifier chain and enabling the label at the tail to be the largest co-occurrence rate, and searching the label at the tail of the classifier chain for the corresponding largest co-occurrence branch to continuously increase the whole chain structure until the whole classifier chain label sequence is obtained.
In the embodiment of the invention, the step of continuously increasing the whole chain structure is carried out by searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, which comprises the following steps: selecting a third sample tag element l at the end of the classifier chain i If there is a third sample tag element l in the co-occurrence vector i Co-occurrence branch b for endpoint i Will co-occur branch b i Fourth sample tag element at the other end ji Added to the tail of the classifier chain while removing the corresponding fourth sample tag element l from the tag set ji Otherwise, traversing the rest label set and obtaining a fifth sample label element l corresponding to suboptimal inext Label element l of the fifth sample inext Added to the classifier chain while removing the corresponding fifth sample tag element l from the tag set inext And so on, the whole chain structure is continuously increased.
In an embodiment of the invention, the greedy-based classifier chain growth process is as follows:
the embodiment of the invention provides an optimization method of a classifier chain label sequence, which comprises the steps of obtaining an input sample to be classified; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; the method comprises the steps of obtaining initial branches of a classifier chain according to co-occurrence vectors, generating sequence of the classifier chain labels based on greedy strategies, providing a new corresponding label sequence generation strategy, accelerating algorithm generation sequences, consuming less time, obtaining high label sequence accuracy of the classifier chain, obviously improving multi-label classification effect, and realizing performance optimization of an original classifier chain model.
In this embodiment, the optimization method for analyzing the classifier chain tag sequence by experimental comparison works as follows:
first, three data sets of Yeast, enron, scene, motion, slasshdot-F, CAL500 and media are selected, and the fields comprise the fields of texts, pictures, biology and the like. The greedy-based classifier chain (Greedy Classifier Chain, GCC) algorithm proposed by the present invention was validated with the seven data sets above and compared with the initial classifier chain CC algorithm and the modified local classifier chain algorithm LOCC algorithm. The parameters of the data set used in particular are as follows:
Name Instance Features labels cardinality
emotions 593 72 6 1.879
Enron 1702 1001 53 3.378
Scene 2407 294 6 1.074
Yeast 2417 103 14 4.237
Slashdot-F 1460 1079 22 1.18
CAL500 502 68 174 26.044
medical 978 1449 45 1.245
all experiments were performed using python, with corresponding development by means of a partial library function with sklearn. In the selection of the base classifier, the SVM is adopted as the base classifier, the kernel function is a Gaussian kernel function, the penalty parameter C=100, and the base classifier of all algorithms adopts the same parameter, so that the effect caused by influencing the sequence extraction per se due to the difference in the performance of the base classifier is avoided.
In the evaluation index, accuracy and F1 are selected in the invention macro As an evaluation index.
(1) Accuracy differs from the Accuracy formula used by the general classification task in that the calculation formula is as follows:
because the Accurcry formula used by the general classification task is too severe, in order to better reflect the performance of the multi-label classification algorithm, the invention adopts the corresponding variant of Accurcy under multi-label classification. Wherein S is i Representative sample x i True tag set, Y i Representative sample x i Is provided for a predictive tag set. S i ∩Y i The number of labels predicted correctly is denoted by S i ∪Y i The I represents the total label number, the larger the value of the index of Accuracy is, the better, in the aspect of verification of experimental results, the five-fold cross verification mode is adopted, and the performance of different algorithms about Accuracy is compared with the following table:
Dataset CC LOCC GCC
yeast 0.4585 0.4649 0.4802
scene 0.5943 0.5938 0.6114
emotions 0.3851 0.3665 0.3817
enron 0.4034 0.3997 0.4026
Slashdot-F 0.3945 0.4147 0.4050
CAL500 0.2210 0.2233 0.2347
medical 0.6964 0.7068 0.7032
the optimal algorithm result of the corresponding index is the maximum value of corresponding data in CC, LOCC, GCC algorithms, and on the Accuracy index, the GCC algorithm can be seen to be superior in Accuracy on other 4 data sets except the medium and Slassdot-F data sets. The CC algorithm has high accuracy over the 1 data sets of the project, and the LOCC algorithm has higher accuracy over the media and Slassdot-F data sets. Therefore, the GCC method has better multi-label classification accuracy and greatly improves the performance of the traditional CC algorithm.
(2)F1 macro The calculation formula is as follows:
since Accuracy mainly evaluates the correct label, in order to consider both correct and incorrect samples, the invention uses F1 under macroscopic averaging macro And (5) an index. P in the formula i Corresponding to Precision, r i Corresponding to Recall ratio Recall, F1 macro The index is then the harmonic average of the precision p and recall r. The larger the index is, the better the algorithm comprehensive performance is.
Different algorithms with respect to F1 macro The performance comparison of (2) is as follows:
Dataset CC LOCC GCC
yeast 0.5585 0.5505 0.5637
scene 0.8547 0.8544 0.8578
emotions 0.6563 0.6601 0.6598
enron 0.5834 0.5860 0.5845
Slashdot-F 0.6503 0.6538 0.6508
CAL500 0.5098 0.5104 0.5103
medical 0.6497 0.6477 0.6497
among them, the GCC algorithm performance is better than that on the F1 index, with the highest F1 performance on the 4 data sets. The F1 index of the conventional CC algorithm does not have any advantage for its F1 value, except that it is kept on the medium dataset and the GCC algorithm is leveled. The LOCC algorithm has the highest F1 performance on the 3 data sets. Overall the GCC method herein has better F1 performance with a great performance boost over the traditional CC algorithm.
The optimization method of the classifier chain label sequence provided by the embodiment of the invention has the advantages that the performance effect of the GCC algorithm is best, the performance of the LOCC algorithm is second, and the performance of the CC algorithm is worst. Therefore, the optimization method of the classifier chain label sequence provided by the invention can be inferred to obviously improve the multi-label classification effect, and the performance optimization of the original classifier chain model is realized.
Example 2
An embodiment of the present invention provides an optimization system for a classifier chain tag sequence, as shown in fig. 2, including:
the sample acquisition module 1 is used for acquiring a label sample to be classified; this module performs the method described in step S1 in embodiment 1, and will not be described here again.
The classifier chain model identification module 2 is used for acquiring a sample label set of a sample to be classified; this module performs the method described in step S2 in embodiment 1, and will not be described here.
The co-occurrence analysis module 3 is used for acquiring a co-occurrence matrix corresponding to the sample label set; this module performs the method described in step S3 in embodiment 1, and will not be described here.
A co-occurrence vector acquisition module 4, configured to acquire co-occurrence vectors by using multiple co-occurrence branches of the co-occurrence matrix; this module performs the method described in step S4 in embodiment 1, and will not be described here.
The classifier chain label sequence generating module 5 is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector and generating the required sequence of the classifier chains based on a greedy strategy; this module performs the method described in step S5 in embodiment 1, and will not be described here.
The embodiment of the invention provides an optimization system of a classifier chain label sequence, provides a label sequence optimization selection generation strategy based on co-occurrence analysis to improve the multi-label classification performance of a classifier chain model, adopts a greedy strategy, sequentially extracts information from a co-occurrence matrix from the angle of ensuring the maximum co-occurrence rate and generates a corresponding classifier chain sequence, obviously improves the multi-label classification effect, and realizes the performance optimization of the original classifier chain model.
Example 3
An embodiment of the present invention provides a terminal, as shown in fig. 3, including: at least one processor 401, such as a CPU (Central Processing Unit ), at least one communication interface 403, a memory 404, at least one communication bus 402. Wherein communication bus 402 is used to enable connected communications between these components. The communication interface 403 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional communication interface 403 may further include a standard wired interface and a wireless interface. The memory 404 may be a high-speed RAM memory (Ramdom Access Memory, volatile random access memory) or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 404 may also optionally be at least one storage device located remotely from the aforementioned processor 401. Wherein the processor 401 may perform the method of optimizing the classifier chain tag sequence in embodiment 1. A set of program codes is stored in the memory 404, and the processor 401 calls the program codes stored in the memory 404 for executing the optimization method of the classifier chain tag sequence in embodiment 1. The communication bus 402 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Communication bus 402 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in fig. 3, but not only one bus or one type of bus. Wherein the memory 404 may include volatile memory (English) such as random-access memory (RAM); the memory may also include a nonvolatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated as HDD) or a solid-state drive (english: SSD); memory 404 may also include a combination of the above types of memory. The processor 401 may be a central processor (English: central processing unit, abbreviated: CPU), a network processor (English: network processor, abbreviated: NP) or a combination of CPU and NP.
Wherein the memory 404 may include volatile memory (English) such as random-access memory (RAM); the memory may also include a nonvolatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated as HDD) or a solid state disk (english: solid-state drive, abbreviated as SSD); memory 404 may also include a combination of the above types of memory.
The processor 401 may be a central processor (English: central processing unit, abbreviated: CPU), a network processor (English: network processor, abbreviated: NP) or a combination of CPU and NP.
Wherein the processor 401 may further comprise a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof (English: programmable logic device). The PLD may be a complex programmable logic device (English: complex programmable logic device, abbreviated: CPLD), a field programmable gate array (English: field-programmable gate array, abbreviated: FPGA), a general-purpose array logic (English: generic array logic, abbreviated: GAL), or any combination thereof.
Optionally, the memory 404 is also used for storing program instructions. The processor 401 may invoke program instructions to implement the optimization method of the classifier chain tag sequence as in embodiment 1 of the present application.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores computer executable instructions, and the computer executable instructions can execute the method for optimizing the classifier chain tag sequence in the embodiment 1. The storage medium may be a magnetic Disk, an optical disc, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present invention.

Claims (6)

1. The multi-label classification method based on the classifier chain label sequence optimization is characterized by comprising the following steps of:
obtaining an input sample to be classified, wherein the input sample comprises: pictures, texts;
identifying an input sample by using a classifier chain model, generating a sample label set, and forming a classifier chain, wherein the sample label is an attribute corresponding to a picture or an attribute corresponding to a text;
obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis, wherein elements of the co-occurrence matrix are the probability of simultaneous occurrence and simultaneous non-occurrence of sample tag set elements in the sample tag set;
forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix;
acquiring a starting branch of a classifier chain according to the co-occurrence vector, generating an order of classifier chain labels based on a greedy strategy, and classifying multiple labels of an input sample, wherein the method comprises the following steps:
adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain;
searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, and continuously increasing the whole chain structure until obtaining the label sequence of the whole classifier chain, wherein the method comprises the following steps: and selecting a third sample tag element at the end of the classifier chain, if a co-occurrence branch taking the third sample tag element as an endpoint exists in the co-occurrence vector, adding a fourth sample tag element at the other end of the co-occurrence branch to the tail of the classifier chain, removing the corresponding fourth sample tag element from the tag set, otherwise, traversing the rest tag set and acquiring a fifth sample tag element corresponding to suboptimal, adding the fifth sample tag element to the classifier chain, removing the corresponding fifth sample tag element from the tag set, and pushing the same, so as to continuously increase the whole chain structure.
2. The method of multi-label classification based on classifier chain label sequence optimization of claim 1 wherein the step of forming a co-occurrence vector using a plurality of co-occurrence branches of a co-occurrence matrix comprises:
obtaining the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix, and obtaining the maximum co-occurrence rate;
acquiring a second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;
forming a plurality of co-occurrence branches by each first sample tag element, the maximum co-occurrence rate corresponding to each first sample tag element and the second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;
a co-occurrence vector is composed of a plurality of co-occurrence branches.
3. The multi-label classification method based on classifier chain label sequence optimization of claim 2 wherein co-occurrence is the ratio of each element in the co-occurrence matrix to the number of input samples to be classified.
4. A multi-tag classification system based on classifier chain tag sequence optimization, comprising:
the sample acquisition module is used for acquiring an input sample to be classified, wherein the input sample comprises: pictures, texts;
the classifier chain model identification module is used for acquiring a sample label set of an input sample to be classified, wherein the sample label is an attribute corresponding to a picture or an attribute corresponding to a text;
the co-occurrence analysis module is used for acquiring a co-occurrence matrix corresponding to the sample tag set, wherein elements of the co-occurrence matrix are the probability that sample tag set elements in the sample tag set occur simultaneously and do not occur simultaneously;
the co-occurrence vector acquisition module is used for acquiring co-occurrence vectors by utilizing a plurality of co-occurrence branches of the co-occurrence matrix;
the classifier chain label sequence generating module is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector, generating the required sequence of the classifier chains based on a greedy strategy, and is used for multi-label classification of input samples, and comprises the following steps:
adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain;
searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, and continuously increasing the whole chain structure until obtaining the label sequence of the whole classifier chain, wherein the method comprises the following steps: and selecting a third sample tag element at the end of the classifier chain, if a co-occurrence branch taking the third sample tag element as an endpoint exists in the co-occurrence vector, adding a fourth sample tag element at the other end of the co-occurrence branch to the tail of the classifier chain, removing the corresponding fourth sample tag element from the tag set, otherwise, traversing the rest tag set and acquiring a fifth sample tag element corresponding to suboptimal, adding the fifth sample tag element to the classifier chain, removing the corresponding fifth sample tag element from the tag set, and pushing the same, so as to continuously increase the whole chain structure.
5. A terminal, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the multi-label classification method based on classifier chain label sequence optimization of any one of claims 1-3.
6. A computer readable storage medium having stored thereon computer instructions for causing the computer to perform the multi-label classification method based on classifier chain label sequence optimization of any one of claims 1-3.
CN202010397834.6A 2020-05-12 2020-05-12 Optimization method and system for classifier chain tag sequence Active CN111553442B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010397834.6A CN111553442B (en) 2020-05-12 2020-05-12 Optimization method and system for classifier chain tag sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010397834.6A CN111553442B (en) 2020-05-12 2020-05-12 Optimization method and system for classifier chain tag sequence

Publications (2)

Publication Number Publication Date
CN111553442A CN111553442A (en) 2020-08-18
CN111553442B true CN111553442B (en) 2024-03-12

Family

ID=72000679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010397834.6A Active CN111553442B (en) 2020-05-12 2020-05-12 Optimization method and system for classifier chain tag sequence

Country Status (1)

Country Link
CN (1) CN111553442B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800222B (en) * 2021-01-26 2022-07-19 天津科技大学 Multi-task auxiliary limit multi-label short text classification method using co-occurrence information
CN113568738A (en) * 2021-07-02 2021-10-29 上海淇玥信息技术有限公司 Resource allocation method and device based on multi-label classification, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109036577A (en) * 2018-07-27 2018-12-18 合肥工业大学 Diabetic complication analysis method and device
CN109783636A (en) * 2018-12-12 2019-05-21 重庆邮电大学 A kind of car review subject distillation method based on classifier chains
CN110442707A (en) * 2019-06-21 2019-11-12 电子科技大学 A kind of multi-tag file classification method based on seq2seq
CN110751188A (en) * 2019-09-26 2020-02-04 华南师范大学 User label prediction method, system and storage medium based on multi-label learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9916540B2 (en) * 2015-01-22 2018-03-13 Microsoft Technology Licensing, Llc Scalable-effort classifiers for energy-efficient machine learning
US11086918B2 (en) * 2016-12-07 2021-08-10 Mitsubishi Electric Research Laboratories, Inc. Method and system for multi-label classification
US10862765B2 (en) * 2018-07-31 2020-12-08 EMC IP Holding Company LLC Allocation of shared computing resources using a classifier chain

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109036577A (en) * 2018-07-27 2018-12-18 合肥工业大学 Diabetic complication analysis method and device
CN109783636A (en) * 2018-12-12 2019-05-21 重庆邮电大学 A kind of car review subject distillation method based on classifier chains
CN110442707A (en) * 2019-06-21 2019-11-12 电子科技大学 A kind of multi-tag file classification method based on seq2seq
CN110751188A (en) * 2019-09-26 2020-02-04 华南师范大学 User label prediction method, system and storage medium based on multi-label learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于PLSA主题模型的多标记文本分类;蒋铭初;潘志松;尤峻;;数据采集与处理;20160515(03);全文 *
基于双层结构的多标签优序选择分类算法;刘各巧;郭涛;;计算机工程与设计;20160416(04);全文 *

Also Published As

Publication number Publication date
CN111553442A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CN110309331B (en) Cross-modal deep hash retrieval method based on self-supervision
CN110837836B (en) Semi-supervised semantic segmentation method based on maximized confidence
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN109189767B (en) Data processing method and device, electronic equipment and storage medium
CN110363049B (en) Method and device for detecting, identifying and determining categories of graphic elements
CN102722713B (en) Handwritten numeral recognition method based on lie group structure data and system thereof
CN109492230B (en) Method for extracting insurance contract key information based on interested text field convolutional neural network
WO2021051864A1 (en) Dictionary expansion method and apparatus, electronic device and storage medium
CN112199536A (en) Cross-modality-based rapid multi-label image classification method and system
US20230306035A1 (en) Automatic recommendation of analysis for dataset
CN113420669B (en) Document layout analysis method and system based on multi-scale training and cascade detection
CN111325264A (en) Multi-label data classification method based on entropy
CN111143567B (en) Comment emotion analysis method based on improved neural network
CN111553442B (en) Optimization method and system for classifier chain tag sequence
CN111460927A (en) Method for extracting structured information of house property certificate image
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN111782804A (en) TextCNN-based same-distribution text data selection method, system and storage medium
Lyu et al. The early Japanese books reorganization by combining image processing and deep learning
CN114897085A (en) Clustering method based on closed subgraph link prediction and computer equipment
CN114153839A (en) Integration method, device, equipment and storage medium of multi-source heterogeneous data
CN113408418A (en) Calligraphy font and character content synchronous identification method and system
CN113298188A (en) Character recognition and neural network training method and device
CN106033546A (en) Behavior classification method based on top-down learning
CN111709475A (en) Multi-label classification method and device based on N-grams
CN112784818B (en) Identification method based on grouping type active learning on optical remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210112

Address after: 102209 18 Riverside Avenue, Changping District science and Technology City, Beijing

Applicant after: GLOBAL ENERGY INTERCONNECTION RESEARCH INSTITUTE Co.,Ltd.

Applicant after: STATE GRID CORPORATION OF CHINA

Applicant after: STATE GRID INFORMATION & TELECOMMUNICATION BRANCH

Applicant after: INFORMATION COMMUNICATION COMPANY OF STATE GRID SHANDONG ELECTRIC POWER Co.

Address before: 102209 18 Riverside Avenue, Changping District science and Technology City, Beijing

Applicant before: GLOBAL ENERGY INTERCONNECTION RESEARCH INSTITUTE Co.,Ltd.

Applicant before: STATE GRID CORPORATION OF CHINA

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 102209 18 Riverside Avenue, Changping District science and Technology City, Beijing

Applicant after: State Grid Smart Grid Research Institute Co.,Ltd.

Applicant after: STATE GRID CORPORATION OF CHINA

Applicant after: STATE GRID INFORMATION & TELECOMMUNICATION BRANCH

Applicant after: INFORMATION COMMUNICATION COMPANY OF STATE GRID SHANDONG ELECTRIC POWER Co.

Address before: 102209 18 Riverside Avenue, Changping District science and Technology City, Beijing

Applicant before: GLOBAL ENERGY INTERCONNECTION RESEARCH INSTITUTE Co.,Ltd.

Applicant before: STATE GRID CORPORATION OF CHINA

Applicant before: STATE GRID INFORMATION & TELECOMMUNICATION BRANCH

Applicant before: INFORMATION COMMUNICATION COMPANY OF STATE GRID SHANDONG ELECTRIC POWER Co.

CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Zheng Rongrong

Inventor after: Xue Wenting

Inventor after: Zhang Qiang

Inventor after: Song Bochuan

Inventor after: Jia Quanye

Inventor after: Chai Bo

Inventor after: Zhang Wenbin

Inventor before: Zhang Qiang

Inventor before: Song Bochuan

Inventor before: Jia Quanye

Inventor before: Chai Bo

GR01 Patent grant
GR01 Patent grant