CN111553442B

CN111553442B - Optimization method and system for classifier chain tag sequence

Info

Publication number: CN111553442B
Application number: CN202010397834.6A
Authority: CN
Inventors: 郑蓉蓉; 薛文婷; 张强; 宋博川; 贾全烨; 柴博; 张闻彬
Original assignee: State Grid Smart Grid Research Institute Co ltd; State Grid Corp of China SGCC; State Grid Information and Telecommunication Co Ltd; Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Information and Telecommunication Co Ltd; Global Energy Interconnection Research Institute; Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date: 2020-05-12
Filing date: 2020-05-12
Publication date: 2024-03-12
Anticipated expiration: 2040-05-12
Also published as: CN111553442A

Abstract

The invention discloses a method and a system for optimizing a classifier chain tag sequence, wherein the method comprises the following steps: acquiring an input sample to be classified; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy. The invention provides a label sequence generation strategy corresponding to the classifier chain, and the label sequence is generated by accelerating an algorithm, so that the time consumption is low, the accuracy of the obtained label sequence of the classifier chain is high, and the performance optimization of the original classifier chain model is realized.

Description

Optimization method and system for classifier chain tag sequence

Technical Field

The invention relates to the technical field of machine learning, in particular to a method and a system for optimizing a classifier chain tag sequence.

Background

Classification is a very important method in machine learning, and classification can enable a machine to classify objects of interest, so as to achieve the purpose of identifying different objects. However, in practical problems, the category to which an object belongs has a certain uncertainty itself, for example, in some text classification tasks, news about certain sports stars may belong to sports news or stars entertainment news. In reality, different attributes can be attached to the objects, in order to accurately predict and classify all the attributes of the objects, the multi-label classification technology is a common method, and compared with the multi-classification or classification problems, the technical difficulty of multi-label classification mainly comprises the following steps: the dimensions of the tags to be processed are too high to explore the potential links between the tags.

However, read et al propose a Classifier Chain (CC) algorithm based on binary correlation, and connect Classifier results in series, so that the whole Classifier Chain can utilize potential association relations between labels, and thus the whole Classifier Chain can output better results. Although the classifier chain algorithm is optimized compared with the original binary correlation algorithm, the dimension of the label to be processed is increased due to the fact that the classifier chain algorithm belongs to a chain-type growth type classifier model, the consumption of the whole algorithm in time is increased, the sequence of the classifier chain is randomly generated, and the risk of error propagation exists.

Disclosure of Invention

Therefore, the optimization method and the system for the classifier chain label sequence overcome the defects of overhigh dimension, large calculation amount, random generation of the classifier chain and large risk of error propagation in the prior art of multi-label classification.

In order to achieve the above purpose, the present invention provides the following technical solutions:

in a first aspect, an embodiment of the present invention provides a method for optimizing a classifier chain tag sequence, including:

acquiring an input sample to be classified;

identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain;

obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis;

forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix;

and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy.

In one embodiment, the elements of the co-occurrence matrix are probabilities of simultaneous occurrence and non-simultaneous occurrence of sample tag set elements in the sample tag set.

In an embodiment, the step of forming a co-occurrence vector using a plurality of co-occurrence branches of a co-occurrence matrix includes:

obtaining the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix, and obtaining the maximum co-occurrence rate;

acquiring a second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;

forming a plurality of co-occurrence branches by each first sample tag element, the maximum co-occurrence rate corresponding to each first sample tag element and the second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element;

a co-occurrence vector is composed of a plurality of co-occurrence branches.

In one embodiment, the co-occurrence is the ratio of each element in the co-occurrence matrix to the number of input samples to be classified.

In one embodiment, the step of obtaining the initial branch of the classifier chain according to the co-occurrence vector and generating the sequence of the classifier chain labels based on the greedy strategy includes:

adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain;

and searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, and continuously increasing the whole chain structure until the whole classifier chain label sequence is obtained.

In one embodiment, the step of performing the growing of the whole chain structure by searching for the label at the end of the classifier chain for the corresponding maximum co-occurrence branch includes:

and selecting a third sample tag element at the end of the classifier chain, if a co-occurrence branch taking the third sample tag element as an endpoint exists in the co-occurrence vector, adding a fourth sample tag element at the other end of the co-occurrence branch to the tail of the classifier chain, removing the corresponding fourth sample tag element from the tag set, otherwise, traversing the rest tag set and acquiring a fifth sample tag element corresponding to suboptimal, adding the fifth sample tag element to the classifier chain, removing the corresponding fifth sample tag element from the tag set, and pushing the same, so as to continuously increase the whole chain structure.

In a second aspect, an embodiment of the present invention provides an optimization system for a classifier chain tag sequence, including:

the sample acquisition module is used for acquiring a label sample to be classified;

the classifier chain model identification module is used for acquiring a sample label set of a sample to be classified;

the co-occurrence analysis module is used for acquiring a co-occurrence matrix corresponding to the sample label set;

the co-occurrence vector acquisition module is used for acquiring co-occurrence vectors by utilizing a plurality of co-occurrence branches of the co-occurrence matrix;

and the classifier chain label sequence generation module is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector and generating the required sequence of the classifier chains based on a greedy strategy.

In a third aspect, an embodiment of the present invention provides a terminal, including: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to perform the method for optimizing the classifier chain tag sequence according to the first aspect of the embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause the computer to perform the method for optimizing a classifier chain tag sequence according to the first aspect of the embodiment of the present invention.

The technical scheme of the invention has the following advantages:

the invention provides a method and a system for optimizing a classifier chain label sequence, which are characterized in that an input sample to be classified is obtained; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; the method comprises the steps of obtaining the initial branches of the classifier chains according to the co-occurrence vector, generating the sequence of the classifier chain labels based on a greedy strategy, providing a new corresponding label sequence generation strategy, generating the label sequence through an acceleration algorithm, and achieving high accuracy of the label sequence of the obtained classifier chains and performance optimization of the original classifier chain model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a specific example of a method for optimizing a classifier chain tag sequence according to an embodiment of the present invention;

FIG. 2 is a block diagram of an optimization system for a classifier chain tag sequence according to an embodiment of the present invention;

fig. 3 is a composition diagram of a specific example of a terminal according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

Example 1

The method for optimizing the classifier chain tag sequence provided by the embodiment of the invention, as shown in fig. 1, comprises the following steps:

step S1: and obtaining an input sample to be classified.

In an embodiment of the invention, text to be classified is input into a classifier chain model.

Step S2: and identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain.

In practice, the multi-label classification algorithm is solved by utilizing a binary correlation conversion strategy, the original multi-label classification problem is converted into a plurality of corresponding bi-classification problems, and a plurality of bi-classification results are combined into a final multi-label classification set, so that the purpose of multi-label classification is realized. For example, there is a sample input space X, in this embodiment, the sample input space is an input sample to be classified, and a sample output space Y, in this embodiment, the sample output space Y generates a sample tag set, and a sample tag set is initially generated, where the sample tag set is an important component of the classifier chain, and for two sample spaces, there is a corresponding data set D that satisfies:

wherein D comprises n training samples x _i In this embodiment, n training samples are the number of input samples to be classified, and for each training sample x _i If the single attribute of the sample is assumed to be a, training sample x _i Can be represented by a k-dimensional vector as follows:

x _i ＝[a _i1 ,a _i2 ,...,a _ik ]

wherein for d labels, d classifiers f need to be trained, assuming that the result output for each classifier is f (x _i ) Then the objective of the binary correlation is to try to use d classifiers f (x _i ) To approximate the corresponding real result y _i The binary relevance algorithm has simple core, realizes traversal and has less cost, but the lack of utilization of intra-label association causes the binary relevance to be poor in the actual multi-label classification effect.

The classifier chain is an optimization algorithm based on binary correlation, the algorithm core is consistent with the binary correlation, the multi-label classification problem is converted into a plurality of two-class problems, but the classifier string type is connected by the application of the method, so that the classifier training models are not mutually isolated, and the potential correlation among labels is utilized, so that the classification result of the whole classification task is optimized. Classifier chains implement the concatenation of the string types of the classifier by adding the results of the classifier chain model to the samples of its inputs, assuming a given input x _i The method meets the following conditions:

x _i ＝[a _i1 ,a _i2 ,...,a _ik ]

the results of the first q classifiers that have been predicted are:

[f ₁ (x _i ),f ₂ (x _i ),...,f _q (x _i )]

updating the corresponding sample with the newly obtained classification result each time, thereby obtaining the updated sample each time:

[a _i1 ,...,a _ik ,f ₁ (x _i ),...,f _q (x _i )],q＝1,2,...,d

the method is characterized in that the method comprises the steps of determining a sequence of the classifier, wherein the sequence of the classifier is determined by a sequence determination algorithm, and the sequence determination algorithm is used for determining the sequence of the classifier according to the sequence determination algorithm.

Step S3: and obtaining a co-occurrence matrix corresponding to the sample label set by utilizing co-occurrence analysis.

The classifier chain based on the binary correlation optimization algorithm generates a sample label set which is an important component part of the classifier chain, and the embodiment of the invention quantitatively measures potential relations possibly existing between two sample label set elements by calculating the co-occurrence times of the two sample label set elements through co-occurrence analysis, thereby effectively measuring deep relations existing between the sample label set elements.

In the embodiment of the invention, the elements of the co-occurrence matrix are the probabilities that the elements of the sample tag set in the sample tag set are simultaneously present and are not simultaneously present. Firstly, generating a corresponding co-occurrence matrix according to a corresponding research object, and correspondingly generating the co-occurrence matrix by using the research object as a row and a column of the co-occurrence matrix respectively. For example, in words, by splitting a segment of words into a plurality of words, the words are used as rows and columns of the corresponding co-occurrence matrix, which is only taken as an example, but not limited to, and the corresponding co-occurrence matrix is generated according to actual requirements in practical application. Similarly, in the multi-label classification task, the purpose of constructing the corresponding co-occurrence matrix M is achieved by taking all labels as rows and columns of the co-occurrence matrix. For the labels, the probability of simultaneous occurrence is probably less, and the corresponding co-occurrence matrix is quite sparse, so the invention takes the label not occurring at the same time into the category of statistics, considers the label not occurring at the same time, reflects the relevance of the two labels to a certain extent, and satisfies the condition that the set of the label i is S _i Satisfy the set of label j as S _j The formula for calculating the co-occurrence matrix element is:

in the embodiment of the invention, the co-occurrence rate is the ratio of each element in the co-occurrence matrix to the number of the input samples to be classified, and in order to convert the content of the co-occurrence matrix into the corresponding percentage for convenient comparison, the invention defines the concept of the co-occurrence rate, and provides the number of the input samples to be classified corresponding to n training samples, and the formula for calculating the co-occurrence rate is as follows:

in the embodiment of the invention, the corresponding co-occurrence matrix is calculated according to the co-occurrence matrix and the co-occurrence rate, and meanwhile, it should be noted that the co-occurrence matrix is necessarily a symmetrical matrix, so that only half of the co-occurrence relations corresponding to the elements need to be calculated, and a given tag set l= { L is assumed ₁ ,l ₂ ,l ₃ ,l ₄ ,l ₅ Simulation generated co-occurrence matrix as follows: :

R	l ₁	l ₂	l ₃	l ₄	l ₅
						l ₁	_	0.672	0.649	0.644	0.632
l ₂	_	_	0.583	0.676	0.630
						l ₃	_	_	_	0.674	0.619
l ₄	_	_	_	_	0.662
						l ₅	_	_	_	_	_

the method comprises the steps of establishing a corresponding co-occurrence matrix for a label set of a sample, calculating a corresponding co-occurrence rate, and representing a correlation between labels in a proportional form.

Step S4: the co-occurrence vector is composed of a plurality of co-occurrence branches of the co-occurrence matrix.

In an embodiment of the present invention, the step of forming a co-occurrence vector by using a plurality of co-occurrence branches of a co-occurrence matrix includes: obtaining the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix, and obtaining the maximum co-occurrence rate; acquiring a second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element; forming a plurality of co-occurrence branches by each first sample tag element, the maximum co-occurrence rate corresponding to each first sample tag element and the second sample tag element corresponding to the maximum co-occurrence rate of each first sample tag element; a co-occurrence vector is composed of a plurality of co-occurrence branches.

In the embodiment of the invention, the co-occurrence matrix occupies d ² If the co-occurrence matrix is traversed frequently, the time consumption of the whole algorithm is increased, so that the invention considers that a simple extraction operation is performed on the co-occurrence matrix, firstly, the co-occurrence rate corresponding to each first sample tag element in the co-occurrence matrix is obtained, the maximum co-occurrence rate of each first sample tag element is obtained, and the maximum values corresponding to all tags in the extracted co-occurrence matrix are combined to form a co-occurrence vector V. For example for the first sample tag element l _i Find out a maximum co-occurrence rate R _i The formula is satisfied:

R _i ＝max{R _ij },j＝1,2,...,d

when find this R _i Thereafter, assume that the sample is labeled with element l _i Co-occurrence rate R of maximum corresponding composition _i Is the second sample tag element l _ji Then label the first sample by element l _i Second sample tag element l _ji Maximum co-occurrence rate R _i Constitutes one co-occurrence branch in the co-occurrence vector V, then the co-occurrence branch b _i The formula is satisfied:

b _i ＝[l _i ,l _ji ,R _i ]

by extracting all co-occurrence branches b _i The final co-occurrence vector V can be written as follows:

[[l ₁ ,l _j1 ,R ₁ ],[l ₂ ,l _j2 ,R ₂ ],...,[l _d ,l _jd ,R _d ]]

co-occurrence vector extraction is accomplished in this way, reducing traversal on the matrix itself by traversing co-occurrence vectors, which helps to speed up the algorithm generation sequence.

Step S5: and acquiring the initial branches of the classifier chains according to the co-occurrence vectors, and generating the sequence of the classifier chain labels based on a greedy strategy.

In the embodiment of the invention, the step of acquiring the initial branch of the classifier chain according to the co-occurrence vector and generating the sequence of the classifier chain labels based on the greedy strategy comprises the following steps: adding the co-occurrence branch with the largest co-occurrence rate in the co-occurrence vector to the initial branch of the classifier chain, thereby starting to generate the required sequence of the classifier chain by increasing the form of the classifier chain, specifically, according to a greedy strategy, only increasing the tail of the classifier chain and enabling the label at the tail to be the largest co-occurrence rate, and searching the label at the tail of the classifier chain for the corresponding largest co-occurrence branch to continuously increase the whole chain structure until the whole classifier chain label sequence is obtained.

In the embodiment of the invention, the step of continuously increasing the whole chain structure is carried out by searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, which comprises the following steps: selecting a third sample tag element l at the end of the classifier chain _i If there is a third sample tag element l in the co-occurrence vector _i Co-occurrence branch b for endpoint _i Will co-occur branch b _i Fourth sample tag element at the other end _ji Added to the tail of the classifier chain while removing the corresponding fourth sample tag element l from the tag set _ji Otherwise, traversing the rest label set and obtaining a fifth sample label element l corresponding to suboptimal _inext Label element l of the fifth sample _inext Added to the classifier chain while removing the corresponding fifth sample tag element l from the tag set _inext And so on, the whole chain structure is continuously increased.

In an embodiment of the invention, the greedy-based classifier chain growth process is as follows:

the embodiment of the invention provides an optimization method of a classifier chain label sequence, which comprises the steps of obtaining an input sample to be classified; identifying input samples by using a classifier chain model, generating a sample label set, and forming a classifier chain; obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis; forming a co-occurrence vector by using a plurality of co-occurrence branches of the co-occurrence matrix; the method comprises the steps of obtaining initial branches of a classifier chain according to co-occurrence vectors, generating sequence of the classifier chain labels based on greedy strategies, providing a new corresponding label sequence generation strategy, accelerating algorithm generation sequences, consuming less time, obtaining high label sequence accuracy of the classifier chain, obviously improving multi-label classification effect, and realizing performance optimization of an original classifier chain model.

In this embodiment, the optimization method for analyzing the classifier chain tag sequence by experimental comparison works as follows:

first, three data sets of Yeast, enron, scene, motion, slasshdot-F, CAL500 and media are selected, and the fields comprise the fields of texts, pictures, biology and the like. The greedy-based classifier chain (Greedy Classifier Chain, GCC) algorithm proposed by the present invention was validated with the seven data sets above and compared with the initial classifier chain CC algorithm and the modified local classifier chain algorithm LOCC algorithm. The parameters of the data set used in particular are as follows:

Name	Instance	Features	labels	cardinality
					emotions	593	72	6	1.879
Enron	1702	1001	53	3.378
					Scene	2407	294	6	1.074
Yeast	2417	103	14	4.237
					Slashdot-F	1460	1079	22	1.18
CAL500	502	68	174	26.044
					medical	978	1449	45	1.245

all experiments were performed using python, with corresponding development by means of a partial library function with sklearn. In the selection of the base classifier, the SVM is adopted as the base classifier, the kernel function is a Gaussian kernel function, the penalty parameter C=100, and the base classifier of all algorithms adopts the same parameter, so that the effect caused by influencing the sequence extraction per se due to the difference in the performance of the base classifier is avoided.

In the evaluation index, accuracy and F1 are selected in the invention _macro As an evaluation index.

(1) Accuracy differs from the Accuracy formula used by the general classification task in that the calculation formula is as follows:

because the Accurcry formula used by the general classification task is too severe, in order to better reflect the performance of the multi-label classification algorithm, the invention adopts the corresponding variant of Accurcy under multi-label classification. Wherein S is _i Representative sample x _i True tag set, Y _i Representative sample x _i Is provided for a predictive tag set. S _i ∩Y _i The number of labels predicted correctly is denoted by S _i ∪Y _i The I represents the total label number, the larger the value of the index of Accuracy is, the better, in the aspect of verification of experimental results, the five-fold cross verification mode is adopted, and the performance of different algorithms about Accuracy is compared with the following table:

Dataset	CC	LOCC	GCC
				yeast	0.4585	0.4649	0.4802
scene	0.5943	0.5938	0.6114
				emotions	0.3851	0.3665	0.3817
enron	0.4034	0.3997	0.4026
				Slashdot-F	0.3945	0.4147	0.4050
CAL500	0.2210	0.2233	0.2347
				medical	0.6964	0.7068	0.7032

the optimal algorithm result of the corresponding index is the maximum value of corresponding data in CC, LOCC, GCC algorithms, and on the Accuracy index, the GCC algorithm can be seen to be superior in Accuracy on other 4 data sets except the medium and Slassdot-F data sets. The CC algorithm has high accuracy over the 1 data sets of the project, and the LOCC algorithm has higher accuracy over the media and Slassdot-F data sets. Therefore, the GCC method has better multi-label classification accuracy and greatly improves the performance of the traditional CC algorithm.

(2)F1 _macro The calculation formula is as follows:

since Accuracy mainly evaluates the correct label, in order to consider both correct and incorrect samples, the invention uses F1 under macroscopic averaging _macro And (5) an index. P in the formula _i Corresponding to Precision, r _i Corresponding to Recall ratio Recall, F1 _macro The index is then the harmonic average of the precision p and recall r. The larger the index is, the better the algorithm comprehensive performance is.

Different algorithms with respect to F1 _macro The performance comparison of (2) is as follows:

Dataset	CC	LOCC	GCC
				yeast	0.5585	0.5505	0.5637
scene	0.8547	0.8544	0.8578
				emotions	0.6563	0.6601	0.6598
enron	0.5834	0.5860	0.5845
				Slashdot-F	0.6503	0.6538	0.6508
CAL500	0.5098	0.5104	0.5103
				medical	0.6497	0.6477	0.6497

among them, the GCC algorithm performance is better than that on the F1 index, with the highest F1 performance on the 4 data sets. The F1 index of the conventional CC algorithm does not have any advantage for its F1 value, except that it is kept on the medium dataset and the GCC algorithm is leveled. The LOCC algorithm has the highest F1 performance on the 3 data sets. Overall the GCC method herein has better F1 performance with a great performance boost over the traditional CC algorithm.

The optimization method of the classifier chain label sequence provided by the embodiment of the invention has the advantages that the performance effect of the GCC algorithm is best, the performance of the LOCC algorithm is second, and the performance of the CC algorithm is worst. Therefore, the optimization method of the classifier chain label sequence provided by the invention can be inferred to obviously improve the multi-label classification effect, and the performance optimization of the original classifier chain model is realized.

Example 2

An embodiment of the present invention provides an optimization system for a classifier chain tag sequence, as shown in fig. 2, including:

the sample acquisition module 1 is used for acquiring a label sample to be classified; this module performs the method described in step S1 in embodiment 1, and will not be described here again.

The classifier chain model identification module 2 is used for acquiring a sample label set of a sample to be classified; this module performs the method described in step S2 in embodiment 1, and will not be described here.

The co-occurrence analysis module 3 is used for acquiring a co-occurrence matrix corresponding to the sample label set; this module performs the method described in step S3 in embodiment 1, and will not be described here.

A co-occurrence vector acquisition module 4, configured to acquire co-occurrence vectors by using multiple co-occurrence branches of the co-occurrence matrix; this module performs the method described in step S4 in embodiment 1, and will not be described here.

The classifier chain label sequence generating module 5 is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector and generating the required sequence of the classifier chains based on a greedy strategy; this module performs the method described in step S5 in embodiment 1, and will not be described here.

The embodiment of the invention provides an optimization system of a classifier chain label sequence, provides a label sequence optimization selection generation strategy based on co-occurrence analysis to improve the multi-label classification performance of a classifier chain model, adopts a greedy strategy, sequentially extracts information from a co-occurrence matrix from the angle of ensuring the maximum co-occurrence rate and generates a corresponding classifier chain sequence, obviously improves the multi-label classification effect, and realizes the performance optimization of the original classifier chain model.

Example 3

An embodiment of the present invention provides a terminal, as shown in fig. 3, including: at least one processor 401, such as a CPU (Central Processing Unit ), at least one communication interface 403, a memory 404, at least one communication bus 402. Wherein communication bus 402 is used to enable connected communications between these components. The communication interface 403 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional communication interface 403 may further include a standard wired interface and a wireless interface. The memory 404 may be a high-speed RAM memory (Ramdom Access Memory, volatile random access memory) or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 404 may also optionally be at least one storage device located remotely from the aforementioned processor 401. Wherein the processor 401 may perform the method of optimizing the classifier chain tag sequence in embodiment 1. A set of program codes is stored in the memory 404, and the processor 401 calls the program codes stored in the memory 404 for executing the optimization method of the classifier chain tag sequence in embodiment 1. The communication bus 402 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Communication bus 402 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in fig. 3, but not only one bus or one type of bus. Wherein the memory 404 may include volatile memory (English) such as random-access memory (RAM); the memory may also include a nonvolatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated as HDD) or a solid-state drive (english: SSD); memory 404 may also include a combination of the above types of memory. The processor 401 may be a central processor (English: central processing unit, abbreviated: CPU), a network processor (English: network processor, abbreviated: NP) or a combination of CPU and NP.

Wherein the memory 404 may include volatile memory (English) such as random-access memory (RAM); the memory may also include a nonvolatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated as HDD) or a solid state disk (english: solid-state drive, abbreviated as SSD); memory 404 may also include a combination of the above types of memory.

The processor 401 may be a central processor (English: central processing unit, abbreviated: CPU), a network processor (English: network processor, abbreviated: NP) or a combination of CPU and NP.

Wherein the processor 401 may further comprise a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof (English: programmable logic device). The PLD may be a complex programmable logic device (English: complex programmable logic device, abbreviated: CPLD), a field programmable gate array (English: field-programmable gate array, abbreviated: FPGA), a general-purpose array logic (English: generic array logic, abbreviated: GAL), or any combination thereof.

Optionally, the memory 404 is also used for storing program instructions. The processor 401 may invoke program instructions to implement the optimization method of the classifier chain tag sequence as in embodiment 1 of the present application.

The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores computer executable instructions, and the computer executable instructions can execute the method for optimizing the classifier chain tag sequence in the embodiment 1. The storage medium may be a magnetic Disk, an optical disc, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.

It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present invention.

Claims

1. The multi-label classification method based on the classifier chain label sequence optimization is characterized by comprising the following steps of:

obtaining an input sample to be classified, wherein the input sample comprises: pictures, texts;

identifying an input sample by using a classifier chain model, generating a sample label set, and forming a classifier chain, wherein the sample label is an attribute corresponding to a picture or an attribute corresponding to a text;

obtaining a co-occurrence matrix corresponding to the sample tag set by utilizing co-occurrence analysis, wherein elements of the co-occurrence matrix are the probability of simultaneous occurrence and simultaneous non-occurrence of sample tag set elements in the sample tag set;

acquiring a starting branch of a classifier chain according to the co-occurrence vector, generating an order of classifier chain labels based on a greedy strategy, and classifying multiple labels of an input sample, wherein the method comprises the following steps:

searching the label at the tail of the classifier chain for the corresponding maximum co-occurrence branch, and continuously increasing the whole chain structure until obtaining the label sequence of the whole classifier chain, wherein the method comprises the following steps: and selecting a third sample tag element at the end of the classifier chain, if a co-occurrence branch taking the third sample tag element as an endpoint exists in the co-occurrence vector, adding a fourth sample tag element at the other end of the co-occurrence branch to the tail of the classifier chain, removing the corresponding fourth sample tag element from the tag set, otherwise, traversing the rest tag set and acquiring a fifth sample tag element corresponding to suboptimal, adding the fifth sample tag element to the classifier chain, removing the corresponding fifth sample tag element from the tag set, and pushing the same, so as to continuously increase the whole chain structure.

2. The method of multi-label classification based on classifier chain label sequence optimization of claim 1 wherein the step of forming a co-occurrence vector using a plurality of co-occurrence branches of a co-occurrence matrix comprises:

a co-occurrence vector is composed of a plurality of co-occurrence branches.

3. The multi-label classification method based on classifier chain label sequence optimization of claim 2 wherein co-occurrence is the ratio of each element in the co-occurrence matrix to the number of input samples to be classified.

4. A multi-tag classification system based on classifier chain tag sequence optimization, comprising:

the sample acquisition module is used for acquiring an input sample to be classified, wherein the input sample comprises: pictures, texts;

the classifier chain model identification module is used for acquiring a sample label set of an input sample to be classified, wherein the sample label is an attribute corresponding to a picture or an attribute corresponding to a text;

the co-occurrence analysis module is used for acquiring a co-occurrence matrix corresponding to the sample tag set, wherein elements of the co-occurrence matrix are the probability that sample tag set elements in the sample tag set occur simultaneously and do not occur simultaneously;

the classifier chain label sequence generating module is used for acquiring the initial branches of the classifier chains according to the co-occurrence vector, generating the required sequence of the classifier chains based on a greedy strategy, and is used for multi-label classification of input samples, and comprises the following steps:

5. A terminal, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the multi-label classification method based on classifier chain label sequence optimization of any one of claims 1-3.

6. A computer readable storage medium having stored thereon computer instructions for causing the computer to perform the multi-label classification method based on classifier chain label sequence optimization of any one of claims 1-3.