CN110347791B - Topic recommendation method based on multi-label classification convolutional neural network - Google Patents

Topic recommendation method based on multi-label classification convolutional neural network Download PDF

Info

Publication number
CN110347791B
CN110347791B CN201910541695.7A CN201910541695A CN110347791B CN 110347791 B CN110347791 B CN 110347791B CN 201910541695 A CN201910541695 A CN 201910541695A CN 110347791 B CN110347791 B CN 110347791B
Authority
CN
China
Prior art keywords
label
neural network
convolutional neural
correlation
labels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910541695.7A
Other languages
Chinese (zh)
Other versions
CN110347791A (en
Inventor
袁锦杰
蔡瑞初
郝志峰
温雯
王丽娟
陈炳丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910541695.7A priority Critical patent/CN110347791B/en
Publication of CN110347791A publication Critical patent/CN110347791A/en
Application granted granted Critical
Publication of CN110347791B publication Critical patent/CN110347791B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of education, in particular to a topic recommendation method based on a multi-label classification convolutional neural network.

Description

Topic recommendation method based on multi-label classification convolutional neural network
Technical Field
The invention relates to the field of education, in particular to a topic recommendation method based on a multi-label classification convolutional neural network.
Background
With the rapid development of computer technology, science and technology has also been widely used in the education field. It is particularly convenient to find other similar questions through one question, for example, if a student cannot grasp a question of a certain type or a certain examination point, he needs to find more similar questions to consolidate and strengthen grasping the questions; for teachers, the examination papers are similar in questions, the examination points are fixed, but the questions are changed, and the questions of other same examination points are found through one question, so that the examination papers are very convenient for the teacher to take out the examination papers. Then how to find other similar questions of a question is focused on extracting examination point information of a question, wherein examination point extraction refers to the process of finding and identifying the concept, key point and rule relation which are finally extracted from the question information, so that the organization and management mode of the traditional knowledge point examination point is changed. Currently, the models of examination point extraction support rough sets, genetic algorithms, neural networks, multi-label classification, potential semantic indexing and the like. In the past research, multi-tag learning has been widely focused and a series of progress have been made, in which how to learn and utilize the dependency relationships before a plurality of tags is a key issue currently widely accepted and focused, and effective learning and utilizing these dependency relationships are key to improving performance of multi-tag classification models, but in the current multi-tag learning method, there are still deficiencies in efficiency and accuracy.
Disclosure of Invention
In order to solve the defects of the efficiency and the accuracy of the multi-label learning method in the prior art, the invention provides a topic recommendation method based on a multi-label classification convolutional neural network.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a topic recommendation method based on a multi-label classified convolutional neural network comprises the following steps:
step S1: acquiring a plurality of questions and a test point set as sample examples, marking test point labels of each question, and storing the marked questions as a question bank; step S2: obtaining a group of keywords of each question in a question bank, and training word vectors of all the keywords;
step S3: calculating local correlation matrix related to the labels to obtain local correlation among the labels, and adding a training set by a method of searching and matching high-correlation label pairs according to the local correlation among the labels;
step S4: constructing a one-dimensional convolutional neural network, wherein an input layer of the one-dimensional convolutional neural network is a word vector of a keyword of a question, an output layer of the one-dimensional convolutional neural network is a predicted value of a test point label, effective elements of a local first-related matrix among the labels are used as neurons to be added into a first full-connection layer of the one-dimensional convolutional neural network, and training the one-dimensional convolutional neural network and selecting an optimal model;
step S5: inputting the new questions into a one-dimensional convolutional neural network of the optimal model, inputting the predicted value of the examination point label, classifying all the questions by using a clustering method according to the characteristics of the examination points, obtaining other questions similar to the examination points of the questions, and taking the other questions as recommended questions.
Preferably, in step S1, if a question includes a test point, the test point tag value corresponding to the sample instance is set to 1, otherwise, to 0.
Preferably, the title recommendation method based on the multi-label classification convolutional neural network according to claim 2 is characterized in that in step S2, the pictures, stop words and punctuation marks of the title are required to be filtered, special symbols and professional vocabulary are reserved, a group of keywords of the title is obtained, the length of each sample is increased to the maximum number of keywords in the title set, blank positions are filled with designated characters, the input dimensions of the samples are consistent, and word vectors of each keyword are trained.
Preferably, in step S3, let lt and lz be any two point tags, and define the local correlation of lt and lz as:
Figure BDA0002101839280000021
wherein n (l) t ∩l z ) Indicating the number of topics with 1 for both tags, n (l) t ∪l z ) And (3) expressing the number of topics with at least one value of 1 in the two labels, solving the local correlation of all the two labels, obtaining a symmetrical matrix C with the diagonal element of 1, wherein the correlation of the labels and the labels is 1, and adding a training set by using a method for searching and matching high-correlation label pairs based on the matrix C.
Preferably, the method for finding highly relevant tag pairs comprises the steps of:
setting a local correlation threshold g, for each sample instance, listing all labels with label values of 1, pairing every two labels, and eliminating a label pair if the local correlation of two labels of the label pair is smaller than g; otherwise, the high correlation label pairs are reserved and considered, so that each sample can correspond to 0 to more high correlation label pairs.
Preferably, the method of matching highly correlated tag pairs comprises the steps of:
for each high-correlation tag pair of all question banks, traversing all high-correlation tag pairs in the sample instances, searching for the same tag pairs, if the two tag pairs are successfully matched, enabling the tag pairs to be (lu, lv), selecting two sample instances to which the two tag pairs belong as arithmetic average values to generate a new positive type instance, wherein lu and lv tag values of the instance are 1, adding the new instance into a new training set Dk, and finally merging the Dk and the original sample instance set into a total training set.
Preferably, if an arithmetic average instance of two instances to which the tag pair successfully corresponds already exists in Dk, the corresponding tag of the existing instance is directly set to 1 without adding a new training sample.
Preferably, in step S4, the specific steps for building the one-dimensional convolutional neural network are as follows:
the input layer is a group of keywords represented by word vectors of a title, the number of channels of the one-dimensional convolutional neural network is the size of the word vectors, the output layer activation function is a Sigmoid activation function, the cost function adopts a classical Cross Entropy function, effective elements of a local correlation matrix are extracted, the effective elements are the elements which remove the diagonal and the residual part of symmetrical redundant parts, one element corresponds to one neuron and is added into a first layer full-connection layer of the network, and the correlation among labels is expected to be utilized by a model in learning prediction.
Preferably, in step 5, the new topic and the point tag features of all the topics in the topic library are used as the data set and classified into a plurality of clusters, and if there are noise points and abnormal points, the topic does not have any similar topic, and each topic is used as a cluster.
Preferably, in step S5, if the input new topic already exists in the topic database, the result of the topic database clustering is directly used to find the recommended topic; otherwise, predicting the examination point of the question through a convolutional neural network, searching the question similar to the new examination point in the question bank by using a clustering method, and taking the question as a recommended question, wherein if the examination point characteristic of the new question is a noise point during clustering, no question can be recommended.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
1. the invention highlights the correlation among the high correlation labels in the training set, and the label pairs with the correlation higher than a certain threshold value are regarded as high correlation label pairs, so that the unbalance rate of each label can be adjusted, and the model can learn the high correlation better to improve the classification accuracy.
2. The convolutional neural network can automatically extract the features of the topic keywords, can better help the convolutional neural network to classify the labels of the examination points features, and in addition, the method adds the correlation information among the labels in the first layer of the convolutional neural network, so that the correlation among the labels is considered in the training of the model, thereby improving the efficiency and the accuracy of network identification classification.
Drawings
FIG. 1 is a general flow chart of an implementation of the present invention;
FIG. 2 is a correlation flow for calculating local correlation between tags according to the present invention;
FIG. 3 is a correlation flow for adding training examples by searching and matching high correlation label pairs in the present invention;
fig. 4 is a schematic diagram of a one-dimensional convolutional neural network according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, 2 and 3, a topic recommendation method based on a multi-label classification convolutional neural network includes the following steps:
step S1: acquiring a plurality of questions and a test point set as sample examples, marking test point labels of each question, and storing the marked questions as a question bank; step S2: obtaining a group of keywords of each question in a question bank, and training word vectors of all the keywords;
step S3: calculating local correlation matrix related to the labels to obtain local correlation among the labels, and adding a training set by a method of searching and matching high-correlation label pairs according to the local correlation among the labels;
step S4: constructing a one-dimensional convolutional neural network, wherein an input layer of the one-dimensional convolutional neural network is a word vector of a keyword of a question, an output layer of the one-dimensional convolutional neural network is a predicted value of a test point label, effective elements of a local first-related matrix among the labels are used as neurons to be added into a first full-connection layer of the one-dimensional convolutional neural network, and training the one-dimensional convolutional neural network and selecting an optimal model;
step S5: inputting the new questions into a one-dimensional convolutional neural network of the optimal model, inputting the predicted value of the examination point label, classifying all the questions by using a clustering method according to the characteristics of the examination points, obtaining other questions similar to the examination points of the questions, and taking the other questions as recommended questions.
In a preferred embodiment, in step S1, if a question includes a test point, the test point tag value corresponding to the sample instance is set to 1, otherwise, set to 0.
As a preferred embodiment, according to claim 2, a method for recommending topics based on a multi-label classification convolutional neural network is characterized in that in step S2, pictures, stop words, punctuation marks of topics are required to be filtered, special symbols and professional vocabulary are reserved to obtain a group of keywords of the topics, the length of each sample is increased to the maximum number of keywords in the topic set, blank positions are filled with designated characters, the input dimensions of the samples are consistent, and word vectors of each keyword are trained.
As a preferred embodiment, let lt and lz be any two point tags in step S3, the local correlation of lt and lz is defined as:
Figure BDA0002101839280000051
wherein n (l) t ∩l z ) Indicating the number of topics with 1 for both tags, n (l) t ∪l z ) And (3) expressing the number of topics with at least one value of 1 in the two labels, solving the local correlation of all the two labels, obtaining a symmetrical matrix C with the diagonal element of 1, wherein the correlation of the labels and the labels is 1, and adding a training set by using a method for searching and matching high-correlation label pairs based on the matrix C.
As a preferred embodiment, the method of finding highly relevant tag pairs comprises the steps of:
setting a local correlation threshold g, for each sample instance, listing all labels with label values of 1, pairing every two labels, and eliminating a label pair if the local correlation of two labels of the label pair is smaller than g; otherwise, the high correlation label pairs are reserved and considered, so that each sample can correspond to 0 to more high correlation label pairs.
As a preferred embodiment, the method of matching highly correlated tag pairs comprises the steps of:
for each high-correlation tag pair of all question banks, traversing all high-correlation tag pairs in the sample instances, searching for the same tag pairs, if the two tag pairs are successfully matched, enabling the tag pairs to be (lu, lv), selecting two sample instances to which the two tag pairs belong as arithmetic average values to generate a new positive type instance, wherein lu and lv tag values of the instance are 1, adding the new instance into a new training set Dk, and finally merging the Dk and the original sample instance set into a total training set.
As a preferred embodiment, if an arithmetic average instance of two instances to which the tag pair matches successfully already corresponds already exists in Dk, then the corresponding tag of the existing instance is set to 1 directly without adding a new training sample.
As a preferred embodiment, in step S4, the specific steps for building a one-dimensional convolutional neural network are as follows:
the input layer is a group of keywords represented by word vectors of a title, the number of channels of the one-dimensional convolutional neural network is the size of the word vectors, the output layer activation function is a Sigmoid activation function, the cost function adopts a classical Cross Entropy function, effective elements of a local correlation matrix are extracted, the effective elements are the elements which remove the diagonal and the residual part of symmetrical redundant parts, one element corresponds to one neuron and is added into a first layer full-connection layer of the network, and the correlation among labels is expected to be utilized by a model in learning prediction.
As a preferred embodiment, in step 5, the new topic and the point tag features of all the topics in the topic library are used as the data set and classified into a plurality of clusters, and if there are noise points and abnormal points, the topic does not have any topic similar to the noise points and abnormal points, and each topic is used as a cluster.
As a preferred embodiment, in step S5, if the input new topic already exists in the topic library, the result of the topic library clustering is directly used to find the recommended topic; otherwise, predicting the examination point of the question through a convolutional neural network, searching the question similar to the new examination point in the question bank by using a clustering method, and taking the question as a recommended question, wherein if the examination point characteristic of the new question is a noise point during clustering, no question can be recommended.
Example 2
In this embodiment, in step S2, the topic picture, stop word, punctuation mark are required to be filtered, special symbol and professional vocabulary are reserved, a set of keywords of the topic is obtained, the length of each sample is increased to the maximum number of keywords in the topic set, and the blank positions are filled with designated characters, so that the input dimensions of the samples are consistent. And training a word vector for each keyword.
5 samples were generated as a presentation example:
sample instance Examination point 1 Examination point 2 Examination point 3 Examination point 4 Examination point 5
x1 1 0 0 0 0
x2 0 1 1 0 0
x3 1 1 1 0 1
x4 0 1 1 0 0
x5 1 0 0 1 1
As a preferred embodiment, in the step 3, the flow of calculating the local correlation between the labels is shown in fig. 2, let lt and lz be any two point tags, and their local correlation is defined as:
Figure BDA0002101839280000061
wherein n (l) t ∪l z ) Represents the number of instances where both tags are 1, n (l t ∩l z ) And (3) representing the number of instances with at least one value of 1 in the two labels, and obtaining the local correlation of all the two labels, so that a symmetrical matrix C with the diagonal element of 1 can be obtained, and the correlation between the labels and the labels is 1. The local correlation matrix C for an example sample is as follows:
examination point 1 Examination point 2 Examination point 3 Examination point 4 Examination point 5
Examination point 1 1 1/5 1/5 1/3 2/3
Examination point 2 1/5 1 1 0 1/4
Examination point 3 1/5 1 1 0 1/4
Examination point 4 1/3 0 0 1 1/2
Examination point 5 2/3 1/4 1/4 1/2 1
After obtaining the local correlation matrix, a process of adding training examples by a method of searching for and matching with a high correlation label pair is shown in fig. 3, a local correlation threshold g is set, for each example, all labels with label values of 1 are listed, pairing is carried out two by two, and if the local correlation of two labels of the label pair is smaller than g, the label pair is eliminated; otherwise, it is reserved and called high correlation tag pairs, and each sample can correspond to 0 to more high correlation tag pairs. The high correlation tag pair for the example sample is as follows (set correlation threshold g to 0.6):
sample instance High correlation tag pair
x1 Without any means for
x2 (examination Point 2, examination Point 3)
x3 (examination Point 1, examination Point 5)
x4 (examination Point 2, examination Point 3)
x5 (examination Point 1, examination Point 4), (examination Point 1, examination Point 5)
For each high correlation tag pair of all the instances, traversing the high correlation tag pairs of all the instances after the instances to which the high correlation tag pairs belong, searching for the tag pairs identical to the high correlation tag pairs, if the two tag pairs are successfully matched, enabling the tag pairs to be (lu, lv), selecting two instances to which the two tag pairs belong to as an average value of an operation to generate a new positive class instance, wherein lu and lv tag values of the instances are 1, adding the new instance into a new training set Dk, and finally merging the Dk and the original instance set into a total training set.
For example, x2 and x4 of the samples have the same high correlation label pair (test point 2, test point 3), and if the matching is successful, a new training set instance can be generated by using the two samples and stored in Dk, and the label values of test point 2 and test point 3 are 1.
If the arithmetic average value instance of the two instances corresponding to the successful matching of the label pair exists in Dk, the corresponding label of the existing training instance is directly set to be 1, and a new training instance does not need to be added.
Example 3
In this embodiment, a one-dimensional convolutional neural network structure is built, as shown in fig. 4, in which the input layer is a set of keywords represented by word vectors for a title, the number of channels is the size of the word vectors, and if there are 8 keywords for each title, the dimension of the word vector for each keyword is 16, the dimension of the input layer is 8×16; the output layer activation function is a Sigmoid activation function, the cost function can be a classical Cross control function, the training process of the network is not described in detail here, the most common back propagation algorithm can be used for feedback parameter adjustment, and the super-parameters can be determined by a Grid Search method of the Scikit-Learn framework; the improved convolutional neural network is characterized in that effective elements of a local correlation matrix are extracted, namely, the diagonal lines and the rest elements of symmetrical redundant parts are removed, if the total number of labels is h, the size of the matrix C is h multiplied by h, only h multiplied by (h-1)/2 effective elements are added into a first full-connection layer of the network, one element corresponds to one neuron, the first full-connection layer is added with neurons with label correlation information besides characteristic neurons obtained by a convolutional layer and a pooling layer, the first full-connection layer can be taken as input of the standard neural network in the full-connection layer, and the correlation among labels is expected to be utilized by a model in learning prediction, so that the efficiency and the accuracy are improved.
The output layer activation function is a Sigmoid function, each output neuron corresponds to one label, and if the output value exceeds a set threshold value, the corresponding label is set to be 1; otherwise, set to 0.
The same or similar reference numerals correspond to the same or similar components;
the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (6)

1. The title recommendation method based on the multi-label classification convolutional neural network is characterized by comprising the following steps of:
step S1: acquiring a plurality of questions and a test point set as sample examples, marking test point labels of each question, and storing the marked questions as a question bank;
step S2: obtaining a group of keywords of each question in a question bank, and training word vectors of all the keywords;
step S3: calculating local correlation matrix related to the labels to obtain local correlation among the labels, and adding a training set by a method of searching and matching high-correlation label pairs according to the local correlation among the labels;
step S4: constructing a one-dimensional convolutional neural network, wherein an input layer of the one-dimensional convolutional neural network is a word vector of a keyword of a question, an output layer of the one-dimensional convolutional neural network is a predicted value of a test point label, effective elements of a local first-related matrix among the labels are used as neurons to be added into a first full-connection layer of the one-dimensional convolutional neural network, and training the one-dimensional convolutional neural network and selecting an optimal model;
step S5: inputting the new questions into a one-dimensional convolutional neural network of an optimal model, inputting the predicted value of the examination point label, classifying all the questions by using a clustering method according to the characteristics of the examination points, obtaining other questions similar to the examination points of the questions, and taking the other questions as recommended questions;
in step S1, if a question includes a test point, the test point tag value corresponding to the sample instance is set to 1, otherwise, set to 0;
in step S3, let lt and lz be any two point tags, and define the local correlation of lt and lz as:
Figure FDA0004227591100000011
wherein n (l) t ∩l z ) Indicating the number of topics with 1 for both tags, n (l) t ∪l z ) The method comprises the steps of representing the number of topics with at least one value of 1 in two labels, solving the local correlation of all the two labels, obtaining a symmetrical matrix C with a diagonal element of 1, wherein the correlation of the labels and the labels is 1, and adding a training set by a method of searching for a label pair with high correlation with matching based on the matrix C;
the method for searching the high correlation label pair comprises the following steps:
setting a local correlation threshold g, for each sample instance, listing all labels with label values of 1, pairing every two labels, and eliminating a label pair if the local correlation of two labels of the label pair is smaller than g; otherwise, reserving and treating the sample as high correlation label pairs, so that each sample can correspond to 0 to more high correlation label pairs;
the method for matching the high correlation tag pair comprises the following steps:
for each high-correlation tag pair of all question banks, traversing all high-correlation tag pairs in the sample instances, searching for the same tag pairs, if the two tag pairs are successfully matched, enabling the tag pairs to be (lu, lv), selecting two sample instances to which the two tag pairs belong as arithmetic average values to generate a new positive type instance, wherein lu and lv tag values of the instance are 1, adding the new instance into a new training set Dk, and finally merging the Dk and the original sample instance set into a total training set.
2. The title recommendation method based on the multi-label classification convolutional neural network according to claim 1, wherein in step S2, pictures, stop words and punctuation marks of the title are required to be filtered, special symbols and professional words are reserved, a group of keywords of the title are obtained, the length of each sample is increased to the maximum number of keywords in the title set, blank positions are filled with designated characters, the input dimensions of the samples are consistent, and word vectors of each keyword are trained;
3. the topic recommendation method based on multi-label classification convolutional neural network according to claim 2, wherein if an arithmetic average instance of two instances corresponding to a successful label pair match already exists in Dk, the corresponding label of the existing instance is directly set to 1 without adding a new training sample.
4. The topic recommendation method based on the multi-label classification convolutional neural network according to claim 1, wherein in step S4, the specific steps of building a one-dimensional convolutional neural network are as follows:
the input layer is a group of keywords represented by word vectors of a title, the number of channels of the one-dimensional convolutional neural network is the size of the word vectors, the output layer activation function is a Sigmoid activation function, the cost function adopts a classical Cross Entropy function, effective elements of a local correlation matrix are extracted, the effective elements are the elements which remove the diagonal and the residual part of symmetrical redundant parts, one element corresponds to one neuron and is added into a first layer full-connection layer of the one-dimensional convolutional neural network, and the correlation among labels is expected to be utilized by a model in learning prediction.
5. The method according to claim 1, wherein in step 5, the density-based clustering is used to classify the new topic and the feature of the examination point label of all the topics in the topic library into clusters, and if there are noise points and abnormal points, the topic does not have any similar topic, and the new topic and the feature of the examination point label of all the topics in the topic library are used as a cluster.
6. The topic recommendation method based on multi-label classification convolutional neural network according to claim 1, wherein in step S5, if the input new topic already exists in the topic library, the result of topic library clustering is directly used to find the recommended topic; otherwise, predicting the examination point of the question through a convolutional neural network, searching the question similar to the new examination point in the question bank by using a clustering method, and taking the question as a recommended question, wherein if the examination point characteristic of the new question is a noise point during clustering, no question can be recommended.
CN201910541695.7A 2019-06-20 2019-06-20 Topic recommendation method based on multi-label classification convolutional neural network Active CN110347791B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910541695.7A CN110347791B (en) 2019-06-20 2019-06-20 Topic recommendation method based on multi-label classification convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910541695.7A CN110347791B (en) 2019-06-20 2019-06-20 Topic recommendation method based on multi-label classification convolutional neural network

Publications (2)

Publication Number Publication Date
CN110347791A CN110347791A (en) 2019-10-18
CN110347791B true CN110347791B (en) 2023-06-16

Family

ID=68182672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910541695.7A Active CN110347791B (en) 2019-06-20 2019-06-20 Topic recommendation method based on multi-label classification convolutional neural network

Country Status (1)

Country Link
CN (1) CN110347791B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861896A (en) * 2019-11-27 2021-05-28 北京沃东天骏信息技术有限公司 Image identification method and device
CN111931875B (en) * 2020-10-10 2021-10-08 北京世纪好未来教育科技有限公司 Data processing method, electronic device and computer readable medium
CN112669181B (en) * 2020-12-29 2023-06-30 吉林工商学院 Assessment method for education practice training
CN112883284B (en) * 2021-04-14 2023-04-07 首都师范大学 Online learning system based on network and data analysis and test question recommendation method
CN114091607B (en) * 2021-11-24 2024-05-03 燕山大学 Semi-supervised multi-label online stream feature selection method based on neighborhood rough set

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9830526B1 (en) * 2016-05-26 2017-11-28 Adobe Systems Incorporated Generating image features based on robust feature-learning
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834729B (en) * 2015-05-14 2018-08-10 作业帮教育科技(北京)有限公司 Topic recommends method and topic recommendation apparatus
US20170075978A1 (en) * 2015-09-16 2017-03-16 Linkedin Corporation Model-based identification of relevant content
US11086918B2 (en) * 2016-12-07 2021-08-10 Mitsubishi Electric Research Laboratories, Inc. Method and system for multi-label classification
CN107895050A (en) * 2017-12-07 2018-04-10 联想(北京)有限公司 Image searching method and system
CN109388709A (en) * 2018-08-20 2019-02-26 国政通科技有限公司 Automatic Creating Test Paper method, electronic equipment and storage medium
CN109086453A (en) * 2018-08-29 2018-12-25 华中科技大学 A kind of method and system for extracting label correlation from neighbours' example
CN109670042A (en) * 2018-12-04 2019-04-23 广东宜教通教育有限公司 A kind of examination question classification and grade of difficulty method based on recurrent neural network
CN109635100A (en) * 2018-12-24 2019-04-16 上海仁静信息技术有限公司 A kind of recommended method, device, electronic equipment and the storage medium of similar topic

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9830526B1 (en) * 2016-05-26 2017-11-28 Adobe Systems Incorporated Generating image features based on robust feature-learning
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NADAQ:natural languwage database querying based on deep learning;BoYan Xu等;《IEEE access》;第35012-35017页 *
基于标签相关性的卷积神经网络多标签分类算法;蒋俊钊等;《工业控制计算机》;第105-109页 *

Also Published As

Publication number Publication date
CN110347791A (en) 2019-10-18

Similar Documents

Publication Publication Date Title
CN110347791B (en) Topic recommendation method based on multi-label classification convolutional neural network
CN109885692B (en) Knowledge data storage method, apparatus, computer device and storage medium
CN109189901B (en) Method for automatically discovering new classification and corresponding corpus in intelligent customer service system
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN112070138B (en) Construction method of multi-label mixed classification model, news classification method and system
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN109271514B (en) Generation method, classification method, device and storage medium of short text classification model
CN109446423B (en) System and method for judging sentiment of news and texts
CN112819023A (en) Sample set acquisition method and device, computer equipment and storage medium
CN113010683B (en) Entity relationship identification method and system based on improved graph attention network
Rajamohana et al. An effective hybrid cuckoo search with harmony search for review spam detection
CN113434688B (en) Data processing method and device for public opinion classification model training
CN114491024B (en) Specific field multi-label text classification method based on small sample
CN111709225B (en) Event causal relationship discriminating method, device and computer readable storage medium
CN114153978A (en) Model training method, information extraction method, device, equipment and storage medium
CN111639185B (en) Relation information extraction method, device, electronic equipment and readable storage medium
CN113946657A (en) Knowledge reasoning-based automatic identification method for power service intention
CN109543038B (en) Emotion analysis method applied to text data
CN109992667B (en) Text classification method and device
CN103268346A (en) Semi-supervised classification method and semi-supervised classification system
Ransing et al. Screening and Ranking Resumes using Stacked Model
CN112380346B (en) Financial news emotion analysis method and device, computer equipment and storage medium
CN113392868A (en) Model training method, related device, equipment and storage medium
CN116226747A (en) Training method of data classification model, data classification method and electronic equipment
CN111460817A (en) Method and system for recommending criminal legal document related law provision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant