CN114519097A

CN114519097A - Academic paper recommendation method for heterogeneous information network enhancement

Info

Publication number: CN114519097A
Application number: CN202210418401.3A
Authority: CN
Inventors: 刘柏嵩; 吴俊超; 沈小烽; 张雪垣; 王冰源
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2022-04-21
Filing date: 2022-04-21
Publication date: 2022-05-20
Anticipated expiration: 2042-04-21
Also published as: CN114519097B

Abstract

The invention discloses an academic paper recommendation method for heterogeneous information network enhancement, which comprises the following steps: step 1, constructing a heterogeneous information network, wherein the heterogeneous information network comprises 3 types of nodes including users, papers and labels, and 3 types of relations, namely, an interaction relation between users and papers, a reference relation between papers and papers, and a subordinate relation between papers and labels; step 2, learning the interactive characteristics of the user and the thesis by utilizing a matrix decomposition algorithm; step 3, inputting the interactive features into a heterogeneous graph attention network to learn high-order features of the thesis in the heterogeneous information network; step 4, utilizing the outer product to calculate and fuse the characteristics obtained by learning in the steps 2 and 3; and 5, inputting the features fused in the step 4 into a depth recommendation model prediction score. The invention solves the problem of sparse interactive data by utilizing a heterogeneous information network, and can improve the recommendation accuracy.

Description

Academic paper recommendation method for heterogeneous information network enhancement

Technical Field

The invention relates to an academic paper recommendation method, in particular to an academic paper recommendation method based on heterogeneous information network enhancement.

Background

With the explosive growth of the published amount of Academic achievements and the rapid iteration of knowledge, scientific researchers are difficult to easily find Academic papers meeting their needs, and face the problem of more and more serious Paper information overload, and Academic Paper Recommendation Systems (APRs) accurately recommend papers to researchers, and are becoming indispensable tools for researchers. Collaborative Filtering (CF) is widely used in recommendation systems, which predicts user personalized preferences by exploring historical interactions of users, however, CF cannot produce robust performance when the interaction matrix is very sparse; in recent years, many methods have been proposed to improve recommendation performance using various auxiliary information. In the paper recommendation, two kinds of auxiliary information are widely adopted: textual information and structural information.

The text information enhances the characteristics of the papers by using titles and abstracts of the papers, but the topics of the papers of the recommendation lists generated by the text information tend to be similar and lack diversity and novelty. The structural Information can be divided into a citation Network and a Heterogeneous Information Network (HIN), the citation Network reflects the relationship between one paper and another paper, but there is a cold start problem that the new paper is cited as zero; the heterogeneous information network is a method for fusing multi-source data, and the rich semantic information can effectively alleviate the problems; how to enhance the performance of paper recommendation by utilizing a heterogeneous information network has attracted the interest and attention of scientific research personnel. The basic idea of the existing thesis recommendation method based on the heterogeneous information network is to learn the characteristics of users and thesis by using a graph embedding method, calculate the scores of the users and the thesis according to the learned characteristics, and generate recommendations such as PRHNE and HGRec by ranking the scores, although the methods effectively improve the thesis recommendation performance, the following problems exist:

1) only the first-order or second-order similarity among the network nodes is considered during feature learning, and the high-order connection relation among the nodes is not excavated;

2) when the recommendation is generated, only the cosine similarity between the user and the paper features is considered, the complex interaction relationship between the user and the paper is not mined, and the enhancement effect of the heterogeneous information network on the recommendation performance cannot be fully reflected.

Therefore, there is a need to develop a new academic paper recommendation method based on heterogeneous information network enhancement to solve the existing problems.

Disclosure of Invention

The invention aims to provide an academic paper recommendation method for heterogeneous information network enhancement. The invention solves the problem of sparse interactive data by utilizing a heterogeneous information network, and can improve the recommendation accuracy.

The technical scheme of the invention is as follows: a heterogeneous information network enhanced academic paper recommendation method comprises the following steps:

step 1, constructing a heterogeneous information network, wherein the heterogeneous information network comprises 3 types of nodes including users, papers and labels, and 3 types of relations including an interaction relation between users and papers, a reference relation between papers and papers, and a subordinate relation between papers and labels;

step 2, learning the interactive characteristics of the user and the thesis by utilizing a matrix decomposition algorithm;

step 3, inputting the interactive features into a heterogeneous graph attention network to learn high-order features of the thesis in the heterogeneous information network;

step 4, utilizing the outer product to calculate and fuse the characteristics obtained by learning in the steps 2 and 3;

and 5, inputting the features fused in the step 4 into a depth recommendation model prediction score.

In the academic paper recommendation method for heterogeneous information network enhancement, the step 1 constructs a heterogeneous information network based on a citeuulike data set, the 3 types of nodes of a user, a paper and a label are respectively represented by a symbol U, P, T, and the step 1 specifically comprises the following sub-steps:

substep 1.1, converting data in the source file into a triple form: (h t r), h represents a head node id, t represents a tail node id, and r represents the relationship type between the head node h and the tail node t;

substep 1.2, establishing a relation matrix according to the triples, wherein the relation matrix comprises 3 matrixes: the method comprises the following steps of establishing a | U | × | P | interaction relation matrix between users and papers, a | P | × | P | reference relation matrix between papers and papers, and a | P | × | T | subordinate relation matrix between papers and labels, wherein the establishing process of the matrixes is as follows: firstly, initializing all 0 matrixes, then positioning the positions of matrix elements according to the triples, wherein the head nodes are matrix row numbers, the tail nodes are matrix column numbers, and the values are set to be 1;

substep 1.3, aligning the thesis id number of the relation matrix, and establishing a heterogeneous information network.

In the aforementioned academic paper recommendation method for heterogeneous information network enhancement, the step 2 specifically includes the following substeps:

substep 2.1, initializing the user interaction feature matrix at random

And a thesis interaction feature matrix

；

Substep 2.2 updating the matrix based on the loss function

And

the loss function is as follows:

，

wherein the content of the first and second substances,

to represent

The user interaction feature is located at the user interface,

to represent

The interactive features of the paper are described,

the dimensions of the interactive features are represented,

representing regularization coefficients that prevent overfitting, iteratively updating the matrix

And

up to

No longer decreases.

In the aforementioned academic paper recommendation method for heterogeneous information network enhancement, the step 3 specifically includes the following substeps:

substep 3.1, giving a set of paper related meta-paths

And calculating the meta-path neighbor matrix based on the relationship matrix obtained in the substep 1.2, wherein the calculation formula is as follows:

for the meta-path PUP, it is possible to,

，

for the meta-path PP,

，

for the meta-path PTP the information is,

，

wherein the content of the first and second substances,

the transpose of the matrix is represented, the neighbor matrix obtained by the calculation needs to be converted into a binary matrix, and a threshold value is set

When the elements in the matrix are larger than

If so, setting the value to be 1, otherwise, setting the value to be 0, and calculating the formula as follows:

，

wherein the content of the first and second substances,

the values of the elements representing the ith row and the jth column of the neighbor matrix,

to customize the threshold, by

、

And

by calculation, 3 binary neighbor matrices can be obtained:

、

and

the median value of the matrix is 1 to represent the neighbor relation, and the median value of the matrix is 0 without the neighbor relation;

and substep 3.2, aggregating the neighbor features based on the binary neighbor matrix, introducing node-level attention, aggregating the meaningful neighbor features to learn the target node features, wherein the calculation formula is as follows:

，

wherein the content of the first and second substances,

node for thesisjFor target thesis nodeiThe weight coefficient of (a) is calculated,

in order to be a function of the power exponent,

in order to activate the function(s),

representing the transpose of the node attention layer query vector,

node for thesisiThe characteristics of the interaction of (a) with (b),

splicing operation is carried out; and aggregating the neighbor information according to the weight coefficient, wherein the calculation formula is as follows:

，

wherein the content of the first and second substances,

presentation paperiAccording to the coefficient

Aggregating meta-specific paths

Features of the neighborhood;

substep 3.3 learning different element paths through the node attention layerCharacteristic of (2)

Introduces meta-path level attention, and merges the thesis features under different meta-pathsGFirst order features of learning articles in heterogeneous information networks

The calculation formula is as follows:

，

，

wherein, the first and the second end of the pipe are connected with each other,

respectively the weight matrix and the bias of the meta-path attention,

as a transpose of the query vector of the meta-path attention layer,

is as followsiThe weight coefficient corresponding to the element path,

is the total number of meta-paths;

substep 3.4, iteratively passingLHigh-order features of layer heterogeneous graph attention network learning paper

The calculation formula is as follows:

，

wherein the content of the first and second substances,

the method represents the characteristics of the thesis obtained by the attention network learning of the heterogeneous graphs of different layers.

In the aforementioned academic paper recommendation method for heterogeneous information network enhancement, step 4 is performed before step 2 and step 3 to obtain user characteristics

Paper interaction feature

Network node characterization of paper

Summing the thesis interaction characteristics and the thesis network node characteristics to obtain new thesis characteristics

=

+

。

In the aforementioned academic thesis recommendation method enhanced by heterogeneous information network, the step 4 is based on new thesis features

And then the characteristics of the outer product fusion user and the thesis are utilized to obtain an interactive graph

The calculation formula is as follows:

，

wherein the content of the first and second substances,

is a two-dimensional matrix, and the matrix is,

and

a feature vector representing a particular user and paper, the subscript of the vector being the value of where it is located.

In the aforementioned method for recommending academic papers for heterogeneous information network enhancement, the step 5 specifically includes the following substeps:

substep 5.1, constructing a neural network structure of 6 convolutional layers and 1 fully-connected layer, wherein the number of convolution kernels in each layer is 32, the size of the convolution kernels is 2 multiplied by 2, the step length is 2, the dimensionality of the fully-connected layer is 32 multiplied by 1, and then, obtaining the interactive graph in the step 4

Inputting the prediction score of the convolutional network, and calculating according to the following formula:

，

wherein the content of the first and second substances,

in order to predict the score for the model,

、

is shown as

The parameters of the layer convolution kernel and the bias term,

representing convolution operations with Relu as excitationThe function of the activity is a function of the activity,

and

representing the weight and the bias of full connection, and flatten is matrix steering quantity operation;

substep 5.2, selecting BPR as the loss function, which optimizes the model parameters by maximizing the scoring distance of the positive and negative samples, the calculation formula is as follows:

，

wherein the content of the first and second substances,

a training set is represented that represents the training set,

for the user

Is detected in the positive sample of (a),

for the user

The negative sample of (a) is,

in order to activate the function(s),

and

the positive and negative samples of the model prediction are scored,

are regularization coefficients that prevent overfitting.

Compared with the prior art, the invention has the beneficial effects that: the invention provides a thought for an academic paper recommendation algorithm, which is used for solving the problems that the high-order relation between nodes and the complex interaction relation between users and papers are not mined in the conventional paper recommendation algorithm and the problem of sparse interaction data between users and papers is solved by a heterogeneous information network.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is an exemplary diagram of a heterogeneous information network;

FIG. 3 is a portion of an academic paper recommendation model diagram enhanced by heterogeneous information networks;

fig. 4 is another part of an academic paper recommendation model diagram enhanced by a heterogeneous information network.

Detailed Description

The invention is further illustrated by the following figures and examples, which are not to be construed as limiting the invention.

Example (b): a method for recommending academic papers for heterogeneous information network enhancement, the flow is shown in fig. 1.

Step 1, constructing a heterogeneous information network.

As shown in fig. 2, which is a heterogeneous information network constructed based on a citeuliuke data set, part (a) of fig. 2 represents a node type, (b) represents a heterogeneous information network, (c) represents a meta-path, and (d) represents a meta-path neighbor, and the network includes 3 types of nodes: user U, paper P and tag T, 3 relations: user paper interaction relation, reference relation among papers and paper label containing relation. The citesulide data set is a public data set suitable for the field of paper recommendation, three files, namely user.dat, positions.dat and item-tag.dat, are selected as original data, wherein the user.dat is a history clicked paper record of a user, the positions.dat is a quotation record of a paper, and the item-tag.dat is a tag record owned by the paper; the method comprises the following specific steps:

substep 1.1, converting the data in the three files into a triple form: (h t r), h represents a head node id, t represents a tail node id, and r represents the relationship type between the head node h and the tail node t;

substep 1.2, establishing a relation matrix according to the triples, wherein the relation matrix comprises 3 matrixes: the method comprises the following steps of establishing a | U | × | P | interaction relation matrix (UP) among user papers, a | P | × | P | reference relation matrix (PP) among the papers and a | P | × | T | subordinate relation matrix (PT) among paper labels, wherein the establishing process of the matrixes comprises the following steps: firstly, initializing a matrix with 0, then positioning the position of a matrix element according to the triplet, wherein the head node of the matrix element is a matrix row number, the tail node of the matrix element is a matrix column number, and the value of the matrix element is set to be 1.

And 2, learning the interactive characteristics of the user and the thesis by utilizing a matrix decomposition algorithm.

The matrix is decomposed into a classic machine learning recommendation model which takes the user paper history interaction matrix

Decomposed into a matrix of user interaction characteristics

And a thesis interaction feature matrix

The method comprises the following specific steps:

substep 2.1, initializing the user interaction feature matrix at random

And a thesis interaction feature matrix

；

Substep 2.2 updating the matrix based on the loss function

And

the loss function is as follows:

，

wherein the content of the first and second substances,

to represent

The user interaction feature is located at the user interface,

to represent

The interactive features of the paper are described,

a dimension of the interactive feature is represented,

And

up to

No longer decreases.

And 3, inputting the interactive features into the attention network of the heterogeneous graph to learn the high-order features of the paper in the heterogeneous information network.

The existing paper recommendation method based on the graph neural network does not consider the inherent difference of nodes and can lose important heterogeneous information. The meta path is a specific path connecting network nodes, and the meta path reflects different semantic information of the network nodes, as shown in part (c) of fig. 2, two papers can be connected through multiple meta paths, such as a paper-user-paper (PUP) and a paper-tag-paper (PTP), where the PUP indicates that the two papers are interacted by the same user, and the PTP indicates that the two papers have the same tag. The nodes of the thesis have different adjacent nodes under different meta-paths, and referring to fig. 3, on one hand, the adjacent nodes of the same meta-path may also have different importance; for example, paper-paper (PP) reflects inter-paper reference relationships, while papers may refer to papers for different aspects. On the other hand, different meta-paths may have different effects on the target node; for example, a paper-user-paper (PUP) may provide more information on learning of the characteristics of a paper than a paper-tag-paper (PTP). Considering the influence of neighbors of the paper on a target node under different meta-paths, learning the node characteristics of the paper by using a heterogeneous graph attention network, and specifically comprising the following steps:

substep 3.1, the working principle of the graph neural network is to propagate information between nodes based on neighbor matrices, giving a set of paper related element paths

for the meta-path PUP, it is possible to,

，

for the meta-path PP there is a path,

，

for the meta-path PTP the information is,

，

wherein the content of the first and second substances,

the transposition of the expression matrix, the neighbor matrix obtained by the calculation has the problem of unbalanced data distribution, the neighbor matrix needs to be converted into a binary matrix, and a threshold value is set

When the elements in the matrix are larger than

，

wherein the content of the first and second substances,

to customize the threshold, by

、

And

by calculation, 3 binary neighbor matrices can be obtained:

、

and

matrix ofThe median value is 1 to represent the neighbor relation, and the median value is 0 without the neighbor relation;

substep 3.2, aggregating neighbor features based on binary neighbor matrices, the traditional graph convolution network ignores different importance of the target node and its neighbor nodes, taking meta-path PP as an example, which reflects the reference relationship between two papers, referring to fig. 3, for the target node p under different meta-paths₃Of two citations p₂And p₄The authors may refer to the two citations for different purposes; in order to distinguish the importance of different nodes to the target node, node-level attention is introduced, meaningful neighbor features are aggregated to learn the target node features, and the calculation formula is as follows:

，

wherein the content of the first and second substances,

in order to be a function of the power exponent,

in order to activate the function(s),

representing the transpose of the node attention layer query vector,

node for thesisiThe characteristics of the interaction of (a) with (b),

，

wherein the content of the first and second substances,

presentation paperiAccording to the coefficient

Aggregating meta-specific paths

Features of the neighborhood;

substep 3.3, learning to obtain thesis characteristics under different element paths through the node attention layer

Introduces meta-path level attention, and merges the thesis features under different meta-pathsGLearning first-order features of paper in heterogeneous information networks

The calculation formula is as follows:

，

，

wherein the content of the first and second substances,

respectively the weight matrix and the bias of the meta-path attention,

as a transpose of the query vector of the meta-path attention layer,

is as followsiStrip element pathThe corresponding weight coefficient is set to be the weight coefficient,

is the total number of meta-paths;

substep 3.4, passing iterativelyLHigh-order features of layer heterogeneous graph attention network learning paper

The calculation formula is as follows:

，

wherein the content of the first and second substances,

And 4, calculating and fusing the characteristics obtained by learning in the steps 2 and 3 by utilizing the outer product.

User characteristics are available through steps 2 and 3

Paper interaction feature

Network node characterization of paper

=

+

And then the characteristics of the outer product fusion user and the thesis are utilized to obtain interactionDrawing (A)

The calculation formula is as follows:

，

wherein the content of the first and second substances,

is a two-dimensional matrix, and the matrix is,

and

a feature vector representing a particular user and paper, the subscript of the vector being the value of where it is located. Compared with the calculation mode of inner product and splicing, the outer product has the following advantages: 1) the inner product calculation only obtains diagonal elements in the interactive graph, and the outer product has more modelable information and still has rich semantics even on sparse data; 2) the splicing calculation ignores the correlation among different feature dimensions, and the outer product models different dimension features, so that the features in the heterogeneous information network can be fully utilized; 3) the structure of the two-dimensional matrix is beneficial to the convolutional neural network to learn complex interactive relations, and under the same network scale, the convolutional neural network has fewer parameters than a multilayer perceptron, so that a model can be stacked in a deeper network, the enhancement effect of heterogeneous information network characteristics on recommendation is fully excavated, and stronger generalization capability is achieved.

Step 5, recommending models, namely a academic paper recommending model diagram enhanced by heterogeneous information network after splicing the models in the figures 3 and 4, wherein e in the figure 3_HPoint to e in figure 4_H。

In the traditional thesis recommendation method, cosine similarity prediction scoring is adopted, the scoring is sorted and a thesis list of top-k is returned; simple cosine operation cannot be used for fitting the interaction relation between the complex user and the paper, the interaction relation between the complex user and the paper can be mined by the ability of the neural network for fitting any function, a recommendation model is built by adopting a convolutional neural network, and the method comprises the following specific steps:

，

wherein the content of the first and second substances,

in order to predict the score for the model,

、

is shown as

The parameters of the layer convolution kernel and the bias term,

representing a convolution operation, Relu being an activation function,

and

representing the weight and the offset of full connection, and flatten is matrix steering operation;

substep 5.2, the present invention focuses more on top-K performance, so the BPR is chosen as the loss function, which optimizes the model parameters by maximizing the scoring distance of the positive and negative samples, and the calculation formula is as follows:

，

wherein the content of the first and second substances,

a training set is represented that represents the training set,

for the user

Is detected in the positive sample of (a),

for the user

The negative sample of (a) is,

in order to activate the function(s),

and

the positive and negative samples of the model prediction are scored,

are regularization coefficients that prevent over-fitting.

The process of the HIN-APR algorithm of the invention is as follows:

Algorithm: HIN-APR

Input: HIN G=（V，E）; meta-path set MP; depth L;

Interaction matrix R;

Output: Predicted Function

Initialize all parameters;

PQ=matrices-factorization（R）;

For number of training epochs do

For batch (u，v) from R do

;

p=P（u）;

q=Q（v）;

;

;

Update parameters by gradient descent;

End for

Return

;

Function

:

;

For l=0…L do:

For mp in MP. do:

y= node-attention(neibor(mp),

);

Y.append(y);

End for

= meta-path-attention (Y);

Y = [];

End for

Return

;

in order to fully embody the advantages of the invention in academic paper recommendation, experiments are carried out on a citeuliuke academic paper recommendation data set, the data set is divided into citeuliuke-a and citeuliuke-t, the sparsity of interaction data is 0.22% and 0.07% respectively, leave-one-out is adopted in data set division, for each user in the data set, the last interaction is reserved as a test positive sample, the rest interactions are used as training positive samples, 999 papers which are not interacted before are randomly selected as test negative samples for each user, and the training negative samples are calculated according to the following formula (1: 1 are randomly chosen during the training process. Experiments are compared with 6 mainstream models at present, namely BPR-MF, Neu-MF, NGCF, HE-Rec, LGRec and CGPRec, and the Hit Rate (HR) and the normalized cumulative discount gain (NDCG) are used as evaluation indexes. The following table shows the experimental results:

from the above table it can be seen that:

1) the performance of the HIN-APR algorithm is superior to that of other model methods on two data sets, compared with the best performance of other model methods, the HR is improved by 1.85% on average, the NDCG is improved by 3.42% on average, and the effectiveness of the HIN-APR modeling high-order connection and complex interaction on the enhancement of the recommendation performance of the thesis is verified.

2) In other model methods, the recommendation method based on the HIN is superior to the collaborative filtering method on the whole, which shows that the problem of data sparsity is effectively relieved by adding the HIN for recommendation, and the recommendation performance is improved. In the HIN-based method, indexes of all aspects of CGPRec are superior to HE-Rec and LGRec, which shows the limitation that the latter only models a low-order connection relation between nodes and the effectiveness of GCN modeling a high-order relation on recommendation; in addition, the performance of the HIN-APR is superior to that of the CGPRec because the HIN-APR introduces a node and meta-path attention layer, which can more accurately propagate neighbor information, while taking into account high-order relationships.

3) In the collaborative filtering method: compared with the recommended performances of MF-BPR, Neu-MF and NGCF, the recommended performances are improved in different degrees on two data sets, and the defects that complex interaction cannot be mined by matrix decomposition are shown; in a sparse data set citeuulike-t, promotion of Neu-MF and NGCF on MF-BPR is not as good as that in citeuulike-a, which shows that under sparse data, a collaborative filtering method faces an over-fitting problem, while HIN-APR is still greatly promoted, which shows that heterogeneous information network characteristics can be well utilized through outer product and convolution calculation, and the data sparseness problem is relieved.

The above are only preferred embodiments of the present invention, and the scope of the present invention is not limited to the above examples, and all technical solutions that fall under the spirit of the present invention belong to the scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A heterogeneous information network enhanced academic thesis recommendation method is characterized by comprising the following steps: the method comprises the following steps:

2. The method of claim 1, wherein the method comprises: the step 1 is to construct a heterogeneous information network based on a citeuliuke data set, wherein 3 types of nodes, namely users, papers and labels, are respectively represented by a symbol U, P, T, and the step 1 specifically comprises the following substeps:

3. The method of claim 2, wherein the method comprises: the step 2 specifically comprises the following substeps:

substep 2.1, initializing the user interaction feature matrix at random

And a thesis interaction feature matrix

；

Substep 2.2 updating the matrix based on the loss function

And

the loss function is as follows:

，

to represent

The user interaction feature is located at the user interface,

to represent

The interactive features of the paper are described,

the dimensions of the interactive features are represented,

And

up to

No longer decreases.

4. The method of claim 3, wherein the method comprises the following steps: the step 3 specifically comprises the following substeps:

substep 3.1, giving a set of paper related meta-paths

for the meta-path PUP, it is possible to,

，

for the meta-path PP,

，

for the meta-path PTP the meta-path,

，

wherein the content of the first and second substances,

When the elements in the matrix are larger than

，

wherein the content of the first and second substances,

to customize the threshold, by

、

And

by calculation, 3 binary neighbor matrices can be obtained:

、

and

，

in order to be a function of the power exponent,

in order to activate the function(s),

representing the transpose of the node attention layer query vector,

node for thesisiThe characteristics of the interaction of (a) with (b),

，

presentation paperiAccording to the coefficient

Aggregating meta-specific paths

Features of the neighborhood;

substep 3.3 learning thesis characteristics under different element paths through node attention layer

The calculation formula is as follows:

，

，

wherein the content of the first and second substances,

respectively the weight matrix and the bias of the meta-path attention,

for the transpose of the query vector of the meta-path attention layer,

is a firstiThe weight coefficient corresponding to the element path,

is the total number of meta-paths;

The calculation formula is as follows:

，

wherein the content of the first and second substances,

indicating different layersThe charting is focused on the characteristics of the thesis obtained by the network learning.

5. The method of claim 4, wherein the method comprises: before the step 4 is carried out, the user characteristics are obtained through the step 2 and the step 3