CN115630153A - Research student literature resource recommendation method based on big data technology - Google Patents

Research student literature resource recommendation method based on big data technology Download PDF

Info

Publication number
CN115630153A
CN115630153A CN202211409115.7A CN202211409115A CN115630153A CN 115630153 A CN115630153 A CN 115630153A CN 202211409115 A CN202211409115 A CN 202211409115A CN 115630153 A CN115630153 A CN 115630153A
Authority
CN
China
Prior art keywords
user
information
item
learning
project
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211409115.7A
Other languages
Chinese (zh)
Inventor
师娇
许勇
李中行
吴小坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202211409115.7A priority Critical patent/CN115630153A/en
Publication of CN115630153A publication Critical patent/CN115630153A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a research student literature resource recommendation method based on big data technology, which comprises the following steps: a pre-recommendation method based on similarity and an individual recommendation method based on deep learning; preliminarily measuring documents which are interesting to the user by a pre-recommendation method based on the similarity, generating a recommendation table for the user, and collecting user-project interaction information through user feedback; the personalized recommendation method based on deep learning firstly generates personalized user and item representations, adopts a graph neural network to construct user-item interaction association, utilizes a learnable hypergraph network to establish user-user and item-item global association, and combines a pre-recommendation method to generate a label optimization recommendation strategy. According to the method, the personalized recommendation strategy is introduced according to the research activity rule of the researcher, and accurate document resource recommendation centering on the drive of the research requirements of the researcher on personalized research is realized by adopting a big data information technology.

Description

Research student literature resource recommendation method based on big data technology
Technical Field
The invention relates to the field of personalized literature resource recommendation, in particular to a research student literature resource recommendation method based on a big data technology.
Background
With the rapid development of internet technology, online document retrieval and learning by using a network as a medium has become a normalized activity for researchers to acquire academic information and know research progress and frontier in time. However, the massive literature resources on academic search engines also bring serious information overload and asymmetry problems, as well as learning lost problems. It is an effective strategy to solve the above problems by introducing a personalized recommendation strategy. Implementation of personalized recommendation of document resources relies on analysis and modeling of the needs of researchers of users, i.e., users of document resources. Compared with researchers in the old experience, the groups of researchers belong to new scientific research hands, the experience of scientific research activities is limited, the number of published papers is possibly less or even none, and the traditional modeling analysis mode for analyzing the research interests of the groups of researchers according to the existing scientific research activities and research results is difficult to adapt to the reality of the groups of researchers. The research subjects of the researchers mainly come from the instructor or the team, the existing research foundation of the instructor or the team is an important support for the researchers to develop the subject research, and the researchers pay attention to the forward dynamics of the research field and the research progress of the same or similar subject teams during the document retrieval. There is currently no literature resource recommendation method for the researcher population. In contrast, the inventor designs a document resource personalized recommendation method based on the research activity rule of students and relying on a big data technology.
Disclosure of Invention
The embodiment of the invention provides a researcher literature resource recommendation method based on a big data technology, which is used for recommending interesting literatures in an individualized manner according to basic information and literature reading rules of different researchers, and improving the research efficiency and learning efficiency of the researcher literatures.
The embodiment of the invention provides a researcher literature resource recommendation method based on big data technology, which comprises the following steps:
acquiring research field information in which a user is interested;
matching the information of the research field in which the user is interested with the type of a preset document to obtain the similarity based on a pre-recommendation method of the similarity;
the similarity is arranged according to the sequence from high to low to obtain a preliminary user literature recommendation table;
acquiring user feedback information;
obtaining user-item interaction information according to the user feedback information and the preliminary user literature recommendation table;
the personalized recommendation method based on deep learning respectively recommends the document resources according to the modules according to the user-project interaction information;
the personalized recommendation method based on deep learning comprises the following steps: building a self-supervision deep learning grid, comprising: a user/item personalized representation module; a user-item interaction representation module; a user-user and project-project global association module; the device comprises an automatic supervision reinforcement learning module and an iterative updating module.
In this scheme, the pre-recommendation method based on similarity further includes:
acquiring representation text information of a user;
mapping the representation of the user from a text form into a high-dimensional vector space (user) based on a preset Vector Space Model (VSM);
mapping different document types into the same space (item);
matching the two types of vectors of the user and the project in a corresponding space to obtain similarity;
judging whether the similarity of the two types of vectors of the user-project in the space is greater than a preset similarity threshold, if so, setting the type of the corresponding document as a type document in which the corresponding user is interested; if not, the user is not interested.
In this scheme, the pre-recommendation method based on similarity further includes:
sorting the documents according to the similarity from high to low;
acquiring two-classification judgment information of the user end on the matching condition of the recommended literature resources and the current-stage research requirements;
and obtaining the next recommended literature resource information and a user-project interaction table according to the binary judgment information of the matching condition of the recommended literature resource and the current-stage research requirement of the user terminal.
In this scheme, the user/item personalized representation module specifically includes:
acquiring user information;
obtaining user personalized representation information based on a mode of fusing user multi-information;
sending the user information to a preset embedding layer to obtain user information generation embedding representation information;
measuring influence weights of different characteristics on the recommended tasks according to the attention network, and fusing different weight characteristics by using a full-connection network layer to obtain primary characterization information of the user;
and generating embedded representations for different document types according to a preset embedded layer to obtain primary representation information of the project.
In this solution, the user-item interaction representation module specifically includes:
obtaining updated user/project representation information according to a preset message transmission mechanism of the graph convolution network;
sending the item-interaction table information to a preset graph convolution network to obtain a user-item interaction graph;
and obtaining a first comparison learning view by aggregating adjacent node representations to learn the local interaction relationship of the user and the item.
In this scheme, the user-user and project-project global association module specifically includes:
performing associated learning by adopting a learnable hypergraph structure;
the learnable hypergraph structure is composed of a group of learnable hypergraph edges, different users/projects serve as different nodes, each learnable hypergraph edge serves as an information pivot, all the users/projects are connected by different weights, information of all the users/projects is gathered from a global view angle to update embedded representation of the node of the hypergraph structure, and a second comparison learning view is obtained;
the different super edges are used as different channels to capture complex connection relations between users and items from multiple semantic dimensions.
In this scheme, the learnable hypergraph network structure specifically includes:
generating a learnable hypergraph parameter matrix by a low-rank decomposition method;
the user primary representation and the item primary representation of claim 4, in combination with a learnable multi-tier perceptron network, generate a learnable hypergraph structure.
In this scheme, the self-supervision reinforcement learning module specifically includes:
adopting a contrast learning method, taking different views as contrast learning objects, and optimizing the gradient of the model by utilizing a mutual information maximization mechanism;
learning by comparing the first comparative learning view of claim 5 with the second comparative learning view of claim 6; taking the same user/item under different views as a positive example; and taking different users/items under different views as negative example pairs, and taking a mutual information maximization function as an optimization equation to obtain the user/item characterization with discriminability.
In this scheme, the self-supervision reinforcement learning module further includes:
data noise is relieved through data enhancement of user data;
carrying out random mask on the user-item interaction graph and sending the user-item interaction graph to a graph convolution network to obtain a third comparative learning view;
the third comparative learning view is co-supervised with the first comparative learning view.
In this scheme, the iterative update module specifically includes: and periodically updating the user-item interaction table and the model parameters by adopting an iterative updating strategy.
By adopting the self-supervision reinforcement learning method and the learnable hypergraph structure, the problems of data sparseness, data noise and over-smoothness of a traditional graph neural network in the document recommendation scene are effectively solved. Particularly, for the data sparsity problem, a local-global feature contrast learning strategy is adopted, so that the deep learning model is allowed to carry out mutual cooperation supervision from the local-global angle, and the information loss in the deep learning model is mutually compensated. Meanwhile, an information maximization function is used as an optimization strategy, and the model is guided to learn richer knowledge by constructing an additional supervision task. For the data noise problem, a data enhancement strategy is adopted to mask the user-item interaction edges, and target documents which are not interested in the user-item interaction table are excluded. For the problem of over-smoothing of the traditional graph neural network, a learnable hypergraph structure is designed, all user/project information is aggregated at one time from a global view, and the problem of node over-smoothing caused by multiple aggregation of the graph neural network is avoided.
On the other hand, the method and the system collect user-item interaction information through a pre-recommendation strategy and generate a monitorable label for a subsequent personalized recommendation algorithm. The traditional method only carries out rough recommendation according to single information of research fields and the like of the researchers under the unsupervised condition, and can not further capture the literature reading rule of the researchers. By designing an individualized recommendation algorithm, the document reading preference and reading interest of researchers are deeply mined, documents with higher values are recommended, the document retrieval time is shortened, and the learning efficiency is improved.
Drawings
FIG. 1 is a flow chart of a method for recommending research literature resources based on big data technology according to the present invention;
FIG. 2 is a diagram illustrating a resource recommendation method for a graduate student document based on big data technology according to the present invention;
fig. 3 shows a personalized recommendation method architecture diagram based on deep learning provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
In particular, a resource recommendation method for researchers 'literature based on big data technology can be regarded as a recommendation system for researchers' literature. In thatIn the description of the present invention, according to the description convention of a general recommendation system, a researcher is described as a user, and a document type is described as an item. Let user set U = { U = { [ U ] 1 ,u 2 ,u 3 ,…,u j ,…,u J Item set I = { I = } 1 ,i 2 ,i 3 ,…,i k ,…,i K }。
FIG. 1 shows a flow chart of a resource recommendation method for researchers' literature based on big data technology.
As shown in FIG. 1, the invention discloses a research student literature resource recommendation method based on big data technology, comprising the following steps:
s102, obtaining information of research fields in which users are interested;
s104, matching the information of the research field of interest of the user with the type of a preset document based on a pre-recommendation method of the similarity to obtain the similarity;
s106, arranging the similarity according to the sequence from high to low to obtain a preliminary user literature recommendation table;
s108, obtaining user feedback information;
s110, obtaining user-item interaction information according to the user feedback information and the preliminary user literature recommendation table;
and S112, respectively recommending the document resources according to the modules based on the personalized recommendation method of the deep learning according to the user-item interaction information.
It should be noted that a keyword in the research area information in which the user is interested is extracted, and the keyword is set as a user "tag". Prioritizing the documents with the highest similarity to the information of the research field in which the user is interested, obtaining a preliminary user document recommendation table according to the sequence of the similarity from high to low, feeding the preliminary user document recommendation table back to the user side, and obtaining feedback information of the user on the preliminary user document recommendation table, such as: unsatisfactory for the A literature and satisfactory for the B literature. And obtaining user-item interaction information according to the feedback information and the preliminary user literature recommendation table. The personalized recommendation method based on deep learning requires building an automatic supervision deep learning grid, and comprises the following steps: a user/item personalization representation module; a user-item interaction representation module; a user-user and project-project global association module; the device comprises an automatic supervision reinforcement learning module and an iterative updating module.
According to the embodiment of the invention, the pre-recommendation method based on the similarity further comprises the following steps:
acquiring the representation text information of a user;
mapping the representation of the user from a text form into a high-dimensional vector space (user) based on a preset Vector Space Model (VSM);
mapping different document types into the same space (item);
matching the two types of vectors of the user and the project in a corresponding space to obtain similarity;
judging whether the similarity of the two types of vectors of the user-project in the space is greater than a preset similarity threshold, if so, setting the type of the corresponding document as a type document in which the corresponding user is interested; if not, the user is not interested.
It should be noted that, in the pre-recommendation method based on similarity, a Vector Space Model (VSM) is used to map each user and item from a text form into a high-dimensional vector space (user); similarly, different document types are mapped into the same space (project) again using the vector space model; and preliminarily recommending the possibly interested type documents for the user by judging the similarity of the two types of vectors of the user-item in the space. Such as: and if the preset similarity threshold is 80, when the similarity of the user-item vectors in the corresponding space is greater than 80, the corresponding document type is a type document which is possibly interested by the user.
According to the embodiment of the invention, the pre-recommendation method based on the similarity further comprises the following steps:
sorting the documents according to the similarity from high to low;
acquiring two-classification judgment information of the user end on the matching condition of the recommended literature resources and the current-stage research requirements;
and obtaining next recommended literature resource information and a user-project interaction table according to the information of the user terminal for performing classification judgment on the recommended literature resource and the matching condition of the current-stage research requirement.
It should be noted that, in the pre-recommendation method based on similarity, documents are sorted from high to low according to similarity, n most relevant documents are recommended to a user each time, n represents a natural integer greater than 0, such as n =20. The user can carry out classification judgment (satisfaction or not) on the matching condition of the recommended literature resources and the current-stage research requirements of the recommended literature resources, and automatically generate the next recommendation after the judgment of the user is finished.
In particular, the document adopts the vector space model to map the document titles and keywords to the high-dimensional vector space, and the relevance matching is carried out on the documents and the items (document types). The most relevant documents are firstly selected from the first relevant items, the first n documents with the highest relevance are selected, and if the recommended documents are judged to be satisfied by a user, the (n + 1) -2n documents are continuously selected from the most relevant items according to the relevance; and if the user judges that the user is not satisfied, selecting the first n documents with the highest correlation degrees from the second correlation items for recommendation, and so on.
And performing the operation by iteration to acquire a user-item interaction table, and generating label data for the personalized recommendation method based on deep learning. That is, if the user judges that the recommended documents are satisfied, the user is enabled to add 1 to the item interaction number, otherwise, the user does not operate.
According to the embodiment of the invention, the user/item personalized representation module specifically comprises:
acquiring user information;
obtaining user personalized representation information based on a mode of fusing multiple information of a user;
sending the user information to a preset embedding layer to obtain user information generation embedding representation information;
measuring the influence weight of different characteristics on the recommended task according to the attention network, and fusing different weight characteristics by utilizing a full-connection network layer to obtain primary characterization information of the user;
and generating embedded representations for different document types according to a preset embedded layer to obtain primary representation information of the project.
It should be noted that the user information includes: user's registration information, etc., such as which college, what specialty, etc. And the user/project personalized representation module generates user personalized representation in a mode of fusing multiple information of the user. Using multiple embedded layers for each user u j Generates an embedded representation for each underlying information f (e.g., specialty, research interest, etc.), and similarly, for each item i k Generating an embedded representation, as shown in equation (1):
Figure BDA0003937843180000081
wherein
Figure BDA0003937843180000082
Respectively represent users u j And item i k An embedded representation of the fth base information of (a);
Figure BDA0003937843180000083
the embedding layer learnable parameters represent the user and the item, respectively, and d represents the number of embedding dimensions. For the embedded representation of a plurality of information of a user, the vector splicing mode is adopted for fusion
Figure BDA0003937843180000084
Such as
Figure BDA0003937843180000085
Wherein F represents the number of user basic information. On top of this, the impact weights of different features on the recommended tasks are measured by the attention network. In particular, the self-attention network is a primary representation of the user
Figure BDA0003937843180000086
Generating three matrices, respectively queries
Figure BDA0003937843180000087
Key with a key body
Figure BDA0003937843180000088
Sum value
Figure BDA0003937843180000089
Q, K and V are used for the auto-attention mechanism operation, will
Figure BDA00039378431800000810
Mapping into a submerged representation in three different dimensions. Specifically, Q, K and V are formed by three learnable weights
Figure BDA00039378431800000811
And
Figure BDA00039378431800000812
obtaining the compound shown in formula (2):
Figure BDA0003937843180000091
q and K are used for calculating attention weight, the calculation result is subjected to matrix multiplication with V, the calculation of the influence of different information characteristics on the recommended task is realized, and the primary representation z of the user is generated (u) As shown in formula (3):
Figure BDA0003937843180000092
z (u) =Attention(Q,K)V (3)
wherein Attenttion represents the calculation of Attention weight, softmax represents the non-linear activation function, and similarly, embedding layers are used to generate embedded representations for different document types, and item primary representation z is generated (i)
According to the embodiment of the invention, the user-project interactive representation module specifically comprises:
obtaining updated user/project representation information according to a preset message transmission mechanism of the graph convolution network;
sending the item-interaction table information to a preset graph convolution network to obtain a user-item interaction graph;
and obtaining a first comparison learning view by aggregating adjacent node representations to learn the local interaction relationship of the user and the item.
It should be noted that the user-item interaction representation module updates the user/item representation by using the message passing mechanism of the graph and volume network. The graph convolution network generates a user-project interaction graph according to the user-project interaction table, and a first comparison learning view is generated by representing a learning user-project local interaction relationship through aggregation adjacent nodes;
wherein the user-item interaction table records the interaction condition of the user and the item, and the user-item interaction table is represented by
Figure BDA0003937843180000093
If user u j Once with item i k Interaction occurs, then A j,k =1; otherwise, A j,k And =0. The graph convolution network performs messaging in the form of equation (4):
Figure BDA0003937843180000094
wherein the content of the first and second substances,
Figure BDA0003937843180000095
and
Figure BDA0003937843180000096
respectively representing neighboring items/users to a central node u j And i k σ represents the LeakyRelu activation function for generating the non-linear representation. In order to solve the problem of scale change caused by inconsistent number of interactions, a standardized adjacency matrix mode is adopted to generate
Figure BDA0003937843180000101
Formally updatable as formula (5):
Figure BDA0003937843180000102
according to the embodiment of the invention, the user-user and project-project global association module specifically comprises:
performing associated learning by adopting a learnable hypergraph structure;
the learnable hypergraph structure is composed of a group of learnable hypergraph edges, different users/projects serve as different nodes, each learnable hypergraph edge serves as an information pivot, all the users/projects are connected by different weights, information of all the users/projects is gathered from a global view angle to update embedded representation of the node of the hypergraph structure, and a second comparison learning view is obtained;
the different super edges are used as different channels to capture complex connection relations between users and items from multiple semantic dimensions.
It should be noted that, the user-user and project-project global association module adopts a learnable hypergraph structure to perform association learning. The learnable hypergraph structure is composed of a set of learnable hypergraph edges, the hypergraph edges for global user association and global project association are respectively represented as
Figure BDA0003937843180000103
And
Figure BDA0003937843180000104
where h represents the number of excess edges. The learnable super edges take different users/projects as different nodes, each learnable super edge serves as an information pivot, all users/projects are connected by different weights, information of all users/projects is gathered from a global view to update the embedded representation of the self node, and a second comparison learning view is generated. Formally represented by formula (6):
Ψ (u) =σ(H (u) ·H (u)T ·g (u) ),Ψ (i) =σ(H (i) ·H (i)T ·g (i) ) (6)
wherein the content of the first and second substances,
Figure BDA0003937843180000105
use of first contrast learning view productionFamily representation g (u) Firstly, transposed learnable overcame H is adopted (u)T G is prepared from (u) The information of each user is transmitted to the super edge, and the vector size dimension is defined by
Figure BDA0003937843180000106
Is updated to
Figure BDA0003937843180000107
Then again with H (u) Matrix multiplication operation is carried out, each super edge transmits the aggregated global user information back to the representation of each user, and the dimension of the vector size is expressed by
Figure BDA0003937843180000108
Is updated to
Figure BDA0003937843180000109
In particular, different super edges serve as different channels to capture complex connection relationships between users and between items and items from multiple semantic dimensions.
According to the embodiment of the invention, the learnable hypergraph network structure specifically comprises:
generating a learnable hypergraph parameter matrix by a low-rank decomposition method;
the user and item primary representations of claim 4, in combination with a learnable multi-tier perceptron network, generating a learnable hypergraph structure.
It should be noted that, in the learnable hypergraph network structure, in order to reduce the parameter number of the neural network model, a low-rank decomposition method is adopted to generate a learnable hypergraph parameter matrix. And generating a learnable hypergraph structure according to the primary user representation and the primary project representation and by combining a multilayer perceptron network, wherein the learnable hypergraph structure is represented by a formula (7) in a form:
H (u) =MLP u (z (u) ·W (uh) ),H (i) =MLP i (z (i) ·W (ih) ) (7)
wherein
Figure BDA0003937843180000111
Are learnable parameters. Wherein the hidden layer dimension is much smaller than the number of users (i.e. d < J). The parameter quantity can be greatly reduced by constructing the super edge through a low-rank decomposition method, the neural network training time is shortened, and the recommendation efficiency is improved. The multilayer perceptron network (MLP), comprising a plurality of linear layers and a nonlinear activation function, may be formalized as in equation (8):
MLP(x)=w 2 ·σ(w 1 ·x+b 1 )+b 2 (8)
wherein, w 1
Figure BDA0003937843180000112
For trainable weight parameters, b 1
Figure BDA0003937843180000113
Is a trainable bias parameter.
According to the embodiment of the invention, the self-supervision reinforcement learning module specifically comprises:
adopting a contrast learning method, taking different views as contrast learning objects, and optimizing the gradient of the model by utilizing a mutual information maximization mechanism;
learning by comparing the first comparative learning view of claim 5 with the second comparative learning view of claim 6; taking the same user/item under different views as a positive example; and taking different users/items under different views as negative example pairs, and taking a mutual information maximization function as an optimization equation to obtain the user/item characterization with discriminability.
It should be noted that the self-supervision reinforcement learning module adopts a contrast learning method, takes different views as contrast learning objects, and optimizes the gradient of the model by using a mutual information maximization mechanism. In particular, a first comparative learning view g generated for the graph convolution network (u) A second comparative learned view Ψ generated with the learnable hypergraph network (u) Performing comparative learning, taking the same user/item under different views as a positive example pair, taking different users/items under different views as a negative example pair, and taking a mutual information maximization function as an optimizationAnd (4) transforming an equation, and enabling the local features and the global features to be in cooperative supervision with each other to generate the user/project characterization with discriminability. Wherein, the user mutual information maximization function is described as formula (9):
Figure BDA0003937843180000121
wherein cos (·) represents cosine similarity for measuring embedded similarity of different contrast learning views;
Figure BDA0003937843180000122
is a temperature coefficient for adjusting the scale of the gradient. By taking the mutual information maximization function as a loss function, the distance represented by the same user between the first comparison learning view and the second comparison learning view is shortened in a high-dimensional space, the distance represented by different users between the first comparison learning view and the second comparison learning view is enlarged, the algorithm is allowed to carry out mutual cooperative supervision on local features and global features under an automatic supervision condition, and the user representation is further enhanced. Similarly, the project mutual information maximization function is set in a similar fashion.
According to the embodiment of the present invention, the self-supervised reinforcement learning module further includes:
data noise is relieved through data enhancement of user data;
carrying out random mask on the user-item interaction graph and sending the user-item interaction graph to a graph convolution network to obtain a third comparative learning view;
the third comparative learning view is co-supervised with the first comparative learning view.
It should be noted that, the self-supervision reinforcement learning module adopts a data reinforcement method to alleviate the data noise problem. Generating a data-enhanced user-project interaction diagram by randomly masking the user-project interaction diagram A
Figure BDA0003937843180000123
Input to the graph convolution network to generate a third comparative learning view. Data enhancement method descriptionAs shown in formula (10):
Figure BDA0003937843180000124
wherein, A' j,k Representing user u after data enhancement j And item I k The connection relationship of (a) to (b),
Figure BDA0003937843180000125
random number, M, representing a value range (0,1) j,k A random number representing the jth row and kth column corresponds to A' j,k (ii) a E represents a preset threshold. If the random number M j,k And if the value is less than the element, performing masking operation on the user-item interaction graph, and otherwise, keeping the original graph. The third contrast learning view is obtained by performing a graph convolution operation by replacing formula (5) A with A 'to generate a third contrast learning view embedded representation g' (u) ,g′ (i) . Then, performing cooperative supervision on the third comparative learning view and the first comparative learning view by adopting the comparative learning strategy described by the formula (9);
in particular, after the personalized recommendation method based on the deep learning obtains the representation of the user-item correlation degree, the personalized recommendation method based on the deep learning carries out recommendation according to the literature recommendation method of the pre-recommendation method based on the similarity degree.
According to the embodiment of the present invention, the iterative update module specifically includes: and periodically updating the user-item interaction table and the model parameters by adopting an iterative updating strategy.
It should be noted that the iterative update strategy mainly includes 1) user-item interaction table update: and periodically updating a user-project interaction table according to the judgment result of the matching condition of the recommended literature resources and the current-stage research requirements of the researchers, wherein the user-project interaction table is used as the basis for constructing the user-project interaction graph of the graph volume network, and more available labels are provided for the personalized recommendation algorithm model. 2) Updating model parameters; over time, the user population changes and the number of documents gradually increases, requiring the user representation and the document representation to be updated. And (3) adopting a pre-training-fine-tuning updating principle, and performing re-training on the basis of the original training model parameters to update the model parameters.
FIG. 2 shows an architecture diagram of a resource recommendation method for researchers' literature based on big data technology.
As shown in the figure, the method for recommending the research student literature resources based on the big data technology provided by the invention comprises the following steps: the method comprises a pre-recommendation method based on similarity and an individual recommendation method based on deep learning, wherein the pre-recommendation method based on the similarity is based on user information and interaction data information of the user, and a preliminary user literature recommendation table is determined according to the similarity. The personalized recommendation method based on deep learning comprises the following steps: a user and item personalized representation module; a user-item interaction representation module; a user-user and project-project global association module; the device comprises an automatic supervision reinforcement learning module and an iterative updating module.
Fig. 3 shows a personalized recommendation method architecture diagram based on deep learning provided by the invention.
As shown in the figure, each module in the personalized recommendation method based on deep learning is independently displayed and closely connected, wherein: the three modules are associated by comparing a first contrast learning view generated by a user-item interaction representation module, a second contrast learning view generated by a user-user, item-item global association module and a third contrast learning view generated by an auto-supervised reinforcement learning module.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various media capable of storing program codes.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media capable of storing program code.

Claims (10)

1. A researcher literature resource recommendation method based on big data technology is characterized by comprising the following steps:
acquiring research field information in which a user is interested;
matching the information of the research field of interest of the user with the type of a preset document to obtain the similarity based on a pre-recommendation method of the similarity;
the similarity is arranged according to the document types from high to low to obtain a preliminary user document recommendation table;
acquiring user feedback information;
obtaining user-item interaction information according to the user feedback information and the preliminary user literature recommendation table;
the personalized recommendation method based on deep learning respectively recommends the document resources according to the modules according to the user-project interaction information;
the personalized recommendation method based on deep learning comprises the following steps: building a self-supervision deep learning grid, comprising: a user/item personalized representation module; a user-item interaction representation module; a user-user and project-project global association module; the device comprises an automatic supervision reinforcement learning module and an iterative updating module.
2. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the method for pre-recommending based on similarity further comprises:
acquiring the representation text information of a user;
mapping the representation of the user from a text form into a high-dimensional vector space (user) based on a preset Vector Space Model (VSM);
mapping different document types into the same space (item);
matching the two types of vectors of the user and the project in a corresponding space to obtain similarity;
judging whether the similarity of the two vectors of the user-project in the space is greater than a preset similarity threshold, if so, setting the type of the corresponding document as a type document which is interested by the corresponding user; if not, the user is not interested.
3. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the method for pre-recommending based on similarity further comprises:
sorting the documents according to the similarity from high to low;
acquiring two-classification judgment information of the user end on the matching condition of the recommended literature resources and the current-stage research requirements;
and obtaining the next recommended literature resource information and a user-project interaction table according to the binary judgment information of the matching condition of the recommended literature resource and the current-stage research requirement of the user terminal.
4. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the user/item personalized representation module specifically is:
acquiring user information;
obtaining user personalized representation information based on a mode of fusing multiple information of a user;
sending the user information to a preset embedding layer to obtain user information generation embedding representation information;
measuring influence weights of different characteristics on the recommended tasks according to the attention network, and fusing different weight characteristics by using a full-connection network layer to obtain primary characterization information of the user;
and generating embedded representations for different document types according to a preset embedded layer to obtain primary representation information of the project.
5. The method for recommending research student literature resources based on big data technology according to claim 1 or 3, wherein the user-item interaction representation module is specifically:
obtaining updated user/project representation information according to a preset message transmission mechanism of the graph convolution network;
sending the item-interaction table information to a preset graph convolution network to obtain a user-item interaction graph;
and (4) representing and learning the local interaction relation of the user and the project by aggregating adjacent nodes to obtain a first comparative learning view.
6. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the user-user and project-project global association module specifically comprises:
performing associated learning by adopting a learnable hypergraph structure;
the learnable hypergraph structure is composed of a group of learnable hypergraph edges, different users/projects serve as different nodes, each learnable hypergraph edge serves as an information pivot, all the users/projects are connected by different weights, information of all the users/projects is gathered from a global view angle to update embedded representation of the node of the hypergraph structure, and a second comparison learning view is obtained;
the different super edges are used as different channels to capture complex connection relations between users and items from multiple semantic dimensions.
7. The method for recommending research student literature resources based on big data technology according to claim 6, wherein the learnable hypergraph network structure is specifically:
generating a learnable hypergraph parameter matrix by a low-rank decomposition method;
the user and item primary representations of claim 4, in combination with a learnable multi-tier perceptron network, generating a learnable hypergraph structure.
8. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the self-supervision reinforcement learning module specifically is:
adopting a contrast learning method, taking different views as contrast learning objects, and optimizing the gradient of the model by utilizing a mutual information maximization mechanism;
learning by comparing the first comparative learning view of claim 5 with the second comparative learning view of claim 6; taking the same user/item under different views as a positive example; and taking different users/items under different views as negative example pairs, and taking a mutual information maximization function as an optimization equation to obtain the user/item characterization with discriminability.
9. The method for recommending research student literature resources based on big data technology according to claim 1 or 5, wherein the self-supervised reinforcement learning module further comprises:
data noise is relieved through data enhancement of user data;
carrying out random mask on the user-project interaction graph and sending the user-project interaction graph to a graph convolution network to obtain a third comparative learning view;
the third comparative learning view is co-supervised with the first comparative learning view.
10. The method for recommending research student literature resources based on big data technology according to claim 1, wherein the iterative updating module specifically is: and periodically updating the user-item interaction table and the model parameters by adopting an iterative updating strategy.
CN202211409115.7A 2022-11-11 2022-11-11 Research student literature resource recommendation method based on big data technology Pending CN115630153A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211409115.7A CN115630153A (en) 2022-11-11 2022-11-11 Research student literature resource recommendation method based on big data technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211409115.7A CN115630153A (en) 2022-11-11 2022-11-11 Research student literature resource recommendation method based on big data technology

Publications (1)

Publication Number Publication Date
CN115630153A true CN115630153A (en) 2023-01-20

Family

ID=84910568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211409115.7A Pending CN115630153A (en) 2022-11-11 2022-11-11 Research student literature resource recommendation method based on big data technology

Country Status (1)

Country Link
CN (1) CN115630153A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116522006A (en) * 2023-07-05 2023-08-01 中国传媒大学 Method and system for recommending lessons based on view self-supervision training
CN116541593A (en) * 2023-04-28 2023-08-04 华中师范大学 Course recommendation method based on hypergraph neural network
CN116628350A (en) * 2023-07-26 2023-08-22 山东大学 New paper recommending method and system based on distinguishable subjects

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541593A (en) * 2023-04-28 2023-08-04 华中师范大学 Course recommendation method based on hypergraph neural network
CN116541593B (en) * 2023-04-28 2024-05-31 华中师范大学 Course recommendation method based on hypergraph neural network
CN116522006A (en) * 2023-07-05 2023-08-01 中国传媒大学 Method and system for recommending lessons based on view self-supervision training
CN116522006B (en) * 2023-07-05 2023-10-20 中国传媒大学 Method and system for recommending lessons based on view self-supervision training
CN116628350A (en) * 2023-07-26 2023-08-22 山东大学 New paper recommending method and system based on distinguishable subjects
CN116628350B (en) * 2023-07-26 2023-10-10 山东大学 New paper recommending method and system based on distinguishable subjects

Similar Documents

Publication Publication Date Title
CN110162700B (en) Training method, device and equipment for information recommendation and model and storage medium
Liu et al. Multi-perspective social recommendation method with graph representation learning
Sun et al. Learning multiple-question decision trees for cold-start recommendation
CN115630153A (en) Research student literature resource recommendation method based on big data technology
Velásquez et al. Adaptive web sites: A knowledge extraction from web data approach
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN111667022A (en) User data processing method and device, computer equipment and storage medium
Li et al. Efficient optimization of performance measures by classifier adaptation
CN112380453B (en) Article recommendation method and device, storage medium and equipment
Yuen et al. Temporal context-aware task recommendation in crowdsourcing systems
Zhong et al. Design of a personalized recommendation system for learning resources based on collaborative filtering
Alhamdani et al. Recommender system for global terrorist database based on deep learning
Yin et al. An efficient recommendation algorithm based on heterogeneous information network
Isoni Machine learning for the web
CN115631008B (en) Commodity recommendation method, device, equipment and medium
CN108647295B (en) Image labeling method based on depth collaborative hash
Hain et al. The promises of Machine Learning and Big Data in entrepreneurship research
Kim et al. A deep generative model for feasible and diverse population synthesis
Zhang et al. Probabilistic matrix factorization recommendation of self-attention mechanism convolutional neural networks with item auxiliary information
CN114996566A (en) Intelligent recommendation system and method for industrial internet platform
CN113469819A (en) Recommendation method of fund product, related device and computer storage medium
Xin Deep learning-based implicit feedback recommendation
CN114298118B (en) Data processing method based on deep learning, related equipment and storage medium
CN117708421B (en) Dynamic recommendation method and system based on modularized neural network
Kamani et al. Cross-domain recommender systems via multimodal domain adaptation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination