CN112650920B - Recommendation method fusing social networks based on Bayesian sorting - Google Patents

Recommendation method fusing social networks based on Bayesian sorting Download PDF

Info

Publication number
CN112650920B
CN112650920B CN202011435734.4A CN202011435734A CN112650920B CN 112650920 B CN112650920 B CN 112650920B CN 202011435734 A CN202011435734 A CN 202011435734A CN 112650920 B CN112650920 B CN 112650920B
Authority
CN
China
Prior art keywords
user
item
node
items
implicit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011435734.4A
Other languages
Chinese (zh)
Other versions
CN112650920A (en
Inventor
印鉴
蒙权
高静
方国鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Hengdian Information Technology Co ltd
Sun Yat Sen University
Original Assignee
Guangdong Hengdian Information Technology Co ltd
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Hengdian Information Technology Co ltd, Sun Yat Sen University filed Critical Guangdong Hengdian Information Technology Co ltd
Priority to CN202011435734.4A priority Critical patent/CN112650920B/en
Publication of CN112650920A publication Critical patent/CN112650920A/en
Application granted granted Critical
Publication of CN112650920B publication Critical patent/CN112650920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a recommendation method fusing social networks based on Bayesian ranking, which is characterized in that firstly, an abnormal graph is formed by articles consumed by a user, scoring feedback and the social networks, then the abnormal graph is sampled by a novel abnormal graph walking method, and the sampled data is input into a Skip-Gram neural network to learn vector representation of the user and the articles. And then calculating the similarity of the vectors of the users by using a cosine similarity formula, and identifying the implicit friends which are most likely to have similar preference according to the similarity between the users. And finally, based on the implicit friend relationship of each user, subdividing the articles into a plurality of mutually exclusive parts, modeling through a Bayesian personalized sorting algorithm, and generating a personalized recommendation list for each user.

Description

Recommendation method fusing social networks based on Bayesian sorting
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a recommendation method fusing social networks based on Bayesian ranking.
Background
The recommendation system solves the problem that a user cannot quickly select interested articles from a large number of articles. However, on a recommendation platform like an online shopping mall, most users usually only buy a small part of a large amount of commodities, which causes a problem that data of user interaction with the commodities is too sparse, thereby causing the traditional recommendation system to have unsatisfactory effect.
With the development of online social platforms, such as qq, weChat, and microblog-related platforms, researchers find that a user's preferences may have a certain relationship with the preferences of their social friends, and use this relationship to infer their preferences from the preferences of the social friends can alleviate the data sparseness problem of recommendation platforms using these observed social relationships.
With this knowledge, a large number of researchers have attempted to incorporate social relationships into corresponding fields for research. In their research, these researchers found that social relationships apply to recommendation systems, and have the following problems:
1) Explicit social relationships present a lot of noise, particularly in false users, advertising marketing users, and so on. The presence of such noise may cause explicit social relationships to adversely affect the effectiveness of the recommendation system;
2) The meaning of social relationships is too complex and not limited to preference similarity. For example, some people become friends because of similar preference relationships, some people become friends because of work, and if the explicit friend relationship is directly treated as the preference relationship without performing the depth filtering process, the effect of the recommendation system may be adversely affected. Direct use of explicit friend relationships for recommendations may be correspondingly limited.
Disclosure of Invention
The invention provides a relatively accurate Bayesian ranking-based recommendation method for fusing social networks.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a recommendation method based on Bayesian ranking and fusing social networks comprises the following steps:
s1: fusing a social network of a user and an interactive network of the user and an article to construct a heterogeneous information network graph;
s2: identifying implicit friends of each user based on the heterogeneous graph embedded representation;
s3: according to the evaluation feedback of each user on the articles and the interaction records of the implicit friends and the articles, all the articles can be classified into 6 mutually exclusive groups in a fine-grained manner for each user: like item, plain item, dislike item, implicit friend like item, implicit friend dislike item, and remaining items;
s4: modeling the preference sequence of each user based on a Bayesian sorting algorithm: for each user, the model assumes its preference: like the item > implicit friend like the item > mediocre item > remaining item > implicit friend aversion item > aversion item, the model is optimized with the higher joint probability of the assumption, and the personalized ranking list of the user when the joint probability is the maximum is obtained;
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items ranked to the user.
Further, the process of step S1 is:
s11: constructing an interaction graph of users and articles, wherein each user or article is independently represented by a node, if the user has an interaction relationship with the article, an edge is used for connection, the weight of the edge represents the grade of the user for the article, and after all the interaction relationships are connected, the interaction graph of the user and the article is obtained;
s12: on an interaction graph of the user and the article, adding the user to a social network through a combination unit; if a social relationship exists between the users, connecting the nodes of the two users; and when the social relations of all the users are connected, obtaining a social heterogeneous information network graph.
Further, the process of step S2 is:
s21: performing wandering sampling on a social heterogeneous information network, and converting information from a graph form into a node sequence form;
s22: inputting the node sequence corpus into a Skip-Gram neural network for embedding, representing and learning, and training to obtain vector representation of each node;
s23: and calculating the similarity between the embedded vectors of the users, and measuring the similarity of the preference between the users by using the similarity.
Further, the process of step S21 is:
s211: setting a walk rule from the current node to the next node:
1) If the current node is an article, the next node to be walked can only be a user because there is no edge connection between articles, and if the current article node is connected with many users, the probability is designed
Figure BDA0002828615560000031
To select one of the users, the probability is proportional to the evaluation weight on the edge, that is, the higher the value of the user evaluation is, the higher the probability to be selected is;
2) If the current node is a user, the next node type to be walked is a user or an article, and a probability alpha n is designed firstly -1 Determining whether the next node type is a user type or an article type, wherein n represents that the user type is accessed for n times continuously, so that the probability of continuously selecting the user is reduced when n is larger, so as to prevent the node type from staying too long and ensure more balanced sampling; after the next node type is selected, if the next node type is selected as a user type, selecting the next user node with equal probability from user nodes connected with the current user node; if the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure BDA0002828615560000032
Selecting one of the nodes, wherein the probability ensures that the object with higher score by the current user has higher probability to be selected;
the walk selection algorithm is expressed by the following equation (1):
Figure BDA0002828615560000033
in the formula v i Representing a current node; v. of i+1 Representing the next node to walk; m represents an item type; u represents a user type; alpha is an element of [0,1]]Probability of being an initial user type; e (v) i ,v j ) Representing a node v i And node v j The weight of the edge in between; | N i+1 (v i ) L represents the number of social friends of the vi node; n is the number of nodes that the user type node is continuously accessed;
s212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, wandering along the edge in the graph according to a designed wandering rule, wandering to an adjacent node of the user node according to the rule in the first step, wandering to an adjacent node of the adjacent node according to the rule in the second step, repeating the steps, wherein each step needs to carry out wandering sampling according to a designed probability until a specified step length L is wandered, and the size of the L is set according to the complexity of the heterogeneous information network to obtain N node sequences with the L step lengths;
s213: repeating the operation of step S212W times to ensure that the sampling of the heterogeneous graph is sufficiently comprehensive, wherein the size of W is set according to the complexity of the heterogeneous information network, and a node sequence of W × N L steps is obtained, that is, the node sequence is a corpus obtained by walking the sampling from the heterogeneous graph.
Further, in step S22, the node sequence corpus is input into a Skip-Gram neural network for embedding, characterizing, learning, and training to obtain a vector representation of each node, and for each current node v k The optimized objective function is:
Figure BDA0002828615560000041
wherein, C (v) k ) Representing a node v k V represents each node in the corpus,
Figure BDA0002828615560000042
is a Softmax function, and specifically comprises:
Figure BDA0002828615560000043
where θ is a weight parameter, V n Represents the node type, y v Representing the embedded vector of node v.
Further, in step S23, a similarity between the embedded vectors of the users is calculated, and the similarity is used to measure a degree of similarity in preference between the users, where a similarity calculation formula between two vectors is:
Figure BDA0002828615560000044
for each user, the formula is used for calculating the similarity between the user and all other users, the Top-K users with the highest similarity are taken as the implicit friends of the user, and finally the Top-K implicit friends of each user are obtained.
Further, the process of step S3 is:
s31: for certain user-interacted items, the items are set to 3 grades according to the scores given to the items by the user: like item P U The plain term O U Aversion item N U (ii) a The user scores 5 points for the item, and if the user scores 4-5 points for the item, the user is considered to like the item and is classified as the favorite of the user; if the score is 3, the user is considered to feel that the item is mediocre, and the item is classified as the mediocre item of the user; if the score is 1-2, the user is considered to dislike the item and is classified as a dislike item for the user;
s32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the hidden friends of the items: implicit friend like item PS U NS for hidden friend aversion item U The remainder of the term E U (ii) a The implicit friends of the user have watched and liked and do not belong to P U 、O U And N U The items of (2) are classified into implicit friend like items of the user; the implicit friends of the user have watched the user, have the score less than or equal to 3 and do not belong to P U 、O U 、N U And PS U The items of (1) are classified into implicit friend aversion items of the user; will not belong to P U 、O U 、N U 、PS U And NS U The rest items are classified into the rest items;
s33: through the two steps of classification, each user has mutually exclusive 6 types of articles, P U +O U +N U +PS U +NS U +E U = all item sets, and P U 、O U 、N U 、PS U 、NS U 、E U The two parts are independent and are not mutually intersected; like item P U : for all users, order P U Representing items that user u has viewed and liked by itself;
plain item O U : let O be U Representing items that user u has viewed and feels mediocre by itself;
aversion item N U : let N U An item that represents what user u has viewed and disliked itself;
implicit friend like item PS U : let PS U The implicit friends representing user u are watched and liked by someone and do not belong to P U 、O U And N U The article of (1);
NS for hidden friend aversion item U : order NS U Indicating that the implicit friend of user u has been watched by someone with a score of 3 or less and does not belong to P U 、O U 、N U And PS U The article of (a);
remainder term E U : has not been viewed by user u and does not belong to P U 、O U 、N U 、PS U And NS U The remainder of the article.
Further, the process of step S4 is:
s41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: assuming that the user's preference level is like > implicit friend like > mediocre > remaining > implicit friend dislike > dislike, then this assumption is transferred to a mathematical formula model:
f:x ui ≥x uj ≥x uk ≥x ul ≥x um ≥x un
where i ∈ P u ,j∈PS u ,k∈O u ,l∈E u ,m∈NS u ,n∈N u
Wherein x is ui Indicates the preference of the user u for the favorite i evaluated by the user u, x uj Representing the preference, x, of user u for implicit friend like item j uk Represents the preference of the user u for the plain term k evaluated by the user u, x ul Indicates the preference of user u for the remaining items l, x um Representing the preference of user u for implicit friend aversion m, x un A preference of the user u for the aversion item n evaluated by the user u;
s42: the above basic assumption can be used to maximize AUC, and a larger AUC value, meaning the greater the probability of the combination of the above assumptions, is trained in the following optimized formula:
Figure BDA0002828615560000051
when the optimization target reaches the maximum, obtaining an item list ordered by each user according to the preference degree;
s43: the item ordered list results for each user are stored in a database for easy querying.
Further, in step S5, when a user logs in the platform, the system reads the id information of the user, then calls the item recommendation list of the user from the off-line database according to the id information of the user, and feeds back Top-N items arranged in front in the list to the user.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
according to the method, firstly, an abnormal figure is formed by the articles consumed by the user, the grading feedback and the social network, then the abnormal figure is sampled through a novel abnormal figure walking method, and the sampled data is input into a Skip-Gram neural network to learn the vector representation of the user and the articles. And then, calculating the similarity of vectors of the users by using a cosine similarity formula, and identifying the implicit friends of the users which are most likely to have similar preference according to the similarity between the users. And finally, based on the implicit friend relationship of each user, subdividing the articles into a plurality of mutually exclusive parts, modeling through a Bayesian personalized sorting algorithm, and generating a personalized recommendation list for each user.
Drawings
FIG. 1 is a general flow chart of the process of the present invention;
FIG. 2 is a simplified social heterogeneous information network;
fig. 3 is a schematic diagram of a node walk sample corpus.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present embodiments;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
As shown in fig. 1, a recommendation method for fusing social networks based on bayesian ranking includes the following steps:
s1: fusing the social network of the user and the interactive network of the user and the article to construct a heterogeneous information network (the step comprises a combination unit of 101 in the flow chart);
s2: implicit friends of each user are identified based on the heterogeneous graph embedded representation. Specifically, in this embodiment, a new heterogeneous graph sampling method is designed to obtain a walking sequence of nodes by walking on a heterogeneous graph, and a vector representation of each user and each article is obtained by learning a sequence of the node sequences through a Skip-Gram neural network. Finally, calculating the similarity between every two users based on the vectors of the users, wherein the first K users with the highest similarity with the users are the implicit friends of the users (the step comprises a node walking sampling unit of 102, a Skip-Gram neural network of 103 and a calculation similarity unit of 104 in the flow chart);
s3: according to the evaluation feedback of each user on the articles and the interaction records of the implicit friends and the articles, all the articles can be classified into 6 mutually exclusive groups in a fine-grained manner for each user: like item, mediocre item, dislike item, implicit friend like item, implicit friend dislike item, and remaining items (this step includes item fine-grained sort unit of 105 in the flowchart);
s4: and modeling the preference sequence of each user based on a Bayesian sorting algorithm. For each user, the model assumes its degree of preference: like item > implicit friend like item > mediocre item > remaining item > implicit friend aversive item > aversive item. The model is optimized by using the larger the joint probability of the establishment of the assumption, and a personalized ranking list of the user when the joint probability is the maximum is obtained (the step comprises a Bayesian ranking unit of 106 in the flow chart);
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user. (this step includes a retrieval unit of 107 in the flowchart)
(1) S1: and fusing the social network of the user and the interactive network of the user and the article to construct a heterogeneous information network. (this step includes the combination unit of 101 in the flowchart)
S11: and constructing an interaction graph of the users and the articles, wherein each user or article is represented by a node, if the user and the article have an interaction relationship, an edge is connected, and the weight of the edge represents the score of the user on the article. When all the interactive relations are connected, an interactive graph of the user and the article is obtained.
S12: and on the interaction graph of the user and the item, joining the social network between the users through the combination unit. Specifically, if there is a social relationship between the users, the nodes of the two users are connected. When the social relationships of all users are connected, a social heterogeneous information network diagram is obtained, and the structure of the social heterogeneous information network diagram is shown in fig. 2 below, wherein u _ id represents the id of the user, and m _ id represents the id of the item.
(2) S2: implicit friends of each user are identified based on the heterogeneous graph embedded representation. Specifically, in this embodiment, a new heterogeneous graph sampling method is designed to obtain a walking sequence of nodes by walking on a heterogeneous graph, and a vector representation of each user and each article is obtained by learning a sequence of the node sequences through a Skip-Gram neural network. And finally, calculating the similarity between every two users based on the vectors of the users, wherein the first K users with the highest similarity with the users are the implicit friends of the users. (this step includes the nodes in the flowchart 102 walking the sampling unit, the Skip-Gram neural network in the flowchart 103, the calculation similarity unit in the flowchart 104)
S21: on the social heterogeneous information network, the walk sampling is carried out, and the information is converted from the form of a graph into the form of a node sequence (the step comprises a node walk sampling unit of 102 in the flow chart). The method comprises the following steps:
s211: a migration rule from the current node to the next node is set.
1) If the current node is an item, then the next node to walk can only be the user, since there is no edge connection between items. Designing probabilities if a current item node is connected to many users
Figure BDA0002828615560000081
To select one of the users, the probability is proportional to the evaluation weight on the edge, i.e. the higher the value of the user evaluation, the higher the probability to be selected.
2) If the current node is a user, then the next node type to walk is a user or an item. In this case, a probability α × n is first designed -1 To decide whether the next node type is a user or an item type, where n represents n consecutive visits to the user type. Thus, the probability of continuing to select users decreases as n increases, so as to prevent the nodes from staying too long in the node type, and ensure more balanced sampling. After the selection of the next node type, if the selection is a user type, the next user node is selected with equal probability from the user nodes having a connection with the current user node. If the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure BDA0002828615560000082
Selecting one of themThe node, this probability ensures that the item scored higher by the current user has a greater probability of being selected.
In summary, the wander selection algorithm designed in this embodiment can be expressed by the following formula (1):
Figure BDA0002828615560000083
in the formula
V i Representing a current node;
V i+1 representing the next node to walk;
m represents an item type;
u represents a user type;
the probability that alpha belongs to [0,1] is the initial user type;
e(v i ,v j ) Representing a node v i And node v j The weight of the edge in between;
|N i +1(v i ) | represents v i The number of social friends of the node;
n is the number of nodes that the user type node is continuously accessed.
S212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, the migration rule designed in the embodiment performs migration along the edge in the graph, the first step of migration to the adjacent node of the user node according to the rule, the second step of migration to the adjacent node of the adjacent node according to the rule, the steps are repeated, each step needs to perform migration sampling according to the designed probability until the specified step length L is reached, and the size of L is set according to the complexity of the heterogeneous information network. Thus, a node sequence of N L steps is obtained.
S213: repeating the operation of step S212W times ensures that the sampling of the heterogeneous graph is sufficiently comprehensive, wherein the size of W is set according to the complexity of the heterogeneous information network. So far, a node sequence of W × N L steps is obtained, and this is referred to as a corpus obtained by walking and sampling from the heterogeneous graph. The node sequence corpus of fig. 3 below can be specifically referred to.
S22: and inputting the node sequence corpus into a Skip-Gram neural network for embedding, characterizing and learning, and training to obtain a vector representation of each node (the step comprises a Skip Gram neural network of 103 in the flow chart). Specifically, for each current node vk, the optimized objective function is:
Figure BDA0002828615560000091
wherein C (vk) represents a node set of upper and lower w windows of a node vk, V represents each node in a corpus, and p (vnm | vk; theta) is a Softmax function, and specifically comprises the following steps:
Figure BDA0002828615560000092
where θ is a weight parameter, vn represents a node type, y v Representing the embedded vector of node v.
S23: the similarity between the embedded vectors of the users is calculated, and the similarity is used to measure the similarity of preference between the users (this step includes a calculate similarity unit of 104 in the flowchart). The similarity between two vectors is calculated as:
Figure BDA0002828615560000093
for each user, the formula is used for calculating the similarity between the user and all other users, the Top-K users with the highest similarity are taken as the implicit friends of the user, and finally the Top-K implicit friends of each user are obtained
(3) S3: according to the evaluation feedback of each user to the articles and the interaction records of the hidden friends and the articles, for each user, all the articles can be classified in a fine-grained manner and are divided into 6 mutually exclusive groups: like items, mediocre items, dislike items, implicit friend like items, implicit friend dislike items, and remaining items. (this step includes the item fine-grained classification unit of 105 in the flow chart)
S31: for certain user-interacted items, the items are set to 3 grades according to the scores given to the items by the user: like item (P) U ) Peaceful item (O) U ) Aversion item (N) U ). The user scores 5 points for the item, if the user scores 4-5 points for the item, the user is considered to like the item, and the item is classified into the favorite of the user; if the score is 3, the user is considered to feel that the item is mediocre, and the item is classified as the mediocre item of the user; if the score is 1-2, the user is considered to dislike the item, and the item is classified as dislike for the user.
S32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the hidden friends of the items: implicit friend like item (PS) U ) Implicit friend aversion item (NS) U ) The remainder (E) U ). The implicit friends of the user have watched and liked, and do not belong to (P) U 、O U And N U ) The items of (1) fall into the implicit friend like items of the user; the implicit friends of the user watch the information, the score is less than or equal to 3, and the information does not belong to (P) U 、O U 、N U And PS U ) Falls under the implicit friend aversion item of the user; will not belong to (P) U 、O U 、N U 、PS U And NS U ) The remaining items of (a) are included in the remaining items.
S33: through the two steps of classification, each user has mutually exclusive 6 types of articles, P U +O U +N U +PS U +NS U +E U = all item sets, and P U 、O U 、N U 、PS U 、NS U 、E U Independent in pairs and not mutually intersected.
(1) Like item (P) U ): for all users, order P U Representing items that user u has viewed and liked by itself;
(2) plain term (O) U ): let O be U Representing items that user u has viewed and feels mediocre by itself;
(3) aversion item (A)N U ): let N be U Items that user u has viewed and disliked by itself;
(4) implicit friend like item (PS) U ): let PS to U Indicating that someone in the implicit friends of user u watched and liked, and did not belong to (P) U 、O U And N U ) The article of (a);
(5) implicit friend aversion item (NS) U ): let NSU denote that user u's implicit friends have been watched and rated ≦ 3, and do not belong to (P) U 、O U 、N U And PS U ) The article of (1);
(6) remainder term (E) U ): has not been viewed by user u and does not belong to (P) U 、O U 、N U 、PS U And NS U ) The remainder of the article.
(4) S4: and modeling the preference sequence of each user based on a Bayesian sorting algorithm. For each user, the model assumes its degree of preference: like item > implicit friend like item > meditation item > remaining item > implicit friend aversion item > aversion item. The model is optimized according to the larger the joint probability of the establishment of the hypothesis, and the personalized ranking list of the user when the joint probability is maximum is obtained. (this step includes a Bayesian ranking unit of 106 in the flow chart)
S41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: it is assumed that the user's preference level is like > implicit friend like > mediocre > remaining > implicit friend dislike > dislike. This assumption is then transformed into a mathematical formula model:
f:x ui ≥x uj ≥x uk ≥x ul ≥x um ≥x un
i∈P u ,j∈PS u ,k∈O u ,l∈E u ,m∈NS u ,n∈N u
where Xui denotes a preference of the user u for the favorite i of the self evaluation, xuj denotes a preference of the user u for the implicit friend favorite j, xuk denotes a preference of the user u for the mediocre k of the self evaluation, xul denotes a preference of the user u for the remaining items 1, xum denotes a preference of the user u for the implicit friend aversion item m, and Xun denotes a preference of the user u for the aversion item n of the self evaluation.
S42: the above basic assumption can be used to maximize AUC, a larger AUC value, meaning that the probability of the above assumptions being combined is larger. Training is carried out according to the following optimized formula:
Figure BDA0002828615560000111
when the optimization goal is maximized, a list of items that are ordered by preference level for each user is obtained.
S43: the item ordered list results of each user are stored in a database for easy query.
(5) S5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items ranked to the user.
The method comprises the steps that articles are recommended to a user online, when a user logs in a platform, a system reads id information of the user, then an article recommendation list of the user is called from an offline database according to the id information of the user, and Top-N articles arranged in the front in the list are fed back to the user.
At this point, the article recommendation process based on the converged social network of bayesian sorting ends.
The same or similar reference numerals correspond to the same or similar parts;
the positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting in the present embodiment;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (4)

1. A recommendation method fusing social networks based on Bayesian ranking is characterized by comprising the following steps:
s1: fusing a social network of a user and an interactive network of the user and an article to construct a heterogeneous information network graph;
s2: identifying implicit friends of each user based on the heterogeneous graph embedded representation;
s3: according to the evaluation feedback of each user on the articles and the interaction records of the implicit friends and the articles, all the articles can be classified into 6 mutually exclusive groups in a fine-grained manner for each user: like items, plain items, dislike items, implicit friend like items, implicit friend dislike items, and remaining items;
s4: modeling the preference sequence of each user based on a Bayesian sorting algorithm: for each user, the model assumes its degree of preference: like the item > implicit friend like the item > mediocre item > remaining item > implicit friend aversion item > aversion item, the model is optimized with the higher joint probability of the assumption, and the personalized ranking list of the user when the joint probability is the maximum is obtained;
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user;
the process of the step S1 is as follows:
s11: constructing an interaction graph of users and articles, wherein each user or article is independently represented by a node, if the user has an interaction relationship with the article, an edge is used for connection, the weight of the edge represents the score of the user on the article, and after all the interaction relationships are connected, the interaction graph of the user and the article is obtained;
s12: on an interaction graph of the user and the article, adding the user to a social network through a combination unit; if a social relationship exists between the users, connecting the nodes of the two users; when the social relations of all the users are connected, obtaining a social heterogeneous information network graph;
the process of the step S2 is as follows:
s21: on a social heterogeneous information network, performing wandering sampling, and converting information from a graph form into a node sequence form;
s22: inputting the node sequence corpus into a Skip-Gram neural network for embedding, characterizing and learning, and training to obtain vector representation of each node;
s23: calculating the similarity between the embedded vectors of the users, and measuring the preference similarity degree between the users by using the similarity;
the process of step S21 is:
s211: designing a wandering rule from a current node to a next node:
1) If the current node is an article, the next node to be walked can only be a user because there is no edge connection between the articles, and if the current article node is connected with many users, the probability is designed
Figure FDA0003863276060000011
To select one of the users, the probability is proportional to the evaluation weight on the edge, that is, the higher the value of the user evaluation is, the higher the probability to be selected is;
2) If the current node is a user, the next node type to be walked is the user or an article, and a probability alpha x n is designed firstly -1 Determining whether the next node type is a user type or an article type, wherein n represents that the user type is accessed for n times continuously, so that the probability of continuously selecting the user is reduced when n is larger, so as to prevent the node type from staying too long and ensure more balanced sampling; after the next node type is selected, if the next node type is selected as a user type, selecting the next user node with equal probability from user nodes connected with the current user node; if the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure FDA0003863276060000012
Selecting one of the nodes, wherein the probability ensures that the object which is scored higher by the current user has higher probability to be selected;
the walk selection algorithm is expressed by the following equation (1):
Figure FDA0003863276060000021
in the formula v i Representing a current node; v. of i+1 Representing the next node to walk; m represents an item type; u. u 0 Representing a user type; alpha is epsilon [0,1]]Probability of being an initial user type; e (v) i ,v j ) Representing a node v i And node v j The weight of the edge in between; | N i+1 (v i ) L represents the number of social friends of the vi node;
s212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, wandering along the edge in the graph according to a designed wandering rule, wandering to an adjacent node of the user node according to the rule in the first step, wandering to an adjacent node of the adjacent node according to the rule in the second step, repeating the steps, wherein each step needs to carry out wandering sampling according to a designed probability until a specified step length L is wandered, and the size of the L is set according to the complexity of the heterogeneous information network to obtain N node sequences with the L step lengths;
s213: repeating the operation of the step S212 for W times to ensure that the heterogeneous graph is fully sampled, wherein the size of W is set according to the complexity of the heterogeneous information network to obtain W x N node sequences with L steps, namely a corpus obtained by wandering and sampling from the heterogeneous graph;
in the step S22, the node sequence corpus is input into a Skip-Gram neural network for embedding, characterizing and learning, a vector representation of each node is obtained through training, and for each current node v i The optimized objective function is:
Figure FDA0003863276060000022
wherein, C (v) i ) Representing a node v i V represents each node in the corpus,
Figure FDA0003863276060000023
is a Softmax function, and specifically comprises:
Figure FDA0003863276060000024
where θ is a weight parameter, V n Indicates the node type, y v An embedded vector representing node v;
in step S23, the similarity between the embedded vectors of the users is calculated, and the similarity is used to measure the degree of similarity between the preferences of the users, where the similarity calculation formula between the two vectors is:
Figure FDA0003863276060000025
for each user, calculating the similarity between the user and all other users by using the formula, taking Top-K users with the highest similarity as the implicit friends of the user, and finally obtaining the Top-K implicit friends of each user;
like item P U : for all users, let P U Representing items that user u has viewed and liked by itself;
plain item O U : let O be U Items that represent user u has viewed and felt mediocre by itself;
aversion item N U : let N be U An item that represents what user u has viewed and disliked itself;
implicit friend like item PS U : let PS to U Implicit friends representing user u are watched and liked by someone, and notBelong to P U 、O U And N U The article of (1);
NS for hidden friend aversion item U : order NS U The implicit friends of the user u are shown as watched by people and the score is less than or equal to 3, and the implicit friends do not belong to the group P U 、O U 、N U And PS U The article of (a);
remainder term E U : has not been viewed by user u and does not belong to P U 、O U 、N U 、PS U And NS U The remainder of the article.
2. The Bayesian ranking based recommendation method for converged social networks according to claim 1, wherein the step S3 is implemented by:
s31: for certain items that the user has interacted with, the items are set to 3 levels according to the scores given to the items by the user: like item P U Peaceful item O U Aversion item N U (ii) a The user scores 5 points for the item, and if the user scores 4-5 points for the item, the user is considered to like the item and is classified as the favorite of the user; if the score is 3, the user is considered to feel that the item is mediocre, and the item is classified as the mediocre item of the user; if the score is 1-2, the user is considered to dislike the item, and the item is classified as the dislike item of the user;
s32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the hidden friends of the items: implicit friend like item PS U NS for hidden friend aversion item U The remaining item E U (ii) a The implicit friends of the user have watched and liked and do not belong to P U 、O U And N U The items of (2) are classified into implicit friend like items of the user; the implicit friends of the user have watched the user, have the score less than or equal to 3 and do not belong to P U 、O U 、N U And PS U The items of (1) are classified into implicit friend aversion items of the user; will not belong to P U 、O U 、N U 、PS U And NS U The rest items are classified into the rest items;
s33: through the classification of the two steps, each user has mutually exclusive 6 types of articles, P U +O U +N U +PS U +NS U +E U = all item sets, and P U 、O U 、N U 、PS U 、NS U 、E U Independent in pairs and not mutually intersected.
3. The Bayesian ranking based recommendation method for converged social networks according to claim 2, wherein the step S4 is performed by:
s41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: assuming that the user's preference level is like item > implicit friend like item > mediocre item > remaining item > implicit friend dislike item > dislike item, then this assumption is transferred to a mathematical formula model:
f:x ui ≥x uj ≥x uk ≥x ul ≥x um ≥x un
wherein i ∈ P u ,j∈PS u ,k∈O u ,l∈E u ,m∈NS u ,n∈N u
Wherein x is ui Indicates the preference, x, of the user u for the favorite i of his own evaluation uj Representing the preference, x, of user u for implicit friend like item j uk Indicates the preference, x, of user u for the mediocre term k evaluated by itself ul Indicates the preference of user u for the remaining item l, x um Representing the preference of user u for implicit friend aversion m, x un A preference of the user u for the aversion item n evaluated by the user u;
s42: the above basic assumptions can be used to maximize AUC, and a larger AUC value, which means that the greater the probability of the above assumptions combining, is trained in the following optimized formula:
Figure FDA0003863276060000031
Figure FDA0003863276060000032
when the optimization target reaches the maximum, an item list ordered by each user according to the preference degree can be obtained;
s43: the item ordered list results for each user are stored in a database for easy querying.
4. The Bayesian-ranking-based recommendation method for fusing social networks according to claim 3, wherein in the step S5, when a user logs in the platform, the system reads the id information of the user, then calls an item recommendation list of the user from an offline database according to the id information of the user, and feeds back Top-N items arranged at the front in the list to the user.
CN202011435734.4A 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting Active CN112650920B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011435734.4A CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011435734.4A CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Publications (2)

Publication Number Publication Date
CN112650920A CN112650920A (en) 2021-04-13
CN112650920B true CN112650920B (en) 2022-11-11

Family

ID=75350667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011435734.4A Active CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Country Status (1)

Country Link
CN (1) CN112650920B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110910218A (en) * 2019-11-21 2020-03-24 南京邮电大学 Multi-behavior migration recommendation method based on deep learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934071A (en) * 2017-04-27 2017-07-07 北京大学 Recommendation method and device based on Heterogeneous Information network and Bayes's personalized ordering
CN107403390B (en) * 2017-08-02 2020-06-02 桂林电子科技大学 Friend recommendation method integrating Bayesian reasoning and random walk on graph
CN109726747B (en) * 2018-12-20 2021-09-28 西安电子科技大学 Data fusion ordering method based on social network recommendation platform
CN111428147B (en) * 2020-03-25 2021-07-27 合肥工业大学 Social recommendation method of heterogeneous graph volume network combining social and interest information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110910218A (en) * 2019-11-21 2020-03-24 南京邮电大学 Multi-behavior migration recommendation method based on deep learning

Also Published As

Publication number Publication date
CN112650920A (en) 2021-04-13

Similar Documents

Publication Publication Date Title
CN111428147B (en) Social recommendation method of heterogeneous graph volume network combining social and interest information
Kuo et al. Integration of ART2 neural network and genetic K-means algorithm for analyzing Web browsing paths in electronic commerce
CN108920527A (en) A kind of personalized recommendation method of knowledge based map
CN109190030B (en) Implicit feedback recommendation method fusing node2vec and deep neural network
CN106296312A (en) Online education resource recommendation system based on social media
Park et al. Uniwalk: Explainable and accurate recommendation for rating and network data
CN112507246B (en) Social recommendation method fusing global and local social interest influence
Xue et al. Trust-aware review spam detection
Wang et al. Personalized news recommendation based on consumers' click behavior
Yigit et al. Extended topology based recommendation system for unidirectional social networks
CN111475724A (en) Random walk social network event recommendation method based on user similarity
US20130138662A1 (en) Method for assigning user-centric ranks to database entries within the context of social networking
CN114298783A (en) Commodity recommendation method and system based on matrix decomposition and fusion of user social information
Wang et al. Link prediction in heterogeneous collaboration networks
CN112905906B (en) Recommendation method and system fusing local collaboration and feature intersection
CN112650920B (en) Recommendation method fusing social networks based on Bayesian sorting
CN110795640B (en) Self-adaptive group recommendation method for compensating group member difference
Shalforoushan et al. Link prediction in social networks using Bayesian networks
CN112307343B (en) Cross-E-book city user alignment method based on double-layer iterative compensation and full-face representation
Wasid et al. Particle swarm optimisation-based contextual recommender systems
CN114022233A (en) Novel commodity recommendation method
JP7158870B2 (en) Information processing device, information processing method, and information processing program
Cai et al. Marriage recommendation algorithm based on KD-KNN-LR model
Song et al. Social recommendation based on implicit friends discovering via meta-path
Yonggui et al. Hybrid Recommendation Algorithm Combining Project Importance and Prediction Score

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant