CN116127201A

CN116127201A - Large-scale user recommendation method based on evolutionary multitasking

Info

Publication number: CN116127201A
Application number: CN202310167737.1A
Authority: CN
Inventors: 马海平; 胡以葳; 田野
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2023-02-27
Filing date: 2023-02-27
Publication date: 2023-05-16

Abstract

The invention discloses a large-scale user recommendation method based on evolutionary multitasking, which comprises the following steps: step 1, acquiring a user and article interaction data set, and step 2, excavating preference scores of the user on articles by constructing a neural network algorithm to obtain a scoring matrix of the user on the articles; step 3: grouping users through clustering, and taking users with similar preference interests as the same category; step 4, initializing and generating a multi-task population; and 5, performing information migration among individuals in the same user group, performing information migration among populations among different user groups, iteratively selecting an optimal user solution through environment selection, and finally selecting an optimal solution as a recommendation list of the user. The invention can reduce the time and space consumed in large-scale user recommendation optimization problem, and improves the accuracy of predicting the user recommendation result through the clustering technology.

Description

Large-scale user recommendation method based on evolutionary multitasking

Technical Field

The invention belongs to the crossing field of evolutionary computation and data mining, and particularly relates to a recommendation method based on evolutionary multitasking optimization.

Background

The aim of the recommendation system is to help users to screen useful information from massive data, the traditional recommendation system only considers the recommendation accuracy, and besides the accuracy, other performances such as novelty, diversity and the like are also important indexes of the recommendation system, so that the multi-target recommendation system becomes an important research direction of the recommendation system. However, as the recommendation system needs to meet multiple indexes, optimizing some indexes inevitably brings conflicts to some indexes. Therefore, multi-objective optimization becomes an important technical means for solving the recommendation system.

The existing multi-objective optimization method of the recommendation system comprises the following steps: the multi-objective index is weighted and summed into a scalar method of a single objective problem, or a population-based evolutionary algorithm is used to simultaneously optimize multiple objectives.

Scalar methods first typically sum two target weights and then combine the scalar with pareto high efficiency SGD using a multiple gradient descent algorithm, using KKT conditions to direct the updating of scalar weights. However, this method can only be optimized for targets with gradients.

When the evolutionary algorithm optimizes the multi-objective recommendation system, the evolutionary algorithm is usually operated for each user respectively, however, the evolutionary algorithm is used as an iterative algorithm, and when the number of users in the data set is excessive, the algorithm is sequentially optimized, so that the operation time of the algorithm is excessively long. Or the recommended results of the users are borrowed into one chromosome through real solution, and all users are simultaneously optimized in one optimization process, but the method can lead to overlong encoding length of the chromosome, and is difficult to simultaneously optimize all users to be optimal, so that the optimization method is easy to fall into a local optimal solution, and the recommended article results can not meet the user requirements.

Disclosure of Invention

The invention provides a large-scale user recommending method based on evolutionary multitasking for solving the defects in the prior art, thereby realizing a recommending list containing accuracy, novelty and diversity for users and simultaneously ensuring the high efficiency and the sum of the recommending methods.

The invention adopts the following technical scheme for solving the technical problems:

the invention discloses a large-scale user recommendation method based on evolutionary multitasking, which is characterized by comprising the following steps:

step one, acquiring related data of a user and an article:

acquiring a user set s= { S ₁ ,s ₂ ,…,s _u ,…,s _|S| -wherein S represents the number of users S _u The user u is represented;

acquiring an item set q= { Q ₁ ,q ₂ ,…,q _i ,…,q _|Q| -wherein Q represents the number of items, Q _i Representing an ith item;

acquiring a user article interaction data set, taking interaction data of each user and articles as positive samples, and randomly collecting articles which are not interacted by the user as negative samples;

step two, obtaining a scoring matrix of the user on the article through a prediction model, wherein the prediction model comprises the following steps: the system comprises a coding layer, a full connection layer, an attention interaction layer, an interaction output layer and a prediction layer;

step 2.1, the coding layer is used for the u number user s _u And the ith article q _i Performing one-hot coding to obtain u-number user s _u Is a sparse vector theta of (2) _u And article q _i Is a sparse vector of (2)

Then mapping the two sparse vectors to E-dimensional vectors respectively, and obtaining u-number user s through mapping of the full connection layer _u Is a representation vector p of (2) _u And the ith article q _i Is a representation vector o of (2) _i ；

Step 2.2, the attention interaction layer pair p _u And o _i Processing and outputting attention vector a _u,i ；

Step 2.3, the interactive output layer will p _u And o _i After splicing, the vector is combined with the attention vector a _u,i After dot multiplication, obtaining a u-number user s _u And the ith article q _i Is of the interaction vector f _u,i ；

Step 2.4, the prediction layer pair interaction vector f _u,i After multi-layer full-connection processing, a predicted interaction score r is output _u,i ；

Step 2.5, taking the mean square error of the minimized predicted interaction score as a loss function, and optimizing the prediction model by using an Adam algorithm until the maximum iteration number is reached, so as to obtain an optimal prediction model and an interaction score matrix of each item predicted by each user output by the optimal prediction model;

step three, setting the clustering number as K, wherein top represents the number of candidate articles, and selecting u-number users s _u Top preferred items of (a) as candidate item set candidates _u Counting the number of the same articles in the candidate article sets of each user as the similarity between the corresponding users; users with similar similarity are used as a class through a clustering algorithm, so that the users are divided into K groups by S|, and a user set U= { U is obtained ₁ ,U ₂ ,…,U _j ,…,U _K }，U _j User group representing a j-th group category, and U _j ＝{P _j,1 ,P _j,2 ,…,P _j,m ,…,P _j,M }；P _j,m Representing U _j P-th user of (a);

step four, initializing a population:

step 4.1, defining the current iteration number as L, the maximum iteration number as L, enabling N to be the population individual number, and adopting a real number system to enable the user group U of the j group category to be the j group _j N recommendation results of each user are respectively encoded into an individual with the length of T, and each decision variable of the individual represents the serial number of the recommended article; thus, N recommendation results of one user form a group, and the mth user P is led to _j,m N recommended results of (a) are recorded as the mth species of the first generationGroup, user group U of j-th group category _j All first generation populations in (a) are marked as

U is set to _j M th user P _j,m The nth recommendation of (2) is recorded as the nth individual in the mth population of the first generation

And is also provided with

Representing the mth generation of users P _j,m The number of the t recommended article in the nth recommended result;

step 4.2, from the mth user P, based on the interaction score matrix predicted by each user for each item _j,m Candidate item set of (5) _j,m Randomly selecting the serial numbers of T non-repeated articles for pairing

Initializing;

step 4.3, obtaining the mth generation of the mth user P by using the formula (1) _j,m Is the nth recommendation result of (2)

Accuracy index>

In the formula (1), the components are as follows,

representing the mth user P _j,m For serial number->

Scoring the corresponding item;

obtaining the mth generation user P by using the method (2) _j,m Is the nth recommendation result of (2)

Novel rate index of (a)

/>

In formula (2), a polar _t Indicating serial number

The popularity of the corresponding item;

obtaining the mth generation user P by using the method (3) _j,m Is the nth recommendation result of (2)

Is a diversity index of (2)

In the formula (3), label

Indicating number->

Category label of corresponding article, label _all Representing user item interaction data set propertiesCategory labels of the products;

constructing a first generation multi-objective optimization function maxisize by using the formula (4)

And 5, performing information migration among individuals among the same user group, performing information migration among populations among different user groups, and iteratively selecting an optimal user solution through environment selection.

The large-scale user recommendation method based on evolutionary multitasking of the invention is also characterized in that the step 5 comprises:

step 5.1, information migration among individuals is carried out among the same user group:

step 5.1.1 Using binary tournament selection method based on equation (4)

Selecting 2 XN recommended results to participate in evolution to obtain first generation mating pool ∈>

Step 5.1.2 from the first Generation pool

Selecting two recommended results of the first generation and marking the recommended results as +.>

And->

And performing cross operation to obtain two first generationCross recommendation result->

Wherein (1)>

Representation->

The number of the t-th recommended item,/-)>

Representation->

The number of the t recommended article;

step 5.1.3 with probability m _P For a pair of

Performing mutation operation:

randomly selecting a number r' from {1,2,3 …, T }, from

Candidate item sets corresponding to affiliated users _j,1 Is selected randomly from one and->

The order of the r' th recommended item +.>

The different article serial numbers are replaced, thereby obtaining +.>

Recommended first generation variant of ++>

Step 5.1.4 for the first Generation pool

After crossing and mutation operations are carried out on all recommended results in the first generation of all populations according to the steps 5.1.2-5.1.3, mutation recommended results of all populations in the first generation are obtained>

Will->

And->

User group U combined into j-th group category _j The method comprises the steps of (1) measuring the fitness value of each recommended result in the first-generation M new populations through a formula (4), so that environment selection is carried out on the first-generation M new populations through non-dominant sorting and crowding distances, and the optimal N recommended results are reserved as the first-generation (1) M populations;

step 5.2, information migration among the populations is carried out among different user groups:

if the mth generation of users P _j,m If more than half of the recommended results are unchanged for a plurality of successive generations, computing a user group U of the j-th group category _j Similarity between user groups of other group categories, and selecting the user group with highest similarity for the mth user P of the first generation _j,m Performing crossover and mutation operations between all recommended results of (1) so as to obtain an mth population of the (1+1) th generation according to the process of the step 5.1.4;

step 5.3, after assigning l+1 to L, judging whether L reaches L, if not, returning to step 4.3 for sequential execution, otherwise, selecting an individual from the mth population of the L generation as a user group U of the j-th group class _j M th user P _j,m Is a recommended result of the user.

The step 5.1.2 comprises:

step a, judging

And->

If the users belong to the same user, executing the step b, otherwise, executing the step c;

step b, pairing

And->

With probability c _P Performing crossover operation:

randomly selecting a number r from {1,2,3 …, T }, will

The former r position and->

The first r bits of (a) are exchanged to obtain the first generation two cross recommended results +.>

And->

Wherein (1)>

Representation->

The order of the r-th recommended item,/-)>

Representation of

Middle (f)The serial numbers of r recommended articles;

step c, will

And->

Combining to obtain a first-generation new recommended result, using formula (1) as fitness value, using binary competitive competition method to respectively select serial numbers corresponding to T items from the first-generation new recommended result and respectively forming two first-generation cross recommended results->

To->

As->

Cross-recommendation results generated by +.>

As->

The cross recommendation results are generated.

The electronic device of the invention comprises a memory and a processor, wherein the memory is used for storing a program for supporting the processor to execute any large-scale user recommendation method, and the processor is configured to execute the program stored in the memory.

The invention provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and the computer program is executed by a processor to execute the steps of any large-scale user recommendation method.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the method, the evolutionary multitasking is used for optimizing the multi-target recommendation system for the first time, the traditional optimization method can only generate an optimized recommendation result for one user at a time, and correlation among different tasks is not considered. According to the method and the device, the optimal recommendation results of a plurality of users can be simultaneously generated in one optimization process, so that the calculated amount of a recommendation algorithm is greatly reduced.

2. In order to avoid the negative migration phenomenon possibly caused by simultaneous optimization, the invention uses a clustering algorithm to take a plurality of users with similar interests in data as a user group in an optimization method, and designs a new operator to accelerate the convergence of the algorithm, thereby improving the recommendation speed.

3. According to the invention, by designing a population optimization scheme, genetic operators and information migration strategies of different populations, the calculated amount is reduced, the recommendation efficiency is improved, and the recommendation accuracy is ensured.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a diagram illustrating an example of user clustering in accordance with the present invention;

FIG. 3 is an exemplary diagram of population initialization in accordance with the present invention;

FIG. 4 is a diagram illustrating the generation of a next generation population according to the present invention.

Detailed Description

In this embodiment, a large-scale user recommendation method based on evolutionary multitasking, as shown in fig. 1, is performed according to the following steps:

step one, acquiring related data of a user and an article:

step two, obtaining a scoring matrix of the user on the article through a prediction model, wherein the prediction model comprises: the system comprises a coding layer, a full connection layer, an attention interaction layer, an interaction output layer and a prediction layer;

step 2.1, coding layer pair u number user s _u And the ith article q _i Performing one-hot coding to obtain u-number user s _u Is a sparse vector theta of (2) _u And article q _i Is a sparse vector of (2)

Then mapping the two sparse vectors to E-dimensional vectors respectively, and obtaining u-number user s through mapping of a full connection layer _u Is a representation vector p of (2) _u And the ith article q _i Is a representation vector o of (2) _i ；

Step 2.2, attention interaction layer pair p _u And o _i Processing and outputting attention vector a _u,i ；

Step 2.3, the interaction output layer outputs p _u And o _i After splicing, the vector is combined with the attention vector a _u,i After dot multiplication, obtaining a u-number user s _u And the ith article q _i Is of the interaction vector f _u,i ；

Step 2.4, prediction layer pair interaction vector f _u,i After multi-layer full-connection processing, a predicted interaction score r is output _u,i ；

Step 2.5, taking the mean square error of the interaction score of the minimum prediction as a loss function, and optimizing the prediction model by using an Adam algorithm until the maximum iteration number is reached, so as to obtain an optimal prediction model and an interaction score matrix of each user output by the optimal prediction model for each article;

step three, setting the clustering number as K, wherein top represents the number of candidate articles, and selecting u-number users s _u Top preferred items of (a) as candidate item set candidates _u Counting the number of the same articles in the candidate article sets of each user as the similarity between the corresponding users; users with similar similarity are used as a class through a clustering algorithm, so that the users are classified into K groups by S, and in the embodiment, the method comprises the following steps ofK= |s|/10 and a user set u= { U is obtained ₁ ,U ₂ ,…,U _j ,…,U _K }，U _j User group representing a j-th group category, and U _j ＝{P _j,1 ,P _j,2 ,…,P _j,m ,…,P _j,M }；P _j,m Representing U _j P-th user of (a); as in the example of fig. 2, it is assumed that there are 4 users s ₁ ,s ₂ ,s ₃ ,s ₄ The candidate item set of (2) is obtained by counting the same item number of different users ₁ Sum s ₂ Is 7,s ₃ Sum s ₄ The similarity of (2) is 7, so the final clustering result is s ₁ Sum s ₂ Is U (U) ₁ Class, s ₃ Sum s ₄ Is U (U) ₂ Class.

Step four, initializing a population:

step 4.1, defining the current iteration number as L and the maximum iteration number as L, in this embodiment, setting l=100, and making N be the population individual number, in this embodiment, setting n=10, and using real number system to make user group U of the j-th group category _j N recommendation results of each user are respectively encoded into an individual with a length of T, in this embodiment, t=10 is set, and each decision variable of the individual represents a serial number of a recommended article; thus, N recommendation results of one user form a group, and the mth user P is led to _j,m The N recommended results of the (a) are recorded as the mth population of the first generation, and the user group U of the jth group category _j All first generation populations in (a) are marked as

And is also provided with

Representing the mth generation of users P _j,m The nth recommended article serial number in the nth recommended result, the population individuals are composed of the recommended results of all users of the group, and the recommended results are encoded into a matrix. Fig. 3 shows an example of a class of population individuals with a number of users of 2 and a recommended length of 10.

Initializing;

Accuracy index>

In the formula (1), the components are as follows,

representing the mth user P _j,m For->

Scoring of the item;

Novel rate index of (a)

In formula (2), a polar _t Representation of

Popularity of the item;

Is a diversity index of (2)

In the formula (3), label

Representation->

Category label of article, label _all Category labels representing all items in the user item interaction dataset;

Step 5, information migration among individuals is carried out among the same user group, information migration among populations is carried out among different user groups, and an optimal user solution is selected through environment selection iteration;

step 5.1.1 Using binary tournament selection method based on equation (4)

Step 5.1.2 from the first Generation pool

And->

And performing a crossover operation wherein->

Representation->

The number of the t-th recommended item,/-)>

Representation->

The number of the t recommended item:

step a, judging

And->

step b, pairing

And->

With probability c _P Performing crossover operation:

randomly selecting a number r from {1,2,3 …, T }, will

The former r position and->

And->

Wherein (1)>

Representation->

The order of the r-th recommended item,/-)>

Representation of

The number of the r-th recommended article;

step c, will

And->

To->

As->

Cross-recommendation results generated by +.>

As->

Generating a cross recommendation result;

step 5.1.3 with probability m _P For a pair of

Performing mutation operation:

randomly selecting a number r' from {1,2,3 …, T }, from

The r' th recommended article in the list/>

The different article serial numbers are replaced, thereby obtaining +.>

Recommended first generation variant of ++>

Fig. 4 shows an example illustrating the specific operation of the crossover and mutation operator. The class of user population in the example includes 2 users, denoted s ₁ ,s ₂ ，X ₁ ,X ₂ Is s ₁ ,s ₂ Is 10, probability c _P ,m _P 0.5 and 0.5, respectively. First, the crossover operation is performed, and assuming that the generated random number is 0.3, then for X ₁ ,X ₂ Cross operation is performed on X ₁ Randomly selecting a number from {1,2,..10 }, and if 3, crossing (1,23,15,9,5,12,4,18,22,14), (18,2,16,8,20,24,4,25,17,30) to obtain (1,23,15,8,20,24,4,25,17,30), (18,2,16,9,5,12,4,18,22,14). Let the same assumption X ₂ Randomly selecting one number, which is 4, (10,2,7,6,5,3,11,13,17,19), (10,2,15,6,21,12,11,27,28,29) to obtain (10,2,7,6,21,12,11,27,28,29), (10,2,15,6,5,3,11,13,17,19) after crossing, to obtain two children

Then, a mutation operator is performed for each child. For->

Assuming that the corresponding generated random number is 0.4, then for +.>

A mutation operation was performed, and a number selected from {1,2,..10 } was randomly selected, assuming 4. From user s ₁ Candidate item set candidates of (c) _s1 A new article 6 is selected to replace the original article 8. Pair s of the same theory ₂ Performing similar operations, finally->

Obtaining->

They are two new recommendations.

Step 5.1.4 for the first Generation pool

Will->

And->

In this embodiment, an electronic device includes a memory for storing a program supporting the processor to execute the above method, and a processor configured to execute the program stored in the memory.

In this embodiment, a computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of the method described above.

Claims

1. The large-scale user recommendation method based on the evolutionary multitasking is characterized by comprising the following steps of:

step one, acquiring related data of a user and an article:

step 2.1, the coding layer is used for the u number user s _u And the ith article q _i Performing one-hot coding to obtain u-number user s _u Is a sparse vector theta of (2) _u And article q _i Is sparse toMeasuring amount

step four, initializing a population:

step 4.1,Defining the current iteration number as L, the maximum iteration number as L, enabling N to be the population individual number, and adopting a real number system to set the user group U of the j group category _j N recommendation results of each user are respectively encoded into an individual with the length of T, and each decision variable of the individual represents the serial number of the recommended article; thus, N recommendation results of one user form a group, and the mth user P is led to _j,m The N recommended results of the (a) are recorded as the mth population of the first generation, and the user group U of the jth group category _j All first generation populations in (a) are marked as

And is also provided with

Initializing;

Accuracy index>

In the formula (1), the components are as follows,

representing the mth user P _j,m For serial number->

Scoring the corresponding item;

Novel index->

In formula (2), a polar _t Indicating serial number

The popularity of the corresponding item;

Diversity index->

In the formula (3), the amino acid sequence of the compound,

indicating the serial number x _j ^l _,m,n,t Category label of corresponding article, label _all Category labels representing all items in the user item interaction dataset;

constructing a first generation multi-objective optimization function by using the method (4)

2. The evolutionarily multitasking-based large-scale user recommendation method of claim 1, wherein said step 5 comprises:

step 5.1.1 Using binary tournament selection method based on equation (4)

Step 5.1.2 from the first Generation pool

And->

And performing cross operation to obtain two first generation cross recommendation results +.>

Wherein (1)>

Representation->

The number of the t-th recommended item,/-)>

Representation->

The number of the t recommended article;

step 5.1.3 with probability m _P For a pair of

Performing mutation operation:

randomly selecting a number r from {1,2,3 …, T } ^′ From the slave

Middle (r) ^′ Number ∈of each recommended item>

The different article serial numbers are replaced, thereby obtaining +.>

Recommended first generation variant of ++>

Step 5.1.4 for the first Generation pool

Will->

And->

step 5.3, after assigning l+1 to L, judging whether L reaches L, if not, returning to step 4.3 for sequential execution, otherwise, selecting an individual from the mth population of the L generation as the j-th group classUser group U of (2) _j M th user P _j,m Is a recommended result of the user.

3. The evolutionarily multitasking-based large-scale user recommendation method of claim 2, wherein said step 5.1.2 comprises:

step a, judging

And->

step b, pairing

And->

With probability c _P Performing crossover operation:

randomly selecting a number r from {1,2,3 …, T }, will

The former r position and->

And->

Wherein (1)>

Representation->

The order of the r-th recommended item,/-)>

Representation of

The number of the r-th recommended article;

step c, will

And->

To->

As->

Cross-recommendation results generated by +.>

As->

The cross recommendation results are generated.

4. An electronic device comprising a memory and a processor, wherein the memory is for storing a program supporting the processor to perform the large-scale user recommendation method of any one of claims 1-3, the processor being configured to execute the program stored in the memory.

5. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of the large-scale user recommendation method according to any of claims 1-3.