CN111651672A - Time-interval user activity recommendation method and device based on deep learning - Google Patents

Time-interval user activity recommendation method and device based on deep learning Download PDF

Info

Publication number
CN111651672A
CN111651672A CN202010463499.5A CN202010463499A CN111651672A CN 111651672 A CN111651672 A CN 111651672A CN 202010463499 A CN202010463499 A CN 202010463499A CN 111651672 A CN111651672 A CN 111651672A
Authority
CN
China
Prior art keywords
activity
user
data
neural network
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010463499.5A
Other languages
Chinese (zh)
Inventor
黄新恩
黄茉
翁增仁
胡锦锋
郑升尉
陈海量
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Boss Software Co ltd
Original Assignee
Fujian Boss Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Boss Software Co ltd filed Critical Fujian Boss Software Co ltd
Priority to CN202010463499.5A priority Critical patent/CN111651672A/en
Publication of CN111651672A publication Critical patent/CN111651672A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a time-interval user activity recommendation method based on deep learning, which comprises the following steps of: step 1, a user logs in a system to obtain user data and current time data of the user; meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities; step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data 5 and one activity data; step 3, preprocessing each group of input data, and converting text type data into numerical type data; step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network; and 5, recommending the activity with the highest prediction score to the user.

Description

Time-interval user activity recommendation method and device based on deep learning
Technical Field
The invention relates to a time-interval user activity recommendation method and device based on deep learning, and belongs to the field of computer artificial intelligence recommendation.
Background
The recommendation system is an application for judging the current needed or interested goods/services of the user according to the historical behavior, social relations, interest points, the context environment and other information of the user. The main task of the recommendation system is to contact the user and the information. For the user, the recommendation system can help the user to find favorite articles/services, help to make decisions and find new things which the user may like; for merchants, the recommendation system can provide personalized services for users, improve the trust and the viscosity of the users and increase revenues.
In this information explosion age, the information overload problem has prompted the penetration of recommendation systems in our daily lives: e-commerce, movie or video websites, personalized music web sites, social networks, personalized reading, location-based services, personalized mail, personalized advertising … … the recommendation system pushes content you may be interested in unknowingly to you when you are shopping for panning, order for takeaway, listen to web sites, watch drama, look up mail, panning for action.
The personalized recommendation system needs to rely on behavior data of users, and generally exists in different websites as an application. The recommendation system is widely applicable to daily life, such as online shopping, music software, movie websites and the like, and personalized content recommendation is performed according to personal preference, habit and other information. But there is little to analyze the activities of interest for a specific period of time according to the user's operation and in conjunction with the operation time, and then make recommendations. An activity refers to an activity performed by a user in a particular system, or may be understood as an operational behavior of the user on a sub-web page according to daily usage habits or needs.
Disclosure of Invention
In order to solve the technical problem, the invention provides a time-interval user activity recommendation method based on deep learning, which analyzes activities of a user interested in different time intervals according to a month and a time interval in which the user operation occurs, collects user data, activity data and time data, analyzes the activities used for the user interested in a specific time interval by utilizing a deep neural network, and realizes accurate time-interval user activity recommendation.
The first technical scheme of the invention is as follows:
a time-interval user activity recommendation method based on deep learning comprises the following steps: step 1, a user logs in a system to obtain user data and current time data of the user; meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities; step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data; step 3, preprocessing each group of input data, and converting text type data into numerical type data; step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network; and 5, recommending the activity with the highest prediction score to the user.
More preferably, the deep neural network comprises a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer; the input of the first neural network is preprocessed user data, the output of the first neural network is a user characteristic vector, the input of the second neural network is preprocessed activity data, the output of the second neural network is an activity characteristic vector, the third neural network is preprocessed time data, the output of the third neural network is a time characteristic vector, the user characteristic vector, the activity characteristic vector and the time characteristic vector are subjected to vector splicing and then input into the total full-connection layer, and the output of the total connection layer is prediction scoring corresponding to each activity performed by the user at the time.
Preferably, the user data in step 2 includes a user ID and a user occupation, the activity data includes an activity ID, an activity name and an activity type, and the time data is a date; in step 3, each group of input data is preprocessed, specifically: respectively converting the user ID and the activity ID into digital numbers; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; unifying the text length of the activity name, and filling the blank part with numbers; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number.
Preferably, the first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensions (a, b), respectively obtains user ID features and user occupation features, performs vector splicing on the user ID features and the user occupation features, and inputs the user ID features and the user occupation features into the first full connection layer, and the first full connection layer outputs the user feature vectors; the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector; the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts the month information and the time period information into vectors with dimensions (a, b/2) to respectively obtain month characteristics and monthly time period characteristics; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
Preferably, the deep neural network is trained as follows: creating a training set: collecting previously generated user activity data as trainingTraining samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity; preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data; training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample; using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure BDA0002511851930000041
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; adjusting the parameter value of the deep neural network to the gradient with reduced average error by adopting a random gradient descent method, and calculating the weight value and the offset value of the updated deep neural network; and reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, so as to finish the convergence of the deep neural network.
The invention also provides a time-interval user activity recommendation device based on deep learning.
The second technical scheme of the invention is as follows:
a deep learning based time-segment user activity recommendation device comprising a processor and a memory, the memory having instructions stored thereon, the processor executing the instructions to perform the steps of: step 1, a user logs in a system to obtain user data and current time data of the user; meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities; step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data; step 3, preprocessing each group of input data, and converting text type data into numerical type data; step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network; and 5, recommending the activity with the highest prediction score to the user.
More preferably, the deep neural network comprises a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer; the input of the first neural network is preprocessed user data, the output of the first neural network is a user characteristic vector, the input of the second neural network is preprocessed activity data, the output of the second neural network is an activity characteristic vector, the third neural network is preprocessed time data, the output of the third neural network is a time characteristic vector, the user characteristic vector, the activity characteristic vector and the time characteristic vector are subjected to vector splicing and then input into the total full-connection layer, and the output of the total connection layer is prediction scoring corresponding to each activity performed by the user at the time.
Preferably, the user data in step 2 includes a user ID and a user occupation, the activity data includes an activity ID, an activity name and an activity type, and the time data is a date; in step 3, each group of input data is preprocessed, specifically: respectively converting the user ID and the activity ID into digital numbers; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; unifying the text length of the activity name, and filling the blank part with numbers; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number.
Preferably, the first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensions (a, b), respectively obtains user ID features and user occupation features, performs vector splicing on the user ID features and the user occupation features, and inputs the user ID features and the user occupation features into the first full connection layer, and the first full connection layer outputs the user feature vectors; the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector; the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts the month information and the time period information into vectors with dimensions (a, b/2) to respectively obtain month characteristics and monthly time period characteristics; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
Preferably, the deep neural network is trained as follows: creating a training set: collecting previously generated user activity data as training samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity; preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data; training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample; using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure BDA0002511851930000061
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; adjusting the parameter value of the deep neural network to the gradient with reduced average error by adopting a random gradient descent method, and calculating the weight value and the offset value of the updated deep neural network; reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, thereby completing the convergence of the deep neural network。
The invention has the following beneficial effects:
1. the invention relates to a time-interval user activity recommendation and device based on deep learning.
2. The time-interval user activity recommendation and device based on deep learning can better understand the nonlinear relation between user requirements and activities changing along with time change, and the performance of the time-interval user activity recommendation and device is superior to that of a conventional recommendation system.
Drawings
FIG. 1 is a flowchart of a time-phased user activity recommendation method based on deep learning according to the present invention;
FIG. 2 is a schematic diagram of a deep neural network according to the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
Example one
Referring to fig. 1 and 2, a method for recommending time-share user activities based on deep learning includes the following steps: step 1, a user logs in a system to obtain user data and current time data of the user; preferably, the user data comprises a user ID and a user occupation, and the time data is a date comprising a month and a day; and meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities, and the activity data comprises an activity ID, an activity name and an activity type. And 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data. And 3, preprocessing each group of input data, and converting text type data into numerical type data. The data are input into the embedding layer after being preprocessed, the existing characteristic matrix obtained through a characteristic extraction network is replaced, the characteristic dimension can be reduced, and the system calculation amount can be reduced. Specifically, the method comprises the following steps: the user ID and the activity ID are respectively converted into number numbers, the number numbers of the user ID are not repeated and have uniqueness, and similarly, the number numbers of the activity ID are not repeated and have uniqueness. For example, the numeric numbers after conversion corresponding to the user ID and the activity ID have numeric ranges of 0 to U and 0 to a, respectively, where U is the total number of users-1 and a is the total number of activities-1; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; for example, for user occupation and activity type, respectively creating a mapping table from text to numbers, and then converting each user occupation and activity type into corresponding numbers, wherein the number ranges are 0 to J and 0 to C, J is the total number of user occupation-1, and C is the total number of activity type-1; specifically, the method comprises the following steps: the mapping relation established for the user occupation is as follows: {1: a research and development engineer; 2, front end engineer; AI algorithm engineer; 4, a project manager; .... J: general manager }, the mapping relation established for the activity type is as follows: {1: attendance; 2, finance category; 3, human matters; ....4: tongue-and-groove type }. Unifying the text length of the activity name, wherein the blank part is filled with numbers, for example, the blank part is defined to be filled with 0; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number. For example, the month information is represented by the original data 1 to 12, the day information may be divided into 6 time periods in one month, so that recommendation of different time periods is realized, and the divided 6 time periods are represented by the numbers 1 to 6. Step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network; and 5, recommending the activity with the highest prediction score to the user.
The application process of the above embodiment is described by taking an office system as an example:
step 1, when a user logs in an office system by three, the system acquires a user ID, a user occupation and current time data: [ user data: third, project manager, time data [ 5 months 1 day ]; the system also automatically obtains all activity IDs, activity names, and corresponding activity types: activity data list [ 1 please leave attendance class; 2, supplementing cards and checking attendance; .... N billing records query financial classes; [ MEANS FOR solving PROBLEMS ] is provided.
Step 2, combining the user data, the current time data and each activity data to obtain N groups of input data:
user data [ Zhang III, project manager ], time data [ 5 months and 1 days ], activity data [ 1 leave attendance class ];
user data (Zhang III, project manager), time data (5 months and 1 days), activity data (2 card supplementing attendance);
.......
user data [ Zhang III, project manager ], time data [ 5 months and 1 days ], activity data [ N invoicing record inquiry financial class ].
Step 3, carrying out data preprocessing on each group of input data, converting data in a text form into data in a digital format, and converting the data format after preprocessing into:
user data [ 185, 4 ], time data [ month:5, month _ split:5 ], activity data [ 1 please 00001 ];
user data [ 185, 4 ], time data [ month:5, month _ split:5 ], activity data [ 2 complement 00001 ];
.......
user data [ 185, 4 ], time data [ month:5, month _ split:5 ], event data [ N billing record query 02 ].
Step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network;
and 5, recommending the activity with the highest prediction score to the user.
The deep neural network includes a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer.
The input of the first neural network is preprocessed user data, and the output of the first neural network is a user feature vector. Specifically, the first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensions (a, b), user ID characteristics and user occupation characteristics are obtained respectively, the user ID characteristics and the user occupation characteristics are input into the first full connection layer after vector splicing, and the first full connection layer outputs the user characteristic vectors. For example: the vector splicing can be performed in the second dimension by using a concat function built in the tensoflow, and the spliced vector is (a, 2 b). In this embodiment, a is 1 and b is 32. The dimension of the user feature vector can be adjusted and set according to requirements, for example, the dimension of the user feature vector is (1, 200).
The input of the second neural network is the preprocessed activity data, and the output is the activity characteristic vector. Specifically, the method comprises the following steps: the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector. Extracting the activity name semantic features by using a text convolutional neural network, which generally comprises: (1) converting the preprocessed activity name into an embedded matrix of an embedded vector; (2) convolution layers of convolution kernels of different sizes; (3) a relu activation function; (4) a maximum pooling layer; (5) dropout layer. Specifically, the method comprises the following steps: firstly, inputting the preprocessed activity names into an embedding matrix to obtain embedding vectors corresponding to the activity names; secondly, inputting the embedded vector into a convolution layer, and performing convolution operation on the embedded vector by using convolution kernels with different sizes, wherein the sizes of windows, namely convolution kernel sizes, are 2 × E, 3 × E and 4 × E respectively, and E is the dimension of an embedded matrix; next, the gradient vanishing problem in back propagation is solved using relu activation function, which is a modified linear unit for hidden layer neuron output. The formula is as follows: and f (x) max (0, x), which is a piecewise linear function, for input x, x is the output vector of the convolutional layer, all negative values and 0 outputs are 0, and all positive values are unchanged, which is called unilateral suppression. And then, further extracting features by utilizing the maximum pooling layer, removing redundant information and extracting key features of the activity name as a text. Finally, overfitting of the text convolutional neural network is suppressed using the dropout method. In this embodiment, a text convolutional neural network is used for extracting feature vectors for activity names, so that semantic information contained in the activity names can be extracted, and different activities of the same activity type can accurately extract semantic information through the text convolutional neural network, for example, the activity types of a personal computer subsidy application and a high-temperature subsidy application belong to the subsidy application, the two activity names have parts with similar semantics, the feature vectors of the activity names can be accurately extracted through the text convolutional neural network, and the semantic features of the activity names can be reflected through the feature vectors.
The third neural network is preprocessed time data and outputs time characteristic vectors, the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts month information and time period information into vectors with dimensions (a, b/2), and month characteristics and monthly time period characteristics are obtained respectively; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
And the vector splicing mode of the second neural network and the third neural network is the same as that of the first neural network. The dimensionality of the activity characteristic vector and the dimensionality of the time characteristic vector can be adjusted and set according to requirements and generally keeps consistent with the dimensionality of the user characteristic vector.
And performing vector splicing on the user characteristic vector, the activity characteristic vector and the time characteristic vector, and inputting the vectors into the total full-connection layer, wherein the output of the total connection layer is the prediction score corresponding to each activity performed by the user at the time. The activity with the highest predictive score is the activity in which the user is most interested during the time period.
The deep neural network training steps are as follows: creating a training set: collecting previously generated user activity data as training samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity; preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data; training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample; using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure BDA0002511851930000111
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; adjusting the parameter value of the deep neural network to the gradient with reduced average error by adopting a random gradient descent method, and calculating the weight value and the offset value of the updated deep neural network; and reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, so that the predicted score is close to the real score, and the convergence of the deep neural network is completed.
The training process of the deep neural network is now illustrated:
firstly, setting training parameters:
Learning_rate:0.0001
Batch_size:256
Num_epochs:100
Dropout_keep:0.5
Show_every_n_batches:20
the Learning _ rate is the Learning rate of the neural network, if the Learning rate is too small, the model is easy to fall into local optimum, and if the Learning rate is too large, the model is easy to cross global optimum; the Batch _ size is the Batch size of the training input data, namely the number of training samples extracted from the training set during each training; num _ epochs is the number of training times of the whole training set, where Num _ epochs is set to 100, i.e., the whole training set is trained 100 times; dropout is an optimization scheme of the neural network, which can effectively avoid overfitting of the model, where Dropout _ keep is set to 0.5, i.e. half of the nodes in the hidden layer are kept to participate in training each time; the Show _ event _ n _ banks represents the frequency of output loss during training, where Show _ event _ n _ banks is set to 20, i.e., loss is output every 20 banks.
Secondly, after the training is iterated to 10000 times, the learning rate is adjusted to 5 x 10-5, and the training is continued.
And finally, stopping training after 50000 times of iteration, and storing the trained model.
The invention relates to a time-interval user activity recommendation method based on deep learning.
Example two
A deep learning based time-segment user activity recommendation device comprising a processor and a memory, the memory having instructions stored thereon, the processor executing the instructions to perform the steps of: step 1, a user logs in a system to obtain user data and current time data of the user; preferably, the user data comprises a user ID and a user occupation, and the time data is a date comprising a month and a day; and meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities, and the activity data comprises an activity ID, an activity name and an activity type. Step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data; and 3, preprocessing each group of input data, and converting text type data into numerical type data. The data are input into the embedding layer after being preprocessed, the existing characteristic matrix obtained through a characteristic extraction network is replaced, the characteristic dimension can be reduced, and the system calculation amount can be reduced. Specifically, the method comprises the following steps: the user ID and the activity ID are respectively converted into digital numbers, the digital numbers of the user ID are not repeated and have uniqueness, and similarly, the digital numbers of the activity ID are not repeated and have uniqueness; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; unifying the text length of the activity name, and filling the blank part with numbers; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number. For example, the month information is represented by the original data 1 to 12, the day information may be divided into 6 time periods in one month, so that recommendation of different time periods is realized, and the divided 6 time periods are represented by the numbers 1 to 6. Step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network; and 5, recommending the activity with the highest prediction score to the user.
The deep neural network comprises a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer; the input of the first neural network is preprocessed user data, the output of the first neural network is a user characteristic vector, the input of the second neural network is preprocessed activity data, the output of the second neural network is an activity characteristic vector, the third neural network is preprocessed time data, the output of the third neural network is a time characteristic vector, the user characteristic vector, the activity characteristic vector and the time characteristic vector are subjected to vector splicing and then input into the total full-connection layer, and the output of the total connection layer is prediction scoring corresponding to each activity performed by the user at the time.
The user data in the step 2 comprises a user ID and a user occupation, the activity data comprises an activity ID, an activity name and an activity type, and the time data is a date; in step 3, each group of input data is preprocessed, specifically: respectively converting the user ID and the activity ID into digital numbers; establishing a mapping table of user occupation and activity type numbers, converting the user occupation and activity types into corresponding digital numbers according to the mapping table, keeping the monthly information in the date unchanged, dividing the daily information according to time periods, and corresponding each time period to one digital number.
The first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensionalities (a and b), user ID characteristics and user occupation characteristics are obtained respectively, the user ID characteristics and the user occupation characteristics are input into the first full connection layer after vector splicing, and the first full connection layer outputs the user characteristic vectors; the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector; the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts the month information and the time period information into vectors with dimensions (a, b/2) to respectively obtain month characteristics and monthly time period characteristics; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
The deep neural network training steps are as follows: creating a training set: collecting previously generated user activity data as training samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity; preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data; training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample;using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure BDA0002511851930000151
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; adjusting the parameter value of the deep neural network to the gradient with reduced average error by adopting a random gradient descent method, and calculating the weight value and the offset value of the updated deep neural network; and reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, so as to finish the convergence of the deep neural network.
The training process of the deep neural network is now illustrated:
firstly, setting training parameters:
Learning_rate:0.0001
Batch_size:256
Num_epochs:100
Dropout_keep:0.5
Show_every_n_batches:20
the Learning _ rate is the Learning rate of the neural network, if the Learning rate is too small, the model is easy to fall into local optimum, and if the Learning rate is too large, the model is easy to cross global optimum; the Batch _ size is the Batch size of the training input data, namely the number of training samples extracted from the training set during each training; num _ epochs is the number of training times of the whole training set, where Num _ epochs is set to 100, i.e., the whole training set is trained 100 times; dropout is an optimization scheme of the neural network, which can effectively avoid overfitting of the model, where Dropout _ keep is set to 0.5, i.e. half of the nodes in the hidden layer are kept to participate in training each time; the Show _ event _ n _ banks represents the frequency of output loss during training, where Show _ event _ n _ banks is set to 20, i.e., loss is output every 20 banks.
Secondly, after the training is iterated to 10000 times, the learning rate is adjusted to 5 x 10-5, and the training is continued.
And finally, stopping training after 50000 times of iteration, and storing the trained model.
The invention relates to a time-interval user activity recommendation device based on deep learning, which analyzes activities interested by a user according to the operation of the user, collects user data, activity data and time data, analyzes the activities interested in a specific time interval by utilizing a deep neural network, and realizes accurate time-interval user activity recommendation.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A time-interval user activity recommendation method based on deep learning is characterized by comprising the following steps:
step 1, a user logs in a system to obtain user data and current time data of the user; meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities;
step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data;
step 3, preprocessing each group of input data, and converting text type data into numerical type data;
step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network;
and 5, recommending the activity with the highest prediction score to the user.
2. The deep learning-based time-share user activity recommendation method according to claim 1, wherein: the deep neural network comprises a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer; the input of the first neural network is preprocessed user data, the output of the first neural network is a user characteristic vector, the input of the second neural network is preprocessed activity data, the output of the second neural network is an activity characteristic vector, the third neural network is preprocessed time data, the output of the third neural network is a time characteristic vector, the user characteristic vector, the activity characteristic vector and the time characteristic vector are subjected to vector splicing and then input into the total full-connection layer, and the output of the total connection layer is prediction scoring corresponding to each activity performed by the user at the time.
3. The deep learning-based time-share user activity recommendation method according to claim 2, wherein: the user data in the step 2 comprises a user ID and a user occupation, the activity data comprises an activity ID, an activity name and an activity type, and the time data is a date; in step 3, each group of input data is preprocessed, specifically: respectively converting the user ID and the activity ID into digital numbers; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; unifying the text length of the activity name, and filling the blank part with numbers; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number.
4. The deep learning-based time-share user activity recommendation method according to claim 3, wherein: the first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensionalities (a and b), user ID characteristics and user occupation characteristics are obtained respectively, the user ID characteristics and the user occupation characteristics are input into the first full connection layer after vector splicing, and the first full connection layer outputs the user characteristic vectors; the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector; the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts the month information and the time period information into vectors with dimensions (a, b/2) to respectively obtain month characteristics and monthly time period characteristics; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
5. The deep learning-based time-share user activity recommendation method according to claim 1, wherein: the deep neural network training steps are as follows:
creating a training set: collecting previously generated user activity data as training samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity;
preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data;
training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample; using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure FDA0002511851920000031
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; gradient-adjusted deep nerves with reduced mean error using a stochastic gradient descent methodCalculating the weight value and the offset value of the updated deep neural network according to the parameter values of the network; and reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, so as to finish the convergence of the deep neural network.
6. A deep learning based time-share user activity recommendation device, comprising a processor and a memory, wherein the memory has instructions stored thereon, and the processor executes the instructions to perform the following steps:
step 1, a user logs in a system to obtain user data and current time data of the user; meanwhile, acquiring activity data of N activities provided by the system, wherein N is the total number of the activities;
step 2, combining the user data, the current time data and each activity data to obtain N groups of input data, wherein each group of input data comprises the user data, the time data and one activity data;
step 3, preprocessing each group of input data, and converting text type data into numerical type data;
step 4, inputting the processed N groups of input data into a deep neural network, and outputting the prediction scores of each activity corresponding to the user at the current time by the deep neural network;
and 5, recommending the activity with the highest prediction score to the user.
7. The deep learning-based timesharing user activity recommendation device of claim 6, wherein: the deep neural network comprises a first neural network for processing user data, a second neural network for processing activity data, a third neural network for processing temporal data, and an overall fully-connected layer; the input of the first neural network is preprocessed user data, the output of the first neural network is a user characteristic vector, the input of the second neural network is preprocessed activity data, the output of the second neural network is an activity characteristic vector, the third neural network is preprocessed time data, the output of the third neural network is a time characteristic vector, the user characteristic vector, the activity characteristic vector and the time characteristic vector are subjected to vector splicing and then input into the total full-connection layer, and the output of the total connection layer is prediction scoring corresponding to each activity performed by the user at the time.
8. The deep learning-based timesharing user activity recommendation device of claim 7, wherein: the user data in the step 2 comprises a user ID and a user occupation, the activity data comprises an activity ID, an activity name and an activity type, and the time data is a date; in step 3, each group of input data is preprocessed, specifically: respectively converting the user ID and the activity ID into digital numbers; establishing a mapping table of user occupation and activity type numbers, and converting the user occupation and activity types into corresponding digital numbers according to the mapping table; unifying the text length of the activity name, and filling the blank part with numbers; keeping the month information in the time data unchanged, dividing the day information according to time periods, wherein each time period corresponds to a digital number.
9. The deep learning-based timesharing user activity recommendation device of claim 8, wherein: the first neural network comprises a first embedding layer and a first full connection layer, the first embedding layer converts the preprocessed user data into vectors with dimensionalities (a and b), user ID characteristics and user occupation characteristics are obtained respectively, the user ID characteristics and the user occupation characteristics are input into the first full connection layer after vector splicing, and the first full connection layer outputs the user characteristic vectors; the second neural network comprises a second embedding layer, a text convolution neural network and a second full-connection layer, the second embedding layer converts the preprocessed activity ID and the activity type into vectors of (a, b) to respectively obtain an activity ID characteristic and an activity type characteristic, the text convolution neural network carries out semantic characteristic extraction on the activity name, outputs the activity name characteristic with the dimensionality of (a, b), carries out vector splicing on the activity ID characteristic, the activity type characteristic and the activity name characteristic and then inputs the activity ID characteristic, the activity type characteristic and the activity name characteristic into the second full-connection layer, and the second full-connection layer outputs the activity characteristic vector; the third neural network comprises a third embedded layer and a third full-connection layer, the third embedded layer converts the month information and the time period information into vectors with dimensions (a, b/2) to respectively obtain month characteristics and monthly time period characteristics; inputting the month characteristics and the time period characteristics in the month into a third full-connected layer, and outputting a time characteristic vector by the third full-connected layer.
10. The deep learning-based timesharing user activity recommendation device of claim 6, wherein: the deep neural network training steps are as follows:
creating a training set: collecting previously generated user activity data as training samples, wherein the training set comprises a plurality of training samples, each training sample comprises user data, activity data and time data, and real scores are calculated according to the time length of the user operating the activity;
preprocessing a training sample: carrying out preprocessing in the step 3 on the user data, the activity data and the time data;
training a deep neural network: inputting the preprocessed data into a deep neural network to obtain a prediction score corresponding to each training sample; using an MSE loss function to calculate average errors of 1 to n real values and predicted values; the loss function is:
Figure FDA0002511851920000051
wherein, yiIs the corresponding real score of the ith training sample, and yi' is the prediction score output by the deep neural network; adjusting the parameter value of the deep neural network to the gradient with reduced average error by adopting a random gradient descent method, and calculating the weight value and the offset value of the updated deep neural network; and reconfiguring the deep neural network by using the updated weight value and the updated bias value, and training the deep neural network by using the training sample again until the minimum value of the loss function is found, so as to finish the convergence of the deep neural network.
CN202010463499.5A 2020-05-27 2020-05-27 Time-interval user activity recommendation method and device based on deep learning Pending CN111651672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010463499.5A CN111651672A (en) 2020-05-27 2020-05-27 Time-interval user activity recommendation method and device based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010463499.5A CN111651672A (en) 2020-05-27 2020-05-27 Time-interval user activity recommendation method and device based on deep learning

Publications (1)

Publication Number Publication Date
CN111651672A true CN111651672A (en) 2020-09-11

Family

ID=72352792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010463499.5A Pending CN111651672A (en) 2020-05-27 2020-05-27 Time-interval user activity recommendation method and device based on deep learning

Country Status (1)

Country Link
CN (1) CN111651672A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399565A (en) * 2019-07-29 2019-11-01 北京理工大学 Based on when null cycle attention mechanism recurrent neural network point of interest recommended method
CN110489665A (en) * 2019-08-16 2019-11-22 北京信息科技大学 A kind of microblogging personalized recommendation method based on scene modeling and convolutional neural networks
CN110490686A (en) * 2019-07-08 2019-11-22 西北大学 A kind of building of commodity Rating Model, recommended method and system based on Time Perception

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490686A (en) * 2019-07-08 2019-11-22 西北大学 A kind of building of commodity Rating Model, recommended method and system based on Time Perception
CN110399565A (en) * 2019-07-29 2019-11-01 北京理工大学 Based on when null cycle attention mechanism recurrent neural network point of interest recommended method
CN110489665A (en) * 2019-08-16 2019-11-22 北京信息科技大学 A kind of microblogging personalized recommendation method based on scene modeling and convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
丁瑞峰: "基于深度神经网络的个性化兴趣点推荐方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Similar Documents

Publication Publication Date Title
Ma et al. A new aspect on P2P online lending default prediction using meta-level phone usage data in China
US8498950B2 (en) System for training classifiers in multiple categories through active learning
US20200074312A1 (en) System and method for call centre management
CN109783730A (en) Products Show method, apparatus, computer equipment and storage medium
CN112633962B (en) Service recommendation method and device, computer equipment and storage medium
US12020267B2 (en) Method, apparatus, storage medium, and device for generating user profile
CN116468460B (en) Consumer finance customer image recognition system and method based on artificial intelligence
US20230342797A1 (en) Object processing method based on time and value factors
CN111429214B (en) Transaction data-based buyer and seller matching method and device
CN112685639A (en) Activity recommendation method and device, computer equipment and storage medium
CN115375361A (en) Method and device for selecting target population for online advertisement delivery and electronic equipment
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
CN113656699B (en) User feature vector determining method, related equipment and medium
Zhang Research on precision marketing based on consumer portrait from the perspective of machine learning
CN111179055A (en) Credit limit adjusting method and device and electronic equipment
CN115730125A (en) Object identification method and device, computer equipment and storage medium
CN116800831A (en) Service data pushing method, device, storage medium and processor
CN111368168A (en) Big data-based electricity price obtaining and predicting method, system and computer-readable storage medium
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
CN115438265A (en) Information recommendation method and device
CN111651672A (en) Time-interval user activity recommendation method and device based on deep learning
CN114169418A (en) Label recommendation model training method and device, and label obtaining method and device
CN113807920A (en) Artificial intelligence based product recommendation method, device, equipment and storage medium
CN113254775A (en) Credit card product recommendation method based on client browsing behavior sequence
Bezbochina et al. Dynamic Classification of Bank Clients by the Predictability of Their Transactional Behavior

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200911

RJ01 Rejection of invention patent application after publication