CN115422472B

CN115422472B - User attention demand decision method based on artificial intelligent recognition and big data system

Info

Publication number: CN115422472B
Application number: CN202211118028.6A
Authority: CN
Inventors: 戴蔚; 张明娥
Original assignee: Hebei Pangu Network Technology Co ltd
Current assignee: Hebei Pangu Network Technology Co ltd
Priority date: 2022-09-14
Filing date: 2022-09-14
Publication date: 2023-11-07
Anticipated expiration: 2042-09-14
Also published as: CN115422472A

Abstract

The embodiment of the application provides a user attention demand decision-making method and a big data system based on artificial intelligent identification, which are characterized in that interest points are mined on Internet activity track big data of a target user to obtain interest pointing data corresponding to the target user, the interest pointing data are loaded into historical platform interest big data of the target user, interest path feature analysis is carried out on the historical platform interest big data of the target user, user attention demand decision-making is carried out based on an interest path feature analysis result to obtain user attention demand distribution of the target user, page content optimization is carried out on online service pages subscribed by the target user according to the user attention demand distribution, and therefore after the interest points are mined, user attention demand decision-making is carried out by further combining interest path feature dimensions, and compared with a mode of carrying out user attention demand decision-making only by using the interest point dimensions, the reliability of the user attention demand decision-making can be improved, and the page content optimization reliability of related users is further improved.

Description

User attention demand decision method based on artificial intelligent recognition and big data system

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a user attention demand decision method and a big data system based on artificial intelligence identification.

Background

The rapid development of internet technology brings great pushing force to online content distribution of various industries. The online push content optimization is completed through user demand mining in the Internet, so that Internet products or service information of a user can be comprehensively presented in front of the user, and the user permeability of the Internet products can be improved. The Internet product is a product driven by people and needs, truly serves users and pays attention to the users. Therefore, how to effectively combine the artificial intelligence technology to mine the attention demands of users so as to better perfect the optimization of online push content is the research focus of all Internet service providers. However, the present inventors have studied and found that the reliability is poor in the related art in which the user attention demand decision is made only in the point of interest dimension.

Disclosure of Invention

In order to at least overcome the defects in the prior art, the application aims to provide a user attention demand decision method based on artificial intelligent recognition and a big data system.

In a first aspect, the present application provides a user attention demand decision method based on artificial intelligence recognition, applied to a big data system, where the big data system is communicatively connected to a plurality of internet page servers, the method includes:

the method comprises the steps that interest points are mined on the Internet activity track big data of a target user based on a user interest mining model obtained through artificial intelligent model training, interest pointing data corresponding to the target user are obtained, and the interest pointing data are loaded into historical platform interest big data of the target user;

performing interest path feature analysis on the historical platform interest big data of the target user, and performing user interest demand decision based on the interest path feature analysis result to obtain user interest demand distribution of the target user;

and carrying out page content optimization on the online service pages subscribed by the target user based on the user attention demand distribution of the target user.

In a second aspect, the embodiment of the application further provides a user attention demand decision system based on artificial intelligent recognition, wherein the user attention demand decision system based on artificial intelligent recognition comprises a big data system and a plurality of internet page servers in communication connection with the big data system;

The big data system is used for:

According to the technical scheme, the interest points are mined on the Internet activity track big data of the target user to obtain the interest pointing data corresponding to the target user, the interest pointing data are loaded into the historical platform interest big data of the target user, the interest path feature analysis is carried out on the historical platform interest big data of the target user, the user interest demand decision is carried out on the basis of the interest path feature analysis result, the user interest demand distribution of the target user is obtained, the online service pages subscribed by the target user are subjected to page content optimization according to the interest path feature analysis result, the user interest demand decision is carried out by further combining the interest path feature dimension after the interest points are mined, and compared with the mode of carrying out the user interest demand decision only by the interest point dimension, the reliability of the user interest demand decision can be improved, and the page content optimization reliability of related users is further improved.

Drawings

Fig. 1 is a flow chart of a user attention demand decision method based on artificial intelligence recognition according to an embodiment of the present invention.

Detailed Description

The architecture of an artificial intelligence recognition based user attention demand decision system 10 provided in accordance with one embodiment of the present invention is described below, wherein the artificial intelligence recognition based user attention demand decision system 10 may include a big data system 100 and an internet page server 200 communicatively coupled to the big data system 100. Wherein, the big data system 100 and the internet page server 200 in the user attention demand decision system 10 based on artificial intelligence recognition can be based on the user attention demand decision method based on artificial intelligence recognition described in conjunction with executing the following method embodiments, and the executing steps of the big data system 100 and the internet page server 200 can be referred to in the following detailed description of the method embodiments.

The user attention demand decision method based on artificial intelligence recognition provided in this embodiment may be executed by the big data system 100, and is described in detail below with reference to fig. 1.

And the Process100 is used for carrying out interest point mining on the Internet activity track big data of the target user based on the user interest mining model obtained by training the artificial intelligence model, obtaining interest pointing data corresponding to the target user, and loading the interest pointing data into the historical platform interest big data of the target user.

In this embodiment, the large data of the internet activity track may represent a track data set generated by a series of internet activities (such as live browsing activities of an e-commerce, ordering activities of an e-commerce, sharing activities of an e-commerce, etc.) performed by a target user on a related internet service page (such as an e-commerce online service page, etc.), and the interest pointing data may be used to represent an interest data section pointed by an interest point (such as a certain e-commerce page object, for example, an e-commerce category plate) of the target user.

And the Process200 is used for carrying out interest path feature analysis on the historical platform interest big data of the target user, carrying out user interest demand decision based on the interest path feature analysis result, and obtaining the user interest demand distribution of the target user.

In this embodiment, the interest path feature may be used to represent an interest path related feature of the target user, and the interest path may refer to a path formed by the user aiming at each operation point of the page data in the interest point generating process.

And the Process300 optimizes the page content of the online service page subscribed by the target user based on the user attention demand distribution of the target user.

For example, target page contents matched with each user attention requirement in user attention requirement distribution of a target user can be obtained from a latest updated page content library, and the target page contents are loaded to an online service page subscribed by the target user for page content optimization.

Based on the above steps, in the embodiment, the interest point mining is performed on the internet activity track big data of the target user to obtain the interest pointing data corresponding to the target user, the interest pointing data is loaded into the historical platform interest big data of the target user, the interest path feature analysis is performed on the historical platform interest big data of the target user, the user interest demand decision is performed based on the interest path feature analysis result to obtain the user interest demand distribution of the target user, and the online service page subscribed by the target user is subjected to page content optimization according to the user interest demand distribution, so that after the interest point mining is performed, the user interest demand decision is further performed by combining the interest path feature dimension, and compared with the mode of performing the user interest demand decision only by the interest point dimension, the reliability of the user interest demand decision can be improved, and the page content optimization reliability of related users is further improved.

In some exemplary design considerations, process200 may be implemented by the following embodiments.

The method comprises the steps of processing 101, acquiring candidate platform interest event data, carrying out interest path network analysis on the candidate platform interest event data, and determining an interest path network for representing the candidate platform interest event data; the candidate platform interest event data includes a plurality of platform interest events; the interest path network includes a plurality of interest path node features; one platform interest event corresponds to one interest path node feature.

And the Process102 determines demand influence coefficients respectively corresponding to the plurality of interest path node characteristics, performs characteristic fusion on the plurality of interest path node characteristics based on the demand influence coefficients, and determines a first interest path characteristic.

For example, the requirement influence coefficient may be used to measure the importance of the interesting path node feature for the user to pay attention to the requirement mining, and the larger the requirement influence coefficient, the more important the corresponding interesting path node feature is represented, and in the first interesting path feature finally output, the larger the weight of the corresponding interesting path node feature is. The demand influence coefficient corresponding to each interest path node characteristic can be obtained by training a model which takes the demand influence coefficient as an input.

For example, after the demand influence coefficient is obtained, feature fusion may be performed on the interesting path node features, that is, the interesting path node features are aggregated based on the demand influence coefficient, so that the first interesting path feature may be obtained.

Process103, clustering the plurality of interest path node features, determining member interest path node features contained in the plurality of feature clusters respectively, determining frequent pattern metric values corresponding to each member interest path node feature respectively based on the plurality of feature clusters and a preset frequent pattern tree, and determining a second interest path feature based on the plurality of frequent pattern metric values; the frequent pattern metric corresponding to a member interest path node feature is determined based on the member interest path node feature in the cluster of features. The second interest path feature may be a feature set formed by member interest path node features with a frequent pattern metric value greater than a preset frequent pattern metric value.

For better analysis of the related feature vectors among the interest path node features, clustering may be performed on the interest path node features, that is, the interest path node features are divided into feature clusters based on relevance of the interest path node features, and platform interest events corresponding to the interest path node features in one feature cluster belong to the same class of platform interest events. And extracting part of interest path node characteristics from each characteristic cluster to serve as member interest path node characteristics.

And a Process104, determining user attention requirement decision information of the candidate platform interest event data based on the first interest path feature and the second interest path feature.

For example, after the first interest path feature and the second interest path feature are obtained, the user attention requirement decision model may be used to predict the first interest path feature and the second interest path feature, and output user attention requirement decision information.

By adopting the steps, the candidate platform interest event data comprising a plurality of platform interest events can be subjected to interest path network analysis, an interest path network for representing the candidate platform interest event data is determined, the interest path network comprises interest path node characteristics corresponding to the plurality of platform interest events respectively, the information of the platform interest events is mined based on two schemes, one is the information of each platform interest event is independently mined, namely, the demand influence coefficients corresponding to the plurality of interest path node characteristics respectively are determined, then the feature fusion is carried out on the plurality of interest path node characteristics based on the demand influence coefficients, and the first interest path characteristic is determined; secondly, related information among platform interest events of the same class is mined, namely a plurality of interest path node features are clustered, member interest path node features contained in the feature clusters are determined, frequent pattern metric values corresponding to each member interest path node feature are determined based on the feature clusters and a preset frequent pattern tree, and a second interest path feature is determined based on the frequent pattern metric values; and finally, determining user attention demand decision information of candidate platform interest event data based on the first interest path feature and the second interest path feature. Therefore, the first interest path feature and the second interest path feature obtained based on the two schemes can be mutually supplemented and mutually constrained, so that the accuracy of user attention demand decision can be improved, in addition, when frequent pattern metric values corresponding to the member interest path node features are calculated, only correlation among the member interest path node features belonging to the same feature cluster with the member interest path node features is guaranteed through presetting a frequent pattern tree, and the decision reliability is improved.

In some embodiments, determining the requirement influence coefficients corresponding to the plurality of interest path node features respectively, and performing feature fusion on the plurality of interest path node features based on the requirement influence coefficients, to determine the scheme of the first interest path feature, see the following Process201-Process210:

process201, loading the plurality of interest path node features into a first demand impact decision branch in a user attention demand decision model; the first demand influence decision branch comprises a demand influence decision unit and a demand influence fusion unit.

And the Process202 is used for respectively carrying out demand influence decision on the plurality of interest path node characteristics in the demand influence decision unit and determining demand influence coefficients respectively corresponding to the plurality of interest path node characteristics.

And a Process203, in the requirement influence fusion unit, fusing each interest path node characteristic based on the requirement influence coefficient, determining a fused interest path node characteristic corresponding to each interest path node characteristic, summarizing the fused interest path node characteristics, and determining a first interest path characteristic.

The following describes an initial user attention demand decision model training method, which may include at least the following processes 101-104:

Process301, acquiring sample platform interest event data, performing interest path network analysis on the sample platform interest event data, and determining a reference interest path network for representing the sample platform interest event data; the example platform interest event data includes a plurality of reference platform interest events; the reference interest path network includes a plurality of reference interest path node features; a reference platform interest event corresponds to a reference interest path node feature.

For example, the implementation Process of the Process301 may refer to the description of the Process101, which is not repeated herein.

And a Process302, loading the plurality of reference platform interest events to an initial user attention demand decision model, determining reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively in the initial user attention demand decision model, and performing feature fusion on the plurality of reference interest path node features based on the reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively to determine a first reference interest path feature.

The initial user attention demand decision model may include a first initial demand influence decision branch, and the big data system may determine reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively in the first initial demand influence decision branch, perform feature fusion on the plurality of reference interest path node features based on the reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively, and determine a first reference interest path feature, where a specific implementation Process may refer to the description of the Process201-Process203 and is not described in detail herein.

Process303, in the initial user attention demand decision model, clustering the plurality of reference interest path node features, determining reference member interest path node features respectively contained in the plurality of reference feature clusters, determining reference frequent pattern metric values respectively corresponding to each reference member interest path node feature based on the plurality of reference feature clusters and a preset frequent pattern tree, and determining a second reference interest path feature based on the plurality of reference frequent pattern metric values; the reference frequent pattern metric value corresponding to the reference member interest path node characteristic is determined according to the reference member interest path node characteristic in the belonging reference characteristic cluster.

The initial user attention demand decision model may further include a second initial demand influence decision branch, then in the second initial demand influence decision branch, clustering is performed on the multiple reference interest path node features, the reference member interest path node features included in the multiple reference feature clusters are determined, the reference frequent pattern metric value corresponding to each reference member interest path node feature is determined based on the multiple reference feature clusters and the preset frequent pattern tree, and the second reference interest path feature is determined based on the multiple reference frequent pattern metric values, which is described in the Process205-Process 209.

Process304 determines a priori user interest demand decision information for the example platform interest event data based on the first and second reference interest path features in the initial user interest demand decision model.

The initial user attention demand decision model may also include an initial classification sub-network, and then the prior user attention demand decision information of the sample platform interest event data is determined in the initial classification sub-network based on the first reference interest path feature and the second reference interest path feature, and the specific implementation may refer to the description of the Process210, which is not repeated herein.

And a Process305, performing model tuning on the initial user attention demand decision model based on the multiple reference feature clusters, the requirement influence coefficients respectively corresponding to the multiple reference interest path node features, the prior user attention demand decision information and the user attention demand corresponding to the sample platform interest event data, and determining a user attention demand decision model for outputting user attention demand decision information of candidate platform interest event data.

For example, because the first demand influence decision branch and the second demand influence decision branch in the finally obtained user attention demand decision model are loaded to the same interest path node characteristics, the demand influence coefficient distribution of the first demand influence decision branch on the plurality of interest path node characteristics and the demand influence coefficient distribution of the second demand influence decision branch on the plurality of interest path node characteristics should be consistent, so that in the process of training the initial user attention demand decision model, the big data system can firstly determine the training cost value of the first model based on the plurality of reference characteristic clusters and the reference demand influence coefficients respectively corresponding to the plurality of reference interest path node characteristics; then determining a second model training cost value based on prior user attention demand decision information and user attention demands corresponding to the sample platform interest event data; finally, carrying out weighted summation on the first model training cost value and the second model training cost value, and determining a target model training cost value; and performing model tuning on the initial user attention demand decision model based on the target model training cost value, and determining a user attention demand decision model for outputting user attention demand decision information of candidate platform interest event data. The first model training cost value is used for guaranteeing that two network branches of the user attention demand decision model obtained through final training are consistent in distribution of demand influence coefficients loaded to the same interest path node characteristics. The second model training cost value is used for ensuring that the user attention demand decision information output by the user attention demand decision model obtained through final training can be closer to a real result.

For example, the implementation process of determining the training cost value of the first model based on the reference requirement influence coefficients corresponding to the plurality of reference feature clusters and the plurality of reference interest path node features respectively may be: acquiring an ith reference feature cluster in the plurality of reference feature clusters; i is a positive integer, and i is not greater than the number of the plurality of reference feature clusters; taking the reference interest path node characteristics contained in the ith reference characteristic cluster as target reference interest path node characteristics; determining an output training cost value corresponding to the ith reference feature cluster based on the reference demand influence coefficient corresponding to the target reference interest path node feature and the number of the target reference interest path node features; and accumulating the output training cost values corresponding to each reference feature cluster respectively to determine a first model training cost value. Because the big data system clusters the reference interest path node characteristics contained in the sample platform interest event data in the second initial demand influence decision branch when the focus demand analysis is performed on the sample platform interest event data, a plurality of reference characteristic clusters are determined, and the focus degree of the reference interest path node characteristics in the same reference characteristic cluster in the second initial demand influence decision branch is the same, so that the focus degree of the reference interest path node characteristics in the same reference characteristic cluster in the first initial demand influence decision branch is the same. For example, the example platform interest event data includes 6 reference interest path node features, namely F1, F2, F3, F4, F5, and F6, with reference demand influence coefficients generated in a first initial demand influence decision branch of 0.10, 0.22, 0.11, 0.31, 0.22, 0.12 in order, and reference feature clusters generated in a second initial demand influence decision branch of: the reference feature clusters 1{ F1, F3 and F6}, the reference feature clusters 2{ F2, F4 and F5}, and the reference requirement influence coefficients corresponding to F1, F3 and F4 in the reference feature clusters 1 are nearly consistent and reasonable; but the reference requirement influence coefficient corresponding to F4 in the reference feature cluster 2 is significantly higher than F2 and F5, which is not reasonable and therefore needs to be adjusted by first model training cost value. That is, the demand influence coefficients generated in the first demand influence decision branch by the reference interest path node features in the same reference feature cluster should be uniformly distributed, so that each reference feature cluster can determine an output training cost value. And finally, accumulating the output training cost values corresponding to each reference feature cluster respectively to obtain a first model training cost value.

Illustratively, the implementation process of determining the output training cost value corresponding to the ith reference feature cluster based on the reference demand influence coefficient corresponding to the target reference interest path node feature and the number of target reference interest path node features may be: acquiring fitting demand influence coefficient distribution formed by reference demand influence coefficients corresponding to target reference interest path node characteristics; normalizing the fitting demand influence coefficient distribution to determine a first demand influence coefficient distribution; taking the uniform demand influence coefficient distribution corresponding to the number of the target reference interest path node characteristics as a second demand influence coefficient distribution; and determining the output training cost value corresponding to the ith reference feature cluster based on the first demand influence coefficient distribution and the second demand influence coefficient distribution. Assuming that the reference requirement influence coefficients corresponding to the target reference interest path node characteristics are 0.10,0.12 and 0.11, the formed fitting requirement influence coefficient distribution is [0.10,0.12,0.11], so that probability is required to be loaded when the subsequent output training cost value is calculated, normalization processing, namely adding up and adding up to be 1, is required to be carried out on the fitting requirement influence coefficient distribution, and the first requirement influence coefficient distribution is determined to be [0.303,0.363,0.333]. The number of target reference interest path node characteristics is 3, and the corresponding uniform demand influence coefficient distribution is used as a second demand influence coefficient distribution and is [1/3,1/3 ].

In some exemplary design ideas, aiming at the Process100, the user interest mining model obtained based on the artificial intelligence model training is used for mining the interest points of the Internet activity track big data of the target user, and the interest pointing data corresponding to the target user is obtained, which can be realized through the following embodiments.

The Process401 acquires a reference internet activity track data sequence, wherein the reference internet activity track data sequence comprises reference internet activity track data carrying priori interest pointing data.

The priori interest pointing data represent pointing behavior node data of target interest point data in the reference internet activity track data.

And the Process402 acquires a reference activity track block corresponding to the reference Internet activity track data, and acquires a reference activity track block cluster according to the reference activity track block.

The reference moving track block cluster carries priori cluster interest data corresponding to priori interest pointing data, and the reference moving track block is a track block obtained by splitting the reference Internet moving track data based on a user moving time-space domain.

In some exemplary design considerations, after dividing the reference internet activity trajectory data into reference activity trajectory blocks, forming a reference activity trajectory block cluster includes at least one of the following.

And inducing the reference moving track blocks belonging to the same reference Internet moving track data into the same reference moving track block cluster, and determining the reference moving track block clusters corresponding to the reference Internet moving track data.

The priori interest pointing data marked by the reference Internet activity track data is priori cluster interest data corresponding to the reference activity track block cluster, the pointing behavior node data of the target interest point data in each reference activity track block obtained by splitting the reference Internet activity track data is correspondingly determined based on the data partition where the marked target interest point data in the reference Internet activity track data is located, and the block interest data is marked for the reference activity track block.

In some exemplary design ideas, when the prior interest pointing data of the reference internet activity track data indicates that the reference internet activity track data does not include the target interest point data, the reference activity track block cluster only needs the prior cluster interest data, namely the prior cluster interest data which characterizes the reference activity track block cluster and does not include the target interest point data, and the reference activity track block does not need to be marked with the block interest data.

After splitting each piece of reference internet activity track data into reference activity track blocks, acquiring a reference activity track block sequence of the reference activity track blocks, and randomly acquiring n reference activity track blocks from the reference activity track block sequence to form a reference activity track block cluster, wherein n is a preset positive integer.

That is, the reference activity track blocks in the same reference activity track block cluster are from the same or different reference internet activity track data.

Determining priori cluster interest data of the reference active track block clusters based on the block interest data corresponding to the reference active track blocks; alternatively, the prior cluster interest data of the reference active track block cluster is determined based on the prior interest pointing data of the source reference internet active track data of the reference active track block, such as: the prior interest pointing data of the reference internet moving track data from which the reference moving track block comes all indicate that the target point of interest data does not exist, if the reference moving track block does not naturally include the target point of interest data, the prior cluster interest data of the reference moving track block cluster represents that the target point of interest data is not included.

And when the reference internet activity track data from which the reference activity track block comes contains target interest point data in the reference internet activity track data, the prior cluster interest data of the reference activity track block cluster needs to be determined based on the block interest data.

And inducing the reference moving track blocks belonging to the same reference Internet moving track data into the same reference moving track block cluster, and randomly acquiring n reference moving track blocks from the reference moving track block sequence to form the reference moving track block cluster, namely, the reference moving track block cluster comprises the reference moving track block cluster obtained by splitting the same reference Internet moving track data and the reference moving track block cluster formed by splitting the reference moving track blocks obtained by splitting different reference Internet moving track data.

And a Process403, wherein interest mining is carried out on the reference active track block cluster based on an initialized user interest mining model, and a relative entropy training cost function value and a first cross entropy training cost function value corresponding to the reference active track block cluster are determined according to characteristic training cost function value information between prior cluster interest data and prediction cluster interest data.

In some exemplary design ideas, the relative entropy training cost function value is a training cost function value determined according to feature training cost function value information between feature attention factor distribution obtained by predicting target interest point data in the reference moving track block cluster and interest influence parameter distribution corresponding to prior cluster interest data of the reference moving track block cluster. That is, after coding description is performed on the reference active track block cluster based on the initialized user interest mining model, distribution state analysis is performed on the coding description, so that feature interest factor distribution of target interest point data in the reference active track block cluster is obtained, the reference active track block cluster carries priori cluster interest data, the priori cluster interest data indicates interest influence parameter distribution corresponding to the reference active track block cluster, and then the relative entropy training cost function value of the reference active track block cluster is determined based on feature training cost function value information between the feature interest factor distribution and the interest influence parameter distribution. That is, coding description is carried out on the reference active track block cluster based on an initialized user interest mining model, a cluster interest description vector is determined, and a relative entropy training cost function value corresponding to the reference active track block cluster is determined according to characteristic attention factor distribution corresponding to the cluster interest description vector and interest influence parameter distribution corresponding to prior cluster interest data.

In some exemplary design ideas, the first cross entropy training cost function value is a training cost function value determined according to a prediction result obtained by analyzing target interest point data in the reference active track block cluster and feature training cost function value information between prior cluster interest data of the reference active track block cluster. That is, after coding description is performed on the reference active track block cluster based on the initialized user interest mining model, analysis is performed on the extracted feature to obtain predicted pointing behavior node data of the target interest point data in the target reference active track block package, and a first cross entropy training cost function value of the reference active track block cluster is determined based on feature training cost function value information between the predicted pointing behavior node data and the pointing behavior node data represented by the priori cluster interest data. That is, according to the feature training cost function value information between the mining data of the target interest point data mining on the cluster interest description vector and the priori cluster interest data, a first cross entropy training cost function value corresponding to the reference active track block cluster is determined.

And a Process404, performing interest mining on the reference active track block based on an initialized user interest mining model, and determining a second cross entropy training cost function value corresponding to the reference active track block according to the characteristic training cost function value information between the priori interest pointing data and the predicted interest pointing data.

In some exemplary design ideas, the second cross entropy training cost function value is a training cost function value determined according to a prediction result obtained by analyzing the target interest point data in the reference moving track block and feature training cost function value information between the block interest data of the reference moving track block. The block interest data can be realized as priori interest pointing data of reference internet activity track data, and can also be realized as track block labels obtained by inference according to the priori interest pointing data.

That is, after coding description is performed on the reference active track block based on the initialized user interest mining model, analysis is performed on target interest point data on the coding description, so that pointing behavior node data of the target interest point data in the target reference active track block is obtained, and a second cross entropy training cost function value of the reference active track block is determined based on feature training cost function value information between the pointing behavior node data obtained through prediction and the pointing behavior node data represented by the block interest data.

The Process403 and the Process404 are two parallel steps, and the Process403 may be executed first and then the Process404 may be executed first and then the Process403 may be executed, or the Process403 and the Process404 may be executed simultaneously, which is not limited in this embodiment.

And a Process405, performing tuning and selection of a model parameter layer on the initialized user interest mining model according to the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value.

In some exemplary design ideas, the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value are fused to obtain a total training cost function value, so that the model parameter layer is optimized and selected for the initialized user interest mining model according to the total training cost function value.

In some exemplary design ideas, when weighting the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value, weighting the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value with weights corresponding to each other, for example: and carrying out weighted calculation on the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value to determine a total training cost function value.

In some exemplary design ideas, when the model parameter layer of the initialized user interest mining model is optimized and selected according to the total training cost function value, the model parameters of the initialized user interest mining model are adjusted according to a gradient descent method.

In some exemplary design ideas, parameters in the first interest description vector aggregation unit and the second interest description vector aggregation unit in the initialized user interest mining model are adjusted based on the global training cost function value when the initialized user interest mining model is trained, and in some exemplary design ideas, parameters of other network layers in the initialized user interest mining model can be adjusted based on the global training cost function value.

Based on the above steps, in the model configuration flow of the initialized user interest mining model, the reference internet activity track data needing interest mining is respectively optimized and selected by using the reference activity track block and the reference activity track block cluster for the initialized user interest mining model, so that the mining effectiveness of the target interest point data in the reference activity track block is improved, and the condition that the mining data of the whole reference internet activity track data has larger error due to the mining error of a single reference activity track block is avoided.

In some exemplary design ideas, the initialization user interest mining model includes a first interest description vector aggregation unit and a second interest description vector aggregation unit. The relative entropy training cost function value and the cross entropy training cost function value are determined according to the interest description vector aggregation unit. The above-described Process403 and Process404 may be implemented as the following steps.

The Process4031 determines a cluster interest description vector based on initializing a user interest mining model to encode and describe the reference active track block clusters.

In some exemplary design considerations, a cluster interest description vector is determined based on initializing a coding unit in a user interest mining model to code descriptions of a reference active trajectory block cluster. In some exemplary design ideas, a block coding description variable is determined based on coding descriptions of reference active track blocks in a reference active track block cluster by a coding unit, so as to determine a cluster interest description vector composed of the block coding description variable, wherein the cluster interest description vector is a sequence of the block coding description variable.

Taking the coding unit as an example to initialize the component parts in the user interest mining model, in some exemplary design ideas, the coding unit may also be implemented as an independent feature extraction network.

In some exemplary design ideas, the coding unit upsamples/downsamples the reference active track blocks in the reference active track block cluster based on a convolution operation, thereby obtaining block coding description variables, and integrates the block coding description variables to form a cluster interest description vector.

The Process4032 determines a first interest description aggregate vector based on initializing a first interest description vector aggregation unit in the user interest mining model to perform first interest vector learning on the cluster interest description vectors.

The initialization user interest mining model comprises a first interest description vector aggregation unit, wherein the interest description vector aggregation unit can be understood as a fully connected layer.

The Process4033 performs attention-mechanism-based interest vector learning on the first interest description aggregate vector based on the attention mechanism, and determines feature attention factor distribution corresponding to the reference active track block cluster.

And the Process4034 determines interest influence parameter distribution corresponding to the reference active track block cluster according to the priori cluster interest data.

The manner in which the prior cluster interest data is obtained is described in the above Process402, and is not described herein.

In some exemplary design considerations, determining the interest influencing parameter distribution of the reference active trajectory block cluster from the a priori cluster interest data includes at least one of the following.

And if the priori cluster interest data are used for representing that the target interest point data do not exist in the reference moving track blocks in the reference moving track block cluster, determining that the interest influence parameters of the reference moving track block cluster are distributed uniformly.

The priori cluster interest data indicate that the reference moving track blocks in the reference moving track block cluster do not have target interest point data, and interest influence parameter distribution corresponding to the reference moving track block cluster is uniform distribution.

If the priori cluster interest data are used for representing that target interest point data exist in a reference moving track block in the reference moving track block cluster, block interest data corresponding to the reference moving track block in the priori cluster interest data are obtained, and interest influence parameter distribution of the reference moving track block cluster is determined according to the block interest data.

And the Process4035 determines a relative entropy training cost function value corresponding to the reference active track block cluster according to the characteristic training cost function value information between the characteristic attention factor distribution and the interest influence parameter distribution.

In some exemplary design ideas, after the feature attention factor distribution and the interest influence parameter distribution are determined, feature training cost function value information between the feature attention factor distribution and the interest influence parameter distribution is determined, so that a relative entropy training cost function value corresponding to the reference moving track block cluster is obtained.

The Process4036 performs second interest vector learning on the first interest description aggregate vector and the feature attention factor distribution corresponding to the reference active track block cluster based on a second interest description vector aggregation unit in the initialized user interest mining model, and determines the second interest description aggregate vector as mining data.

In the above process, the training cost function value is determined according to the characteristic attention factor distribution condition of the reference active track block cluster. In addition, cross entropy training cost function value can be determined for the mining data of the target interest point data in the reference active track block cluster.

In some exemplary design ideas, the feature attention factor distribution corresponding to the first interest description aggregate vector and the reference active track block cluster is aggregated, so that the aggregated feature is input to the second interest description vector aggregation unit.

And the Process4037 determines a first cross entropy training cost function value corresponding to the reference active track block cluster according to the characteristic training cost function value information between the second interest description aggregate vector and the priori cluster interest data.

The Process4041 determines block code description variables based on initializing a user interest mining model to code descriptions of reference active trajectory blocks.

In some exemplary design considerations, block code description variables are determined based on initializing code descriptions for reference active trajectory blocks by code units in a user interest mining model.

The Process4042 performs a first interest vector learning on the block code description variables based on initializing a first interest description vector aggregation unit in the user interest mining model to determine a third interest description aggregation vector.

The Process4043 performs second interest vector learning on the third interest description aggregate vector based on the second interest description vector aggregation unit in the initialized user interest mining model, and determines a fourth interest description aggregate vector as predicted interest pointing data.

And the Process4044 determines a second cross entropy training cost function value corresponding to the reference active track block according to the characteristic training cost function value information between the fourth interest description aggregate vector and the block interest data.

In some exemplary design ideas, weighting calculation is performed on the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value, the global training cost function value is determined, and model parameter layer tuning and selection are performed on the initialized user interest mining model according to the global training cost function value.

In some exemplary design considerations, the Process402 may also be implemented as follows.

The Process4021 divides the reference internet activity track data based on the user activity time-space domain to determine a reference activity track block.

In some exemplary design concepts, the manner in which the splitting of the reference internet activity trajectory data based on the user activity time-space domain is performed is described in the above Process402, which is not described herein again.

And the Process4022 distributes the reference active track blocks belonging to the same reference Internet active track data to the same cluster, and determines the reference active track block cluster.

In some exemplary design ideas, all the reference activity track blocks belonging to the same reference internet activity track data are distributed to the same cluster, and the reference activity track block cluster is determined; or, part of the reference activity track blocks belonging to the same reference internet activity track data are allocated to the same cluster.

Wherein when the partial reference active track blocks are allocated to the same cluster, at least one of the following is included.

1. And randomly selecting n reference activity track blocks from the reference activity track blocks referencing the Internet activity track data, and distributing the n reference activity track blocks to the same cluster.

2. And selecting n reference activity track blocks from the designated position area of the reference Internet activity track data, and distributing the n reference activity track blocks to the same cluster.

3. And jumping-type selecting n reference active track blocks from the reference active track blocks referencing the Internet active track data, and distributing the n reference active track blocks to the same cluster.

I.e. one of every adjacent two reference active track blocks is selected to be assigned to the same cluster.

In some exemplary design ideas, if the reference moving track blocks in the reference moving track block cluster are from the same reference internet moving track data, the priori interest pointing data corresponding to the reference internet moving track data is used as the priori cluster interest data corresponding to the reference moving track block cluster.

And the Process4023 mixes and distributes the reference activity track blocks belonging to different reference Internet activity track data to the same cluster to determine the reference activity track block cluster.

Some exemplary design considerations include at least one of the following distribution schemes when mixed distribution.

First, at least one reference active track block is selected from the reference active track blocks of each reference internet active track data, and is distributed to the same packet to obtain a reference active track block cluster.

The number of the reference activity track blocks obtained from each reference Internet activity track data is the same or different.

Secondly, mixing the reference moving track blocks of different reference Internet moving track data, determining a reference moving track block sequence, and randomly acquiring n reference moving track blocks from the reference moving track block sequence to form a reference moving track block cluster.

Thirdly, each part of reference moving track blocks are obtained from the reference Internet moving track data classified by different labels to form a reference moving track block cluster.

It should be noted that the above-mentioned allocation manner of the reference active track block clusters is merely an example, and the present embodiment is not limited thereto.

In some exemplary design considerations, if the reference active track blocks in the reference active track block cluster are from different reference internet active track data, determining prior cluster interest data corresponding to the reference active track block cluster according to the block interest data corresponding to the reference active track block.

In some embodiments, big data system 100 may include a processor 110, a machine-readable storage medium 120, a bus 130, and a communication unit 140.

The processor 110 may perform various suitable actions and processes by programs stored in the machine-readable storage medium 120, such as the program instructions associated with the artificial intelligence recognition-based user attention demand decision method described in the foregoing embodiments. The processor 110, the machine-readable storage medium 120, and the communication unit 140 communicate signals over the bus 130.

In particular, the processes described in the above exemplary flowcharts may be implemented as computer software programs, in accordance with embodiments of the present invention. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication unit 140, which, when executed by the processor 110, performs the above-described functions defined in the method of the embodiment of the invention.

Still another embodiment of the present invention provides a computer readable storage medium having stored therein computer executable instructions which when executed by a processor are configured to implement the method for determining a user's attention demand based on artificial intelligence recognition as described in any of the above embodiments.

Yet another embodiment of the present invention provides a computer program product comprising a computer program which, when executed by a processor, implements the artificial intelligence recognition based user attention demand decision method as described in any of the above embodiments.

It should be understood that, although various operation steps are indicated by arrows in the flowcharts of the embodiments of the present application, the order in which these steps are implemented is not limited to the order indicated by the arrows. In some implementations of embodiments of the application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages can be flexibly configured according to the requirement, which is not limited by the embodiment of the present application.

The foregoing is merely an optional implementation manner of some of the implementation scenarios of the present application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the present application are adopted without departing from the technical idea of the solution of the present application, which is also included in the protection scope of the embodiments of the present application.

Claims

1. A user attention demand decision method based on artificial intelligence recognition, comprising:

performing page content optimization on online service pages subscribed by the target user based on user attention demand distribution of the target user;

the step of carrying out interest path feature analysis on the historical platform interest big data of the target user and carrying out user interest demand decision based on the interest path feature analysis result to obtain the user interest demand distribution of the target user specifically comprises the following steps:

acquiring candidate platform interest event data matched with a content service item waiting to be online from historical platform interest big data of a target user, analyzing interest path network of the candidate platform interest event data, and determining an interest path network for representing the candidate platform interest event data; the candidate platform interest event data includes a plurality of platform interest events; the interest path network includes a plurality of interest path node features; a platform interest event corresponds to an interest path node feature;

Determining demand influence coefficients corresponding to the plurality of interest path node characteristics respectively, carrying out characteristic fusion on the plurality of interest path node characteristics based on the demand influence coefficients, and determining a first interest path characteristic;

clustering the plurality of interest path node features, determining member interest path node features contained in the plurality of feature clusters respectively, determining frequent pattern metric values corresponding to each member interest path node feature based on the plurality of feature clusters and a preset frequent pattern tree, and determining a second interest path feature based on the plurality of frequent pattern metric values; the frequent pattern metric value corresponding to the member interest path node characteristic is determined according to the member interest path node characteristic in the belonging characteristic cluster;

and determining user attention demand decision information of the candidate platform interest event data based on the first interest path feature and the second interest path feature.

2. The method for determining a user attention demand decision based on artificial intelligence recognition according to claim 1, wherein the determining a demand influence coefficient corresponding to each of the plurality of interest path node features, performing feature fusion on the plurality of interest path node features based on the demand influence coefficient, and determining a first interest path feature, specifically includes:

Loading the plurality of interest path node features into a first demand impact decision branch in a user attention demand decision model; the first demand influence decision branch comprises a demand influence decision unit and a demand influence fusion unit;

in the demand influence decision unit, respectively carrying out demand influence decision on the plurality of interest path node characteristics, and determining demand influence coefficients respectively corresponding to the plurality of interest path node characteristics;

and in the demand influence fusion unit, fusing each interest path node characteristic based on the demand influence coefficient, determining the fused interest path node characteristic corresponding to each interest path node characteristic, summarizing the fused interest path node characteristics, and determining a first interest path characteristic.

3. The method for determining user attention demand decision based on artificial intelligence recognition according to claim 1, wherein the step of determining the user attention demand decision information of the candidate platform interest event data based on the first interest path feature and the second interest path feature specifically comprises:

fusing the first interest path feature and the second interest path feature to determine a fused interest path feature;

Carrying out user attention demand decision on the fused interest path characteristics, and determining user attention demand decision information of the candidate platform interest event data;

the method further comprises the steps of:

acquiring sample platform interest event data, carrying out interest path network analysis on the sample platform interest event data, and determining a reference interest path network for representing the sample platform interest event data; the example platform interest event data includes a plurality of reference platform interest events; the reference interest path network includes a plurality of reference interest path node features; a reference platform interest event corresponds to a reference interest path node feature;

loading the plurality of reference platform interest events to an initial user attention demand decision model, determining reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively in the initial user attention demand decision model, performing feature fusion on the plurality of reference interest path node features based on the reference demand influence coefficients corresponding to the plurality of reference interest path node features respectively, and determining a first reference interest path feature;

in the initial user attention demand decision model, clustering is carried out on the multiple reference interest path node characteristics, reference member interest path node characteristics respectively contained in the multiple reference characteristic clusters are determined, reference frequent pattern metric values respectively corresponding to each reference member interest path node characteristic are determined based on the multiple reference characteristic clusters and a preset frequent pattern tree, and a second reference interest path characteristic is determined based on the multiple reference frequent pattern metric values; the reference frequent pattern metric value corresponding to the reference member interest path node characteristic is determined according to the reference member interest path node characteristic in the belonging reference characteristic cluster;

Determining, in the initial user interest demand decision model, a priori user interest demand decision information for the example platform interest event data based on the first reference interest path feature and the second reference interest path feature;

and performing model tuning on the initial user attention demand decision model based on the plurality of reference feature clusters, the requirement influence coefficients respectively corresponding to the plurality of reference interest path node features, the priori user attention demand decision information and the user attention demands corresponding to the sample platform interest event data, and determining a user attention demand decision model for outputting user attention demand decision information of candidate platform interest event data.

4. The method for determining a user attention demand decision based on artificial intelligence recognition according to claim 3, wherein the step of modeling the initial user attention demand decision model based on the plurality of reference feature clusters, the requirement influence coefficients corresponding to the plurality of reference interest path node features, the prior user attention demand decision information, and the user attention requirements corresponding to the example platform interest event data, and determining a user attention demand decision model for outputting user attention demand decision information of candidate platform interest event data specifically comprises:

Determining a first model training cost value based on the reference demand influence coefficients respectively corresponding to the plurality of reference feature clusters and the plurality of reference interest path node features;

determining a second model training cost value based on the prior user attention demand decision information and the user attention demand corresponding to the sample platform interest event data;

carrying out weighted summation on the first model training cost value and the second model training cost value, and determining a target model training cost value;

performing model tuning on the initial user attention demand decision model based on the target model training cost value, and determining a user attention demand decision model for outputting user attention demand decision information of candidate platform interest event data;

the step of determining the training cost value of the first model based on the plurality of reference feature clusters and the reference requirement influence coefficients respectively corresponding to the plurality of reference interest path node features specifically includes:

acquiring an ith reference feature cluster in the plurality of reference feature clusters; i is a positive integer, and i is not greater than the number of clusters of the plurality of reference features;

taking the reference interest path node characteristics contained in the ith reference characteristic cluster as target reference interest path node characteristics;

Determining an output training cost value corresponding to an ith reference feature cluster based on a reference demand influence coefficient corresponding to the target reference interest path node feature and the number of the target reference interest path node features;

accumulating the output training cost values corresponding to each reference feature cluster respectively to determine a first model training cost value;

the step of determining the output training cost value corresponding to the ith reference feature cluster based on the reference demand influence coefficient corresponding to the target reference interest path node feature and the number of the target reference interest path node features specifically includes:

acquiring fitting demand influence coefficient distribution formed by reference demand influence coefficients corresponding to the target reference interest path node characteristics;

normalizing the fitting demand influence coefficient distribution to determine a first demand influence coefficient distribution;

taking the uniform demand influence coefficient distribution corresponding to the number of the target reference interest path node characteristics as a second demand influence coefficient distribution;

and determining the output training cost value corresponding to the ith reference feature cluster based on the first demand influence coefficient distribution and the second demand influence coefficient distribution.

5. The method for deciding user attention demand based on artificial intelligence recognition according to any one of claims 1 to 4, wherein the step of performing interest point mining on internet activity track big data of a target user by using a user interest mining model obtained based on artificial intelligence model training to obtain interest pointing data corresponding to the target user specifically comprises the following steps:

splitting Internet activity track big data of the target user into a plurality of Internet activity track data, inputting the Internet activity track data into a user interest mining model obtained through artificial intelligent model training to mine interest points, and obtaining interest pointing data corresponding to the target user;

the training step of the user interest mining model comprises the following steps:

acquiring a reference internet activity track data sequence, wherein the reference internet activity track data sequence comprises reference internet activity track data carrying priori interest pointing data, and the priori interest pointing data represents pointing behavior node data of target interest point data in the reference internet activity track data;

acquiring a reference moving track block corresponding to the reference Internet moving track data, and acquiring a reference moving track block cluster according to the reference moving track block, wherein the reference moving track block cluster carries priori cluster interest data corresponding to the priori interest pointing data, and the reference moving track block is a track block obtained by splitting the reference Internet moving track data based on a user moving time-space domain;

Performing interest mining on the reference active track block cluster based on an initialized user interest mining model, and determining a relative entropy training cost function value and a first cross entropy training cost function value corresponding to the reference active track block cluster according to characteristic training cost function value information between the priori cluster interest data and the predicted cluster interest data;

performing interest mining on the reference active track block based on the initialized user interest mining model, and determining a second cross entropy training cost function value corresponding to the reference active track block according to characteristic training cost function value information between the priori interest pointing data and the predicted interest pointing data;

performing weighted calculation on the relative entropy training cost function value, the first cross entropy training cost function value and the second cross entropy training cost function value to determine a global training cost function value;

and adjusting and selecting a model parameter layer according to the global training cost function value, wherein the adjusted and selected model parameter layer is used for mining target interest point data in the Internet activity track data.

6. The method for determining a user attention demand decision based on artificial intelligence recognition according to claim 5, wherein the step of performing interest mining on the reference active track block cluster based on an initialized user interest mining model and determining a relative entropy training cost function value and a first cross entropy training cost function value corresponding to the reference active track block cluster according to feature training cost function value information between the prior cluster interest data and the predicted cluster interest data specifically comprises:

coding and describing the reference active track block cluster based on the initialized user interest mining model, and determining a cluster interest description vector;

based on a first interest description vector aggregation unit in the initialized user interest mining model, first interest vector learning is conducted on the cluster interest description vectors, and a first interest description aggregation vector is determined;

performing attention mechanism-based interest vector learning on the first interest description aggregate vector based on an attention mechanism, and determining characteristic attention factor distribution corresponding to the reference active track block cluster;

if the prior cluster interest data are used for representing that the target interest point data do not exist in the reference moving track blocks in the reference moving track block cluster, determining that interest influence parameters of the reference moving track block cluster are distributed uniformly;

If the prior cluster interest data are used for representing that the target interest point data exist in the reference moving track blocks in the reference moving track block cluster, acquiring block interest data corresponding to the reference moving track blocks in the prior cluster interest data;

determining interest influence parameter distribution of the reference movable track block cluster according to the block interest data;

determining a relative entropy training cost function value corresponding to the reference active track block cluster according to the characteristic training cost function value information between the characteristic attention factor distribution and the interest influence parameter distribution;

based on a second interest description vector aggregation unit in the initialized user interest mining model, second interest vector learning is conducted on the first interest description aggregation vector and feature attention factor distribution corresponding to the reference active track block cluster, and a second interest description aggregation vector is determined;

and determining a first cross entropy training cost function value corresponding to the reference active track block cluster according to the characteristic training cost function value information between the second interest description aggregate vector and the priori cluster interest data.

7. The method for determining a user attention demand decision based on artificial intelligence recognition according to claim 5, wherein the step of performing interest mining on the reference activity trajectory block based on the initialized user interest mining model and determining a second cross entropy training cost function value corresponding to the reference activity trajectory block according to feature training cost function value information between the prior interest pointing data and the predicted interest pointing data specifically comprises:

Coding and describing the reference active track block based on the initialized user interest mining model, and determining a block coding and describing variable;

based on a first interest description vector aggregation unit in the initialized user interest mining model, performing first interest vector learning on the block coding description variable, and determining a third interest description aggregation vector;

performing second interest vector learning on the third interest description aggregate vector based on a second interest description vector aggregation unit in the initialized user interest mining model, and determining a fourth interest description aggregate vector as the predicted interest pointing data;

and determining a second cross entropy training cost function value corresponding to the reference moving track block according to the characteristic training cost function value information between the fourth interest description aggregate vector and block interest data, wherein the block interest data is interest data corresponding to the reference moving track block determined according to the prior interest pointing data.

8. The method for determining a user's attention demand based on artificial intelligence recognition according to claim 5, wherein the step of acquiring the reference activity track block corresponding to the reference internet activity track data and acquiring the reference activity track block cluster according to the reference activity track block specifically comprises:

Splitting the reference internet activity track data based on a user activity time-space domain, and determining the reference activity track block;

distributing reference activity track blocks belonging to the same reference Internet activity track data to the same cluster, and determining the reference activity track block cluster; or mixing and distributing the reference active track blocks belonging to different reference Internet active track data to the same cluster, and determining the reference active track block cluster;

if the reference moving track blocks in the reference moving track block cluster come from the same reference Internet moving track data, the priori interest pointing data corresponding to the reference Internet moving track data is used as the priori cluster interest data corresponding to the reference moving track block cluster;

and if the reference active track blocks in the reference active track block clusters come from different reference Internet active track data, determining priori cluster interest data corresponding to the reference active track block clusters according to the block interest data corresponding to the reference active track blocks.

9. A big data system, characterized in that the big data system comprises a processor and a memory for storing a computer program capable of running on the processor, the processor being adapted to execute the artificial intelligence recognition based user attention demand decision method of any of claims 1-8 when running the computer program.