CN114817747B - User behavior analysis method based on internet big data and cloud computing service system - Google Patents

User behavior analysis method based on internet big data and cloud computing service system Download PDF

Info

Publication number
CN114817747B
CN114817747B CN202210572060.5A CN202210572060A CN114817747B CN 114817747 B CN114817747 B CN 114817747B CN 202210572060 A CN202210572060 A CN 202210572060A CN 114817747 B CN114817747 B CN 114817747B
Authority
CN
China
Prior art keywords
intention
behavior
data
behavior intention
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210572060.5A
Other languages
Chinese (zh)
Other versions
CN114817747A (en
Inventor
朱新平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cummy Technology Shanghai Co ltd
Original Assignee
Cummy Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cummy Technology Shanghai Co ltd filed Critical Cummy Technology Shanghai Co ltd
Priority to CN202210572060.5A priority Critical patent/CN114817747B/en
Priority to CN202211291929.5A priority patent/CN115587252A/en
Publication of CN114817747A publication Critical patent/CN114817747A/en
Application granted granted Critical
Publication of CN114817747B publication Critical patent/CN114817747B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a user behavior analysis method based on internet big data and a cloud computing service system, after target user behavior intention data matching the current service demand to be online are obtained, behavior intention relation extraction is carried out on the target user behavior intention data to generate a corresponding behavior intention relation map, then a target behavior intention entity of which behavior intention relation attribute is associated with at least two behavior intention entities is extracted from the behavior intention relation map, at least one target behavior intention entity and an associated behavior intention entity of which each target behavior intention entity is associated through the behavior intention relation attribute are obtained, so that behavior intention entity extraction of multiple association of the behavior intention relation attribute is carried out by combining the behavior intention relation based on the current service demand to be online, and internet content pushing is carried out according to the behavior intention entity extraction, and the matching degree of internet content pushing and the current service demand to be online can be improved.

Description

User behavior analysis method based on internet big data and cloud computing service system
Technical Field
The invention relates to the technical field of big data mining, in particular to a user behavior analysis method based on internet big data and a cloud computing service system.
Background
With the rise of each online e-commerce service platform, the traffic age of the internet has been finished, the internet in the future is certainly the big data + algorithm age, and the accurate internet content pushing based on the big data algorithm is applied to each field in the life of the user.
Disclosure of Invention
In order to overcome at least the above disadvantages in the prior art, the present invention provides a user behavior analysis method based on internet big data and a cloud computing service system.
In a first aspect, an embodiment of the present invention provides an internet big data-based user behavior analysis method, which is applied to a cloud computing service system, and the method includes:
performing behavior intention mining on internet behavior big data of a target user to generate user behavior intention data of the target user, loading the user behavior intention data into a user behavior intention big data log of the target user in real time, and generating target user behavior intention data matched with the current service requirement to be online based on the user behavior intention big data log;
performing behavior intention relationship extraction on the target user behavior intention data to generate a corresponding behavior intention relationship map, wherein the behavior intention relationship map is used for expressing a plurality of behavior intention entities and behavior intention relationship attributes among the behavior intention entities;
extracting target behavioral intention entities of which behavioral intention relationship attributes are associated with at least two behavioral intention entities from the behavioral intention relationship graph to obtain at least one target behavioral intention entity and associated behavioral intention entities of which each target behavioral intention entity is associated through the behavioral intention relationship attributes;
and carrying out internet content pushing on the internet service page corresponding to the target user based on at least one target behavioral intention entity and the associated behavioral intention entity associated with each target behavioral intention entity through the behavioral intention relationship attribute.
In some possible implementations of the first aspect, the step of performing behavioral intention mining on internet behavior big data of a target user to generate user behavioral intention data of the target user includes:
inputting the Internet behavior big data of the target user into a user behavior intention mining model meeting the model deployment requirement, and generating user behavior intention data of the target user;
the user behavior intention mining model comprises a student intention positioning unit, a teacher intention positioning unit and a behavior intention mining unit;
the training step of the user behavior intention mining model comprises the following steps:
processing first sample user behavior data carrying prior user behavior intention labeling information according to the student intention positioning unit, and outputting student intention positioning information of the first sample user behavior data, wherein the student intention positioning information of the first sample user behavior data represents a student intention positioning point and a student intention positioning label of a sample user in the first sample user behavior data;
respectively processing second sample user behavior data which do not carry prior user behavior intention marking information according to the student intention positioning unit and the teacher intention positioning unit, and outputting student intention positioning information and teacher intention positioning information of the second sample user behavior data; the student intention positioning information of the second sample user behavior data represents student intention positioning points and student intention positioning labels of the sample users in the second sample user behavior data, and the teacher intention positioning information of the second sample user behavior data represents teacher intention positioning points and teacher intention positioning labels of the sample users in the second sample user behavior data;
respectively mining the behavior intention characteristics of the first sample user behavior data and the behavior intention characteristics of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting the behavior intention data of the first sample user behavior data and the behavior intention data of the second sample user behavior data;
and performing model parameter optimization on the user behavior intention mining model based on the student intention positioning information of the first sample user behavior data, the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, and the behavior intention data.
In some possible embodiments of the first aspect, the model parameter tuning of the user behavioral intention mining model based on the student intention positioning information of the first sample user behavior data, the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, and the behavioral intention data includes:
determining first intention learning cost information and second intention learning cost information based on the student intention positioning information of the first sample user behavior data and the prior user behavior intention labeling information of the first sample user behavior data; the first intention learning cost information is used for evaluating the intention positioning tag mining precision of the student intention positioning unit on the first sample user behavior data by taking intention positioning point marking information of the first sample user behavior data as a reference, and the second intention learning cost information is used for evaluating the intention positioning point mining precision of the student intention positioning unit on the first sample user behavior data by taking the intention positioning point marking information of the first sample user behavior data as a reference;
determining target learning cost based on the student intention positioning information and the teacher intention positioning information of the second sample user behavior data; the target learning cost is used for evaluating intention positioning labels and intention positioning point mining accuracy of the student intention positioning unit on the second sample user behavior data by taking teacher intention positioning information of the second sample user behavior data as comparison training basis information;
determining a behavior intention mining cost based on the behavior intention data and the behavior intention labeling information; wherein the behavioral intention mining cost is used for evaluating the behavioral intention mining precision of the behavioral intention mining unit;
and performing model parameter optimization on the user behavior intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost.
In some possible implementations of the first aspect, the model parameter tuning of the user behavioral intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost, and the behavioral intention mining cost includes:
determining global learning cost information of the user behavior intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost;
determining a model weight parameter optimization gradient of the behavior intention mining unit based on the global learning cost information, and carrying out model weight information tuning on the behavior intention mining unit based on the model weight parameter optimization gradient of the behavior intention mining unit;
determining a model weight parameter optimization gradient of the student intention positioning unit based on a gradient descent method based on the model weight parameter optimization gradient of the behavior intention mining unit, and performing model weight information tuning on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit;
determining parameters of the teacher intent location unit based on an exponentially weighted average strategy based on the parameters of the student intent location unit;
determining a target learning cost based on the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, including:
determining the target learning cost based on a positioning support degree difference between student intention positioning information and teacher intention positioning information of the second sample user behavior data and a weight parameter; wherein the weight parameter is dynamically adjusted according to teacher intention positioning information of the second sample user behavior data.
For instance, in some possible implementations of the first aspect, the method further comprises:
selecting second sample user behavior data, of which teacher intention positioning information is not less than a preset positioning confidence coefficient, from the second sample user behavior data; and the student intention positioning information and the teacher intention positioning information of the selected second sample user behavior data are used for determining the target learning cost.
For example, in some possible implementations of the first aspect, the processing, according to the student intention positioning unit and the teacher intention positioning unit, second sample user behavior data that does not carry a priori user behavior intention tagging information, and outputting student intention positioning information and teacher intention positioning information of the second sample user behavior data includes:
respectively processing the second sample user behavior data based on a first data expansion strategy and a second data expansion strategy, and outputting the second sample user behavior data after the first data expansion and the second sample user behavior data after the second data expansion; wherein the data expansion dimension of the first data expansion strategy is larger than the data expansion dimension of the second data expansion strategy;
processing second sample user behavior data after the first data expansion according to the student intention positioning unit, and outputting student intention positioning information of the second sample user behavior data;
and processing the second sample user behavior data after the second data expansion according to the teacher intention positioning unit, and outputting teacher intention positioning information of the second sample user behavior data.
For instance, in some possible implementations of the first aspect, the behavioral intent features include feature vectors of a plurality of different behavioral intent evaluation dimensions;
the mining the behavior intention characteristics of the first sample user behavior data and the behavior intention characteristics of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting the behavior intention data of the first sample user behavior data and the behavior intention data of the second sample user behavior data includes:
respectively mining the feature vectors of a plurality of different behavior intention assessment dimensions of the first sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting a plurality of behavior intention data of the first sample user behavior data;
and respectively mining the feature vectors of a plurality of different behavior intention assessment dimensions of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting a plurality of behavior intention data of the second sample user behavior data.
In a second aspect, an embodiment of the present invention further provides an internet big data-based user behavior analysis system, where the internet big data-based user behavior analysis system includes a cloud computing service system and a plurality of user terminal devices in communication connection with the cloud computing service system;
the cloud computing service system is used for:
performing behavior intention mining on Internet behavior big data of a target user to generate user behavior intention data of the target user, loading the user behavior intention data into a user behavior intention big data log of the target user in real time, and generating target user behavior intention data matched with the current service requirement to be online based on the user behavior intention big data log;
performing behavior intention relationship extraction on the target user behavior intention data to generate a corresponding behavior intention relationship map, wherein the behavior intention relationship map is used for expressing a plurality of behavior intention entities and behavior intention relationship attributes among the behavior intention entities;
extracting target behavioral intention entities of which behavioral intention relationship attributes are associated with at least two behavioral intention entities from the behavioral intention relationship graph to obtain at least one target behavioral intention entity and associated behavioral intention entities of which each target behavioral intention entity is associated through the behavioral intention relationship attributes;
and carrying out internet content pushing on the internet service page corresponding to the target user based on at least one target behavioral intention entity and the associated behavioral intention entity associated with each target behavioral intention entity through the behavioral intention relationship attribute.
By adopting the embodiment scheme of any one aspect, after the target user behavior intention data matching the current service demand to be online is obtained, the behavior intention relationship of the target user behavior intention data is extracted to generate a corresponding behavior intention relationship map, then the target behavior intention entities of which the behavior intention relationship attributes are associated with at least two behavior intention entities are extracted from the behavior intention relationship map, at least one target behavior intention entity and associated behavior intention entities of which each target behavior intention entity is associated with through the behavior intention relationship attributes are obtained, so that the behavior intention entities with multiple associations of the behavior intention relationship attributes are extracted by combining the behavior intention relationships based on the current service demand to be online, and the internet content is pushed according to the extraction, and the matching degree of the internet content pushing and the current service demand to be online can be improved.
Drawings
Fig. 1 is a schematic flow chart of a user behavior analysis method based on internet big data according to an embodiment of the present invention.
Detailed Description
The architecture of the internet big data based user behavior analysis system 10 according to an embodiment of the present invention is described below, where the internet big data based user behavior analysis system 10 may include a cloud computing service system 100 and a user terminal device 200 communicatively connected to the cloud computing service system 100. The cloud computing service system 100 and the user terminal device 200 in the internet big data based user behavior analysis system 10 may cooperatively perform the internet big data based user behavior analysis method described in the following method embodiment, and the detailed description of the method embodiment may be referred to in the specific steps of the cloud computing service system 100 and the user terminal device 200.
The internet big data based user behavior analysis method provided in this embodiment may be executed by the cloud computing service system 100, and will be described in detail with reference to fig. 1.
The Process100 performs behavior intention mining on the internet behavior big data of the target user to generate user behavior intention data of the target user, loads the user behavior intention data into a user behavior intention big data log of the target user in real time, and generates target user behavior intention data matched with the current business requirement to be online based on the user behavior intention big data log.
In this embodiment, for a specific implementation of behavior intention mining, refer to the schemes of the following embodiments, and the user behavior intention data may include a user behavior intention anchor point and a user behavior intention anchor tag, where the user behavior intention anchor point may represent a data node corresponding to a user behavior intention, and the user behavior intention anchor tag may represent a data category corresponding to the user behavior intention.
On the basis, the user behavior intention data can be loaded into the user behavior intention big data log of the target user in real time, and aiming at the service requirement to be on-line, in order to improve the follow-up on-line user attention heat degree of the service requirement to be on-line, the target user behavior intention data matching the current service requirement to be on-line is further generated on the basis of the user behavior intention big data log.
And the Process200 is used for extracting the behavior intention relationship of the target user behavior intention data and generating a corresponding behavior intention relationship map.
In this embodiment, after determining the target user behavioral intention data, behavioral intention relationships between different behavioral intentions in the target user behavioral intention data may be further analyzed, for example, a purchasing behavioral intention for a certain category a and a behavioral intention relationship between short video attention behavioral intentions for a certain category B (for example, a direct triggering relationship if the short video attention behavioral intention of a certain category B directly triggers the purchasing behavioral intention of a certain category a), so as to construct a corresponding behavioral intention relationship graph, that is, the behavioral intention relationship graph is used to represent a plurality of behavioral intention entities and behavioral intention relationship attributes between the behavioral intention entities.
The Process300 extracts a target behavioral intention entity of which the behavioral intention relationship attribute is associated with at least two behavioral intention entities from the behavioral intention relationship map, and obtains at least one target behavioral intention entity and associated behavioral intention entities of which each target behavioral intention entity is associated through the behavioral intention relationship attribute.
In order to improve the mining depth of the key behavioral intention entities, target behavioral intention entities of which behavioral intention relationship attributes are associated with at least two behavioral intention entities can be extracted, and at least one target behavioral intention entity and associated behavioral intention entities of which each target behavioral intention entity is associated by the behavioral intention relationship attributes are obtained. For example, the short video attention behavioral intention of a certain category B directly triggers the purchase behavioral intention of a certain category a and the purchase behavioral intention of a certain category C, then the short video attention behavioral intention of a certain category B may be determined as the target behavioral intention entity, and the purchase behavioral intention of a certain category a and the purchase behavioral intention of a certain category C may be determined as the associated behavioral intention entity associated with the short video attention behavioral intention of a certain category B.
The Process400 performs internet content pushing on an internet service page corresponding to the target user based on at least one target behavioral intention entity and an associated behavioral intention entity associated with each target behavioral intention entity through a behavioral intention relationship attribute.
For example, the internet initial content related to each target behavioral intention entity and the associated behavioral intention entity may be searched, and the corresponding internet initial content may be subjected to page node contact in combination with the behavioral intention relationship attribute between each target behavioral intention entity and the associated behavioral intention entity, so as to generate target internet content and push the internet service page corresponding to the target user.
Based on the steps, after target user behavior intention data matching the current service demand to be online are obtained, behavior intention relationship extraction is carried out on the target user behavior intention data to generate a corresponding behavior intention relationship map, then a target behavior intention entity of which behavior intention relationship attributes are associated with at least two behavior intention entities is extracted from the behavior intention relationship map, at least one target behavior intention entity and an associated behavior intention entity of which each target behavior intention entity is associated through the behavior intention relationship attributes are obtained, so that the matching degree of Internet content pushing and the current service demand to be online can be improved by extracting the behavior intention entities of which the behavior intention relationship attributes are in multiple association based on the current service demand to be online and combining the behavior intention relationships and carrying out Internet content pushing according to the behavior intention entities.
In some exemplary design considerations, the above Process100 may be implemented by the following embodiments.
The Process110: and acquiring basic user behavior intention data which is matched with the current service requirement to be online from the user behavior intention big data log of the target user, wherein the basic user behavior intention data is characteristic data of basic dimensionality.
The Process120: and mining basic behavior intention variables of the basic user behavior intention data.
The basic behavior intention variables of the basic user behavior intention data can be mined through a basic behavior intention variable mining model meeting model convergence requirements, wherein the input of the basic behavior intention variable mining model can be the basic user behavior intention data, and can also be feature data which is obtained by preprocessing the basic user behavior intention data and meets the requirements of a model input format of the basic behavior intention variable mining model.
The Process130: and matching the basic behavior intention variables with a plurality of derived behavior intention variables in the derived intention database to generate matching state information corresponding to the derived behavior intention variables.
The Process140: and obtaining derived behavior intention data corresponding to the basic user behavior intention data from each derived intention data based on the matching state information corresponding to each derived behavior intention variable, and outputting the basic user behavior intention data and the corresponding derived behavior intention data as target user behavior intention data matching the current business demand to be online.
In some exemplary design ideas, the derived intent database includes a plurality of derived intent data and derived behavioral intent variables of each derived intent data, and the derived intent data is feature data of derived dimensions.
In some exemplary design ideas, the derived behavior intention variables of each derived intention data may also be mined through a behavior intention variable learning model meeting the model convergence requirement, and specifically, the derived behavior intention variables of each derived intention data may be generated by respectively performing feature extraction on each derived intention data through the derived behavior intention variable mining model. Similarly, the input of the derived behavior intention variable mining model may be derived intention data, or the derived behavior intention variable mining model may be generated by preprocessing the derived intention data and inputting the preprocessed feature data into the derived behavior intention variable mining model.
In some exemplary design ideas, the derived intention data in the derived intention database may be derived features corresponding to each extended intention data in the derived intention database, the extended intention data is feature data of a base dimension, and one extended intention data corresponds to a plurality of derived features.
For example, the matching state information corresponding to one derived behavior intention variable may be the correlation degree, such as a correlation parameter value, between the basic behavior intention variable and the derived behavior intention variable, and after the matching state information between the basic behavior intention variable and each derived behavior intention variable is obtained, the derived intention data corresponding to the derived behavior intention variable with the highest correlation degree may be used as the derived behavior intention data, or the derived intention data corresponding to the correlation degree ordered within a preset range may be used as the derived behavior intention data based on a descending order of the correlation degree, or the derived intention data corresponding to each derived behavior intention variable with the correlation degree greater than the preset correlation degree may be used as the derived behavior intention data.
Based on the steps, the basic user behavior intention data and the corresponding derivative behavior intention data are combined to be output as target user behavior intention data matched with the current business requirement to be online, and then the behavior intention data can be effectively expanded and derived.
In some exemplary design ideas, for feature data of a base dimension (such as base user behavior intention data), behavior intention variables of the feature data can be mined through a base behavior intention variable mining model; for feature data of the derived dimension (such as each derived intention data), behavior intention variables of the feature data can be mined through a derived behavior intention variable mining model; and the basic behavior intention variable mining model and the derived behavior intention variable mining model are obtained by performing model weight parameter tuning on the behavior intention variable learning model according to the example model learning data sequence.
In some exemplary design ideas, the behavior intention variable learning model includes a basic behavior intention variable learning model and a derivative behavior intention variable learning model, traversal model parameters of the basic behavior intention variable learning model and the derivative behavior intention variable learning model can be optimized according to the example model learning data sequence, the basic behavior intention variable learning model meeting model convergence requirements is used as the basic behavior intention variable mining model, and the derivative behavior intention variable learning model meeting the model convergence requirements is used as the derivative behavior intention variable mining model. The architecture of the model parameter layer of the basic behavior intention variable learning model and the derivative behavior intention variable learning model is not particularly limited, and may be configured based on application requirements.
For example, in some exemplary design concepts, the behavior intention variable learning model including the basic behavior intention variable learning model and the derived behavior intention variable learning model, the model updating step may include:
obtaining an example model learning data sequence, wherein the example model learning data sequence comprises a basic example model learning data set, and each basic example model learning data in the basic example model learning data set comprises first basic behavior intention training data of a basic dimension and first derivative behavior intention training data of a derivative dimension corresponding to the first basic behavior intention training data;
traversing model parameters and optimizing a behavior intention variable learning model which is initialized and configured by model weight parameters based on the example model learning data sequence until the model learning cost value is converged, taking a basic behavior intention variable learning model when the model learning cost value is converged as a basic behavior intention variable mining model, and taking a derivative behavior intention variable learning model when the model learning cost value is converged as a derivative behavior intention variable mining model; the training process may include the steps of:
loading each piece of first basic behavior intention training data into a basic behavior intention variable learning model, generating behavior intention variables of each piece of first basic behavior intention training data, loading each piece of first derived behavior intention training data into a derived behavior intention variable learning model, and generating behavior intention variables of each piece of first derived behavior intention training data;
determining a first model learning cost value based on the correlation degree of the behavior intention variables of the first basic behavior intention training data and the behavior intention variables of the first derivative behavior intention training data in each basic example model learning data set and the correlation degree of the behavior intention variables of the first basic behavior intention training data and the behavior intention variables of the first derivative behavior intention training data in each basic negative direction model learning data; wherein the basic negative-going model learning data comprises first basic behavior intent training data of one basic example model learning data and first derivative behavior intent training data of another basic example model learning data;
and if the first model learning cost value does not meet the first training convergence requirement, optimizing the model weight information of the basic behavior intention variable learning model and the derivative behavior intention variable learning model, wherein the model learning cost value convergence comprises that the first model learning cost value meets the first training convergence requirement.
In training the behavior intention variable learning model, the first basic behavior intention training data and the first derivative behavior intention training data of each basic example model learning data set are feature data of two dimensions that match each other, the basic example model learning data may also be referred to as basic positive model learning data, and the basic negative model learning data is first basic behavior intention training data and first derivative behavior intention training data of different basic example model learning data sets, that is, feature data of two dimensions that do not match, and for any first basic behavior intention training data, the data may respectively constitute negative model learning data by a plurality of other first derivative behavior intention training data (first derivative behavior intention training data other than the first derivative behavior intention training data corresponding to the first basic behavior intention training data). In the training process, the model learning cost value is determined based on the correlation between the characteristic behavior intention variables of the basic positive-direction model learning data and the correlation between the characteristic behavior intention variables of the basic negative-direction model learning data.
The model weight parameter tuning is not limited to the model learning cost function (loss function) selected in the training process, and the purpose of the model weight parameter tuning is to make the associated parameter value between the behavior intention variables of the first basic behavior intention training data and the first derivative behavior intention training data which are matched with each other as large as possible, and the associated parameter value between the behavior intention variables of the first basic behavior intention training data and the first derivative behavior intention training data which are not matched with each other as small as possible.
For basic positive-direction model learning data, a characteristic distance between a behavior intention variable of first basic behavior intention training data learned through a basic behavior intention variable learning model and a derivative behavior intention variable learned through a derivative behavior intention variable learning model can be calculated, and a corresponding learning cost value is generated. The calculation method for calculating the correlation or the characteristic distance is also different for different loss functions.
In some exemplary design considerations, the determining the first model learning cost value based on the correlation between the behavior intention variable of the first basic behavior intention training data in each basic example model learning data set and the behavior intention variable of the first derived behavior intention training data, and the correlation between the behavior intention variable of the first basic behavior intention training data and the behavior intention variable of the first derived behavior intention training data in each basic negative-direction model learning data may include:
determining characteristic distances between behavior intention variables of first basic behavior intention training data and behavior intention variables of first derived behavior intention training data of the basic example model learning data to generate a first model learning cost value;
for each piece of first basic behavior intention training data, determining a basic associated parameter value corresponding to the first basic behavior intention training data and a derived associated parameter value corresponding to the first basic behavior intention training data, wherein the basic associated parameter value is an associated parameter value between a behavior intention variable of the first basic behavior intention training data and a behavior intention variable of first derived behavior intention training data corresponding to the first basic behavior intention training data, and the derived associated parameter value is an associated parameter value between a behavior intention variable of the first basic behavior intention training data and a behavior intention variable of the first derived behavior intention training data in basic negative-direction model learning data in which the first basic behavior intention training data is located;
acquiring training marking information corresponding to each first basic behavior intention training data, wherein the training marking information comprises associated parameter value marking information corresponding to basic associated parameter values and associated parameter value marking information corresponding to derived associated parameter values;
determining a second model learning cost value based on model output associated parameter values and training annotation information corresponding to the first basic behavior intention training data, wherein the model output associated parameter values comprise basic associated parameter values and derivative associated parameter values, and the second model learning cost value represents a loss function value between the model output associated parameter values and the training annotation information corresponding to the first basic behavior intention training data;
a first model learning cost value is determined based on the first model learning cost value and the second model learning cost value.
For example, the first model learning cost value may be a sum of mean square errors between behavior intention variables of the first base behavior intention training data and behavior intention variables of the first derivative behavior intention training data in each base forward model learning data, or may be a sum of feature distances corresponding to each base forward model learning data as the first model learning cost value by calculating a correlation parameter value between the behavior intention variable of the first base behavior intention training data and the behavior intention variable of the first derivative behavior intention training data in each base forward model learning data, and subtracting the correlation parameter value by 1 as the feature distance. The first model learning cost value can enable behavior intention variables of feature data of two dimensions in basic forward model learning data learned by the model to be as close as possible.
The second model learning cost value may also be referred to as a matching loss function value, and is used to constrain that a value of an associated parameter between behavior intention variables of two data in basic positive-direction model learning data learned by a model is higher than a value of an associated parameter between behavior intention variables of two data in basic negative-direction model learning data. When the partial matching loss function value is calculated, the training annotation information is actual learning information during training, that is, a result that the desired model needs to be learned, for example, for each first basic behavior intention training data, the associated parameter value annotation information corresponding to the basic associated parameter value in the corresponding actual learning information refers to an expected associated parameter value between the first basic behavior intention training data and the corresponding first derived behavior intention training data, such as a related parameter value that may be 1 or higher, the derived associated parameter value in the actual learning information refers to an expected associated parameter value between the first derived behavior intention training data and the first derived behavior intention training data that does not match with the first derived behavior intention training data, such as a related parameter value that may be 0 or smaller, and the training annotation information may be pre-configured. Based on behavior intention variables of first basic behavior intention training data and behavior intention variables of first derivative behavior intention training data output by the model, basic associated parameter values and derivative associated parameter values corresponding to each piece of first basic behavior intention training data can be obtained through calculation, the associated parameter values can form an associated parameter value sequence, loss function values between the associated parameter value sequence and training annotation information are calculated, a second model learning cost value is generated, for example, the associated parameter value sequence can be used as confidence distribution predicted by the model, the training annotation information is used as real confidence distribution, namely labels, and cross entropy loss between the two is calculated to obtain the second model learning cost value.
In some exemplary design ideas, the loading each piece of first basic behavior intention training data into the basic behavior intention variable learning model to generate the behavior intention variable of each piece of first basic behavior intention training data may include:
for each first basic behavior intention training data, executing the following steps on the first basic behavior intention training data through a basic behavior intention variable learning model to generate behavior intention variables of the first basic behavior intention training data:
splitting the first basic behavior intention training data into at least two behavior intention training member data to generate a behavior intention training member data series corresponding to the first basic behavior intention training data; extracting behavior intention variables of each behavior intention training member data in the behavior intention training member data cluster based on an intention variable knowledge base, wherein the intention variable knowledge base comprises a plurality of intention variable knowledge points, the number of characteristic values included in the behavior intention variables of each behavior intention training member data is equal to the number of the intention variable knowledge points in the intention variable knowledge base, and one characteristic value represents the confidence coefficient of the intention variable knowledge point corresponding to the position of the characteristic value in the intention training member data; generating behavior intention variables of first basic behavior intention training data based on behavior intention variables of each behavior intention training member data;
the above method embodiment may further include:
for each first derived behavior intention training data, determining characteristic behavior intention variables of the first derived behavior intention training data corresponding to an intention variable knowledge base according to the intention variable knowledge base, wherein the behavior intention variables represent confidence degrees of the first derived behavior intention training data corresponding to all intention variable knowledge points in the intention variable knowledge base;
accordingly, the determining the first model learning cost value may include:
and determining a first model learning cost value according to the correlation degree between the behavior intention variable of each behavior intention training member data of the first basic behavior intention training data in each basic example model learning data set and the characteristic behavior intention variable of the first derivative behavior intention training data corresponding to the intention variable knowledge base, the correlation degree between the behavior intention variable of the first basic behavior intention training data in each basic example model learning data set and the behavior intention variable of the first derivative behavior intention training data, and the correlation degree between the behavior intention variable of the first basic behavior intention training data in each basic negative direction model learning data and the behavior intention variable of the first derivative behavior intention training data.
That is, the first model learning cost value also increases a loss (which may be referred to as a third model learning cost value) corresponding to a correlation (which may be referred to as a third model learning cost value) between the behavior intention variables of the behavior intention training member data of the first basic behavior intention training data in the basic example model learning data sets and the characteristic behavior intention variables of the first derivative behavior intention training data corresponding to the intention variable knowledge base, and according to the loss, the behavior intention variables of the behavior intention training member data in the first basic behavior intention training data learned by the basic behavior intention variable learning model can be predicted to maximize the confidence of the first derivative behavior intention training data corresponding to the first basic behavior intention training data.
In some exemplary design ideas, the intention variable knowledge points in the intention variable knowledge base are behavior intention variable data units which can be used for representing each behavior intention training member data and first derived behavior intention training data of the first basic behavior intention training data, and the forms of the intention variable knowledge points can be configured according to requirements. For the first derived behavioral intention training data, the characteristic behavioral intention variables corresponding to the knowledge base of intention variables are the confidences that characterize the behavioral intention variables that characterize the first derived behavioral intention training data corresponding to the knowledge points of the intention variables in the knowledge base of intention variables. When calculating the third model learning cost value corresponding to each basic forward model learning data, the confidence degree of the first derived behavior intention training data corresponding to the feature behavior intention variables of the intention variable knowledge base can be obtained according to the behavior intention variable sequences (namely, the feature vectors formed by the feature values) of the first basic behavior intention training member data, which are determined according to the behavior intention variable sequences of the behavior intention training member data, and the confidence degree is maximized according to the constraint of the third model learning cost value, so that the semantic information of the first derived behavior intention training data can be contained in the behavior intention variables of the first basic behavior intention training data learned by the basic behavior intention variable learning model.
In some exemplary design ideas, the derived intention data may be derived features of derived dimensions corresponding to extended intention data of preset derived dimensions; the behavior intention variable learning model initialized and configured by the model weight parameters also comprises a label decision model; at this time, the example model learning data sequence further includes a derived example model learning data set, each derived example model learning data in the derived example model learning data set includes second basic behavior intention training data of a basic dimension, second derived behavior intention training data of a derived dimension corresponding to the second basic behavior intention training data, and a calibration derivative label of the second basic behavior intention training data, wherein the second basic behavior intention training data in the derived example model learning data sequence includes second basic behavior intention training data of a preset derived dimension and second basic behavior intention training data of a non-preset derived dimension; after obtaining the behavior intention variable learning model with the first model learning cost value satisfying the first training convergence requirement, the method may further include:
continuously carrying out traversal model weight optimization on the behavior intention variable learning model according to the derived example model learning data until the learning cost value of the second model meets a second training convergence requirement, wherein the model learning cost value convergence further comprises that the learning cost value of the second model meets the second training convergence requirement; the above embodiment may further include:
loading each second basic behavior intention training data into a basic behavior intention variable learning model, generating behavior intention variables of each second basic behavior intention training data, loading each second derived behavior intention training data into a derived behavior intention variable learning model, generating behavior intention variables of each second derived behavior intention training data, loading the behavior intention variables of each second basic behavior intention training data into a label decision model, and generating decision derived labels corresponding to each second basic behavior intention training data;
determining a second model learning cost value according to the correlation degree of the behavior intention variables of the second basic behavior intention training data and the behavior intention variables of the second derived behavior intention training data in each derived example model learning data, the correlation degree of the behavior intention variables of the second basic behavior intention training data and the behavior intention variables of the second derived behavior intention training data in each derived negative direction model learning data, and the correlation degree between the calibration derivative labels and the decision derivative labels of each second basic behavior intention training data;
and if the second model learning cost value does not meet the second training convergence requirement, optimizing the model weight information of the behavior intention variable learning model.
The behavior intention variable learning model comprises a basic behavior intention variable learning model and a derivative behavior intention variable learning model, and can also comprise a label decision model which is cascaded with the basic behavior intention variable learning model and used for judging the type of feature data loaded to the basic behavior intention variable learning model according to features output by the basic behavior intention variable learning model. In some embodiments, the process of performing model parameter tuning on the behavior intention variable learning model according to the basic example model learning data in the foregoing embodiments is a preliminary tuning process, and a basic behavior intention variable learning model and a derived behavior intention variable learning model that satisfy the most basic application conditions may be output.
Further, a part of the learning cost value (matching loss) may be calculated according to the correlation between the behavior intention variable of the second basic behavior intention training data in each derived example model learning data and the behavior intention variable of the second derived behavior intention training data, and the correlation between the behavior intention variable of the second basic behavior intention training data in each derived negative-direction model learning data and the behavior intention variable of the second derived behavior intention training data, a part of the learning cost value (classification loss) may be calculated according to the calibration derivative label and the decision derivative label of each second basic behavior intention training data, and further training of the model may be constrained according to the learning cost values of the two parts. The method for calculating the cost value according to the correlation between the behavior intention variable of the second basic behavior intention training data in each derived example model learning data and the behavior intention variable of the second derived behavior intention training data, and the correlation between the behavior intention variable of the second basic behavior intention training data in each derived negative direction model learning data and the behavior intention variable of the second derived behavior intention training data may be a method for calculating the matching loss (i.e., the second model learning cost value) in the foregoing, and may be a method for calculating the first model learning cost value and the second model learning cost value in the foregoing.
For the classification loss, the model learning cost value of the part represents the similarity between the type of the second basic behavior intention training data predicted by the label decision model and the actual prior label of the second basic behavior intention training data, that is, the calibrated derivative label, for example, the calibrated derivative label of the second basic behavior intention training data may be 1 or 0, for example, 1 represents that the second basic behavior intention training data is the feature data of the preset derivative dimension, 0 represents that the second basic behavior intention training data is not the feature data of the preset derivative dimension, the output of the label decision model may include the basic confidence that the second basic behavior intention training data is the preset derivative dimension and the derived confidence that the second basic behavior intention training data is not the preset derivative dimension, and the training model learning cost value corresponding to the label decision model may be calculated according to the two confidence that the calibrated derivative label decision model of each second basic behavior intention training data is output, for example, the model learning cost value may be calculated by using a binary entropy cross error, and the smaller error value represents that the predicted type and the actual prior label are closer.
After the behavior intention variable learning model with the converged model learning cost value is obtained, in application, the behavior intention variable data type of the basic user behavior intention data can be identified through the label decision model meeting the model convergence requirement, for example, the basic user behavior intention data can be loaded into the basic behavior intention variable learning model (namely, a basic behavior intention variable mining model) meeting the model convergence requirement, basic behavior intention variables of the basic user behavior intention data are generated, the basic behavior intention variables are loaded into the trained label decision model, basic confidence degrees that the basic user behavior intention data belong to preset derivative dimension data and derivative confidence degrees that the basic user behavior intention data do not belong to the preset derivative dimension data are generated, and whether the basic user behavior intention data are feature data of the preset derivative dimension can be determined according to the basic confidence degrees and the derivative confidence degrees.
Some exemplary design ideas, aiming at the Process100, perform behavior intention mining on internet behavior big data of a target user, and during the Process of generating the user behavior intention data of the target user, for example, the internet behavior big data of the target user may be input into a user behavior intention mining model meeting model deployment requirements, so as to generate the user behavior intention data of the target user.
The user behavior intention mining model comprises a student intention positioning unit, a teacher intention positioning unit and a behavior intention mining unit; the training step of the user behavior intention mining model can be realized by the following embodiments.
The Process101 processes the first sample user behavior data carrying the prior user behavior intention marking information according to the student intention positioning unit, and outputs student intention positioning information of the first sample user behavior data, wherein the student intention positioning information of the first sample user behavior data represents a student intention positioning point and a student intention positioning label of the sample user in the first sample user behavior data.
The Process102 is used for respectively processing second sample user behavior data which do not carry prior user behavior intention marking information according to the student intention positioning unit and the teacher intention positioning unit and outputting student intention positioning information and teacher intention positioning information of the second sample user behavior data; the student intention positioning information of the second sample user behavior data represents the student intention positioning point and the student intention positioning label of the sample user in the second sample user behavior data, and the teacher intention positioning information of the second sample user behavior data represents the teacher intention positioning point and the teacher intention positioning label of the sample user in the second sample user behavior data.
For example, before processing the second sample user behavior data, data expansion may be performed on the second sample user behavior data, for example, the second sample user behavior data may be processed based on the first data expansion policy and the second data expansion policy, and the second sample user behavior data after the first data expansion and the second sample user behavior data after the second data expansion are output; the data expansion dimensionality of the first data expansion strategy is larger than that of the second data expansion strategy; processing the second sample user behavior data after the first data expansion according to the student intention positioning unit, and outputting student intention positioning information of the second sample user behavior data; and processing the second sample user behavior data after the second data expansion according to the teacher intention positioning unit, and outputting teacher intention positioning information of the second sample user behavior data.
In some embodiments, the first data expansion strategy is a strong data expansion strategy, and the second data expansion strategy is a weak data expansion strategy. The sample user behavior data after the weak data expansion is processed according to the teacher intention positioning unit, so that the analysis precision of the teacher intention positioning unit can be effectively improved, and the sample user behavior data after the strong data expansion is processed according to the student intention positioning unit, so that the robustness of the student intention positioning unit can be effectively improved, and the robustness of the teacher intention positioning unit can be improved.
The Process103 respectively mines the behavior intention characteristics of the first sample user behavior data and the behavior intention characteristics of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputs the behavior intention data of the first sample user behavior data and the behavior intention data of the second sample user behavior data.
The behavioral intent mining unit may be used to perform behavioral intent mining. For example, the behavior intention mining unit may include a gradient inversion layer and a behavior intention miner, and the behavior intention miner is configured to judge the behavior intentions of the first sample user behavior data and the second sample user behavior data based on the behavior intention characteristics of the first sample user behavior data and the behavior intention characteristics of the second sample user behavior data obtained by the student intention positioning unit, and output the behavior intention data of the first sample user behavior data and the behavior intention data of the second sample user behavior data. The gradient reverse layer is used for reversely propagating the gradient optimization direction of the behavior intention mining unit to the intention positioning unit when the behavior intention mining unit and the intention positioning unit (student intention positioning unit + teacher intention positioning unit) are jointly trained. Wherein the behavioral intention data characterizes a behavioral intention to which the sample user behavioral data belongs.
The above behavioral intent characteristics include a plurality of characteristic vectors of different behavioral intent evaluation dimensions. For example, the behavior intention data acquisition process may also be as follows: respectively mining a plurality of feature vectors of different behavior intention assessment dimensions of the first sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting a plurality of behavior intention data of the first sample user behavior data; and respectively mining the feature vectors of a plurality of different behavior intention assessment dimensions of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting a plurality of behavior intention data of the second sample user behavior data.
And the Process104 is used for carrying out model parameter optimization on the user behavior intention mining model according to the student intention positioning information of the first sample user behavior data, the student intention positioning information and the teacher intention positioning information of the second sample user behavior data and the behavior intention data.
The Process104 may also include several sub-steps as follows.
The Process104a determines first intention learning cost information and second intention learning cost information according to the student intention positioning information of the first sample user behavior data and the prior user behavior intention labeling information of the first sample user behavior data; the first intention learning cost information is used for evaluating the mining precision of the intention positioning labels of the student intention positioning units on the first sample user behavior data by taking the intention positioning point marking information of the first sample user behavior data as a reference, and the second intention learning cost information is used for evaluating the mining precision of the intention positioning points of the student intention positioning units on the first sample user behavior data by taking the intention label marking information of the first sample user behavior data as a reference.
The prior user behavior intention labeling information of the first sample user behavior data represents an actual intention positioning point (corresponding intention positioning point labeling information) and an actual intention positioning label (corresponding intention label labeling information) of a sample user. The first intention learning cost information can be determined based on the difference between the student intention positioning point and the actual intention positioning point of the first sample user behavior data, and the second intention learning cost information can be determined based on the difference between the student intention positioning label and the actual intention positioning label of the first sample user behavior data.
The Process104b determines a target learning cost according to the student intention positioning information and the teacher intention positioning information of the second sample user behavior data; and the target learning cost is used for evaluating the intention positioning label and the intention positioning point mining precision of the student intention positioning unit on the second sample user behavior data by taking teacher intention positioning information of the second sample user behavior data as comparison training basis information.
For example, second sample user behavior data in which teacher intention positioning information of the second sample user behavior data is not less than a preset positioning confidence may be selected from the plurality of second sample user behavior data; and the student intention positioning information and the teacher intention positioning information of the selected second sample user behavior data are used for determining the target learning cost.
For example, the teacher intention positioning tag may represent an intention positioning tag of a sample user, and when the position reliability corresponding to the teacher intention positioning tag is not less than the preset position reliability, it may be determined that the sample user has a high probability of belonging to the intention positioning tag corresponding to the preset position reliability. Therefore, the second sample user behavior data with the position reliability not less than the preset positioning confidence coefficient corresponding to the teacher intention positioning label can be reserved, the second sample user behavior data with the position reliability less than the preset positioning confidence coefficient corresponding to the teacher intention positioning label is removed, the selected second sample user behavior data is output, and the target learning cost is determined based on the student intention positioning information and the teacher intention positioning information of the selected second sample user behavior data.
For example, the preset positioning confidence may be flexibly set, such as 95%, 96%, 97%, and the like. Therefore, according to the high preset positioning confidence coefficient, the second sample user behavior data is screened based on the positioning confidence coefficient corresponding to the teacher intention positioning label, and therefore inaccurate intention positioning information can be screened out, and the model decision reliability is improved.
And calculating to obtain target learning cost based on the student intention positioning information and the teacher intention positioning information of the second sample user behavior data. For example, the target learning cost may be determined according to the positioning support degree difference between the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, and the weight parameter; and the weight parameters are dynamically adjusted according to the teacher intention positioning information of the second sample user behavior data.
The Process104c determines the digging cost of the behavior intention according to the behavior intention data and the behavior intention marking information; and the behavior intention mining cost is used for evaluating the behavior intention mining precision of the behavior intention mining unit.
For example, the behavior intention mining cost can be calculated based on the behavior intention data and the behavior intention labeling information based on a cross entropy loss function.
And the Process104d performs model parameter optimization on the user behavior intention mining model according to the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost.
The method comprises the steps that a student intention positioning unit is trained by first sample user behavior data carrying prior user behavior intention labeling information, and meanwhile joint optimization training is conducted on a behavior intention mining unit and the student intention positioning unit by the first sample user behavior data carrying the prior user behavior intention labeling information and second sample user behavior data not carrying the prior user behavior intention labeling information. For example, global learning cost information corresponding to the basic training process is determined based on the first intention learning cost information, the second intention learning cost information and the behavior intention mining cost.
At this time, supervised learning can be performed based on the first sample user behavior data carrying the prior user behavior intention labeling information, and further training is performed based on the first sample user behavior data and the second sample user behavior data. For example, in the basic training process, a model weight parameter optimization gradient of the behavior intention mining unit may be determined according to global learning cost information corresponding to the basic training process, and model weight information tuning may be performed on the behavior intention mining unit based on the model weight parameter optimization gradient of the behavior intention mining unit. And then model weight parameter optimization gradient of the student intention positioning unit is determined based on a gradient inversion layer, and model weight information tuning is carried out on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit. After traversing model weight information tuning, training a student intention positioning unit by using first sample user behavior data carrying prior user behavior intention marking information, simultaneously performing joint optimization training on a behavior intention mining unit and the student intention positioning unit by using the first sample user behavior data carrying prior user behavior intention marking information and second sample user behavior data not carrying prior user behavior intention marking information, and performing model parameter tuning on the student intention positioning unit and a teacher intention positioning unit by using constraint consistency and based on the second sample user behavior data not carrying prior user behavior intention marking information. For example, global learning cost information of the user behavioral intention mining model may be determined according to the first intention learning cost information, the second intention learning cost information, the target learning cost, and the behavioral intention mining cost.
At this time, supervised learning can be performed based on the first sample user behavior data carrying the prior user behavior intention labeling information, semi-supervised learning is performed based on the first sample user behavior data carrying the prior user behavior intention labeling information and the second sample user behavior data not carrying the prior user behavior intention labeling information, and further training is performed based on the first sample user behavior data and the second sample user behavior data. And after the weight information of the traversal model is adjusted and optimized, finishing the advanced training process training of the user behavior intention mining model to obtain the user behavior intention mining model meeting the model deployment requirement.
For example, in the training process of the advanced training flow, a model weight parameter optimization gradient of the behavior intention mining unit may be determined according to the global learning cost information, and model weight information tuning may be performed on the behavior intention mining unit based on the model weight parameter optimization gradient of the behavior intention mining unit.
Then, according to the model weight parameter optimization gradient of the behavior intention mining unit, determining the model weight parameter optimization gradient of the student intention positioning unit based on the gradient reverse layer GRL, and carrying out model weight information tuning on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit; and finally, determining the parameters of the teacher intention positioning unit based on an exponential weighted average strategy according to the parameters of the student intention positioning unit.
Based on the above steps, the embodiment of the application can complete the training of the user behavior intention mining model according to the first sample user behavior data based on the prior user behavior intention labeling information and the second sample user behavior data not carrying the prior user behavior intention labeling information, so that under the condition that the quantity of the first sample user behavior data carrying the prior user behavior intention labeling information is limited, a high-reliability user behavior intention mining model can be obtained according to a large quantity of second sample user behavior data not carrying the prior user behavior intention labeling information. Meanwhile, only a small amount of first sample user behavior data carrying prior user behavior intention labeling information is needed, so that model optimization cost of a user behavior intention mining model can be effectively reduced, and the first sample user behavior data carrying the prior user behavior intention labeling information can be more efficiently utilized.
In addition, model parameters of the user behavior intention mining model are optimized according to first sample user behavior data carrying prior user behavior intention marking information and second sample user behavior data not carrying prior user behavior intention marking information, so that the trained user behavior intention mining model can process the first sample user behavior data and the second sample user behavior data, and the generalization of the user behavior intention mining model is improved.
Further, based on the first intention learning cost information, the second intention learning cost information and the behavior intention mining cost, basic training process training is carried out on the user behavior intention mining model, and the user behavior intention mining model for finishing basic training process training is output. In the training process of the basic training process, a model weight parameter optimization gradient of the behavior intention mining unit can be determined according to global learning cost information (namely, the sum of the first intention learning cost information, the second intention learning cost information and the behavior intention mining cost) corresponding to the basic training process, and model weight information tuning can be performed on the behavior intention mining unit based on the model weight parameter optimization gradient of the behavior intention mining unit. And then model weight parameter optimization gradient of the behavior intention mining unit is obtained, model weight parameter optimization gradient of the student intention positioning unit is determined based on GRL, and model weight information tuning is carried out on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit.
And performing advanced training process training on the user behavior intention mining model which completes basic training process training based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost, and outputting the user behavior intention mining model meeting the model deployment requirement. In the training process of the advanced training process, a model weight parameter optimization gradient of the behavior intention mining unit can be determined according to the global learning cost information (namely the sum of the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost), and the model weight information of the behavior intention mining unit is adjusted and optimized based on the model weight parameter optimization gradient of the behavior intention mining unit. Then, according to the model weight parameter optimization gradient of the behavior intention mining unit, determining the model weight parameter optimization gradient of the student intention positioning unit based on the gradient reverse layer GRL, and carrying out model weight information tuning on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit; and finally, determining the parameters of the teacher intention positioning unit based on an exponential weighted average strategy according to the parameters of the student intention positioning unit.
In some embodiments, cloud computing business system 100 may include a processor 110, a machine-readable storage medium 120, a bus 130, and a communication unit 140.
The processor 110 may perform various suitable actions and processes based on a program stored in the machine-readable storage medium 120, such as program instructions related to the internet big data based user behavior analysis method described in the foregoing embodiments. The processor 110, the machine-readable storage medium 120, and the communication unit 140 perform signal transmission through the bus 130.
In particular, the processes described in the exemplary flow diagrams above may be implemented as computer software programs, according to embodiments of the present invention. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication unit 140, and when executed by the processor 110, performs the above-described functions defined in the methods of the embodiments of the present invention.
The invention further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-readable storage medium is used for implementing the method for analyzing user behavior based on internet big data according to any of the above embodiments.
Yet another embodiment of the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the method for analyzing user behavior based on internet big data according to any of the above embodiments is implemented.
It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.
The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.

Claims (9)

1. A user behavior analysis method based on Internet big data is characterized in that the method is executed through a cloud computing service system, and the method comprises the following steps:
performing behavior intention mining on internet behavior big data of a target user to generate user behavior intention data of the target user, loading the user behavior intention data into a user behavior intention big data log of the target user in real time, and generating target user behavior intention data matched with the current service requirement to be online based on the user behavior intention big data log;
performing behavior intention relationship extraction on the target user behavior intention data to generate a corresponding behavior intention relationship map, wherein the behavior intention relationship map is used for representing a plurality of behavior intention entities and behavior intention relationship attributes among the behavior intention entities;
extracting target behavioral intention entities of which behavioral intention relationship attributes are associated with at least two behavioral intention entities from the behavioral intention relationship graph to obtain at least one target behavioral intention entity and associated behavioral intention entities of which each target behavioral intention entity is associated through the behavioral intention relationship attributes;
performing internet content pushing on an internet service page corresponding to the target user based on at least one target behavior intention entity and an associated behavior intention entity associated with each target behavior intention entity through a behavior intention relation attribute;
the step of mining the internet behavior big data of the target user to generate the user behavior intention data of the target user comprises the following steps:
inputting the Internet behavior big data of the target user into a user behavior intention mining model meeting the model deployment requirement, and generating user behavior intention data of the target user;
the user behavior intention mining model comprises a student intention positioning unit, a teacher intention positioning unit and a behavior intention mining unit;
the training step of the user behavior intention mining model comprises the following steps:
processing first sample user behavior data carrying prior user behavior intention labeling information according to the student intention positioning unit, and outputting student intention positioning information of the first sample user behavior data, wherein the student intention positioning information of the first sample user behavior data represents a student intention positioning point and a student intention positioning label of a sample user in the first sample user behavior data;
respectively processing second sample user behavior data which do not carry prior user behavior intention marking information according to the student intention positioning unit and the teacher intention positioning unit, and outputting student intention positioning information and teacher intention positioning information of the second sample user behavior data; wherein the student intention positioning information of the second sample user behavior data characterizes student intention positioning points and student intention positioning labels of the sample users in the second sample user behavior data, and the teacher intention positioning information of the second sample user behavior data characterizes teacher intention positioning points and teacher intention positioning labels of the sample users in the second sample user behavior data;
respectively mining the behavior intention characteristics of the first sample user behavior data and the behavior intention characteristics of the second sample user behavior data obtained by the student intention positioning unit according to the behavior intention mining unit, and outputting the behavior intention data of the first sample user behavior data and the behavior intention data of the second sample user behavior data;
and performing model parameter optimization on the user behavior intention mining model based on the student intention positioning information of the first sample user behavior data, the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, and the behavior intention data.
2. The internet big data-based user behavior analysis method according to claim 1, wherein the step of generating target user behavior intention data matching the current business demand to be online based on the user behavior intention big data log comprises:
acquiring basic user behavior intention data which is matched with the current service requirement to be online from a user behavior intention big data log of a target user, wherein the basic user behavior intention data is characteristic data of basic dimensionality;
mining basic behavior intention variables of the basic user behavior intention data;
matching the basic behavior intention variable with a plurality of derived behavior intention variables in a derived intention database to generate matching state information corresponding to each derived behavior intention variable, wherein the derived intention database comprises a plurality of derived intention data and derived behavior intention variables of each derived intention data, and the derived intention data are feature data of derived dimensions;
and obtaining derived behavior intention data corresponding to the basic user behavior intention data from each derived intention data based on the matching state information corresponding to each derived behavior intention variable, and outputting the basic user behavior intention data and the corresponding derived behavior intention data as target user behavior intention data matching the current business demand to be online.
3. The internet big data-based user behavior analysis method according to claim 2, wherein the basic behavior intention variables are mined through a basic behavior intention variable mining model; the derived behavior intention variables of the derived intention data are mined through a derived behavior intention variable mining model;
the model updating steps of the basic behavior intention variable mining model and the derived behavior intention variable mining model comprise:
obtaining a sample model learning data sequence, wherein the sample model learning data sequence comprises a basic sample model learning data set, and each basic sample model learning data in the basic sample model learning data set comprises first basic behavior intention training data of basic dimensionality and first derivative behavior intention training data of derivative dimensionality corresponding to the first basic behavior intention training data;
optimizing traversing model parameters of a behavior intention variable learning model initialized and configured by model weight parameters according to the example model learning data sequence until the model learning cost value is converged, wherein the behavior intention variable learning model comprises a basic behavior intention variable learning model and a derivative behavior intention variable learning model, the basic behavior intention variable learning model when the model learning cost value is converged is used as the basic behavior intention variable mining model, and the derivative behavior intention variable learning model when the model learning cost value is converged is used as the derivative behavior intention variable mining model; wherein, the concrete model updating step comprises:
loading each piece of first basic behavior intention training data into a basic behavior intention variable learning model, generating behavior intention variables of each piece of first basic behavior intention training data, loading each piece of first derived behavior intention training data into a derived behavior intention variable learning model, and generating behavior intention variables of each piece of first derived behavior intention training data;
determining a first model learning cost value according to the correlation degree of the behavior intention variables of the first basic behavior intention training data and the behavior intention variables of the first derivative behavior intention training data in each basic example model learning data set and the correlation degree of the behavior intention variables of the first basic behavior intention training data and the behavior intention variables of the first derivative behavior intention training data in each basic negative direction model learning data; wherein the basic negative-going model learning data comprises first basic behavior intent training data of one basic example model learning data and first derivative behavior intent training data of another basic example model learning data;
and if the first model learning cost value does not meet a first training convergence requirement, optimizing the model weight information of the basic behavior intention variable learning model and the derivative behavior intention variable learning model, wherein the model learning cost value convergence comprises that the first model learning cost value meets the first training convergence requirement.
4. The internet big data-based user behavior analysis method according to claim 3, wherein the loading each of the first basic behavior intention training data into a basic behavior intention variable learning model to generate the behavior intention variable of each of the first basic behavior intention training data comprises:
for each first basic behavior intention training data, executing the following steps on the first basic behavior intention training data through the basic behavior intention variable learning model to generate behavior intention variables of the first basic behavior intention training data:
splitting the first basic behavior intention training data into at least two behavior intention training member data to generate a behavior intention training member data cluster corresponding to the first basic behavior intention training data;
according to an intention variable knowledge base, behavior intention variables of each behavior intention training member data in the behavior intention training member data cluster are mined, wherein the intention variable knowledge base comprises a plurality of intention variable knowledge points, the number of characteristic values included in the behavior intention variables of each behavior intention training member data is equal to the number of the intention variable knowledge points in the intention variable knowledge base, and one characteristic value represents the confidence degree of the intention variable knowledge point corresponding to the position of the characteristic value in the intention training member data;
generating behavior intention variables of the first basic behavior intention training data according to the behavior intention variables of the behavior intention training member data;
the method further comprises the following steps:
aiming at each first derived behavior intention training data, determining characteristic behavior intention variables of the first derived behavior intention training data corresponding to the intention variable knowledge base according to the intention variable knowledge base, wherein the behavior intention variables represent the confidence degrees of the first derived behavior intention training data corresponding to each intention variable knowledge point in the intention variable knowledge base;
the determining a first model learning cost value comprises:
determining a first model learning cost value according to a correlation degree between the behavior intention variable of each behavior intention training member data of the first basic behavior intention training data in each basic example model learning data set and the characteristic behavior intention variable of the first derivative behavior intention training data corresponding to the intention variable knowledge base, a correlation degree between the behavior intention variable of the first basic behavior intention training data in each basic example model learning data set and the behavior intention variable of the first derivative behavior intention training data, and a correlation degree between the behavior intention variable of the first basic behavior intention training data in each basic negative-direction model learning data and the behavior intention variable of the first derivative behavior intention training data.
5. The internet big data-based user behavior analysis method according to claim 3, wherein the determining the first model learning cost value according to the correlation between the behavior intention variable of the first basic behavior intention training data and the behavior intention variable of the first derived behavior intention training data in each basic example model learning data set, and the correlation between the behavior intention variable of the first basic behavior intention training data and the behavior intention variable of the first derived behavior intention training data in each basic negative-direction model learning data comprises:
determining characteristic distances between behavior intention variables of first basic behavior intention training data and behavior intention variables of first derivative behavior intention training data of each basic example model learning data to generate a first model learning cost value;
for each piece of first basic behavior intention training data, determining a basic associated parameter value corresponding to the first basic behavior intention training data and a derived associated parameter value corresponding to the first basic behavior intention training data, wherein the basic associated parameter value is an associated parameter value between a behavior intention variable of the first basic behavior intention training data and a behavior intention variable of first derived behavior intention training data corresponding to the first basic behavior intention training data, and the derived associated parameter value is an associated parameter value between a negative-going behavior intention variable of the first basic behavior intention training data and a behavior intention variable of the first derived behavior intention training data in basic model learning data in which the first basic behavior intention training data is located;
acquiring training marking information corresponding to the first basic behavior intention training data, wherein the training marking information comprises associated parameter value marking information corresponding to basic associated parameter values and associated parameter value marking information corresponding to derived associated parameter values;
determining a second model learning cost value according to model output associated parameter values and training marking information corresponding to the first basic behavior intention training data, wherein the model output associated parameter values comprise the basic associated parameter values and the derivative associated parameter values, and the second model learning cost value represents a loss function value between the model output associated parameter values and the training marking information corresponding to the first basic behavior intention training data;
and determining the first model learning cost value according to the first model learning cost value and the second model learning cost value.
6. The internet big data-based user behavior analysis method according to any one of claims 3 to 5, wherein the derived intention data is derived features of derived dimensions corresponding to extended intention data of preset derived dimensions; the behavior intention variable learning model initialized and configured by the model weight parameters further comprises a label decision model;
the example model learning data sequence further comprises a derived example model learning data set, each derived example model learning data in the derived example model learning data set comprises second basic behavior intention training data of a basic dimension, second derived behavior intention training data of a derived dimension corresponding to the second basic behavior intention training data, and a calibration derivative label of the second basic behavior intention training data, wherein the second basic behavior intention training data in the derived example model learning data set comprises second basic behavior intention training data of a preset derived dimension and second basic behavior intention training data of a non-preset derived dimension;
after obtaining the behavior intention variable learning model with the first model learning cost value meeting the first training convergence requirement, the method further comprises the following steps:
continuously performing traversal model weight optimization on the behavior intention variable learning model according to the derived example model learning data until a second model learning cost value meets a second training convergence requirement, wherein the model learning cost value is converged, and the second model learning cost value meets the second training convergence requirement; the method further comprises the following steps:
loading each piece of second basic behavior intention training data into a basic behavior intention variable learning model, generating behavior intention variables of each piece of second basic behavior intention training data, loading each piece of second derived behavior intention training data into a derived behavior intention variable learning model, generating behavior intention variables of each piece of second derived behavior intention training data, loading the behavior intention variables of each piece of second basic behavior intention training data into a label decision model, and generating decision derived labels corresponding to each piece of second basic behavior intention training data;
determining a second model learning cost value according to the correlation degree of the behavior intention variables of the second basic behavior intention training data and the behavior intention variables of the second derived behavior intention training data in each derived example model learning data, the correlation degree of the behavior intention variables of the second basic behavior intention training data and the behavior intention variables of the second derived behavior intention training data in each derived negative direction model learning data, and the correlation degree between the calibration derivative labels and the decision derivative labels of each second basic behavior intention training data;
and if the second model learning cost value does not meet the second training convergence requirement, optimizing the model weight information of the behavior intention variable learning model.
7. The internet big data-based user behavior analysis method according to claim 1, wherein model parameter tuning is performed on the user behavior intention mining model based on the student intention positioning information of the first sample user behavior data, the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, and the behavior intention data, and comprises:
determining first intention learning cost information and second intention learning cost information based on the student intention positioning information of the first sample user behavior data and the prior user behavior intention labeling information of the first sample user behavior data; the first intention learning cost information is used for evaluating the intention positioning tag mining precision of the student intention positioning unit on the first sample user behavior data by taking intention positioning point marking information of the first sample user behavior data as a reference, and the second intention learning cost information is used for evaluating the intention positioning point mining precision of the student intention positioning unit on the first sample user behavior data by taking the intention positioning point marking information of the first sample user behavior data as a reference;
determining target learning cost based on the student intention positioning information and the teacher intention positioning information of the second sample user behavior data; the target learning cost is used for evaluating intention positioning labels and intention positioning point mining accuracy of the student intention positioning unit on the second sample user behavior data by taking teacher intention positioning information of the second sample user behavior data as comparison training basis information;
determining behavior intention mining cost based on the behavior intention data and the behavior intention labeling information; wherein the behavioral intention mining cost is used for evaluating the behavioral intention mining precision of the behavioral intention mining unit;
and performing model parameter optimization on the user behavior intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost.
8. The internet big data-based user behavior analysis method according to claim 7, wherein the model parameter tuning of the user behavior intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost comprises:
determining global learning cost information of the user behavior intention mining model based on the first intention learning cost information, the second intention learning cost information, the target learning cost and the behavior intention mining cost;
determining a model weight parameter optimization gradient of the behavior intention mining unit based on the global learning cost information, and carrying out model weight information tuning on the behavior intention mining unit based on the model weight parameter optimization gradient of the behavior intention mining unit;
determining a model weight parameter optimization gradient of the student intention positioning unit based on a gradient descent method based on the model weight parameter optimization gradient of the behavior intention mining unit, and performing model weight information tuning on the student intention positioning unit based on the model weight parameter optimization gradient of the student intention positioning unit;
determining parameters of the teacher intent location unit based on an exponentially weighted average strategy based on the parameters of the student intent location unit;
determining a target learning cost based on the student intention positioning information and the teacher intention positioning information of the second sample user behavior data, including:
determining the target learning cost based on a positioning support degree difference between student intention positioning information and teacher intention positioning information of the second sample user behavior data and a weight parameter; wherein the weight parameter is dynamically adjusted according to teacher intention positioning information of the second sample user behavior data.
9. A cloud computing service system, characterized in that the cloud computing service system comprises a processor and a memory for storing a computer program capable of running on the processor, and the processor is used for executing the user behavior analysis method based on internet big data according to any one of claims 1 to 8 when the computer program is run.
CN202210572060.5A 2022-05-25 2022-05-25 User behavior analysis method based on internet big data and cloud computing service system Active CN114817747B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210572060.5A CN114817747B (en) 2022-05-25 2022-05-25 User behavior analysis method based on internet big data and cloud computing service system
CN202211291929.5A CN115587252A (en) 2022-05-25 2022-05-25 User behavior intention mining method and system based on internet big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210572060.5A CN114817747B (en) 2022-05-25 2022-05-25 User behavior analysis method based on internet big data and cloud computing service system

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202211291929.5A Division CN115587252A (en) 2022-05-25 2022-05-25 User behavior intention mining method and system based on internet big data

Publications (2)

Publication Number Publication Date
CN114817747A CN114817747A (en) 2022-07-29
CN114817747B true CN114817747B (en) 2022-11-15

Family

ID=82516568

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202211291929.5A Pending CN115587252A (en) 2022-05-25 2022-05-25 User behavior intention mining method and system based on internet big data
CN202210572060.5A Active CN114817747B (en) 2022-05-25 2022-05-25 User behavior analysis method based on internet big data and cloud computing service system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202211291929.5A Pending CN115587252A (en) 2022-05-25 2022-05-25 User behavior intention mining method and system based on internet big data

Country Status (1)

Country Link
CN (2) CN115587252A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115757900B (en) * 2022-12-20 2023-08-01 创贸科技(深圳)集团有限公司 User demand analysis method and system applying artificial intelligent model
CN115858418B (en) * 2023-02-09 2023-05-05 成都有为财商教育科技有限公司 Data caching method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392330A (en) * 2021-08-17 2021-09-14 湖南轻悦健康管理有限公司 Big data processing method and system based on internet behaviors
CN113628005A (en) * 2021-07-31 2021-11-09 李德财 E-commerce session big data based pushing and updating method and big data AI system
CN113901320A (en) * 2021-10-19 2022-01-07 平安科技(深圳)有限公司 Scene service recommendation method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8468170B2 (en) * 2008-12-15 2013-06-18 Microsoft Creating ad hoc relationships between entities

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113628005A (en) * 2021-07-31 2021-11-09 李德财 E-commerce session big data based pushing and updating method and big data AI system
CN113392330A (en) * 2021-08-17 2021-09-14 湖南轻悦健康管理有限公司 Big data processing method and system based on internet behaviors
CN113901320A (en) * 2021-10-19 2022-01-07 平安科技(深圳)有限公司 Scene service recommendation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN114817747A (en) 2022-07-29
CN115587252A (en) 2023-01-10

Similar Documents

Publication Publication Date Title
CN114817747B (en) User behavior analysis method based on internet big data and cloud computing service system
CN110765265A (en) Information classification extraction method and device, computer equipment and storage medium
CN110598206A (en) Text semantic recognition method and device, computer equipment and storage medium
CN113326426A (en) Information pushing method and system based on big data positioning and artificial intelligence
CN110737818B (en) Network release data processing method, device, computer equipment and storage medium
CN113722493B (en) Text classification data processing method, apparatus and storage medium
CN111783993A (en) Intelligent labeling method and device, intelligent platform and storage medium
US11275888B2 (en) Hyperlink processing method and apparatus
CN113377936A (en) Intelligent question and answer method, device and equipment
CN111241850B (en) Method and device for providing business model
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN111680165A (en) Information matching method and device, readable storage medium and electronic equipment
CN115935344A (en) Abnormal equipment identification method and device and electronic equipment
CN113343073A (en) Big data and artificial intelligence based information fraud identification method and big data system
US20200027034A1 (en) System and method for relationship identification
CN115098556A (en) User demand matching method and device, electronic equipment and storage medium
CN113642652A (en) Method, device and equipment for generating fusion model
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN114328942A (en) Relationship extraction method, apparatus, device, storage medium and computer program product
CN114896502B (en) User demand decision method applying AI and big data analysis and Internet system
CN114238740A (en) Method and device for determining agent brand of agent main body
CN115062619B (en) Chinese entity linking method, device, equipment and storage medium
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN114978765A (en) Big data processing method serving information attack defense and AI attack defense system
CN114090781A (en) Text data-based repulsion event detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221010

Address after: No. 28, Binjiang Avenue, Tongren, Guizhou 554300

Applicant after: Hong Xingfa

Address before: 554300 room 1202, building a, Jinjiang street, Bijiang District, Tongren City, Guizhou Province

Applicant before: Tongren Hengsheng Network Technology Co.,Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221031

Address after: Room 908, Building 2, No. 968 Jinzhong Road, Changning District, Shanghai, 200000 (the actual floor is 8 floors)

Applicant after: CUMMY TECHNOLOGY (SHANGHAI) Co.,Ltd.

Address before: No. 28, Binjiang Avenue, Tongren, Guizhou 554300

Applicant before: Hong Xingfa

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant