CN114880578A - User behavior data processing method, device, equipment and storage medium - Google Patents

User behavior data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114880578A
CN114880578A CN202210622588.9A CN202210622588A CN114880578A CN 114880578 A CN114880578 A CN 114880578A CN 202210622588 A CN202210622588 A CN 202210622588A CN 114880578 A CN114880578 A CN 114880578A
Authority
CN
China
Prior art keywords
target
feature vector
user
behavior data
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210622588.9A
Other languages
Chinese (zh)
Inventor
黄福华
郑文琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202210622588.9A priority Critical patent/CN114880578A/en
Publication of CN114880578A publication Critical patent/CN114880578A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a user behavior data processing method, a user behavior data processing device, user behavior data processing equipment and a storage medium. The method comprises the following steps: acquiring a target task; the target task comprises a target characteristic vector required by the AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of a target user; if the target characteristic vector required by the AI model does not exist in the characteristic vector library, acquiring a plurality of characteristic vectors of the target subject in the characteristic vector library according to the target characteristic vector required by the AI model; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels; combining the plurality of feature vectors to obtain a target feature vector; and training the AI model according to the target characteristic vector to obtain the trained AI model. The application improves the feature development efficiency.

Description

User behavior data processing method, device, equipment and storage medium
Technical Field
The present application relates to artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing user behavior data.
Background
Data characteristics play an important role in Artificial Intelligence (AI) learning. Good data characteristics can improve model accuracy, while good algorithmic models only continuously approximate the model.
Currently, for a model, if it needs many aspects of business behavior characteristics, then manual characteristic engineering is needed. And if another model needs to use another business behavior characteristic, even if the business behavior characteristic partially overlapped with the existing model exists, the characteristic engineering needs to be carried out again, which causes repeated work and causes low characteristic development efficiency.
Disclosure of Invention
The application provides a user behavior data processing method, a user behavior data processing device, user behavior data processing equipment and a storage medium, which are used for solving the problem of low feature development efficiency.
In a first aspect, the present application provides a user behavior data processing method, including: acquiring a target task; the target task comprises a target characteristic vector required by an AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of a target user; if the target characteristic vector required by the AI model does not exist in the characteristic vector library, acquiring a plurality of characteristic vectors of the target user in the characteristic vector library according to the target characteristic vector required by the AI model; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels; combining the plurality of feature vectors to obtain the target feature vector; and training the AI model according to the target characteristic vector to obtain the trained AI model.
Optionally, before obtaining the plurality of feature vectors from the feature vector library according to the target feature vector required by the AI model, the method further includes: acquiring sample data of a target user of a feature vector to be constructed; the sample data of the target user comprises an identifier of the target user and behavior data of the target user, and the identifier of the target user and the behavior data of the target user are uncoupled data; extracting the identifier of the target user from the sample data to obtain the sample identifier of the feature vector to be constructed; extracting the behavior data of the target user from the sample data according to the sample identification of the feature vector to be constructed; and performing feature coding on the behavior data of the target user to obtain a feature vector of the target user.
Optionally, the obtaining sample data of the target user of the feature vector to be constructed includes: acquiring original data of a target user of a feature vector to be constructed; the original data comprises the identification of the target user and original behavior data of the target user; the identification of the target user and the original behavior data of the target user are coupled data; and performing characteristic decoupling processing on the identification of the target user and the original behavior data of the target user to obtain sample data of the target user.
Optionally, the performing feature coding on the behavior data of the target user to obtain a feature vector of the target user includes: determining a data type of the behavior data of the target user; the data types comprise character types, number types and classification types; if the data type of the behavior data of the target user is a character type, encoding the behavior data of the target user through a first encoding mode to obtain a first feature vector of the target user; the first coding mode is a coding mode used for behavior data of character types; if the data type of the behavior data of the target user is a digital type, encoding the behavior data of the target user through a second encoding mode to obtain a second feature vector of the target user; the second coding mode is a coding mode used for digital behavior data; if the data type of the behavior data of the target user is a classification type, the behavior data of the target user is coded through a third coding mode to obtain a third feature vector of the target user; the third encoding mode is an encoding mode for classifying the behavior data of the type.
Optionally, the feature vector library further includes a plurality of pre-stored target feature vectors, and the method further includes: determining whether a target feature vector required by the AI model exists in the feature vector library; and if the target characteristic vector required by the AI model exists in the characteristic vector library, acquiring the target characteristic vector required by the AI model from the target characteristic library.
Optionally, before determining whether the target feature vector required by the AI model exists in the feature vector library, the method further includes: determining the storage value of the target feature vector; according to the storage value of the target feature vectors, sequencing the target feature vectors in a descending order of the storage value; and storing the N top-ranked target feature vectors into the feature vector library.
Optionally, the determining a storage value of the target feature vector includes: determining the use frequency and the combination cost of the target feature vector for AI model training; wherein the combination cost is time consumed for combining a plurality of feature vectors for constructing the target feature vector; and determining the storage value of the target feature vector according to the weighted sum of the use frequency and the combination cost of the target feature vector.
Optionally, the target feature vector is usage behavior data of the user on a withdrawal platform; the plurality of feature vectors are feature vectors constructed according to the use behavior data of different withdrawal platforms of the user; the AI model is a model used for predicting the withdrawal rate of the user according to the target feature vector and a label, wherein the withdrawal rate is used for representing the withdrawal probability of the user, and the label is used for representing whether the user withdraws money or not.
In a second aspect, the present application provides a user behavior data processing apparatus, including: the acquisition module is used for acquiring a target task; the target task comprises a target characteristic vector required by an AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of a target user; the obtaining module is further configured to obtain a plurality of feature vectors of a target user in a feature vector library according to a target feature vector required by the AI model if the target feature vector required by the AI model does not exist in the feature vector library; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels; the combination module is used for combining the plurality of characteristic vectors to obtain the target characteristic vector; and the training module is used for training the AI model according to the target characteristic vector to obtain the trained AI model.
In a third aspect, the present application provides an electronic device, comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored by the memory to implement the method of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions for implementing the method according to the first aspect when executed by a processor.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
According to the user behavior data processing method, the user behavior data processing device, the user behavior data processing equipment and the storage medium, when the target characteristic vector required by the AI model is not found in the characteristic vector library, a plurality of characteristic vectors of the target user are found in the characteristic vector library according to the target characteristic vector required by the AI model, and the plurality of characteristic vectors are combined and then used for training the AI model; because the feature vector library comprises a plurality of pre-constructed feature vectors which are obtained according to user behavior data of different channels, when partial coincidence features exist in target feature vectors used by different models, the coincidence features are developed once, so that the different models can multiplex the coincidence features, repeated feature development is avoided, and the feature development efficiency is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 2 is a first flowchart of a user behavior data processing method according to an embodiment of the present application;
FIG. 3 is an exemplary diagram of a feature vector combination provided by an embodiment of the present application;
FIG. 4 is a flowchart of constructing a feature vector library according to an embodiment of the present application;
fig. 5 is a second flowchart of a user behavior data processing method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a user behavior data processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
Machine learning is defined as the use of experience, which is mainly in the form of data in computers, to improve the performance of computer systems themselves, and thus, data is the premise and basis for machine learning.
When we want the performance of the prediction model to be optimal, we need to obtain more information from the raw data as much as possible, and the purpose of obtaining more information from the raw data is to obtain better training data, i.e. to manually design the input variables X of the model.
Feature engineering is the engineering of transforming a raw datum into features that can describe the datum well and make the model built by using the data have optimal performance on unknown datum.
In machine learning, a data set needs to be constructed first. The task of converting raw data into a data set is called feature engineering. For example, when we need to predict the withdrawal rate, the raw data needed will contain usage behavior data of each user for the withdrawal platform. These attribute data are characteristic of the data set. The task of creating a data set is to learn useful properties from the raw data and create new properties from existing properties that have an impact on the results, or manipulate these properties so that they can be used to model or enhance the results. This process is called feature engineering.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application. As shown in fig. 1, characteristics of a user, for example, withdrawal behavior data of usage behavior data of a withdrawal Application (APP) and withdrawal behavior data of usage behavior data of a withdrawal public number, may be input into a model a to be trained for an electronic device, and the model a may be trained. The trained model A can be used for predicting the withdrawal rate of the user so as to predict the withdrawal probability of the user, and therefore the withdrawal information is recommended to the user.
In the above process, the usage behavior data of the withdrawal APP and the usage behavior data of the withdrawal public account need to be obtained from the withdrawal APP and the withdrawal public account respectively, then the usage behavior data of the withdrawal APP and the usage behavior data of the withdrawal public account are merged, feature coding is performed on the merged usage behavior data, and finally training data for training the model a are obtained.
At this time, if another model B exists, when the training data is constructed, it is necessary to use the usage behavior data of the withdrawal public number of the user a and the usage behavior data of the withdrawal applet, it is necessary to acquire the usage behavior data of the user from the withdrawal public number and the withdrawal applet, respectively, combine the usage behavior data of the withdrawal public number and the usage behavior data of the withdrawal applet, perform feature coding on the combined usage behavior data, and finally obtain the training data for training the model B.
It can be seen that, when training data is constructed for model B, although there are partial coincidence features between model B and model a, feature development needs to be repeated for features required by model B, which results in repeated feature development for the partial coincidence features between model a and model B, such as usage behavior data of the withdrawal public, and results in low feature development efficiency.
In view of the above technical problems, an embodiment of the present invention provides a user behavior data processing method, which can construct a feature vector for each feature of each user, store the constructed feature vector in a feature vector library, and when a target feature vector required by an AI model to be trained is obtained as a combined feature, if the target feature vector is not found in the feature vector library, obtain each feature vector to be combined for forming the target feature vector from the feature vector library, combine the feature vectors to obtain a target feature vector, and train the AI model to be trained according to the target feature vector. The feature vectors to be combined for forming the target feature vector are obtained from the feature vector library and combined to obtain the target feature vector, the feature vectors can be multiplexed without repeated development, and the feature development efficiency is improved.
The following describes the technical solutions of the present application and how to solve the above technical problems in detail with specific embodiments in conjunction with the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 2 is a first flowchart of a user behavior data processing method according to an embodiment of the present application. As shown in fig. 2, the user behavior data processing method includes the following steps:
s201, acquiring a target task; the target task comprises a target feature vector required by the AI model to be trained, and the target feature vector is the combined feature of a plurality of feature vectors of a target user.
The execution subject of the method in this embodiment may be any device having a data processing function.
The target task refers to a function which the AI model required by the user has. Alternatively, the target task may be a withdrawal rate prediction task or the like.
Alternatively, the target user may be an analysis object of the target task, and may be an individual user or an enterprise user.
The target feature vector of the target user is training data required by the AI model to be trained, and can be understood as a characteristic expression of the attribute of the target user, which is a symbol and a sign of the characteristics of the target user and is a key for distinguishing from other users.
The user can input a target feature vector required by the AI model to be trained and a plurality of feature vectors constituting the target feature vector through the electronic device.
In some scenarios, the target feature vector may be usage behavior data of the withdrawal platform by the user, and the plurality of feature vectors are feature vectors constructed according to the usage behavior data of different withdrawal platforms of the user.
S202, if the target characteristic vector required by the AI model does not exist in the characteristic vector library, acquiring a plurality of characteristic vectors of a target user in the characteristic vector library according to the target characteristic vector required by the AI model; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are obtained according to user behavior data of different channels.
When the electronic equipment acquires a target task, whether a target characteristic vector which can be directly used exists or not is searched in a characteristic vector library according to a target characteristic vector required by the AI model. And if the target characteristic vector which can be directly used is not found in the characteristic vector library, searching a plurality of characteristic vectors for forming the target characteristic vector in the characteristic vector library.
Fig. 3 is an exemplary diagram of a feature vector combination provided in an embodiment of the present application. As shown in fig. 3, assuming that the target feature vector is the usage behavior data of the withdrawal APP, the withdrawal public number, and the withdrawal applet of the user a, and the feature vector corresponding to the usage behavior data of the withdrawal APP, the withdrawal public number, and the withdrawal applet of the user a is not found in the feature vector library, the feature vector corresponding to the usage behavior data of the withdrawal APP of the user a, the feature vector corresponding to the usage behavior data of the withdrawal public number, and the feature vector corresponding to the usage behavior data of the withdrawal applet of the user a are respectively found in the feature vector library.
And S203, combining the plurality of feature vectors to obtain a target feature vector.
When multiple feature vectors are found in the feature vector library, the multiple feature vectors may be combined as a target feature vector.
With reference to fig. 3, the feature vector corresponding to the usage behavior data of the withdrawal APP of the user a, the feature vector corresponding to the usage behavior data of the withdrawal public account, and the feature vector corresponding to the usage behavior data of the withdrawal applet are combined to obtain the target feature vector of the user a.
And S204, training the AI model according to the target characteristic vector to obtain the trained AI model.
Referring to fig. 3, in this step, the target feature vector is input into the AI model as training data of the AI model, and the AI model is trained, and the specific training process may refer to the training process of the AI model in the related art, which is not described in detail herein.
The target feature vector and the label are used as training data of the AI model to train the AI model so as to predict the withdrawal rate of the user, the label is used for representing whether the user withdraws money, and the withdrawal rate is used for representing the withdrawal probability of the user.
In this embodiment, when the target feature vector required by the AI model is not found in the feature vector library, a plurality of feature vectors of the target subject are found in the feature vector library according to the target feature vector required by the AI model, and the plurality of feature vectors are combined and used for training the AI model; because the feature vector library comprises a plurality of pre-constructed feature vectors which are obtained according to user behavior data of different channels, when partial coincidence features exist in target feature vectors used by different models, the coincidence features are developed once, so that the different models can multiplex the coincidence features, repeated feature development is avoided, and the feature development efficiency is improved.
The above embodiment describes the use process of the feature vector library, and before the feature vector library is applied to select the feature vector, the feature vector library also needs to be constructed. How to construct the feature vector library will be described below with reference to the accompanying drawings:
fig. 4 is a flowchart for constructing a feature vector library according to an embodiment of the present application. As shown in fig. 4, constructing the feature vector library may include:
s401, obtaining sample data of a target user of a feature vector to be constructed; the sample data of the target user comprises the identification of the target user and the behavior data of the target user, and the identification of the target user and the behavior data of the target user are uncoupled data.
The target user is a main body of a feature vector to be constructed by a user, and in some scenes, the behavior data of the target user is the use behavior data of the user on a withdrawal platform.
The using behavior data of the user on the withdrawal platform comprises the click times and the stay time of the user on a withdrawal APP, a withdrawal public number or a withdrawal small program page and the like.
The method for acquiring the sample data of the target user of the feature vector to be constructed comprises the following steps: and acquiring sample data from an APP, sample data from a public number and sample data from an applet of a target user of the feature vector to be constructed. The sample data comprises identification of the target user and usage behavior data of the target user on the withdrawal platform.
When the identifier of the target user and the behavior data of the target user are uncoupled data, the identifier of the target user and the behavior data of the target user can be extracted separately.
S402, extracting the identification of the target user from the sample data to obtain the sample identification of the feature vector to be constructed.
The identification of the target user is information that uniquely characterizes the target user. For an enterprise user, the identification of the target user may be the name of the enterprise. For an individual user, the identification of the target user may be information that can uniquely identify the user, such as the name, mobile phone number, or identification number of the user.
And S403, extracting behavior data of the target user from the sample data according to the sample identification of the feature vector to be constructed.
Step S402 and step S403 respectively extract the identification of the target user and the behavior data, so as to perform feature coding separately for each behavior data subsequently.
The sample characteristics refer to behavior data to be subjected to characteristic coding.
In some scenarios, the behavior data of the target user is extracted from the sample data according to the sample identifier of the feature vector to be constructed, and may be the withdrawal behavior data of the user is extracted from the sample data of the withdrawal APP according to the sample identifier of the feature vector to be constructed, or the withdrawal behavior data of the user is extracted from the sample data of the withdrawal public account, or the withdrawal behavior data of the user is extracted from the sample data of the withdrawal applet.
S404, performing feature coding on the behavior data of the target user to obtain a feature vector of the target user.
The behavior data of the target user obtained through steps S401-S403 is the most primitive data set, and may include various non-numeric special symbols, such as characters, while the data required for the AI model training is numeric, so that for various special behavior data, it is necessary to encode, i.e. quantize, the behavior data so as to convert the behavior data into data that can be recognized by a computer.
In this embodiment, optionally, feature coding is performed on the usage behavior data of the withdrawal APP of the user, the usage behavior data of the withdrawal public number, and the usage behavior data of the withdrawal applet, respectively, to obtain a usage behavior feature vector of the withdrawal APP of the user, a usage behavior feature vector of the withdrawal public number, and a usage behavior feature vector of the withdrawal applet.
In one or more embodiments of the present application, optionally, step S401 may include:
step a1, acquiring original data of a target user of a feature vector to be constructed; the original data comprises the identification of the target user and the original behavior data of the target user; the original behavior data of the target user and the target user are coupling data.
The original behavior data of the target user is coupled with the original behavior data of the target user, when the identification of the target user is extracted, the original behavior data of the target user can be extracted at the same time, and in the feature coding process, only the original behavior data of the target user and the original behavior data of the target user can be coded at the same time.
Step a2, performing feature decoupling processing on the identification of the target user and the original behavior data of the target user to obtain sample data of the target user.
The characteristic decoupling is to separate the original behavior data of the target user from the original behavior data of the target user, and when the original behavior data of the target user is multiple, the multiple original behavior data can be separated, so that the original behavior data are independent, and each original behavior data can be independently extracted from sample data to perform characteristic coding.
In one or more embodiments of the present application, optionally, step S404 may include:
step b1, determining the data type of the behavior data of the target user; the data types include character type, number type, and classification type.
B2, if the data type of the behavior data of the target user is a character type, encoding the behavior data of the target user through a first encoding mode to obtain a first feature vector of the target user; the first encoding mode is an encoding mode for behavior data of a character type.
Optionally, the first encoding mode may be Index encoding, where the Index encoding is mainly used to encode discrete type features, for example, discontinuous values and texts, so as to convert the discrete features into continuous numerical type variables. Index-based encoding may serve as data normalization.
B3, if the data type of the behavior data of the target user is a digital type, encoding the behavior data of the target user through a second encoding mode to obtain a second feature vector of the target user; the second encoding scheme is an encoding scheme for digital type behavior data.
Optionally, the second encoding manner may be binning (which is a process of converting one continuous feature into a plurality of binary features, and is generally converted based on a value interval. For example, when the behavior data of the target user is 18, the bin includes a bin [0,10 ] numbered 0 and a bin [10, 20 ] numbered 1, 18 is classified into the bin numbered 1 according to the binning technique.
B4, if the data type of the behavior data of the target user is a classification type, encoding the behavior data of the target user through a third encoding mode to obtain a third feature vector of the target subject; the third encoding scheme is an encoding scheme for classifying the type of behavior data.
Alternatively, the third encoding scheme may be one-hot encoding (onehot encoding), which encodes N states using an N-bit state register, each state having its own independent register bit, and only one of which is active at any time.
It will be appreciated that for each behavioural data, if there are m possible values, then m binary signatures result after unique hot encoding, and that there is mutual exclusion between the signatures, with only one signature being activated at a time. The one-hot coding can well solve the problem of behavior data which is not well processed by the classifier, and plays a role in expanding features to a certain extent.
In the embodiment, different encoding modes are adopted for different types of behavior data to perform feature encoding, and a uniform feature encoding mode is adopted for each type of behavior data, so that each feature vector can be conveniently combined when feature combination is performed.
In the above embodiment, an implementation process is described in which feature coding is performed on each of the mutually independent behavior data to obtain a feature vector, the feature vector is stored in a feature vector library, when a target feature vector is needed, a plurality of feature vectors constituting the target feature vector are searched in the feature vector library, and the searched plurality of feature vectors are combined to obtain the target feature vector. In practice, training data required by different AI models may be the same, and if feature vectors are selected from the feature vector library for combination for each AI model, a process of repeated combination of feature vectors may be generated, which results in low generation efficiency of training data and affects model training efficiency. Based on the method, when the target task is obtained, whether the target characteristic vector required by the AI model exists in the characteristic vector library can be determined; and if the target characteristic vector required by the AI model exists in the characteristic vector library, acquiring the target characteristic vector required by the AI model from the target characteristic library.
Before determining whether the target feature vector required by the AI model exists in the feature vector library, a part of the target feature vector may be stored in the feature vector library in advance, and when the target task is acquired, whether the target feature vector required by the target task exists may be first searched in the feature vector library. When the target characteristic vector which can be directly used exists in the characteristic vector library, the target characteristic vector is directly obtained from the characteristic vector library, the characteristic combination process can be reduced, the generation efficiency of training data is improved, and further the model training efficiency is improved.
In one or more embodiments of the present application, before determining whether the target feature vector required by the AI model exists in the feature vector library, the method further includes: determining the storage value of the target feature vector; according to the storage value of the target feature vectors, sequencing the target feature vectors in the order of the storage value from large to small; and storing the N top-ranked target feature vectors into a feature vector library.
In some embodiments, optionally, determining a storage value of the target feature vector comprises: determining the use frequency and the combination cost of the target feature vector for AI model training; the combination cost is used for representing the time consumed for combining a plurality of feature vectors for constructing the target feature vector; and determining the storage value of the target feature vector according to the weighted sum of the use frequency and the combination cost of the target feature vector.
The above embodiment can be expressed as the following formula (1):
v=a*f+b*t; (1)
in the formula (1), v represents the storage value of the target feature vector; f represents the use frequency of the target feature vector, which can be determined according to the ratio of the use times to the use duration in a period of time; t represents the time taken to combine a plurality of feature vectors that construct a target feature vector; a and b are weights, respectively, and the sum of a and b is 1.
In one or more embodiments of the present application, before determining whether the target feature vector required by the AI model exists in the feature vector library, the method further includes: recording the use times of the target feature vector for AI model training; and if the using times of the target characteristic vector are greater than the preset times, storing the target characteristic vector into a characteristic vector library. Or, the use frequency of the target feature vector for AI model training can be recorded; and if the use frequency of the target characteristic vector is greater than the preset frequency, storing the target characteristic vector into a characteristic vector library.
Taking the number of times of use as an example, in this embodiment, when the target task is obtained, no matter the target feature vector is obtained from the feature vector library according to the target task, or multiple feature vectors are obtained from the feature vector library according to the target task and combined into the target feature vector, the number of times of use of the target feature vector is increased by 1.
When the number of times of use of the target feature vector is greater than the preset number of times, it is indicated that the target feature vector is used for training the AI model multiple times, and therefore, in order to avoid repeated operations of acquiring multiple feature vectors from the feature vector library each time and combining the multiple feature vectors into the target feature vector, the target feature vector may be stored in the feature vector library. And when a target task is acquired subsequently, firstly searching whether a target characteristic vector which can be directly used exists in a characteristic vector library.
In one or more embodiments of the present application, before determining whether the target feature vector required by the AI model exists in the feature vector library, the method further includes: recording the use times of the target feature vector for AI model training; sequencing the target characteristic vectors from large to small according to the use times of the target characteristic vectors for AI model training; and storing the N top-ranked target feature vectors into a feature vector library. Or, the use frequency of the target feature vector for AI model training can be recorded; sequencing the target characteristic vectors from large to small according to the use frequency of the target characteristic vectors for AI model training; and storing the N top-ranked target feature vectors into a feature vector library.
With the use of the feature vector library, more and more feature vectors are combined to obtain more and more target feature vectors, some of the target feature vectors may be used for more times for training the AI model, and some of the target feature vectors may be used for less times for training the AI model. And when a target task is acquired subsequently, firstly searching whether a target characteristic vector which can be directly used exists in a characteristic vector library.
Fig. 5 is a second flowchart of a user behavior data processing method according to an embodiment of the present application. As shown in fig. 5, the user behavior data processing method may include the following four processes: creating samples, extracting features, feature encoding, and feature combinations. The following description will be made separately.
S501, creating a sample.
Optionally, the id of the sample data required by the feature vector to be created is extracted from the original data. For example, for an enterprise financial transaction, the sample id may be a list of enterprise names that satisfy some condition. In addition, each sample data corresponds to a time point in addition to the sample id, and in the feature calculation, feature data of a certain time period can be screened out and calculated based on the time point.
And S502, extracting features.
For each sample data id and time point in the sample data, and for each business behavior, extracting the characteristics of the behaviors from the original table to form a series of behavior characteristics. Currently, these behavior features are only original values. For example, the user withdrawal behavior of the loan transaction is withdrawal behavior characteristic data of a user, the behavior characteristic of the withdrawal behavior data comprises withdrawal times, and the value can be 3 times.
S503, feature coding.
When feature encoding is performed on the extracted features, the encoding mode may include index, bucket, and onehot. Wherein index is used to index the original value of the class-type feature, for example, for a city field, the feature original value is Shenzhen, and index is number 0. bucket refers to a bucket, numbers are classified into different buckets, for example, the buckets are [0,10 ] and [10, 20), the buckets are numbered from 0, wherein [0,10) is the bucket number 0, and [10, 20) is the bucket number 1, 18 is classified into the second bucket, and the value is 1. onehot refers to one-hot encoding.
Optionally, the feature encoding further includes: multiple fields are aggregated into one vector. For example, two fields and their values are: city 0, age 1, the feature vector after aggregation is: [0,1].
And S504, combining the characteristics.
In the feature combination process, a plurality of feature vectors can be selected from the feature vector library according to a plurality of feature vectors required by the target task, and the feature vectors are combined to obtain a target feature vector for AI model training.
And S505, analyzing or modeling.
The combined target feature vector can be applied to data analysis or modeling to mine the value of data and guide the business.
The following describes an application of the user behavior data processing method provided by the embodiment of the present application in practice with reference to several examples.
In some optional examples, an embodiment of the present application may provide a withdrawal behavior data processing method, which specifically includes:
step A1, acquiring a target task; the target task comprises a target characteristic vector required by a withdrawal rate prediction model to be trained, and the target characteristic vector is obtained according to the use behavior data of the user on withdrawal platforms of different channels and is used for representing the withdrawal behavior characteristics of the user.
A2, if the feature vector library does not have a target feature vector required by the withdrawal rate prediction model to be trained, acquiring a plurality of feature vectors of a user in the feature vector library according to the target feature vector required by the withdrawal rate prediction model to be trained; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are constructed according to the use behavior data of different withdrawal platforms of the user.
And A3, combining the multiple feature vectors of the user to obtain a target feature vector of the user.
A4, training a withdrawal rate prediction model to be trained according to the target feature vector and the label of the user to obtain a trained withdrawal rate prediction model; the withdrawal rate is used for representing the withdrawal probability of the user, and the label is used for representing whether the user withdraws money or not.
In other optional examples, an embodiment of the present application may further provide a loan behavior data processing method, which specifically includes:
step B1, acquiring a target task; the target task comprises a target characteristic vector required by the overdue rate prediction model to be trained, and the target characteristic vector is a combined vector obtained according to behavior data of loan platforms of different channels of the user.
In this step, optionally, the different channels include an APP channel, a public channel, an electricity marketing channel, and the like.
Step B2, if the feature vector library does not have the target feature vector required by the overdue prediction model to be trained, acquiring a plurality of feature vectors of the user from the feature vector library according to the target feature vector required by the overdue prediction model to be trained; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are constructed according to behavior data of different channels of the user;
and step B3, combining the plurality of feature vectors of the user to obtain the target feature vector of the user.
B4, training a overdue rate prediction model to be trained according to the target feature vector and the label of the user to obtain a trained overdue rate prediction model; the overdue rate is used for representing the overdue probability of payment of the user, and the label is used for representing whether the user is overdue.
In some further optional examples, an embodiment of the present application may further provide a user behavior data processing method, which specifically includes:
step C1, acquiring a target task; the target task comprises a target characteristic vector required by an information recommendation model to be trained, and the target characteristic vector is a combined vector obtained according to information use behaviors of different channels of the user.
In this step, optionally, the information using behaviors of different channels include information browsing behaviors and information clicking behaviors of an APP channel, a public channel and an electricity marketing channel.
Taking banking business as an example, information browsing behaviors and information clicking behaviors of a user on webpages of a bank APP, a bank public number and a bank electric marketing system can be information using behaviors of the user in different channels.
Step C2, if the target characteristic vector required by the information recommendation model to be trained does not exist in the characteristic vector library, acquiring a plurality of characteristic vectors of the user in the characteristic vector library according to the target characteristic vector required by the information recommendation model to be trained; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are constructed according to information use behaviors of different channels of the user;
and step C3, combining the multiple feature vectors of the user to obtain a target feature vector of the user.
C4, training the information recommendation model to be trained according to the target characteristic vector and the label of the user to obtain a trained information recommendation model; wherein the label is used for representing whether the user clicks the information.
The trained information recommendation model can be used for information recommendation for users.
It should be understood that when the information recommendation model of the present embodiment is used for information recommendation, the information thereof may be advertisement information, commodity information, news information, and the like. In addition, it should be understood that the above examples are only illustrations, and any scenario in which the features are divided into modules and then combined may be applied to the user behavior data processing method of the present embodiment.
Fig. 6 is a schematic structural diagram of a user behavior data processing apparatus according to an embodiment of the present application. As shown in fig. 6, the user behavior data processing apparatus includes: a first acquisition module 61, a second acquisition module 62, a combination module 63 and a training module 64; the first obtaining module 61 is configured to obtain a target task; the target task comprises a target characteristic vector required by an AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of a target user; the second obtaining module 62 is further configured to, when the target feature vector required by the AI model does not exist in the feature vector library, obtain a plurality of feature vectors of the target subject in the feature vector library according to the target feature vector required by the AI model; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels; the combination module 63 is configured to combine the plurality of feature vectors to obtain the target feature vector; and a training module 64, configured to train the AI model according to the target feature vector, so as to obtain a trained AI model.
Optionally, the apparatus further comprises: an extraction module 65 and an encoding module 66; the first obtaining module 61 is further configured to obtain sample data of a target user of a feature vector to be constructed; the sample data of the target user comprises an identifier of the target user and behavior data of the target user, and the identifier of the target user and the behavior data of the target user are uncoupled data; an extracting module 65, configured to extract the identifier of the target user from the sample data to obtain a sample identifier of the feature vector to be constructed; the extracting module 65 is further configured to extract behavior data of the target user from the sample data according to the sample identifier of the feature vector to be constructed; and the encoding module 66 is configured to perform feature encoding on the behavior data of the target user to obtain a feature vector of the target user.
Optionally, the obtaining, by the first obtaining module 61, sample data of a target user of a feature vector to be constructed includes: acquiring original data of a target user of a feature vector to be constructed; the original data comprises the identification of the target user and original behavior data of the target user; the identification of the target user and the original behavior data of the target user are coupled data; and performing characteristic decoupling processing on the identification of the target user and the original behavior data of the target user to obtain sample data of the target user.
Optionally, the encoding module 66 performs feature encoding on the behavior data of the target user to obtain a feature vector of the target user, and specifically includes: determining a data type of the behavior data of the target user; the data types comprise character types, number types and classification types; if the data type of the behavior data of the target user is a character type, encoding the behavior data of the target user through a first encoding mode to obtain a first feature vector of the target user; the first coding mode is a coding mode used for behavior data of character types; if the data type of the behavior data of the target user is a digital type, encoding the behavior data of the target user through a second encoding mode to obtain a second feature vector of the target user; the second encoding mode is an encoding mode used for digital behavior data; if the data type of the behavior data of the target user is a classification type, encoding the behavior data of the target user through a third encoding mode to obtain a third feature vector of the target user; the third encoding mode is an encoding mode for classifying the behavior data of the type.
Optionally, the feature vector library further includes a plurality of pre-stored target feature vectors, and the apparatus further includes: a determining module 67, configured to determine whether a target feature vector required by the AI model exists in the feature vector library; the second obtaining module 62 is further configured to obtain a target feature vector required by the AI model from the target feature library when the target feature vector required by the AI model exists in the feature vector library.
Optionally, the apparatus further comprises: a sorting module 68 and a storage module 69; the determining module 67 is further configured to determine a storage value of the target feature vector; the sorting module 68 is configured to sort the target feature vectors in an order from a large storage value to a small storage value according to the storage values of the target feature vectors; a storage module 69, configured to store the top N target feature vectors into the feature vector library.
Optionally, the determining module 67 determines the storage value of the target feature vector, which specifically includes: determining the use frequency and the combination cost of the target feature vector for AI model training; wherein the combination cost is time consumed for combining a plurality of feature vectors for constructing the target feature vector; and determining the storage value of the target feature vector according to the weighted sum of the use frequency and the combination cost of the target feature vector.
Optionally, the target feature vector is usage behavior data of the user on a withdrawal platform; the plurality of feature vectors are feature vectors constructed according to the use behavior data of different withdrawal platforms of the user; the AI model is a model used for predicting the withdrawal rate of the user according to the target feature vector and a label, wherein the withdrawal rate is used for representing the withdrawal probability of the user, and the label is used for representing whether the user withdraws money or not.
The user behavior data processing device provided in the embodiment of the present application may be used to implement the technical solution of the user behavior data processing method in the foregoing embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the training module 64 may be a separate processing element, or may be integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a processing element of the apparatus calls and executes the functions of the training module 64. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device may include: transceiver 71, processor 72, memory 73.
Processor 72 executes computer-executable instructions stored in memory, which cause processor 72 to perform aspects of the embodiments described above. The processor 72 may be a general-purpose processor including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
A memory 73 is coupled to the processor 72 via the system bus and communicates with each other, the memory 73 storing computer program instructions.
The transceiver 71 may be used to acquire a target task.
The system bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory).
The embodiment of the application also provides a chip for running the instructions, and the chip is used for executing the technical scheme of the user behavior data processing method in the embodiment.
The embodiment of the present application further provides a computer-readable storage medium, where a computer instruction is stored in the computer-readable storage medium, and when the computer instruction runs on a computer, the computer is enabled to execute the technical solution of the user behavior data processing method in the foregoing embodiment.
The embodiment of the present application further provides a computer program product, where the computer program product includes a computer program, the computer program is stored in a computer-readable storage medium, at least one processor can read the computer program from the computer-readable storage medium, and when the computer program is executed by the at least one processor, the technical solution of the user behavior data processing method in the foregoing embodiment can be implemented.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (12)

1. A user behavior data processing method is characterized by comprising the following steps:
acquiring a target task; the target task comprises a target characteristic vector required by an AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of a target user;
if the target characteristic vector required by the AI model does not exist in the characteristic vector library, acquiring a plurality of characteristic vectors of the target user in the characteristic vector library according to the target characteristic vector required by the AI model; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels;
combining the plurality of feature vectors to obtain the target feature vector;
and training the AI model according to the target characteristic vector to obtain the trained AI model.
2. The method according to claim 1, wherein before the obtaining the plurality of feature vectors from the feature vector library according to the target feature vectors required by the AI model, the method further comprises:
acquiring sample data of a target user of a feature vector to be constructed; the sample data of the target user comprises an identifier of the target user and behavior data of the target user, and the identifier of the target user and the behavior data of the target user are uncoupled data;
extracting the identifier of the target user from the sample data to obtain the sample identifier of the feature vector to be constructed;
extracting the behavior data of the target user from the sample data according to the sample identification of the feature vector to be constructed;
and performing feature coding on the behavior data of the target user to obtain a feature vector of the target user.
3. The method according to claim 2, wherein the obtaining sample data of a target user of the feature vector to be constructed comprises:
acquiring original data of a target user of a feature vector to be constructed; the original data comprises the identification of the target user and original behavior data of the target user; the identification of the target user and the original behavior data of the target user are coupled data;
and performing characteristic decoupling processing on the identification of the target user and the original behavior data of the target user to obtain sample data of the target user.
4. The method according to claim 2 or 3, wherein the performing feature coding on the behavior data of the target user to obtain a feature vector of the target user comprises:
determining a data type of the behavior data of the target user; the data types comprise character types, number types and classification types;
if the data type of the behavior data of the target user is a character type, encoding the behavior data of the target user through a first encoding mode to obtain a first feature vector of the target user; the first coding mode is a coding mode used for behavior data of character types;
if the data type of the behavior data of the target user is a digital type, encoding the behavior data of the target user through a second encoding mode to obtain a second feature vector of the target user; the second coding mode is a coding mode used for digital behavior data;
if the data type of the behavior data of the target user is a classification type, encoding the behavior data of the target user through a third encoding mode to obtain a third feature vector of the target user; the third encoding mode is an encoding mode for classifying the behavior data of the type.
5. The method according to any one of claims 1-3, wherein the feature vector library further comprises a plurality of pre-stored target feature vectors, the method further comprising:
determining whether a target feature vector required by the AI model exists in the feature vector library;
and if the target characteristic vector required by the AI model exists in the characteristic vector library, acquiring the target characteristic vector required by the AI model from the target characteristic library.
6. The method of claim 5, wherein prior to determining whether the target feature vector needed by the AI model exists in the feature vector library, the method further comprises:
determining the storage value of the target feature vector;
according to the storage value of the target feature vectors, sequencing the target feature vectors in a descending order of the storage value;
and storing the N top-ranked target feature vectors into the feature vector library.
7. The method of claim 6, wherein determining a storage value of the target feature vector comprises:
determining the use frequency and the combination cost of the target feature vector for AI model training; wherein the combination cost is time consumed for combining a plurality of feature vectors for constructing the target feature vector;
and determining the storage value of the target feature vector according to the weighted sum of the use frequency and the combination cost of the target feature vector.
8. The method according to any one of claims 1 to 3, wherein the target feature vector is usage behavior data of the withdrawal platform by the user; the plurality of feature vectors are constructed according to the use behavior data of different withdrawal platforms of the user;
the AI model is a model used for predicting the withdrawal rate of the user according to the target feature vector and a label, wherein the withdrawal rate is used for representing the withdrawal probability of the user, and the label is used for representing whether the user withdraws money or not.
9. A user behavior data processing apparatus, comprising:
the acquisition module is used for acquiring a target task; the target task comprises a target characteristic vector required by an AI model to be trained, and the target characteristic vector is the combined characteristic of a plurality of characteristic vectors of the same user;
the obtaining module is further configured to obtain a plurality of feature vectors of the target user from a feature vector library according to the target feature vector required by the AI model if the target feature vector required by the AI model does not exist in the feature vector library; the feature vector library comprises a plurality of pre-constructed feature vectors, and the feature vectors are vectors obtained according to user behavior data of different channels;
the combination module is used for combining the plurality of characteristic vectors to obtain the target characteristic vector;
and the training module is used for training the AI model according to the target characteristic vector to obtain the trained AI model.
10. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the method of any of claims 1-8.
11. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the method of any one of claims 1-8.
12. A computer program product, characterized in that it comprises a computer program which, when being executed by a processor, carries out the method of any one of claims 1-8.
CN202210622588.9A 2022-06-02 2022-06-02 User behavior data processing method, device, equipment and storage medium Pending CN114880578A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210622588.9A CN114880578A (en) 2022-06-02 2022-06-02 User behavior data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210622588.9A CN114880578A (en) 2022-06-02 2022-06-02 User behavior data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114880578A true CN114880578A (en) 2022-08-09

Family

ID=82678898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210622588.9A Pending CN114880578A (en) 2022-06-02 2022-06-02 User behavior data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114880578A (en)

Similar Documents

Publication Publication Date Title
EP3985578A1 (en) Method and system for automatically training machine learning model
WO2021164382A1 (en) Method and apparatus for performing feature processing for user classification model
CN111275491A (en) Data processing method and device
CN108491406B (en) Information classification method and device, computer equipment and storage medium
CN113742488B (en) Embedded knowledge graph completion method and device based on multitask learning
CN117520503A (en) Financial customer service dialogue generation method, device, equipment and medium based on LLM model
CN109783381B (en) Test data generation method, device and system
CN114328681A (en) Data conversion method and device, electronic equipment and storage medium
CN113656699A (en) User feature vector determination method, related device and medium
CN113988195A (en) Private domain traffic clue mining method and device, vehicle and readable medium
CN112801784A (en) Bit currency address mining method and device for digital currency exchange
CN108830302B (en) Image classification method, training method, classification prediction method and related device
CN113282686B (en) Association rule determining method and device for unbalanced sample
CN113297482B (en) User portrayal describing method and system of search engine data based on multiple models
CN112541357B (en) Entity identification method and device and intelligent equipment
CN114880578A (en) User behavior data processing method, device, equipment and storage medium
CN114780649A (en) Method and device for identifying structured data entity type
CN111159397B (en) Text classification method and device and server
CN116861226A (en) Data processing method and related device
CN114021716A (en) Model training method and system and electronic equipment
CN113705201A (en) Text-based event probability prediction evaluation algorithm, electronic device and storage medium
CN111523318A (en) Chinese phrase analysis method, system, storage medium and electronic equipment
CN114418752B (en) Method and device for processing user data without type label, electronic equipment and medium
CN117093715B (en) Word stock expansion method, system, computer equipment and storage medium
CN113535805B (en) Data mining method, related device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication