CN113822684B - Black-birth user identification model training method and device, electronic equipment and storage medium - Google Patents

Black-birth user identification model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113822684B
CN113822684B CN202111145600.3A CN202111145600A CN113822684B CN 113822684 B CN113822684 B CN 113822684B CN 202111145600 A CN202111145600 A CN 202111145600A CN 113822684 B CN113822684 B CN 113822684B
Authority
CN
China
Prior art keywords
model
loss
sub
user
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111145600.3A
Other languages
Chinese (zh)
Other versions
CN113822684A (en
Inventor
张徵
秦超
陈柏宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202111145600.3A priority Critical patent/CN113822684B/en
Publication of CN113822684A publication Critical patent/CN113822684A/en
Application granted granted Critical
Publication of CN113822684B publication Critical patent/CN113822684B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a black-producing user identification model training method, a black-producing user identification model training device, electronic equipment and a storage medium, wherein the method comprises the following steps: training a first basic model by using a first sample set until a first constraint condition is met to obtain a first recognition model, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model. Because the second sub-model is already jointly trained with the first sub-model, and the second basic model is obtained based on the second sub-model, training can be completed by only a small number of user behavior feature sequences marked with tag data in the second sample set, so that the influence of the number of positive samples for training the user identification model on the accuracy of the model is reduced.

Description

Black-birth user identification model training method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a training method and apparatus for a black-producing user identification model, an electronic device, and a storage medium.
Background
With the development of internet technology, internet services provided by internet service providers are increasing, however, in an actual internet service scenario, some people may perform abnormal activities based on the internet services provided by the internet service providers, for example: theft of normal user information, malicious praise, comment, order, release of illegal transaction information, fraud messages, etc., and the person performing the above-mentioned abnormal activities is referred to herein as an abnormal user or a blackout user. The internet service provider needs to continuously identify these abnormal users to ensure account security of normal users and normal operation of internet services.
In the related art, the user behavior characteristics of the user are analyzed by using a trained user identification model, so that an abnormal user with abnormal behaviors is identified. The training process of the user identification model comprises the following steps: manually selecting user behavior characteristics of an abnormal user, marking the abnormal user label as a positive sample, manually selecting user behavior characteristics of a normal user as a negative sample, and training a user identification model by utilizing the positive sample and the negative sample, thereby obtaining a trained user identification model.
However, the inventor found in the study that, by adopting the above-mentioned training method of the user identification model, a large number of user behavior features of the abnormal users need to be manually selected as positive samples, however, in a practical scenario, the number of abnormal users is far smaller than that of normal users, so that the user behavior features of the abnormal users cannot be selected as positive samples, and the workload of manually selecting the user behavior features of the abnormal users is great, and the number of the positive samples of the user behavior features of the abnormal users is limited, so that the number of the positive samples capable of being used for training the user identification model is small, and finally the accuracy of the user identification model is affected.
Disclosure of Invention
The embodiment of the invention aims to provide a black-producing user identification model training method, a black-producing user identification model training device, electronic equipment and a storage medium, so as to reduce the influence of the number of positive samples capable of being used for training a user identification model on the accuracy of the user identification model. The specific technical scheme is as follows:
in a first aspect of the present invention, there is provided a black-producing user identification model training method, the method comprising:
training a first basic model by using a first sample set until a first constraint condition is met, so as to obtain a first identification model; the first recognition model is used for predicting whether texts corresponding to the text feature sequences are junk contents or not based on the text feature sequences and the user behavior feature sequences; the first base model includes: the system comprises a first sub-model and a second sub-model, wherein the first sub-model is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
Determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In a second aspect of the present invention, there is also provided a black-producing user identification method, the method further comprising:
acquiring a user behavior feature sequence of a user to be identified;
inputting a user behavior feature sequence of a user to be identified into a trained second identification model, and determining whether the user to be identified is a black-producing user, wherein the trained second identification model is obtained by training any black-producing user identification model training method;
and determining whether the user to be identified is a black-producing user or not based on the prediction result of the user to be identified.
In a third aspect of the present invention, there is also provided a training device for a black-producing user identification model, the device comprising:
the first training module is used for training the first basic model by utilizing the first sample set until a first constraint condition is met, so as to obtain a first identification model; the first recognition model is used for predicting whether texts corresponding to the text feature sequences are junk contents or not based on the text feature sequences and the user behavior feature sequences; the first base model includes: the system comprises a first sub-model and a second sub-model, wherein the first sub-model is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
The second training module is used for taking a second sub-model in the first recognition model as a second basic model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In a fourth aspect of the present invention, there is also provided a black-producing user identifying apparatus, the apparatus further comprising:
the acquisition module is used for acquiring a user behavior feature sequence of the user to be identified;
the identification module is used for inputting a user behavior feature sequence of the user to be identified into a trained second identification model to determine whether the user to be identified is a black-producing user, wherein the trained second identification model is obtained through training by the black-producing user identification model training device;
and the determining module is used for determining whether the user to be identified is a black-producing user or not based on the prediction result of the user to be identified.
In a fifth aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
A memory for storing a computer program;
and a processor configured to implement the steps of any of the methods described herein when executing the program stored on the memory.
In a sixth aspect of the present invention there is also provided a computer readable storage medium having a computer program stored therein, the computer program when executed by a processor implementing the steps of any of the methods described herein.
In a seventh aspect of the invention there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of any of the methods described herein.
The embodiment of the invention provides a training method and device for a black-producing user identification model, electronic equipment and a storage medium, wherein the training method comprises the following steps: training a first basic model by using a first sample set until a first constraint condition is met, so as to obtain a first identification model; determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the first recognition model is used for predicting whether texts corresponding to the text feature sequences are junk contents or not based on the text feature sequences and the user behavior feature sequences; the first base model includes: the system comprises a first sub-model and a second sub-model, wherein the first sub-model is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
It can be seen that, in the embodiment of the present invention, a first sub-model and a second sub-model in a first basic model are jointly trained by using a first sample set to obtain a second loss of the first sub-model, a third loss of the second sub-model, and a fourth loss between the first sub-model and the second sub-model, and training parameters of the first basic model are adjusted by using the first loss including the second loss, the third loss, and the fourth loss, so that training parameters of the first recognition model can be obtained. And then obtaining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set. Because the second sub-model is jointly trained with the first sub-model, and the second basic model is obtained based on the second sub-model, training can be completed by only a small amount of user behavior feature sequences marked with tag data in the second sample set, so that the influence of the number of positive samples capable of being used for training the user identification model on the accuracy of the user identification model can be reduced, and the accuracy of the second identification model can be increased under the condition that only a small amount of user behavior feature sequences marked with tag data are used. Furthermore, the workload of label data labeling of the text feature sequence is far smaller than that of label data labeling of the user behavior feature sequence, so that the workload of manually selecting and labeling user behavior features of abnormal users can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flowchart of a first implementation of a training method for a black-producing user identification model according to an embodiment of the present invention;
FIG. 2 is a flowchart of a second implementation of a training method for a black-producing user identification model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a network structure according to an embodiment of the present invention;
FIG. 4 is a flowchart of a third implementation of a training method for a black-producing user identification model according to an embodiment of the present invention;
FIG. 5 is a flowchart of a black-producing user identification method according to an embodiment of the present invention;
FIG. 6 is a diagram of an overall frame in an embodiment of the invention;
FIG. 7 is a flowchart of a fourth implementation of a training method for a black-producing user identification model according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a training device for a black-producing user identification model according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a black-producing user identification device according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.
In an actual internet service scenario, some people may perform unusual activities based on internet services provided by an internet service provider, for example: the normal user information is stolen, comments are maliciously praised and brushed on the video comment page, or illegal transaction information is released or fraud information is released by using a barrage function, and the personnel performing the abnormal activities are called abnormal users or black-producing users. The internet service provider needs to continuously identify these abnormal users using an identification model to ensure account security of normal users and normal operation of internet services.
However, the inventor finds that, in the related art, a large number of user behavior features of abnormal users are usually required to be manually selected to perform training of the recognition model, and the number of abnormal users is far smaller than that of normal users, so that the user behavior features of the abnormal users cannot be selected as positive samples, the workload of manually selecting the user behavior features of the abnormal users is great, and the number of the positive samples of the user behavior features of the abnormal users is limited, so that the number of the positive samples capable of being used for training the user recognition model is small, and finally the accuracy of the user recognition model is affected.
In order to solve the problems, the embodiment of the invention provides a training method, a training device, electronic equipment and a storage medium for a black-producing user identification model, so as to reduce the workload of manually selecting and labeling user behavior characteristics of abnormal users.
Next, a black-out user identification model training method according to an embodiment of the present invention is described first, as shown in fig. 1, and is a flowchart of a first implementation of a black-out user identification model training method according to an embodiment of the present invention, where the method may include:
s110, training a first basic model by using a first sample set until a first constraint condition is met, so as to obtain a first recognition model; the first recognition model is used for predicting whether texts corresponding to the text feature sequences are junk contents or not based on the text feature sequences and the user behavior feature sequences; the first base model includes: the system comprises a first sub-model and a second sub-model, wherein the first sub-model is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
S120, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In order to reduce the impact of the number of positive samples that can be used to train the user recognition model on the accuracy of the user recognition model, in an embodiment of the present invention, a first base model may be trained first with a first set of samples comprising a text feature sequence, a user behavior feature sequence, and first tag data indicating whether the text feature sequence is spam.
In some examples, the text feature sequence is a word set obtained by processing text content published by a user on an internet service platform, where the text content may be normal comment information or abnormal comment information published by the user on a comment page, may be normal barrage information or abnormal barrage information published by the user using a barrage function, and may also be: normal post content or abnormal post information published in a forum, a bar or a microblog, and the abnormal comment information, the abnormal barrage information and the abnormal post information can be illegal transaction information or fraud information and the like. For example, the text content may be text content as shown in table 1, and the corresponding first tag may also be a tag as shown in table 1.
For example, the text feature sequence may be a word set obtained by word segmentation of text content, a word set obtained by word bag model processing of text content, or the like.
In some examples, the text content shown in table 1 is relatively easy to obtain, and it is relatively easy to determine whether the text content is junk content, so the process of obtaining the text feature sequence and the first tag is relatively easy. It will be appreciated that the text content shown in table 1 is for illustration only, and that in practical applications the number of text contents may be greater, for example, the number may be set to millions or tens of millions, etc.
Table 1 text content and tag example table
Figure BDA0003285325910000071
In still other examples, the text feature sequence described above may be a set of words resulting from processing text content by a bag of words model. Or a word set obtained by processing text contents by TF-IDF (word frequency-inverse document frequency) model or the like. The text content can be subjected to word segmentation processing by other word segmentation methods to obtain word sets, which can be all right.
The word bag model is used for carrying out word segmentation processing on text content and counting the occurrence times of each word in the text, so that all words contained in the text content and the occurrence frequency of each word in the text can be obtained.
The TF-IDF model is used for word segmentation processing of text contents, then frequency of occurrence of each word in the text contents and reverse file frequency of the text containing the word are calculated, further weighting weights of the words are obtained based on the frequency of occurrence of each word in the text contents and the reverse file frequency of the text containing the word, and finally words meeting preset selection conditions are selected as word sets of the text feature sequences based on the weighting weights of the words. The preset selection condition may be that the weighting weights are located in the first N bits according to the order from big to small; the weighting weight may be greater than or equal to a preset weighting threshold.
In some examples, when a user is publishing text content, the text content has a correspondence with the user. When a user operates on the Internet service platform, corresponding user behaviors are generated, and the user behaviors have corresponding relations with the user. Thus, a sequence of user behavior features can be obtained here. The user behavior feature sequence may be related data representation when the user performs different behaviors, for example, when the user logs in to a video website, each user behavior feature may be represented by using at least one data of "login device fingerprint", "login IP", "wifi flag", and "video name for comment", and when the user logs in to a forum, a bar or a microblog, each user behavior feature may be represented by using at least one data of "login device account", "posting time", "post modifying time", and "posting destination".
In some examples, since the user behavior feature sequence is difficult to obtain, and the workload of labeling the tag data of the user behavior feature sequence is far greater than that of labeling the tag data of the text feature sequence, if the first sub-model for analyzing the text feature sequence and the second sub-model for analyzing the user behavior feature sequence are trained in a joint training manner, the second sub-model for analyzing the user behavior feature sequence can be obtained under the condition that only the tag data of the text feature sequence needs to be labeled.
Therefore, in the embodiment of the present invention, the first basic model may include two sub-models, which may be a twin neural network model, and the two sub-models are a first sub-model and a second sub-model, respectively, so that a text feature sequence may be input into the first sub-model, a user behavior feature sequence may be input into the second sub-model, so as to obtain a first loss of the first basic model, and then model parameters of the first basic model are adjusted based on the first loss, so as to implement training of the first basic model.
In still other examples, after training the first base model to obtain the first recognition model, a second base model that can be used to analyze the user behavior feature sequence may be determined based on a second sub-model of the first recognition model that can be used to analyze the user behavior feature sequence;
For example, a second sub-model may be extracted from the first recognition model and then taken as a second base model;
the model parameters of the second sub-model in the first recognition model may also be obtained and stored in a preset knowledge base, which contains the model parameters of the plurality of models, i.e. the preset knowledge base is a set of model parameters of the plurality of models.
And then obtaining model parameters of the second sub-model from the knowledge base and migrating the model parameters to a second basic model, wherein the second basic model has the same model structure as the second sub-model.
It will be appreciated that when the first recognition model is trained, knowledge understanding of the user behavior by the second sub-model in the first recognition model is also obtained, which knowledge understanding may be represented by model parameters of the second sub-model, and thus the knowledge base herein may be model parameters of the second sub-model.
After the second basic model is obtained, since the second basic model can be used for analyzing the user behavior feature sequence, in order to obtain higher analysis accuracy, the second basic model can be further trained, and at this time, the second basic model can be trained by using a second sample set including the user behavior feature sequence and second tag data until a second constraint condition is met, so that a second identification model for identifying whether the user corresponding to the user behavior feature sequence is a blackout user based on the user behavior feature sequence can be obtained.
The second constraint condition may be that the loss of the second basic model converges, or the training frequency reaches a third preset frequency threshold, or the frequency of the loss obtained by two adjacent training is greater than or equal to the third preset frequency threshold, where the second constraint condition is not limited.
In the embodiment of the invention, a first sub-model and a second sub-model in a first basic model are jointly trained by using a first sample set to obtain a second loss of the first sub-model, a third loss of the second sub-model and a fourth loss between the first sub-model and the second sub-model, and training parameters of the first basic model are adjusted by using the first loss comprising the second loss, the third loss and the fourth loss, so that training parameters of a first identification model can be obtained. And then obtaining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set. Because the second sub-model is jointly trained with the first sub-model, and the second basic model is obtained based on the second sub-model, training can be completed by only a small amount of user behavior feature sequences marked with tag data in the second sample set, so that the influence of the number of positive samples capable of being used for training the user identification model on the accuracy of the user identification model can be reduced, and the accuracy of the second identification model can be increased under the condition that only a small amount of user behavior feature sequences marked with tag data are used. Furthermore, the workload of label data labeling of the text feature sequence is far smaller than that of label data labeling of the user behavior feature sequence, so that the workload of manually selecting and labeling user behavior features of abnormal users can be reduced.
On the basis of a black-out user identification model training method shown in fig. 1, an embodiment of the present invention further provides a possible implementation manner, as shown in fig. 2, which is a flowchart of a second implementation manner of a black-out user identification model training method in the embodiment of the present invention, where the method may include:
s210, inputting the text feature sequence into a first sub-model to obtain a first full-connection layer feature output by a second full-connection layer in the first sub-model and a first garbage content prediction result output by a normalization layer of the first sub-model;
s220, inputting the user behavior feature sequence into a second sub-model to obtain second full-connection layer features output by a second full-connection layer in the second sub-model and second garbage content prediction results output by a normalization layer of the second sub-model;
s230, calculating second loss based on the first tag data and the first garbage content prediction result, calculating third loss based on the first tag data and the second garbage content prediction result, and calculating fourth loss based on the first full connection layer feature and the second full connection layer feature;
s240, determining a first loss according to the second loss, the third loss and the fourth loss;
S250, training parameters in the first basic model are adjusted according to the first loss until a first constraint condition is met, and a first recognition model is obtained.
S260, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In some examples, after inputting the text feature sequence into the first sub-model, a first full-connection layer feature and a first spam prediction result output by the first sub-model may be obtained;
after the user behavior feature sequence is input into the second sub-model, second full-connection layer features and second garbage content prediction results output by the second sub-model can be obtained;
at this time, the second loss may be calculated based on the first tag data and the first garbage content prediction result, the third loss may be calculated based on the first tag data and the second garbage content prediction result, and the fourth loss may be calculated based on the first full connection layer feature and the second full connection layer feature; then determining a first loss according to the second loss, the third loss and the fourth loss; and further, training parameters in the first basic model can be adjusted according to the first loss until a first constraint condition is met, so that a first recognition model is obtained.
The second loss is used for representing deviation between the first garbage content predicted result and the first tag data, and the third loss is used for representing deviation between the second garbage content predicted result and the first tag data.
The first full-connection layer feature is a feature of the text feature sequence, and can be used for representing the text feature sequence, the second full-connection layer feature is a feature of the user behavior feature sequence, and can be used for representing the user behavior feature sequence, so that the fourth loss can be used for representing a distance between the first full-connection layer feature obtained by transforming the text feature sequence by the first sub-model and the second full-connection layer feature obtained by transforming the user behavior feature sequence by the second sub-model.
In this way, the distance between the first fully connected layer feature and the second fully connected layer feature can be minimized when the first penalty is minimized, i.e. the transformation of the text feature sequence by the first sub-model is closest to the transformation of the user behavior feature sequence by the second sub-model. Thus, training results of training the first sub-model with text features may be used in the second sub-model.
In still other examples, the first constraint may include at least one of: the first loss converges, or the number of times the difference between the first losses obtained by two adjacent training is smaller than the preset error threshold is greater than or equal to the second preset number of times threshold, or the like, and the first constraint is not limited as long as the first constraint is associated with the first loss. By associating the first constraint condition with the first loss, the recognition accuracy of the first recognition model obtained through final training can be related to the magnitude of the first loss, if the first loss is larger, the recognition accuracy of the first recognition model obtained through training is lower, otherwise, if the first loss is smaller, the recognition accuracy of the first recognition model obtained through training is higher.
In still other examples, when training the first basic model by using the first sample set until the first constraint condition is met to obtain the first recognition model, a text feature sequence in the first sample set, a user behavior feature sequence corresponding to the text feature sequence, and corresponding first tag data may be obtained;
inputting the text feature sequence into a first sub-model to obtain a first full-connection layer feature and a first junk content prediction result; inputting the user behavior feature sequence into a second sub-model to obtain a second full-connection layer feature and a second garbage content prediction result;
further, a second loss may be calculated based on the first tag data and the first spam prediction result, a third loss may be calculated based on the first tag data and the second spam prediction result, and a fourth loss may be calculated based on the first full-link layer feature and the second full-link layer feature; determining a first loss according to the second loss, the third loss and the fourth loss;
finally, training parameters in the first base model may be adjusted according to the first penalty, and the executing step is returned: and acquiring a text feature sequence in the first sample set, a user behavior feature sequence corresponding to the text feature sequence and corresponding first tag data until a first constraint condition is met, and obtaining a first recognition model.
In some examples, the first sub-model and the second sub-model may use the same network structure, or may use different network structures, for example, when the first sub-model and the second sub-model are the same, then they may be both BERT (Bidirectional Encoder Representations from Transformers, bidirectional encoder representation based on a transducer model) models, or both albert (a Lite BERT, bidirectional encoder representation based on a transducer model with a lightweight weight) models, and when the first sub-model and the second sub-model are different, then the first sub-model may be a BERT model, the second sub-model may be an albert model, or the first sub-model may be an albert model, and the second sub-model may be a BERT model.
In yet other examples, when the first sub-model and the second sub-model are of the same network structure; the network structure may be a network structure as shown in fig. 3, including: a plurality of embedded layers 301, a plurality of bidirectional long and short term memory layers 302, a reverse feedforward neural network layer 303, a forward feedforward neural network layer 304, a first full connection layer 305, a hidden representation layer 306, a second full connection layer 307, a logistic regression layer 308, and a normalization layer 309; wherein,,
As shown in fig. 3, each embedded layer 301 is connected to a two-way long-short-term memory layer 302; the two-way long-short-term memory layers 302 are respectively connected with the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304; the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304 are both connected with the first full connection layer 305; the first fully connected layer 305, the hidden representation layer 306, the second fully connected layer 307, the logistic regression layer 308, and the normalization layer 309 are connected in sequence.
After inputting the text feature sequence into the first sub-model, the plurality of embedding layers 301 of the first sub-model may acquire the text feature sequence and process the text feature sequence, so as to output continuous word vectors; then, the continuous word vectors can be input into the two-way long-short-term memory layers 302 of the first sub-model to obtain first word vectors output by the two-way long-term memory layers 302 of the first sub-model; inputting the first word vector into a reverse feedforward neural network layer 303 and a forward feedforward neural network layer 304 of the first sub-model respectively to obtain a second word vector output by the reverse feedforward neural network layer 303 of the first sub-model and a third word vector output by the forward feedforward neural network layer 304 of the first sub-model; further, the second word vector and the third word vector may be input to the first full-connection layer 305 of the first sub-model, so as to obtain a third full-connection layer feature output by the first full-connection layer 305 of the first sub-model;
After the third full connection layer feature is obtained, the third full connection layer feature may be input to the hidden representation layer 306 of the first sub-model, to obtain a first hidden feature output by the hidden representation layer 306 of the first sub-model; inputting the first hidden feature into the second full connection layer 307 of the first sub-model to obtain a first full connection layer feature output by the second full connection layer 307 of the first sub-model; then, the first full-connection layer feature can be input to the logistic regression layer 308 of the first sub-model to obtain a first prediction result to be normalized output by the logistic regression layer 308 of the first sub-model; and finally, inputting the first predicted result to be normalized into the normalization layer 309 of the first sub-model to obtain a first garbage content predicted result output by the normalization layer 309 of the first sub-model. The first spam prediction result is derived based on the text feature sequence and is used to indicate whether the text is spam.
After the user behavior feature sequence is input to the second sub-model, the plurality of embedding layers 301 of the second sub-model may acquire the user behavior feature sequence and process it, so that a continuous behavior feature vector may be output. Then, the continuous behavior feature vectors can be input into the two-way long-short-term memory layers 302 of the second sub-model to obtain first behavior feature vectors output by the two-way long-term memory layers 302 of the second sub-model; the first behavior feature vector is respectively input into a reverse feedforward neural network layer 303 and a forward feedforward neural network layer 304 of the second sub-model to obtain a second behavior feature vector output by the reverse feedforward neural network layer 303 of the second sub-model and a third behavior feature vector output by the forward feedforward neural network layer 304 of the second sub-model; further, the second behavior feature vector and the third behavior feature vector may be input to the first full-connection layer 305 of the second sub-model, so as to obtain a fourth full-connection layer feature output by the first full-connection layer 305 of the second sub-model;
After the fourth full connection layer feature is obtained, the fourth full connection layer feature may be input to the hidden representation layer 306 of the second sub-model, to obtain a second hidden feature output by the hidden representation layer 306 of the second sub-model; inputting the second hidden feature into a second full connection layer 307 of the second sub-model to obtain a second full connection layer feature output by the second full connection layer 307 of the second sub-model; then, the second full-connection layer feature can be input to the logistic regression layer 308 of the second sub-model to obtain a second prediction result to be normalized output by the logistic regression layer 308 of the second sub-model; and finally, inputting the second predicted result to be normalized into the normalization layer 309 of the second sub-model to obtain a second garbage content predicted result output by the normalization layer 309 of the second sub-model. The second spam prediction result is derived based on the user behavior feature sequence and is used to indicate whether the text is spam.
In some examples, the two-way long-short-term memory layer 302 is a recurrent neural network, and since the two-way long-short-term memory layer 302 includes a plurality of gating structures, the gating structures and buses of the recurrent neural network form a special structure of the two-way long-short-term memory layer together, so that the problems of gradient disappearance and gradient explosion in the continuous training process based on word vectors and continuous behavior feature vectors can be solved.
By adopting the network structure of the embodiment of the invention, the discrete word vectors can be converted into continuous word vectors through the embedding layer 301, and the discrete behavior features can also be converted into continuous behavior feature vectors through the embedding layer 301. And, the two-way long-short term memory layer 302 can solve the problems of gradient disappearance and gradient explosion in the training process. Furthermore, by adopting the same network structure, the model parameters of the first sub model and the model parameters of the second sub model can be conveniently adjusted, so that the time cost in the training process is reduced, and the training efficiency is improved.
In some examples, after obtaining the first spam prediction result, the second spam prediction result, the first fully connected layer feature, and the second fully connected layer feature, a second loss may be calculated based on the first tag data and the first spam prediction result, a third loss may be calculated based on the first tag data and the second spam prediction result, and a fourth loss may be calculated based on the first fully connected layer feature and the second fully connected layer feature; wherein the second loss is used to represent a deviation between the first spam prediction result and the first tag data, and the third loss is used to represent a deviation between the second spam prediction result and the first tag data.
In some examples, the fourth loss satisfies the following equation:
Figure BDA0003285325910000141
wherein MMD (X, Y) is the first full-connection layer feature x= [ X ] 1 ,…x i ,…,x n ]Second full connection layer feature y= [ Y ] 1 ,…y j ,…,y m ]Maximum mean difference between, Φ (x i ) To make the first full connection layer feature x= [ X ] 1 ,…x i ,…,x n ]After mapping the ith feature in the regenerated hilbert space, the obtained feature value of the first full connection layer feature in the regenerated hilbert space, phi (y) j ) To connect the second full connection layer feature y= [ Y ] 1 ,…y j ,…,y m ]After mapping the j-th feature of the second full connection layer to the regenerated hilbert space, the obtained feature value of the second full connection layer feature in the regenerated hilbert space. The maximum mean difference may represent a distance between the first fully connected layer feature and the second fully connected layer feature.
It will be appreciated that the purpose of the migration learning is to apply knowledge learned in the source domain to different but related target domains. In order to achieve the purpose of transfer learning, it is necessary to minimize the distance between the source domain data and the target domain data. And the maximum mean difference can measure the distance between two different but related domain data. Thus, the maximum mean difference may be used herein to measure the distance between the text feature sequence and the user behavior feature sequence.
In some examples, training the first base model by calculating a maximum mean difference between the first fully connected layer feature and the second fully connected layer feature and adding the maximum mean difference to the loss of the network may enable subsequent migration of the second sub-model in the first recognition model, i.e., migration of model parameters of the second sub-model as a knowledge base into the second base model.
After obtaining the second loss, the third loss and the fourth loss, the first loss of the first basic model can be calculated based on the second loss, the third loss and the fourth loss, and then training parameters in the first basic model are adjusted according to the first loss until the first constraint condition is met, so that the first recognition model is obtained. And then the second basic model can be determined based on the first recognition model, and the second basic model is trained by using the second sample set to obtain a second recognition model.
It is to be understood that step S260 in the embodiment of the present invention is the same as or similar to step S120 in the first embodiment, and will not be described herein.
On the basis of a black-out user identification model training method shown in fig. 2, a possible implementation manner is further improved according to an embodiment of the present invention, as shown in fig. 4, which is a flowchart of a third implementation manner of a black-out user identification model training method in the embodiment of the present invention, where the method may include:
S410, inputting the text feature sequence into a first sub-model to obtain a first full-connection layer feature output by a second full-connection layer in the first sub-model and a first garbage content prediction result output by a normalization layer of the first sub-model;
s420, inputting the user behavior feature sequence into a second sub-model to obtain second full-connection layer features output by a second full-connection layer in the second sub-model and second garbage content prediction results output by a normalization layer of the second sub-model;
s430, calculating second loss based on the first tag data and the first garbage content prediction result, calculating third loss based on the first tag data and the second garbage content prediction result, and calculating fourth loss based on the first full-connection layer feature and the second full-connection layer feature;
and S440, weighting the second loss, the third loss and the fourth loss to obtain a first loss.
S450, training parameters in the first basic model are adjusted according to the first loss until a first constraint condition is met, and a first recognition model is obtained.
S460, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In some examples, the second, third, and fourth losses may be weighted such that the first loss may be obtained, where the weighting may be a weighted sum or a weighted average.
In still other examples, when performing the weighted summation to calculate the first loss, the first loss may be calculated by first multiplying the second loss by the first weighting coefficient, multiplying the third loss by the second weighting coefficient, multiplying the fourth loss by the third weighting coefficient, and then summing the weighted second loss, the weighted third loss, and the weighted fourth loss. For example, the first loss may be calculated using the following formula:
Total Loss=a*loss1+b*loss2+c*domain_loss。
wherein Total Loss is a first Loss, loss1 is a second Loss, loss2 is a third Loss, domain_loss is a fourth Loss, a is a first weighting coefficient, b is a second weighting coefficient, c is a third weighting coefficient, and a, b and c are artificial super-parameters.
In some examples, the first weighting factor, the second weighting factor, and the third weighting factor may be parameters that are not changed and that are set in advance, or parameters that are adjusted each time a model parameter is adjusted.
It is understood that steps S410 to S430 and S450 to S460 in the embodiment of the present invention are the same as or similar to steps S210 to S230 and S250 to S260 in the second embodiment, and are not described herein.
In some examples, after the second recognition model is obtained, the user may be identified using the second recognition model. As shown in fig. 5, a flowchart of a black-out user identification method according to an embodiment of the present invention may include:
s510, acquiring a user behavior feature sequence of a user to be identified;
s520, inputting a user behavior feature sequence of the user to be identified into a trained second identification model, and determining whether the user to be identified is a black-producing user, wherein the trained second identification model is obtained by training the black-producing user identification model training method shown in any embodiment.
Specifically, the user behavior feature sequence of the user to be identified may be input into the second identification model, where the second identification model may output the probability that the user to be identified is a black-producing user based on the user behavior feature sequence of the user to be identified, and further may determine whether the user to be identified is a black-producing user based on the probability that the user to be identified is a black-producing user.
For example, when the probability that the user to be identified is a blackout user is greater than 50%, the user to be identified may be determined to be a blackout user, otherwise, the user to be identified is determined not to be a blackout user.
In still other examples, the second recognition model may also directly output whether the user to be recognized is a blackout user based on the user behavior feature sequence of the user to be recognized, for example, may directly output "yes" to indicate that the user to be recognized is a blackout user, or directly output "no" to indicate that the user to be recognized is not a blackout user.
In still other examples, the second set of samples may also be periodically updated and the second recognition model may be robustly trained using the updated second set of samples. Therefore, the second identification model can be continuously updated, and further, the black-producing user can be identified more timely by the second identification model.
For a clearer description of an embodiment of the present invention, the description is made here with reference to an overall frame diagram shown in fig. 6 and a flow diagram shown in fig. 7.
Firstly, massive text contents can be collected offline, the text contents can be comment information posted by a user on a comment page, and also can be barrage information posted by the user by using a barrage function, the comment information or barrage information can comprise normal comment information or abnormal comment information, and the abnormal comment information can be illegal transaction information or fraud information and the like. And then labeling the mass text content with a first label, and extracting the mass text content to obtain a text feature set and a user behavior feature set, namely a content feature set and a user behavior feature set.
In still other examples, the text content may also be information posted by the user in other scenarios, such as post content posted in forums, bars, microblogs, and may include normal post content or abnormal post content.
A twin network is then constructed, as shown in fig. 6, which is a first sub-model 610 and a second sub-model 620 using the same network structure shown in fig. 6. As shown in fig. 6, the network structure includes: a plurality of embedded layers 301, a plurality of bidirectional long and short term memory layers 302, a reverse feedforward neural network layer 303, a forward feedforward neural network layer 304, a first full connection layer 305, a hidden representation layer 306, a second full connection layer 307, a logistic regression layer 308, and a normalization layer 309; wherein,,
as shown in fig. 3, each embedded layer 301 is connected to a two-way long-short-term memory layer 302; the two-way long-short-term memory layers 302 are respectively connected with the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304; the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304 are both connected with the first full connection layer 305; the first fully connected layer 305, the hidden representation layer 306, the second fully connected layer 307, the logistic regression layer 308, and the normalization layer 309 are connected in sequence.
Since the embedding layer 301 may transform discrete features into continuous vectors, after obtaining the text feature set and the user behavior feature set, the text feature set may be converted into a text feature sequence, and the user behavior feature set may be converted into a user behavior feature sequence.
At this time, the text feature sequence and the user behavior feature sequence may be input into the twin network, and the twin network may be trained offline. Here, training the twin network may be referred to as task 1, and thus, training the twin network may also be referred to herein as training the twin network on task 1.
After training is completed, a first recognition model may be obtained after training is completed, and then model parameters of a second sub-model in the first recognition model may be migrated to the second base model 630 as a knowledge base.
The migrated second base model 630 is then trimmed offline using the second sample set. Here, the second base model 630 will be trimmed to task 2, and thus, here the trimming of the second base model 630 may be referred to as continuing the trimming on task 2, so that the target second recognition model that may be deployed on the line may be obtained. And deploying the second recognition model on the line to become an on-line wind control model.
When the user behavior real-time stream is obtained, the behavior characteristics of the user behavior real-time stream are sequentially extracted by adopting sliding windows with preset sizes according to the time sequence of the user behavior in the user behavior real-time stream, then the behavior characteristics of the user behavior real-time stream are input into a second identification model deployed on the line, the second identification model deployed on the line carries out on-line identification on the behavior characteristics of the user behavior real-time stream, and a user identification result is output in real time, namely a wind control result is output in real time. The user identification result comprises: the user is a black producing user or the user is not a black producing user.
According to the embodiment of the invention, training of the second recognition model is independent of labeling of a large amount of user behavior data, and labor cost for labeling the user behavior data is reduced. In addition, the black-birth user identification model training method provided by the embodiment of the invention does not need to formulate rules of various levels, and does not have the problems of rule invalidation and contradiction between rules. Furthermore, the second recognition model trained by the black-output user recognition model training method can detect the real-time behavior of the user, and the real-time detection is high, so long as the user has abnormality in the short-term behavior sequence, the second recognition model can be detected.
Corresponding to the above method embodiment, the embodiment of the present invention further provides a model training device, as shown in fig. 8, which is a schematic structural diagram of a model training device in the embodiment of the present invention, where the device may include:
a first training module 810, configured to train the first basic model with the first sample set until a first constraint condition is satisfied, to obtain a first recognition model; the first recognition model is used for predicting whether texts corresponding to the text feature sequences are junk contents or not based on the text feature sequences and the user behavior feature sequences; the first base model includes: the system comprises a first sub-model and a second sub-model, wherein the first sub-model is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
A second training module 820, configured to use a second sub-model in the first recognition model as a second basic model, and train the second basic model with a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
In some examples, the first set of samples includes: the system comprises a text feature sequence, a user behavior feature sequence and first tag data, wherein the first tag data is used for indicating whether the text feature sequence is junk content or not;
the first training module 810 is specifically configured to:
inputting the text feature sequence into a first sub-model to obtain a first full-connection layer feature and a first junk content prediction result;
inputting the user behavior feature sequence into a second sub-model to obtain a second full-connection layer feature and a second garbage content prediction result;
calculating a second loss based on the first tag data and the first spam content prediction result, calculating a third loss based on the first tag data and the second spam content prediction result, and calculating a fourth loss based on the first full-link layer feature and the second full-link layer feature;
Determining a first loss according to the second loss, the third loss and the fourth loss;
and adjusting training parameters in the first basic model according to the first loss until a first constraint condition is met, so as to obtain a first identification model.
In some examples, the first sub-model and the second sub-model each include:
the system comprises a plurality of embedded layers, a plurality of two-way long-short-term memory layers, a reverse feedforward neural network layer, a forward feedforward neural network layer, a first full-connection layer, a hidden representation layer, a second full-connection layer, a logistic regression layer and a normalization layer;
wherein the second full-connection layer in the first sub-model outputs the first full-connection layer feature; the second full-link layer in the second sub-model outputs a second full-link layer feature.
In some examples, the first training module 810 is specifically configured to:
calculating the maximum mean difference between the first full connection layer feature and the second full connection layer feature; and determining the maximum mean difference as the fourth loss.
In some examples, the first training module 810 is further to:
and weighting the second loss, the third loss and the fourth loss to obtain a first loss.
In some examples, the second training module 820 is specifically configured to:
Obtaining model parameters of a second sub-model in the first recognition model, and storing the model parameters of the second sub-model into a preset knowledge base, wherein the preset knowledge base comprises model parameters of a plurality of models;
and obtaining model parameters of the second sub model from the knowledge base and migrating the model parameters to a second basic model, wherein the second basic model has the same model structure as the second sub model.
In some examples, the embodiment of the present invention further provides a black-out user identification device, as shown in fig. 9, which is a schematic structural diagram of a black-out user identification device in the embodiment of the present invention, where the device may include:
an obtaining module 910, configured to obtain a user behavior feature sequence of a user to be identified;
the recognition module 920 is configured to input a user behavior feature sequence of the user to be recognized into a trained second recognition model, and determine whether the user to be recognized is a black-output user, where the trained second recognition model is obtained by training the model training device shown in fig. 8.
The embodiment of the invention also provides an electronic device, as shown in fig. 10, which comprises a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001, the communication interface 1002 and the memory 1003 complete communication with each other through the communication bus 1004,
A memory 1003 for storing a computer program;
the processor 1001 is configured to implement the steps shown in any of the above embodiments when executing the program stored in the memory 1003.
The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the terminal and other devices.
The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, the computer program implementing the steps shown in any of the above embodiments when executed by a processor.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps shown in any of the embodiments described above.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for embodiments of the apparatus, electronic device, storage medium, etc., the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (11)

1. A black-producing user identification model training method, the method comprising:
training a first basic model by using a first sample set until a first constraint condition is met, so as to obtain a first identification model; the first recognition model is used for predicting whether a text corresponding to the text feature sequence is junk content or not based on the text feature sequence and the user behavior feature sequence; the first base model includes: the first sub-model is used for analyzing the text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing the user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
Determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
2. The method of claim 1, wherein the first set of samples comprises: the system comprises a text feature sequence, a user behavior feature sequence and first tag data, wherein the first tag data is used for indicating whether the text feature sequence is junk content or not;
training the first basic model by using the first sample set until a first constraint condition is met, so as to obtain a first identification model, which comprises the following steps:
inputting the text feature sequence into the first sub-model to obtain a first full-connection layer feature and a first junk content prediction result;
inputting the user behavior feature sequence into the second sub-model to obtain a second full-connection layer feature and a second garbage content prediction result;
calculating the second loss based on the first tag data and the first garbage content prediction result, calculating a third loss based on the first tag data and the second garbage content prediction result, and calculating the fourth loss based on the first full-connection layer feature and the second full-connection layer feature;
Determining the first loss according to the second loss, the third loss and the fourth loss;
and adjusting training parameters in the first basic model according to the first loss until the first constraint condition is met, so as to obtain the first recognition model.
3. The method of claim 2, wherein the first sub-model and the second sub-model each comprise:
the system comprises a plurality of embedded layers, a plurality of two-way long-short-term memory layers, a reverse feedforward neural network layer, a forward feedforward neural network layer, a first full-connection layer, a hidden representation layer, a second full-connection layer, a logistic regression layer and a normalization layer;
wherein a second fully connected layer in the first sub-model outputs the first fully connected layer feature; and outputting the second full connection layer characteristics by a second full connection layer in the second sub-model.
4. The method of claim 2, wherein the calculating a fourth loss based on the first full connection layer feature and the second full connection layer feature comprises:
calculating the maximum mean value difference between the first full-connection layer feature and the second full-connection layer feature; and determining the maximum mean difference as the fourth loss.
5. The method of claim 2, wherein the determining the first loss based on the second loss, the third loss, and the fourth loss comprises:
and weighting the second loss, the third loss and the fourth loss to obtain the first loss.
6. The method of claim 1, wherein the determining a second base model based on a second sub-model in the first recognition model comprises:
obtaining model parameters of a second sub-model in the first recognition model, and storing the model parameters of the second sub-model into a preset knowledge base, wherein the preset knowledge base comprises model parameters of a plurality of models;
and obtaining model parameters of the second sub model from the knowledge base and migrating the model parameters to the second basic model, wherein the second basic model has the same model structure as the second sub model.
7. A black-producing user identification method, characterized in that the method further comprises:
acquiring a user behavior feature sequence of a user to be identified;
inputting the user behavior feature sequence of the user to be identified into a trained second identification model, and determining whether the user to be identified is a black-producing user, wherein the trained second identification model is obtained by training the black-producing user identification model training method according to any one of claims 1-6.
8. A black-producing user identification model training device, the device comprising:
the first training module is used for training the first basic model by utilizing the first sample set until a first constraint condition is met, so as to obtain a first identification model; the first recognition model is used for predicting whether a text corresponding to the text feature sequence is junk content or not based on the text feature sequence and the user behavior feature sequence; the first base model includes: the first sub-model is used for analyzing the text characteristic sequence to obtain a first junk content prediction result, and the second sub-model is used for analyzing the user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first loss, the first loss comprising: a second loss, a third loss, and a fourth loss, the second loss being a loss of the first sub-model, the third loss being a loss of the second sub-model, the fourth loss being a characteristic loss between the first sub-model and the second sub-model;
the second training module is used for taking a second sub-model in the first recognition model as a second basic model, and training the second basic model by utilizing a second sample set to obtain a second recognition model; the second recognition model is used for recognizing whether the user corresponding to the user behavior feature sequence is a black-producing user or not based on the user behavior feature sequence.
9. A black-producing user identification device, the device further comprising:
the acquisition module is used for acquiring a user behavior feature sequence of the user to be identified;
the recognition module is used for inputting the user behavior feature sequence of the user to be recognized into a trained second recognition model to determine whether the user to be recognized is a black-producing user, wherein the trained second recognition model is obtained through training by the black-producing user recognition model training device according to claim 8.
10. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of the method of any one of claims 1 to 7 when executing a program stored on a memory.
11. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the steps of the method of any of claims 1-7.
CN202111145600.3A 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium Active CN113822684B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111145600.3A CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111145600.3A CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113822684A CN113822684A (en) 2021-12-21
CN113822684B true CN113822684B (en) 2023-06-06

Family

ID=78915779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111145600.3A Active CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113822684B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564577B (en) * 2022-12-02 2023-04-07 成都新希望金融信息有限公司 Abnormal user identification method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109168044A (en) * 2018-10-11 2019-01-08 北京奇艺世纪科技有限公司 A kind of determination method and device of video features
CN110008980A (en) * 2019-01-02 2019-07-12 阿里巴巴集团控股有限公司 Identification model generation method, recognition methods, device, equipment and storage medium
WO2020156004A1 (en) * 2019-02-01 2020-08-06 阿里巴巴集团控股有限公司 Model training method, apparatus and system
CN112686046A (en) * 2021-01-06 2021-04-20 上海明略人工智能(集团)有限公司 Model training method, device, equipment and computer readable medium
CN112926699A (en) * 2021-04-25 2021-06-08 恒生电子股份有限公司 Abnormal object identification method, device, equipment and storage medium
CN113392179A (en) * 2020-12-21 2021-09-14 腾讯科技(深圳)有限公司 Text labeling method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11310247B2 (en) * 2016-12-21 2022-04-19 Micro Focus Llc Abnormal behavior detection of enterprise entities using time-series data
KR102418859B1 (en) * 2017-12-12 2022-07-11 한국전자통신연구원 System and method for managing dangerous factors in AEO certification process
US11385633B2 (en) * 2018-04-09 2022-07-12 Diveplane Corporation Model reduction and training efficiency in computer-based reasoning and artificial intelligence systems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109168044A (en) * 2018-10-11 2019-01-08 北京奇艺世纪科技有限公司 A kind of determination method and device of video features
CN110008980A (en) * 2019-01-02 2019-07-12 阿里巴巴集团控股有限公司 Identification model generation method, recognition methods, device, equipment and storage medium
WO2020156004A1 (en) * 2019-02-01 2020-08-06 阿里巴巴集团控股有限公司 Model training method, apparatus and system
CN113392179A (en) * 2020-12-21 2021-09-14 腾讯科技(深圳)有限公司 Text labeling method and device, electronic equipment and storage medium
CN112686046A (en) * 2021-01-06 2021-04-20 上海明略人工智能(集团)有限公司 Model training method, device, equipment and computer readable medium
CN112926699A (en) * 2021-04-25 2021-06-08 恒生电子股份有限公司 Abnormal object identification method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于近邻传播的网络异常行为检测算法设计及应用;陈俊杰;中国硕士学位论文全文数据库 信息科技辑;全文 *
融合多约束条件的意图和语义槽填充联合识别;侯丽仙;李艳玲;林民;李成城;;计算机科学与探索(09);全文 *
面向中文微博的评价对象与评价词语联合抽取;刘全超;黄河燕;冯冲;;电子学报(07);全文 *

Also Published As

Publication number Publication date
CN113822684A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
Fanaee-T et al. Event labeling combining ensemble detectors and background knowledge
US20200349430A1 (en) System and method for predicting domain reputation
CN111371806A (en) Web attack detection method and device
CN111881983B (en) Data processing method and device based on classification model, electronic equipment and medium
JP5454357B2 (en) Information processing apparatus and method, and program
CN110929785B (en) Data classification method, device, terminal equipment and readable storage medium
CN111818198B (en) Domain name detection method, domain name detection device, equipment and medium
CN110598070B (en) Application type identification method and device, server and storage medium
Shindarev et al. Approach to identifying of employees profiles in websites of social networks aimed to analyze social engineering vulnerabilities
CN110674317A (en) Entity linking method and device based on graph neural network
CN112839014B (en) Method, system, equipment and medium for establishing abnormal visitor identification model
CN111881398B (en) Page type determining method, device and equipment and computer storage medium
CN113139052B (en) Rumor detection method and device based on graph neural network feature aggregation
CN111160959A (en) User click conversion estimation method and device
CN115310510A (en) Target safety identification method and device based on optimization rule decision tree and electronic equipment
CN111159481B (en) Edge prediction method and device for graph data and terminal equipment
CN113822684B (en) Black-birth user identification model training method and device, electronic equipment and storage medium
CN115130542A (en) Model training method, text processing device and electronic equipment
Chua et al. Problem Understanding of Fake News Detection from a Data Mining Perspective
CN114492576A (en) Abnormal user detection method, system, storage medium and electronic equipment
CN112487819A (en) Method, system, electronic device and storage medium for identifying homonyms among enterprises
Li et al. SCX-SD: semi-supervised method for contextual sarcasm detection
Lijun et al. An intuitionistic calculus to complex abnormal event recognition on data streams
CN113157993A (en) Network water army behavior early warning model based on time sequence graph polarization analysis
Li et al. Using big data from the web to train chinese traffic word representation model in vector space

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant