CN116976397A

CN116976397A - Information selection method and device, electronic equipment and storage medium

Info

Publication number: CN116976397A
Application number: CN202310108688.4A
Authority: CN
Inventors: 苏宏祖; 张一飞; 杨雪娇; 华画; 王双洋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-01-17
Filing date: 2023-01-17
Publication date: 2023-10-31

Abstract

The application discloses an information selection method, an information selection device, electronic equipment and a storage medium. The embodiment of the application relates to the technical fields of artificial intelligence machine learning, cloud technology and the like. The method comprises the following steps: acquiring object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected. The first discriminator has better distinguishing capability on the information of the first data domain and the information of the second data domain, so that the prediction deviation of the information selection model obtained according to the training of the first discriminator is greatly reduced, the matching degree of the target information and the requirement of the target object is determined according to the target value, and the information transfer efficiency is improved.

Description

Information selection method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of electronic information technologies, and in particular, to an information selection method, an information selection device, an electronic device, and a storage medium.

Background

The intelligent selection (recommendation) service refers to selecting information and services possibly liked by a user for the user based on big data and artificial intelligence technology and combining accumulation of multiple industry fields, for example, selecting goods possibly liked by the user for the user based on information such as browsing records and purchasing records of the goods by the user.

At present, in an application scene of intelligent selection service, an initial model can be trained through samples in a plurality of data fields to obtain an information selection model for the plurality of data fields, then information to be selected in the plurality of data fields is subjected to scoring prediction through the information selection model to obtain a prediction score of each piece of information to be selected, then information pushed to a user is determined from the information to be selected according to the prediction scores of the pieces of information to be selected, and the determined information is pushed to the user.

However, when the initial model is trained, samples among a plurality of data fields are mutually influenced, so that the prediction deviation of the information selection model obtained through training is larger, the accuracy of the prediction score of the information to be selected determined by the information selection model is lower, the information determined according to the prediction score is not information liked by a user, the information transmission efficiency is low, and the user experience is poor.

Disclosure of Invention

In view of the above, the embodiments of the present application provide an information selecting method, an information selecting device, an electronic device, and a storage medium.

In a first aspect, an embodiment of the present application provides an information selection method, where the method includes: acquiring object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected, wherein the plurality of pieces of information to be selected belong to information of a first data domain; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; the target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first discriminator is obtained by training the second discriminator through an anti-loss value, and the anti-loss value characterizes the distinguishing capability of the second discriminator on the information of the first data domain and the information of the second data domain; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

In a second aspect, an embodiment of the present application provides an information selecting apparatus, including: the acquisition module is used for acquiring object characteristics of the target object and information characteristics of a plurality of pieces of information to be selected, wherein the plurality of pieces of information to be selected belong to the information of the first data domain; the target value determining module is used for processing the object characteristics of the target object and the information characteristics of each piece of information to be selected through the information selecting model to obtain a target value of each piece of information to be selected; the target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first discriminator is obtained by training the second discriminator through an anti-loss value, and the anti-loss value characterizes the distinguishing capability of the second discriminator on the information of the first data domain and the information of the second data domain; the information selecting module is used for selecting target information aiming at a target object from the plurality of pieces of information to be selected according to the respective target values of the plurality of pieces of information to be selected.

Optionally, the apparatus further comprises a first training module for determining, by the first model, sample features of the first sample information of the first data field as the first sample features; determining sample characteristics of second sample information of the second data domain as second sample characteristics by the first model; determining an antagonism loss value according to the discrimination result of the second discriminator aiming at the first sample characteristic and the discrimination result aiming at the second sample characteristic; training the second discriminator according to the countermeasures loss value to obtain a third discriminator; training the first model according to the discrimination result of the third discriminator aiming at the first sample characteristic to obtain a third model; acquiring a third model as a new first model, acquiring a third discriminant as a new second discriminant, acquiring third sample information of the first data domain as new first sample information and acquiring fourth sample information of the second data domain as new second sample information, and returning to the step of executing the determination of the sample characteristics of the first sample information of the first data domain by the first model until a preset condition is satisfied; and acquiring a third model obtained in the last training process as an information selection model, and acquiring a third discriminator obtained in the last training process as a first discriminator.

Optionally, the first training module is further configured to determine, by using a third identifier, a discrimination result for the first sample feature, as an intermediate discrimination result of the first sample feature; determining a selected loss value according to the intermediate discrimination result of the first sample characteristic, wherein the selected loss value characterizes the characteristic extraction capacity of the first model on the sample information of the first data domain; and training the first model according to the selected loss value to obtain a third model.

Optionally, the first training module is further configured to obtain a third model obtained in the last training process, as a model to be corrected; obtaining a correction sample, wherein the correction sample comprises correction sample information of a first data field and correction sample values corresponding to the correction sample information, and the correction sample values represent the attention degree of a sample object corresponding to the correction sample information; determining a predicted correction sample value for correction sample information through a model to be corrected; determining a loss value as a correction loss value according to the predicted correction sample value and the correction sample value; training the model to be corrected according to the correction loss value to obtain an information selection model.

Optionally, the first training module is further configured to determine a first correction loss value according to the predicted correction sample value and the correction sample value; determining sample characteristics aiming at the sample information to be corrected through the model to be corrected, and taking the sample characteristics as corrected sample characteristics; determining a discrimination result for the corrected sample feature by a first discriminator; determining a second correction loss value according to the discrimination result of the correction sample characteristics; and determining a correction loss value according to the first correction loss value and the second correction loss value.

Optionally, the first training module is further configured to obtain a first coefficient for a first correction loss value and a second coefficient for a second correction loss value; calculating the product of the first correction loss value and the first coefficient as a first product result; calculating the product of the second correction loss value and the second coefficient as a second product result; and calculating the sum of the first product result and the second product result as a correction loss value.

Optionally, the first training module is further configured to input first sample information of the first data field into an embedded network of the first model to obtain a first embedded feature; processing the first embedded feature through a first task tower of the first model to obtain a first task feature; processing the first embedded feature through a second task tower of the first model to obtain a second task feature; and splicing the first task feature and the second task feature to obtain the first sample feature.

Optionally, the device further includes a second training module, configured to obtain a first training sample, where the first training sample includes fifth sample information in the second data domain and a first sample value corresponding to the fifth sample information, and the first sample value characterizes a degree of interest of a sample object corresponding to the fifth sample information on the fifth sample information; determining embedded features for fifth sample information as second embedded features through an embedded network in a fourth model; processing the second embedded feature through a third task tower in the fourth model to obtain a first predicted sample value; a first target loss value based on the first predicted sample value and the first sample value; and training the fourth model according to the first target loss value to obtain a second model.

Optionally, the apparatus further includes a third training module, configured to obtain a second training sample, where the second training sample includes sixth sample information in a second data field, a second sample value corresponding to the sixth sample information, and a third sample value corresponding to the sixth sample information; the second sample value represents the probability that the sixth sample information is selected by the sample object corresponding to the sixth sample information, and the third sample value represents the probability that the sixth sample information is converted by the sample object corresponding to the sixth sample information; determining an embedded feature for the sixth sample information as a third embedded feature through an embedded network in the fifth model; processing the third embedded feature through a fourth task tower in the fifth model to obtain a second predicted sample value; processing the third embedded feature through a fifth task tower in the fifth model to obtain a third predicted sample value; and training the fifth model according to the second sample value, the third sample value, the second predicted sample value and the third predicted sample value to obtain a second model.

Optionally, the third training module is further configured to determine a first loss value according to the second predicted sample value and the second sample value; determining a second loss value according to the third predicted sample value and the third sample value; calculating a third sample value and a product of the third sample value as a fourth sample value; calculating the product of the second predicted sample value and the third predicted sample value as a fourth predicted sample value; determining a third loss value according to the fourth predicted sample value and the fourth sample value; calculating the sum of the first loss value, the second loss value and the third loss value as a fourth loss value; calculating the sum of the fourth loss value and the first loss value as a second target loss value; and training the fifth model according to the second target loss value to obtain a second model.

Optionally, the information selecting module is further configured to select, from the plurality of pieces of information to be selected, information to be selected for which a target value reaches a preset threshold, as target information for the target object; or selecting a preset number of pieces of information to be selected from the plurality of pieces of information to be selected according to the order of the target value from high to low, and taking the information to be selected as target information aiming at a target object.

Optionally, the acquiring module is configured to further acquire object features of the target object and information features of the plurality of information to be selected in response to a received transmission request including the plurality of information to be selected; correspondingly, the information selection module is also used for sending target information to the target object.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory; one or more programs are stored in the memory and configured to be executed by the processor to implement the methods described above.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having program code stored therein, wherein the program code, when executed by a processor, performs the method described above.

In a fifth aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the electronic device to perform the method described above.

According to the information selection method, the device, the electronic equipment and the storage medium, the first discriminator is obtained by training the second discriminator through the counterloss value, the counterloss value characterizes the distinguishing capability of the second discriminator on the information of the first data domain and the information of the second data domain, the distinguishing capability of the first discriminator on the information of the first data domain and the information of the second data domain is good, the prediction deviation of the information selection model obtained through training of the first discriminator is greatly reduced, the target value of the information to be selected determined according to the information selection model is more accurate, the matching degree of the target information determined according to the target value and the requirement of the target object is further improved, and the information transmission efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a schematic diagram of an application scenario proposed by an embodiment of the present application;

FIG. 2 is a flow chart of a method for information selection according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for information selection according to yet another embodiment of the present application;

FIG. 4 is a schematic diagram showing the structure of a fourth model in an embodiment of the present application;

FIG. 5 is a flow chart of a method for information selection according to yet another embodiment of the present application;

FIG. 6 is a schematic diagram showing the structure of a fifth model in the embodiment of the present application;

FIG. 7 is a flow chart of a method for information selection according to yet another embodiment of the present application;

FIG. 8 is a schematic diagram of a first model in an embodiment of the application;

FIG. 9 is a schematic view showing the structure of a further first model in the embodiment of the present application;

FIG. 10 is a schematic diagram of a training process of a second arbiter in an embodiment of the application;

FIG. 11 is a flow chart illustrating a method for information selection according to yet another embodiment of the present application;

FIG. 12 is a schematic diagram showing a structure of a model to be corrected according to an embodiment of the present application;

FIG. 13 is a schematic diagram showing the structure of another model to be corrected according to the embodiment of the present application;

FIG. 14 is a schematic diagram showing a training process of the model to be corrected shown in FIG. 12;

FIG. 15 is a schematic diagram showing a training process of the model to be corrected shown in FIG. 13;

fig. 16 is a block diagram of an information selecting apparatus according to an embodiment of the present application;

fig. 17 shows a block diagram of an electronic device for performing the information selection method according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the application, are within the scope of the application in accordance with embodiments of the present application.

In the following description, the terms "first", "second", and the like are merely used to distinguish between similar objects and do not represent a particular ordering of the objects, it being understood that the "first", "second", or the like may be interchanged with one another, if permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.

It should be noted that: references herein to "a plurality" means two or more. "and/or" describes an association relationship of an association object, meaning that there may be three relationships, e.g., a and/or B may represent: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

In addition, in the embodiment of the present application, the information such as the work type, the learning history, the age, the life stage (e.g. whether to marrie), the work place and the favorite product of the user needs to be obtained and licensed or agreed by the user, and the collection, the use, the processing and the storage of the information such as the work type, the learning history, the age, the life stage (e.g. whether to marrie), the work place and the favorite product of the user need to meet the regulations of the region where the user is located.

The application discloses an information selection method, an information selection device, electronic equipment and a storage medium, and relates to artificial intelligence machine learning, cloud technology and the like.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Cloud storage (cloud storage) is a new concept that extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network to work cooperatively through application software or application interfaces through functions such as cluster application, grid technology, and a distributed storage file system, so as to provide data storage and service access functions for the outside.

At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as a data Identification (ID) and the like, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: according to the group of capacity measurement of objects stored in a logical volume (which often has a large margin with respect to the capacity of the objects actually to be stored) and redundant array of independent disks (RAID, redundant Array of Independent Disk), physical storage space is divided into stripes in advance, and a logical volume can be understood as a stripe, thereby allocating physical storage space to the logical volume

As shown in fig. 1, an application scenario to which the embodiment of the present application is applicable includes a terminal 20 and a server 10, where the terminal 20 and the server 10 are connected through a wired network or a wireless network. The terminal 20 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart home appliance, a vehicle-mounted terminal, an aircraft, a wearable device terminal, a virtual reality device, and other terminal devices capable of page presentation, or other applications (e.g., instant messaging applications, shopping applications, search applications, game applications, forum applications, map traffic applications, etc.) capable of invoking page presentation applications.

The server 10 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The server 10 may be used to provide services for applications running at the terminal 20.

The terminal 20 may send a selection request to the server 10, and the server 10 may obtain the object feature and the information features of the plurality of information to be selected according to the selection request, process the object feature and the information features of the plurality of information to be selected through a preset information selection model to obtain a target value of each information to be selected, determine target information from the plurality of information to be selected according to the target value of the information to be selected, and feed back the target information to the terminal 20.

The information to be selected may refer to information to be selected, and the information to be selected may refer to advertisement to be selected, commodity to be selected, application to be selected, and the like. The target object may be an object for which the information to be selected is aimed, may be a push object corresponding to the information to be selected, for example, the target object may be a user.

In some possible implementations, the server 10 may obtain the information selection model according to the first arbiter and the second model, and deploy the information selection model to the server 10 locally.

In another embodiment, the terminal 20 may select a request, the terminal 20 may obtain the object feature and the information features of the plurality of pieces of information to be selected according to the selection request, process the object feature and the information features of the plurality of pieces of information to be selected through the information selection model issued by the server 10 to obtain a target value of each piece of information to be selected, and determine the target information from the plurality of pieces of information to be selected according to the target value of the information to be selected.

Alternatively, the server 10 may store the obtained information selection model in a cloud storage system, and when the information selection method of the present application is executed, the terminal 20 obtains the information selection model from the cloud storage system.

For convenience of description, in the following embodiments, information selection is described as an example performed by the electronic device.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for selecting information according to an embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be the terminal 20 or the server 10 in fig. 1, and the method includes:

s110, obtaining object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected, wherein the plurality of pieces of information to be selected belong to information of a first data field.

The object characteristics of the target object may refer to characteristic information describing the target object, for example, when the target object is a user, the object characteristics of the target object may refer to user characteristics such as a user identifier, a user gender, a user age, a city in which the user is located, and occupation information of the user, and when the target object is a pet, for example, the object characteristics of the target object may refer to pet characteristics such as a pet variety, a pet identifier, a pet gender, a pet age, and a city in which the pet is located.

The data field may refer to a scene or a data category, for example, advertisement information under a class a game may be used as information of a data field, and for example, a man T-shirt may be used as information of a data field.

Under different scenes and actual demands, the determined data fields may be different in size, for example, for one scene a1, advertisement information of all games may be used as information of one data field, for another scene a2, advertisement information under a educational game may be used as information of one data field, and for another scene a3, advertisement information of an educational game and shooting game may be used as information of one data field.

The first data field may refer to a set data field, and the first data field may include a plurality of pieces of information, and the information to be selected is determined from the plurality of pieces of information included in the first data field. For example, a plurality of pieces of information which are newly generated are determined in the first data field as information to be selected, or a plurality of pieces of information which have the highest occurrence frequency (or have the highest frequency selected by the target object) are determined in the first data field as information to be selected.

Optionally, in the case that the first data field is set, the object characteristics of the target object may further include a click rate, a conversion rate, preference information, a preference category, and the like of the information of the target object with respect to the first data field.

The information characteristic of the information to be selected may refer to the number of activations representing the information to be selected, payment information of the information to be selected, information content of the information to be selected, display position of the information to be selected, display duration of the information to be selected, click condition of the information to be selected, and the like.

For example, when the target object is a user and the information to be selected is a game advertisement, the object features of the target object may include a user identifier, a user gender, a user age, a city in which the user is located, and game behavior statistics: user historical advertisement click frequency, user advertisement click preference, user game type preference and the like; the information features of the information to be selected may include advertisement display information, advertisement display position, advertisement display time, statistics of clicking conditions of advertisements, activity number of game b1, payment amount of game b1, activity number of game b2, payment amount of game b2, etc., where game b1 and game b2 belong to the same data field.

S120, processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected.

The target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first arbiter is trained on the second arbiter by a challenge loss value characterizing the ability of the second arbiter to distinguish between the information of the first data field and the information of the second data field. The target value may refer to, for example, a click rate and/or a conversion rate determined by the information selection model for the information to be selected.

Click-through rate refers to CTR, e.g., for advertisements, and advertisement click-through rate refers to the probability used in an advertisement system to indicate that a user clicks on an advertisement.

Conversion refers to CVR, for example for advertisements: advertisement conversion rate, used in advertisement systems to indicate the probability of a user to download, install, register, log in, or pay for an advertisement within 72 hours after clicking on the advertisement.

For each piece of information to be selected, inputting the object characteristics of the target object and the information characteristics of the information to be selected into an information selection model to obtain a target value which is output by the information selection model and aims at the information to be selected, wherein the target value of the information to be selected can be the click rate of the information to be selected, the target value of the information to be selected can also be the click rate of the information to be selected, and the target value of the information to be selected can also be the click rate and the conversion rate of the information to be selected.

The second data field refers to a different data field than the first data field, for example, the first data field includes advertisement information of a educational game, the second data field may be advertisement information including a shooting game, and for example, the first data field may include clothing, and the second data field may include an electronic product.

The initial model and the sample information of the second data domain can be obtained, the initial model is trained through the sample information of the second data domain, and the second model aiming at the second data domain is obtained, wherein the initial model can be a neural network model with target value (click rate and/or conversion rate) prediction capability and parameter initialization, and the structure of the initial model is not particularly limited.

After obtaining the second model for the second data domain, an initial model which is not trained can be obtained, and parameter configuration is carried out on the initial model which is not trained through network parameters in the second model, so that the initial model after configuration is obtained as the first model.

A parameter initialized discriminator or a discriminator with certain discrimination capability can be obtained as a second discriminator, a first data domain sample feature of sample information for a first data domain is determined through a first model, a discrimination result of the first data domain sample feature is determined through the second discriminator, a second data domain sample feature of sample information for a second data domain is determined through the first model, a discrimination result of the first data domain sample feature is determined through the second discriminator, and then a loss value (such as a cross entropy loss value) for the second discriminator is determined according to the discrimination result of the second discriminator for the first data domain sample feature and the discrimination result of the second data domain sample feature, and is used as an countermeasure loss value, and the second discriminator is trained through the countermeasure loss value to obtain the first discriminator.

Determining a first data domain sample characteristic of sample information of a first data domain through the first model, determining a discrimination result of the first data domain sample characteristic through the first discriminator, determining a loss value (such as a cross entropy loss value) according to the discrimination result of the first discriminator of the first data domain sample characteristic, and training the first model through the determined loss value to obtain an information selection model.

It should be noted that, when the target value output by the information selection model includes one of the click rate and the conversion rate, the information selection model is obtained according to the above process training.

When the target value output by the information selection model comprises both the click rate and the conversion rate, the first model, the second model and the information selection model comprise a task tower aiming at the click rate and a task tower aiming at the conversion rate, the first discriminator comprises a first discriminator aiming at the click rate and a first discriminator aiming at the conversion rate, and the second discriminator comprises a second discriminator aiming at the click rate and a second discriminator aiming at the conversion rate. The training process of the task tower, the first discriminator and the second discriminator aiming at the click rate is similar to that described above, the training process of the task tower, the first discriminator and the second discriminator aiming at the conversion rate is similar to that described above, the training process is not described again, and when the training of the task tower aiming at the click rate and the task tower aiming at the conversion rate is completed, the two trained task towers are summarized to obtain an information selection model.

The information selection Model may be called a Target Model (Target Model), and the second Model for obtaining the information selection Model may be called a Source Model (Source Model). The second data field used to train the second model may be referred to as a source field, which refers to a collection of samples used to perform predictive model pre-training, samples used for model training prior to migration in the anti-adaptive approach; correspondingly, the obtained information selects a first data domain aimed by the model to be called a target domain, and a target domain stone is used for a batch of samples for final model test and reasoning, and samples for model training and test are used after migration in the anti-adaptive method.

S130, selecting target information aiming at a target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

In the application, the larger the target value of the information to be selected is, the larger the attention degree of the target object to be selected is, the smaller the target value of the information to be selected is, and the smaller the attention degree of the target object to be selected is; after the respective target values of the information to be selected are obtained, one information to be selected with the maximum target value in the plurality of information to be selected can be selected as target information, and the target information is pushed to the target object.

As an implementation manner, when the target value includes the click rate and the conversion rate, the click rate and the conversion rate may be weighted and summed to obtain a summation result, and one piece of information to be selected with the largest summation result is selected from the plurality of pieces of information to be selected as the target information, and the target information is pushed to the target object. The click rate and the weight of the conversion rate are not limited in the application.

When the information selection model comprises a task tower corresponding to the click rate, the output target value is the click rate, when the information selection model comprises a task tower corresponding to the conversion rate, the output target value is the conversion rate, and when the information selection model comprises the task tower corresponding to the click rate and the conversion rate, the output target value is the click rate and the conversion rate.

As yet another embodiment, S130 may include: selecting information to be selected with target value reaching a preset threshold value from the plurality of information to be selected as target information aiming at a target object. The preset threshold may be determined according to the requirement and the first data field, which is not limited by the present application.

The information to be selected in the same data field has different contents included in the target value, and different corresponding preset thresholds, for example, the preset threshold corresponding to the click rate of the target value is different from the preset threshold corresponding to the conversion rate of the target value. The information to be selected, the target value of which reaches the preset threshold value, can be used as target information.

If the target value includes the click rate and the conversion rate, the click rate and the conversion rate may be weighted and summed to obtain a summation result, and the information to be selected, of which the summation result reaches a corresponding preset threshold, is used as target information for the target object, where the preset threshold corresponding to the summation result may be different from the preset threshold corresponding to the click rate and the preset threshold corresponding to the conversion rate.

As yet another embodiment, S130 may include: and selecting a preset number of pieces of information to be selected from the plurality of pieces of information to be selected according to the order of the target value from high to low, and taking the information to be selected as target information aiming at a target object. The preset number may be determined according to the requirement and the first data field, and the present application is not limited thereto, for example, the preset number may be 3.

If the target value includes the click rate and the conversion rate, the click rate and the conversion rate can be weighted and summed to obtain a summation result, and a preset number of information to be selected from the plurality of information to be selected is selected according to the order of the summation result from high to low to be used as target information for the target object.

Alternatively, in one possible implementation, S110 may include: responding to a received sending request comprising a plurality of pieces of information to be selected, and acquiring object characteristics of a target object and information characteristics of the plurality of pieces of information to be selected; accordingly, after S130, it may further include: and sending the target information to the target object.

The electronic device may receive a transmission request including a plurality of information to be selected sent by another server (a server different from the server 10 in fig. 1, the server may be a server belonging to a different provider or scenario than the server 10 in fig. 1), and then the electronic device obtains, in response to the transmission request, an object feature of the target object and an information feature of the plurality of information to be selected.

After the target information is obtained, the target information is transmitted to the target object so as to present the target information to the target object. For example, when the target object is a user, target information is obtained, the target information is pushed to the terminal of the user, and the terminal of the user can output the target information so that the user can conveniently view the target information.

In this embodiment, the first arbiter trains the second arbiter through the counterloss value, and the counterloss value characterizes the distinguishing capability of the second arbiter on the information of the first data domain and the information of the second data domain, and the distinguishing capability of the first arbiter on the information of the first data domain and the information of the second data domain is better, so that the prediction deviation of the information selection model obtained through the training of the first arbiter is greatly reduced, the target value of the information to be selected determined according to the information selection model is more accurate, the matching degree of the target information determined according to the target value and the requirement of the target object is further improved, and the information transmission efficiency is improved.

Referring to fig. 3, fig. 3 is a flowchart illustrating an information selecting method according to another embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be the terminal 20 or the server 10 in fig. 1, and the method includes:

s210, acquiring a first training sample.

The first training sample comprises fifth sample information of the second data field and a first sample value corresponding to the fifth sample information, and the first sample value represents the attention degree of a sample object corresponding to the fifth sample information.

Information in the second data field used as training second model may be obtained as fifth sample information, the object for which the fifth sample information is directed being the sample object. For example, if the fifth sample information is advertisement information of the c1 game, the sample object is a user who pays attention to advertisement information of the c1 game.

Wherein, the first sample value of the sample object for the fifth sample information may refer to a click rate or conversion rate of the sample object for the fifth sample information.

S220, determining an embedded feature aiming at fifth sample information through an embedded network in a fourth model as a second embedded feature; and processing the second embedded feature through a third task tower in the fourth model to obtain a first predicted sample value.

The fourth model refers to a basic model that can be used as the second model, and in this embodiment, the fourth model may include an embedded network, a third task tower, and a prediction network, as shown in fig. 4.

Information features corresponding to the fifth sample information and object features of the sample object aiming at the fifth sample information can be summarized to obtain summarized features, the summarized features are divided into dense features (numerical class features such as click rate) and sparse features (non-numerical class features such as names of the sample objects), and then the dense features and the sparse features are input into an embedded network in a fourth model to obtain second embedded features; and inputting the second embedded feature into a third task tower to obtain a result output by the third task tower, and inputting the result output by the third task tower into a prediction network to obtain a first prediction sample value.

In the present application, the task tower may include a multi-layer fully connected neural network, for example, a fully connected layer of 256 neural network nodes in a first layer, 64 neural network nodes in a second layer, and 10 neural network nodes in a third layer, and the output of each fully connected layer is processed using a ReLU activation function. The predictive network is typically a multi-layer fully connected neural network, for example, the first layer of the predictive network is a fully connected layer of 10 neural network nodes, using a ReLU activation function, and the second layer of the predictive network is a fully connected layer of 1 neural network node, using a Sigmoid activation function.

S230, according to the first predicted sample value and the first sample value, a first target loss value; and training the fourth model according to the first target loss value to obtain a second model.

The cross entropy loss value may be determined according to the first predicted sample value and the first sample value, and used as a first target loss value, and the fourth model may be trained according to the first target loss value to obtain the second model.

If the fourth model is directed to the click rate, the first predicted sample value and the first sample value are both the click rate, and the cross entropy loss value can be determined according to the first predicted sample value and the first sample value, and the cross entropy loss value is used as the first target loss value, and the calculation process refers to formula one:

wherein, the liquid crystal display device comprises a liquid crystal display device,is click rateCorresponding first target loss value, +.>For the first sample value, +>Is the first predicted sample value.

If the fourth model is for conversion rate, the first predicted sample value may include a predicted click rate, a predicted conversion rate, and a predicted click conversion rate, and the first sample value may include a click rate, a conversion rate, and a click conversion rate; the cross entropy loss value corresponding to the click rate, the cross entropy loss value corresponding to the conversion rate and the cross entropy loss value corresponding to the click conversion can be determined, and then the first target loss value is determined according to the three cross entropy loss values, and the calculation process refers to a formula II:

Wherein, the liquid crystal display device comprises a liquid crystal display device,for the first target loss value corresponding to the conversion rate, beta _a For the weight corresponding to the conversion rate, beta _c For the weight corresponding to the click rate, beta _m For the weight corresponding to click conversion rate, +.>For the conversion in the first sample value, +.>For the conversion in the first predicted sample value, < >>For click rate in the first sample value, +.>For click-through rate in the first predicted sample value, < >>For click conversion in the first sample value, +.>Is the click conversion rate in the first predicted sample value.

As shown in fig. 4, the embedded network, the third task tower and the prediction network in the fourth model may be trained by the first target loss value, so as to obtain a trained second model.

S240, obtaining object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

The description of S240 refers to the descriptions of S110 to S130 above, and will not be repeated here.

In this embodiment, the fourth model is used as a basic model, and is trained to obtain the second model, where the fourth model includes a multi-layer fully connected neural network, and the fourth model has a better feature extraction capability, so that the feature extraction capability of the second model obtained by training is better, thereby improving the feature extraction capability of the information extraction model obtained according to the second model, and further improving the selection effect of the information selection model.

Referring to fig. 5, fig. 5 shows a flowchart of a method for selecting information according to still another embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be the terminal 20 or the server 10 in fig. 1, and the method includes:

s310, acquiring a second training sample.

The second training sample comprises sixth sample information of a second data field, a second sample value corresponding to the sixth sample information and a third sample value corresponding to the sixth sample information; the second sample value characterizes the probability that the sixth sample information is selected by the sample object corresponding to the sixth sample information, and the third sample value characterizes the probability that the sixth sample information is transformed by the sample object corresponding to the sixth sample information.

In this embodiment, the description of the sixth sample information and the sample object refers to the description of the fifth sample information, which is not described herein, and the second sample value may be the pointing rate and the third sample value may be the conversion rate.

S320, determining an embedded characteristic aiming at the sixth sample information as a third embedded characteristic through an embedded network in the fifth model.

The fifth model may be used as a basic model for training the second model, and in this embodiment, the structure of the fifth model may be as shown in fig. 6, and the fifth model may include an embedded network, a fourth task tower, a prediction network corresponding to the fourth task tower, a fifth task tower, and a prediction network corresponding to the fifth task tower.

The information features corresponding to the sixth sample information and the object features of the sample object aiming at the sixth sample information can be summarized to obtain summarized features, the summarized features are divided into dense features and sparse features, and then the dense features and the sparse features are input into an embedded network in a fifth model to obtain a third embedded feature.

S330, processing the third embedded feature through a fourth task tower in the fifth model to obtain a second predicted sample value; and processing the third embedded feature through a fifth task tower in the fifth model to obtain a third predicted sample value.

And the third embedded feature can be input into a fourth task tower to obtain a result output by the fourth task tower, and then the result output by the fourth task tower is input into a prediction network corresponding to the fourth task tower to obtain a second prediction sample value. Similarly, the third embedded feature can be input into a fifth task tower to obtain a result output by the fifth task tower, and then the result output by the fifth task tower is input into a prediction network corresponding to the fifth task tower to obtain a third prediction sample value.

And S340, training the fifth model according to the second sample value, the third sample value, the second predicted sample value and the third predicted sample value to obtain a second model.

The loss value may be determined according to the second sample value, the third sample value, the second predicted sample value, and the third predicted sample value, and then the fifth model may be trained to obtain the second model by the determined loss value. The cross entropy loss value may be determined according to the second sample value and the second predicted sample value, and the cross entropy loss value may be determined according to the third sample value and the third predicted sample value, and then the two cross entropy loss values may be summed (may be weighted summation, the weight is not specifically limited in the present application), to obtain a final loss value, and training the fifth model through the final loss value, to obtain the second model.

As an embodiment, S340 may include: determining a first loss value according to the second predicted sample value and the second sample value; determining a second loss value according to the third predicted sample value and the third sample value; calculating a third sample value and a product of the third sample value as a fourth sample value; calculating the product of the second predicted sample value and the third predicted sample value as a fourth predicted sample value; determining a third loss value according to the fourth predicted sample value and the fourth sample value; calculating the sum of the first loss value, the second loss value and the third loss value as a fourth loss value; calculating the sum of the fourth loss value and the first loss value as a second target loss value; and training the fifth model according to the second target loss value to obtain a second model.

The first loss value may refer to a cross entropy loss value determined from the second predicted sample value and the second sample value, the second loss value may refer to a cross entropy loss value determined from the third predicted sample value and the third sample value, and the third loss value may be a cross entropy loss value determined from the fourth predicted sample value and the fourth sample value.

And summing the first loss value, the second loss value and the third loss value to obtain a fourth loss value which is a loss value corresponding to the conversion rate, and summing the fourth loss value and the first loss value (the first loss value may refer to a loss value for the click rate) to obtain a second target loss value.

The training process of the fifth model may be as shown in fig. 6, where the embedded network, the fourth task tower, the prediction network corresponding to the fourth task tower, and the prediction network corresponding to the fifth task tower in the fifth model are trained by using the second target loss value, so as to obtain a trained second model.

S350, acquiring object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

The description of S350 refers to the descriptions of S110 to S130 above, and will not be repeated here.

In this embodiment, the fifth model is used as a basic model, and is trained to obtain the second model, the fifth model includes a multi-layer fully connected neural network, the fifth model has better feature extraction capability, and the fifth model is specific to different tasks, namely the conversion rate and the click rate, so that the feature extraction capability of the second model obtained by training according to the fifth model is better and adapts to different tasks, thereby improving the feature extraction capability of the information extraction model obtained according to the second model, and further improving the selection effect of the information selection model.

Referring to fig. 7, fig. 7 is a flowchart illustrating an information selecting method according to still another embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be the terminal 20 or the server 10 in fig. 1, and the method includes:

s410, determining sample characteristics of first sample information of a first data field through a first model, and taking the sample characteristics as first sample characteristics; sample characteristics of the second sample information of the second data field are determined by the first model as second sample characteristics.

The first sample information may refer to information in the first data field that may be used as a training information selection model, and the second sample information may refer to information in the second data field that may be used as a training information selection model.

Information features of information corresponding to the first sample information and object features of the sample object can be input into the first model, features extracted by the first model are obtained to serve as first sample features, and similarly, information features of information corresponding to the second sample information and object features of the sample object can be input into the first model, and features extracted by the first model are obtained to serve as second sample features.

As an embodiment, as shown in fig. 8, the first model may include a task tower for click through rate (or conversion rate), which is a sixth task tower, and an embedded network. The information features of the information corresponding to the first sample information and the object features of the sample object can be summarized to obtain summarized features, the summarized features are divided into dense features and sparse features, and then the dense features and the sparse features are input into an embedded network in the first model to obtain fourth embedded features; and inputting the fourth embedded feature into a sixth task tower to obtain a result output by the sixth task tower as the first sample feature.

Similarly, information features of information corresponding to the second sample information and object features of the sample object can be summarized to obtain summarized features, the summarized features are divided into dense features and sparse features, and then the dense features and the sparse features are input into an embedded network in the first model to obtain a fifth embedded feature; and inputting the fifth embedded feature into a sixth task tower to obtain a result output by the sixth task tower as a second sample feature.

As yet another embodiment, as shown in fig. 9, the first model may include a task tower for click rate, a task tower for conversion rate, and an embedded network, where the task tower corresponding to click rate is a first task tower and the task tower corresponding to conversion rate is a second task tower.

At this time, determining, by the first model, sample characteristics of the first sample information of the first data field as the first sample characteristics includes: inputting first sample information of a first data field into an embedded network of a first model to obtain a first embedded feature; processing the first embedded feature through a first task tower of the first model to obtain a first task feature; processing the first embedded feature through a second task tower of the first model to obtain a second task feature; and splicing the first task feature and the second task feature to obtain the first sample feature.

Accordingly, determining, by the first model, a sample feature of the second sample information of the second data field as the second sample feature comprises: inputting second sample information of a second data field into an embedded network of the first model to obtain a sixth embedded feature; processing the sixth embedded feature through a first task tower of the first model to obtain a third task feature; processing the sixth embedded feature through a second task tower of the first model to obtain a fourth task feature; and splicing the third task feature and the fourth task feature to obtain a second sample feature.

The information features of the information corresponding to the first sample information and the object features of the sample object can be summarized to obtain summarized features, the summarized features are divided into dense features and sparse features, and then the dense features and the sparse features are input into an embedded network in a first model to obtain first embedded features; and respectively inputting the first embedded features into the first task tower and the second task tower to obtain the first task features and the second task features.

Similarly, information features of information corresponding to the second sample information and object features of the sample object can be summarized to obtain summarized features, the summarized features are divided into dense features and sparse features, and then the dense features and the sparse features are input into an embedded network in the first model to obtain a sixth embedded feature; and respectively inputting the sixth embedded feature into the first task tower and the second task tower to obtain a third task feature and a fourth task feature.

S420, determining an antagonism loss value according to the discrimination result of the second discriminator aiming at the first sample characteristic and the discrimination result aiming at the second sample characteristic; training the second discriminator according to the countermeasures loss value to obtain a third discriminator.

The first sample feature may be input to a second arbiter to obtain a discrimination result of the second arbiter for the first sample feature, and the second sample feature may be input to the second arbiter to obtain a discrimination result of the second arbiter for the second sample feature, and then the counterdamage value may be determined according to the discrimination result of the second arbiter for the first sample feature and the discrimination result for the second sample feature.

Alternatively, the challenge loss value may be determined according to equation three, which is as follows:

wherein L is _adv To combat loss values, P _t Refers to the distribution obeyed by the first sample feature, E refers to the calculated expectation, D (E ^t ) Refers to the discrimination result of the second discriminator for the first sample feature, D (e ^s ) Refers to the discrimination result of the second discriminator for the second sample feature, P _s Refers to the distribution obeyed by the second sample feature, the third term after the plus signRefers to penalty gradients.

As shown in fig. 10, the discrimination result for the first sample feature and the discrimination result for the second sample feature may be determined according to the second discriminator, the counterdamage value may be determined according to the discrimination result for the first sample feature and the discrimination result for the second sample feature, the second discriminator may be trained according to the counterdamage value until the discrimination capability of the second discriminator reaches the discrimination target, and the trained second discriminator may be determined as the third discriminator; the discrimination target may be that the accuracy of the discrimination result of the second discriminator for the first sample feature and the discrimination result of the second sample feature reach respective accuracy thresholds, and the accuracy threshold corresponding to the discrimination result of the first sample feature and the accuracy threshold corresponding to the discrimination result of the second sample feature may be the same or different, which is not limited by the present application.

S430, training the first model according to the judging result of the third judging device aiming at the first sample characteristics to obtain a third model; the steps of acquiring the third model as a new first model, acquiring the third discriminant as a new second discriminant, acquiring the third sample information of the first data field as new first sample information and acquiring the fourth sample information of the second data field as new second sample information are performed and the determination of the sample characteristics of the first sample information of the first data field by the first model is performed back until a preset condition is met.

The first sample feature can be input into a third discriminator to obtain a discriminating result of the third discriminator aiming at the first sample feature, then a loss value is determined according to the discriminating result of the third discriminator aiming at the first sample feature, and then the first model is trained through the determined loss value to obtain a third model.

In the present application, the preset condition may refer to that the number of cycles (training the second discriminator once and the first model once as one cycle) reaches the set first target number of times, and/or that the model identification accuracy of the third model after the continuous multiple iterations is improved to be smaller than the set first accuracy, where the first target number of times and the first accuracy are not limited in this embodiment, for example, the first target number of times may be 10, and the first accuracy may be 0.00001.

In the application, the model identification accuracy of each model can be measured by the Area Under the AUC (Area Under the ROC Curve). The AUC is used for evaluating the classification model, and indicates the probability that the predictor predicts the positive case more correctly than the negative case, and for one predictor, a high AUC measurement value of the prediction result means that the predictor has better prediction effect.

The second discriminant may be trained first to obtain a third discriminant, the first model may be trained by the third discriminant to obtain a trained third model, and then taking the third discriminator as a new second mode, taking the third model as a new first model, and repeating the training process until the preset condition is met, and determining that the training is finished.

Optionally, in this embodiment, training the first model according to the discrimination result of the third discriminator for the first sample feature to obtain a third model includes: determining a discrimination result aiming at the first sample characteristic through a third discriminator as an intermediate discrimination result of the first sample characteristic; determining a selected loss value according to the intermediate discrimination result of the first sample characteristic, wherein the selected loss value characterizes the characteristic extraction capacity of the first model on the sample information of the first data domain; and training the first model according to the selected loss value to obtain a third model.

The first sample feature of the first sample information of the first data field determined by the first model can be input into a third discriminator to obtain a discrimination result of the third discriminator for the first sample feature, the discrimination result is used as an intermediate discrimination result of the first sample feature, and then a loss value is determined and selected according to a fourth formula, wherein the fourth formula is as follows:

wherein L is _e Refers to selecting a loss value.

After the loss value is selected, training the first model by the loss value until the iteration number reaches a set second target number, and stopping the training process to obtain a third model, wherein the second target number can be 50, for example.

As an implementation manner, when the first model structure is shown in fig. 8, an intermediate discrimination result for the first sample feature may be determined by the first discriminator, a selection loss may be determined according to the intermediate discrimination result, and the embedded network and the sixth task tower of the first model may be trained by the selection loss to obtain the third model, where in the training process, the prediction network in the first model may not participate in training, and only the prediction network of the second model may be directly used.

As still another embodiment, when the first model structure is as shown in fig. 9, the intermediate discrimination result for the first sample feature may be determined by the first discriminator, the selection loss may be determined according to the intermediate discrimination result, and the embedded network, the first task tower and the second task tower of the first model may be trained by the selection loss to obtain the third model, where in this training process, the prediction network in the first model may not participate in training, and only the prediction network of the second model may be directly used.

S440, acquiring a third model obtained in the last training process as an information selection model, and acquiring a third discriminator obtained in the last training process as a first discriminator.

And stopping the training process when the preset condition is met, acquiring a third model obtained in the last training process as an information selection model, and acquiring a third discriminator obtained in the last training process as a first discriminator.

S450, obtaining object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

The description of S450 refers to the descriptions of S110 to S130 above, and will not be repeated here.

In this embodiment, the information selection model is obtained through the countermeasure network training composed of the first model and the first discriminator, and the training process of the information selection model is sufficient, so that the information selection model has better recognition capability for the information of the first data domain, and the recognition effect of the information selection model is improved.

Referring to fig. 11, fig. 11 is a flowchart illustrating an information selecting method according to still another embodiment of the present application, where the method may be applied to an electronic device, and the electronic device may be the terminal 20 or the server 10 in fig. 1, and the method includes:

s510, determining first sample characteristics corresponding to the first sample information and second sample characteristics corresponding to the second sample information; training the second discriminator and the first model according to the first sample characteristics and the second sample characteristics to obtain a third model and a third discriminator; the steps of acquiring the third model as a new first model, acquiring the third discriminant as a new second discriminant, acquiring the third sample information of the first data field as new first sample information and acquiring the fourth sample information of the second data field as new second sample information are performed and the determination of the sample characteristics of the first sample information of the first data field by the first model is performed back until a preset condition is met.

The description of S510 refers to the descriptions of S410-S430 above, and will not be repeated here.

S520, acquiring a third model obtained in the last training process as a model to be corrected; obtaining a correction sample; and determining a predicted correction sample value for the correction sample information through the model to be corrected.

The correction sample comprises correction sample information of the first data field and correction sample values corresponding to the correction sample information, and the correction sample values represent the attention degree of a sample object corresponding to the correction sample information. In this embodiment, the corrected sample value may refer to a click rate and/or a conversion rate of the corrected sample information, and description of the corrected sample information refers to description of the first sample information above, which is not repeated.

In the present embodiment, when the model to be corrected is for one of the click rate and the conversion rate, the model to be corrected may include an embedded network, a seventh task tower, and a prediction network as shown in fig. 12.

Inputting correction sample information into an embedded network of a model to be corrected to obtain a seventh embedded feature; and processing the seventh embedded feature through a seventh task tower of the model to be corrected to obtain a processed result, and inputting the processed result of the seventh task tower into a prediction network to obtain a predicted corrected sample value, wherein the predicted corrected sample value is a predicted click rate or a predicted conversion rate.

In this embodiment, when the model to be corrected is for both the click rate and the conversion rate, the model to be corrected may include an embedded network, an eighth task tower, a prediction network corresponding to the eighth task tower, a ninth task tower, and a prediction network corresponding to the ninth task tower, as shown in fig. 13.

Inputting correction sample information into an embedded network of a model to be corrected to obtain an eighth embedded feature; and processing the eighth embedded feature through an eighth task tower of the model to be corrected to obtain a processed result, inputting the processed result of the eighth task tower into a corresponding prediction network to obtain one of the predicted corrected sample values, wherein the predicted corrected sample value is the predicted click rate, and similarly, processing the eighth embedded feature through a ninth task tower of the model to be corrected to obtain a processed result, inputting the processed result of the ninth task tower into the corresponding prediction network to obtain the other predicted corrected sample value, and the predicted corrected sample value is the predicted conversion rate.

S530, determining a loss value as a correction loss value according to the predicted correction sample value and the correction sample value; training the model to be corrected according to the correction loss value to obtain an information selection model.

After obtaining the predicted corrected sample value and the corrected sample value, a loss value of the predicted corrected sample value and the corrected sample value may be determined, and the corrected loss value may be a cross entropy loss value.

For example, when the predicted corrected sample value includes one of the predicted click rate and the predicted conversion rate, calculating a cross entropy loss value of the predicted corrected sample value and the corresponding corrected sample value as the corrected loss value; for another example, when the predicted correction sample value includes both the predicted click rate and the predicted conversion rate, the cross entropy loss value corresponding to the click rate and the cross entropy loss value corresponding to the conversion rate are calculated respectively, and the two cross entropy loss values are weighted and summed to obtain the correction loss value, wherein the weights of the two cross entropy loss values are not specifically limited in the application.

When the iteration number reaches the set third target number, or the model identification accuracy of the model to be corrected after the continuous multiple iterations is improved to be smaller than the set second accuracy, the third target number and the second accuracy are not limited in this embodiment.

As one embodiment, the method for determining a correction loss value may include determining a first correction loss value based on a predicted correction sample value and a correction sample value; determining sample characteristics aiming at the sample information to be corrected through the model to be corrected, and taking the sample characteristics as corrected sample characteristics; determining a discrimination result for the corrected sample feature by a first discriminator; determining a second correction loss value according to the discrimination result of the correction sample characteristics; and determining a correction loss value according to the first correction loss value and the second correction loss value.

When the model to be corrected is aimed at the click rate, a first correction loss value can be calculated according to a formula five, wherein the formula five is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for the first correction loss value corresponding to click rate, < >>For correction of the sample value +.>To predict corrected sample values.

When the model to be corrected is aimed at the conversion rate, the first correction loss value can be calculated according to a formula six, wherein the formula six is as follows:

Wherein, the liquid crystal display device comprises a liquid crystal display device,for a first correction loss value corresponding to the conversion, < >>To correct the conversion in the sample values +.>To predict the conversion in the corrected sample values, +.>To correct click rate in sample values +.>To predict click rate in corrected sample values, +.>To correct click conversion in sample values, +.>To predict click conversions in the corrected sample values.

When the model to be corrected is one of the click rate and the conversion rate, the model to be corrected is shown in fig. 12, and correction sample information can be input into an embedded network of the model to be corrected to obtain a seventh embedded feature; and processing the seventh embedded feature through a seventh task tower of the model to be corrected to obtain a processed result, and inputting the processed result of the seventh task tower into a first discriminator to obtain a discriminating result aiming at the corrected sample feature.

When the model to be corrected is aimed at both the click rate and the conversion rate, the model to be corrected is shown in fig. 13, and correction sample information can be input into an embedded network of the model to be corrected to obtain an eighth embedded feature; and processing the eighth embedded feature through an eighth task tower of the model to be corrected to obtain a processed result, inputting the processed result of the eighth task tower into the first discriminator to obtain a discrimination result aiming at the click rate, and similarly, processing the eighth embedded feature through a ninth task tower of the model to be corrected to obtain a processed result, and inputting the processed result of the ninth task tower into the first discriminator to obtain a discrimination result aiming at the conversion rate.

After the discrimination result aiming at the characteristics of the correction sample is obtained, calculating a second loss value, and determining the second correction loss value according to a formula seven when the model to be corrected aims at the click rate or the conversion rate, wherein the formula seven is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for the second correction loss value, D ^t (e ^t ) The result is a discrimination result for the correction sample characteristics.

After obtaining the first corrective loss value and the second corrective loss value, a first coefficient for the first corrective loss value and a second coefficient for the second corrective loss value may be obtained; calculating the product of the first correction loss value and the first coefficient as a first product result; calculating the product of the second correction loss value and the second coefficient as a second product result; and calculating the sum of the first product result and the second product result as a correction loss value. The specific values of the first coefficient and the second coefficient are not limited, and the user can set the values according to the requirements.

When the model to be corrected is aimed at the click rate, the process of calculating the correction loss value refers to a formula eight, and the formula eight is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,lambda is the correction loss value corresponding to click rate _e Is a second coefficient lambda _c Is the first coefficient for click through rate. / >

When the model to be corrected is for the conversion rate, the process of calculating the correction loss value refers to a formula nine, and the formula nine is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,lambda is the corrected loss value corresponding to conversion _a Is the first coefficient for conversion. That is, in the present application, the first coefficients corresponding to the click rate and the conversion rate may be different.

When the model to be corrected is one of the click rate and the conversion rate, the training process of the model to be corrected is shown in fig. 14, and the model to be corrected is the model to be corrected shown in fig. 12. After the first correction loss value and the second correction loss value are determined, the correction loss value is determined through the first correction loss value and the second correction loss value, and then the embedded network, the seventh task tower and the prediction network of the model to be corrected are trained through the correction loss value, so that the information selection model is obtained.

When the model to be corrected is aimed at the click rate and the conversion rate, the training process of the model to be corrected is shown in fig. 15, and the model to be corrected is the model to be corrected shown in fig. 13. And determining correction loss values for the click rate and the conversion rate according to a formula eight and a formula nine, summing the correction loss values for the click rate and the conversion rate to obtain a summed correction loss value, and training a model to be corrected according to the summed correction loss value, wherein the model to be corrected can comprise an embedded network, an eighth task tower, a prediction network corresponding to the eighth task tower, a ninth task tower and a prediction network corresponding to the ninth task tower to obtain an information selection model.

It should be noted that, the information selection model may be trained according to the scene of the information selection model for both the click rate and the conversion rate, and then the task tower portion (including the task tower and the prediction network) of the click rate in the obtained information selection model may be deleted to obtain the information selection model for the conversion rate, and the task tower portion (including the task tower and the prediction network) of the conversion rate in the obtained information selection model may also be deleted to obtain the information selection model for the click rate.

S540, obtaining object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected; processing object characteristics of the target object and information characteristics of each piece of information to be selected through the information selection model to obtain a target value of each piece of information to be selected; and selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

The description of S540 refers to the descriptions of S110 to S130 above, and will not be repeated here.

In this embodiment, after the last cycle process is finished, the third model obtained in the last cycle process is corrected and trained through the correction loss value, the trained model is used as the information selection model, after the information selection model is corrected and trained, the information selection capability of the information selection model for the information of the first data domain is better, the selection capability of the information selection model is further improved, the obtained target value for the information to be selected is more accurate, and the information transmission efficiency is improved.

In order to more clearly explain the technical solution of the present application, the information selection method of the present application is explained below in connection with an exemplary scenario in which. The object is a user (the object sample object is a user), the information in the first data field is the game advertisement information in the shooting game category, the information in the second data field is the game advertisement information in the action game category, the server is used as an execution subject (for example, a server 10 in fig. 1), the batch_size (number of samples per batch) 8192 is the game advertisement information, the num_epochs (iteration number) is 50 times, the learning_rate (learning rate) is 0.00001, the early_stop (early_stop iteration number) is 10 times, and if the precision of the continuous 20 iteration test set is not improved, the model stops training in advance, which refers to the training process of the target field.

In addition, the model-trained optimizer employs Adam optimizer [2], with the optimizer parameters set to β1=0.9, β2=1e-8.

1. Training of a first model

In this scenario, a fifth model for click rate and conversion rate of game advertisement information is constructed, and the structure of the fifth model is described with reference to fig. 6 in the above embodiment, which is not repeated.

And obtaining a plurality of pieces of game advertisement information under the action game category, and the click rate and conversion rate corresponding to the game advertisement information as a second training sample, inputting game advertisement features and user features corresponding to each piece of game advertisement information into an embedded network of a fifth model to obtain corresponding embedded features, inputting the embedded features into a task tower corresponding to the click rate (namely a fourth task tower in the embodiment) and a prediction network corresponding to the click rate to obtain a predicted click rate, and inputting the embedded features into a task tower corresponding to the conversion rate (namely a fifth task tower in the embodiment) and a prediction network corresponding to the conversion rate to obtain a predicted conversion rate.

Calculating the product of the click rate and the conversion rate of the second training sample to obtain the click conversion rate, calculating the product of the predicted click rate and the predicted conversion rate of the second training sample to obtain the predicted click conversion rate, determining a first loss value according to the predicted click rate and the click rate, determining a second loss value according to the predicted conversion rate and the conversion rate, determining a third loss value according to the predicted click conversion rate and the click conversion rate, summing the first loss value, the third loss value and the third loss value to obtain a fourth loss value, taking the fourth loss value as the loss value of the conversion rate, summing the first loss value and the fourth loss value, and training an embedded network of the fifth model, a predicted network corresponding to the fourth task tower and a predicted network corresponding to the fifth task tower according to the summation result (namely, the second target loss value in the embodiment), thereby obtaining a trained second model.

And re-acquiring a fifth model which is not trained, configuring the fifth model through parameters of the second model, and obtaining the configured model as a first model, wherein the structure of the first model is shown in fig. 9 in the embodiment.

2. Acquisition of a model to be corrected

The method comprises the steps of acquiring a discriminator in an initial state as a second discriminator, acquiring a plurality of pieces of game advertisement information in a shooting game category as first sample information, and acquiring a plurality of pieces of game advertisement information in an action game category as second sample information.

Inputting game advertisement features and user features corresponding to the first sample information into an embedded network of a first model to obtain corresponding embedded features, inputting the embedded features into a task tower corresponding to the click rate in the first model (namely the first task tower) to obtain first task features, inputting the embedded features into a task tower corresponding to the conversion rate in the first model (namely the second task tower) to obtain second task features, and splicing the first task features and the second task features to obtain the first sample features.

Similarly, the game advertisement feature and the user feature corresponding to the second sample information are input into an embedded network of the first model to obtain a corresponding embedded feature, the embedded feature is input into a task tower corresponding to the click rate in the first model (namely the first task tower) to obtain a third task feature, the embedded feature is input into a task tower corresponding to the conversion rate in the first model (namely the second task tower) to obtain a fourth task feature, and the third task feature are spliced to obtain the second sample feature.

Inputting the first sample feature into a second discriminator to obtain a discrimination result aiming at the first sample feature, inputting the second sample feature into the second discriminator to obtain a discrimination result aiming at the second sample feature, solving the counterdamage value according to a method of a formula III, and training the second discriminator to obtain a third discriminator through the counterdamage value.

Inputting the first sample characteristics into a third discriminator to obtain an intermediate discriminating result aiming at the first sample characteristics, determining a selected loss value through a fourth formula, and training the first model through the selected loss value to obtain a third model.

The third model is obtained as a new first model, the third discriminator is obtained as a new second discriminator, game advertisement information (which can be different from or the same as the original first sample information) under the shooting game category is obtained as new first sample information, and game advertisement information (which can be different from or the same as the original second sample information) under the action game category is obtained as new second sample information, and the process of training the second discriminator and the first model is repeated until the preset condition is met.

A third model after the last iteration in the last cycle is obtained and used as a model to be corrected, and the model to be corrected is shown in fig. 13.

3. Acquisition of information selection model

And obtaining the model to be corrected obtained in the process, and obtaining game advertisement information (which can be different from the first sample information) under the shooting game category and the click rate and conversion rate of the game advertisement information as correction samples.

And inputting the game advertisement features and the user features corresponding to the correction sample information into an embedded network of the model to be corrected to obtain corresponding embedded features (namely the eighth embedded features), inputting the embedded features into a task tower corresponding to the click rate (namely the eighth task tower) and a prediction network corresponding to the click rate to obtain a predicted click rate, and inputting the embedded features into a task tower corresponding to the conversion rate (namely the ninth task tower) and a prediction network corresponding to the conversion rate to obtain a predicted conversion rate.

According to the predicted click rate and the click rate, a first correction loss value corresponding to the click rate is calculated according to a formula five, and according to the predicted click rate, the conversion rate and the predicted conversion rate, a first correction loss value corresponding to the conversion rate is calculated according to a formula six.

Meanwhile, the result output by the task tower corresponding to the click rate in the model to be corrected can be input into the first discriminator to obtain the discrimination result corresponding to the click rate, and the result output by the task tower corresponding to the conversion rate in the model to be corrected is input into the first discriminator to obtain the discrimination result corresponding to the conversion rate.

And calculating a second correction loss value corresponding to the click rate according to a judgment result corresponding to the click rate and a formula seven, and calculating the second correction loss value corresponding to the conversion rate according to the judgment result corresponding to the conversion rate.

And then, calculating a correction loss value aiming at the click rate according to a formula eight, calculating a correction loss value corresponding to the conversion rate according to a formula nine, summing the correction loss value of the click rate and the correction loss value corresponding to the conversion rate to obtain a summed correction loss value, and training a model to be corrected by the summed correction loss value, wherein the model to be corrected can comprise an embedded network, an eighth task tower, a prediction network corresponding to the eighth task tower, a ninth task tower and a prediction network corresponding to the ninth task tower, so as to obtain a trained information selection model, and the structure of the information selection model is not repeated by referring to the structure of the correction model.

At this time, the click rate task tower and the prediction network can be deleted to obtain the information selection model for the conversion rate, and the click rate task tower and the prediction network can be deleted to obtain the information selection model for the click rate.

In the scene, deleting the task tower with click rate and the prediction network to obtain an information selection model aiming at the conversion rate.

4. Selection of information

After training the information selection model, the server arranges the information selection model locally.

The server receives a transmission request sent by a game server of the shooting game category, the transmission request comprises game advertisement information of 10 shooting game categories, and the server responds to the transmission request to acquire game advertisement characteristics of the game advertisement information of the 10 shooting game categories and user characteristics of a user XZA.

And inputting each game advertisement feature and the user feature into an embedded network of the information selection model to obtain embedded features to be analyzed, and inputting the embedded features to be analyzed into a task tower corresponding to the conversion rate and a prediction network corresponding to the conversion rate to obtain the prediction conversion rate. And traversing all 10 pieces of game advertisement information to obtain the respective prediction conversion rate of the 10 pieces of game advertisement information.

And taking the 2 pieces of game advertisement information with the highest predicted conversion rate as target information, and sending the 2 pieces of target information to the terminal of the user XZA by the server so that the user XZA can watch the 2 pieces of target information through the terminal.

In the scene, the information selection model is obtained through the training process and is used as the target model of the target domain (the first data domain, namely the shooting game type), and finally, the predicted result of the target domain model after migration is improved by more than 10% on average compared with the AUC measurement value of the predicted result of the single domain model (the model only aiming at one data domain), so that the prediction effect is higher.

Referring to fig. 16, fig. 16 shows a block diagram of an information selecting apparatus according to an embodiment of the present application, the apparatus 1100 includes:

an obtaining module 1110, configured to obtain an object feature of a target object and information features of a plurality of information to be selected, where the plurality of information to be selected belongs to information of a first data domain;

the target value determining module 1120 is configured to process, through the information selection model, the object feature of the target object and the information feature of each piece of information to be selected, to obtain a target value of each piece of information to be selected; the target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first discriminator is obtained by training the second discriminator through an anti-loss value, and the anti-loss value characterizes the distinguishing capability of the second discriminator on the information of the first data domain and the information of the second data domain;

the information selecting module 1130 is configured to select target information for the target object from the plurality of information to be selected according to respective target values of the plurality of information to be selected.

Optionally, the information selecting module 1130 is further configured to select, from the plurality of pieces of information to be selected, information to be selected for which a target value reaches a preset threshold, as target information for a target object; or selecting a preset number of pieces of information to be selected from the plurality of pieces of information to be selected according to the order of the target value from high to low, and taking the information to be selected as target information aiming at a target object.

Optionally, an acquiring module 1110 is configured to further acquire an object feature of the target object and an information feature of the plurality of information to be selected in response to the received transmission request including the plurality of information to be selected; correspondingly, the information selection module 1130 is further configured to send target information to the target object.

It should be noted that, in the present application, the device embodiment and the foregoing method embodiment correspond to each other, and specific principles in the device embodiment may refer to the content in the foregoing method embodiment, which is not described herein again.

Fig. 17 shows a block diagram of an electronic device for performing the information selection method according to an embodiment of the present application. The electronic device may be the terminal 20 or the server 10 in fig. 1, and it should be noted that, the computer system 1200 of the electronic device shown in fig. 17 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.

As shown in fig. 17, the computer system 1200 includes a central processing unit (Central Processing Unit, CPU) 1201 which can perform various appropriate actions and processes, such as performing the methods in the above-described embodiments, according to a program stored in a Read-Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a random access Memory (Random Access Memory, RAM) 1203. In the RAM 1203, various programs and data required for the system operation are also stored. The CPU1201, ROM1202, and RAM 1203 are connected to each other through a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.

The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output portion 1207 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and a speaker, etc.; a storage section 1208 including a hard disk or the like; and a communication section 1209 including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. The drive 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 1210 as needed, so that a computer program read out therefrom is installed into the storage section 1208 as needed.

In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1209, and/or installed from the removable media 1211. When executed by a Central Processing Unit (CPU) 1201, performs the various functions defined in the system of the present application.

It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

As another aspect, the present application also provides a computer-readable storage medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer readable storage medium carries computer readable instructions which, when executed by a processor, implement the method of any of the above embodiments.

According to an aspect of embodiments of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the electronic device to perform the method of any of the embodiments described above.

It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause an electronic device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be appreciated by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not drive the essence of the corresponding technical solutions to depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. An information selection method, characterized in that the method comprises:

acquiring object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected, wherein the plurality of pieces of information to be selected belong to information of a first data field;

processing object characteristics of the target object and information characteristics of each piece of information to be selected through an information selection model to obtain a target value of each piece of information to be selected; the target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first discriminator is obtained by training a second discriminator through a counterdamage value, and the counterdamage value characterizes the distinguishing capability of the second discriminator on the information of the first data field and the information of the second data field;

And selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

2. The method of claim 1, wherein the training method of the first discriminant and the information selection model comprises:

determining sample characteristics of first sample information of the first data field through the first model as first sample characteristics;

determining sample characteristics of second sample information of the second data field as second sample characteristics through the first model;

determining a countermeasures loss value according to the discrimination result of the second discriminator for the first sample feature and the discrimination result for the second sample feature;

training the second discriminator according to the countermeasures loss value to obtain a third discriminator;

training the first model according to the discrimination result of the third discriminator aiming at the first sample characteristic to obtain a third model;

acquiring the third model as a new first model, acquiring the third discriminant as a new second discriminant, acquiring third sample information of the first data domain as new first sample information and acquiring fourth sample information of the second data domain as new second sample information, and returning to execute the step of determining the sample characteristics of the first sample information of the first data domain through the first model until a preset condition is met;

And acquiring a third model obtained in the last training process as the information selection model, and acquiring a third discriminator obtained in the last training process as the first discriminator.

3. The method according to claim 2, wherein training the first model according to the discrimination result of the third discriminator for the first sample feature to obtain a third model includes:

determining a discrimination result for the first sample feature by the third discriminator as an intermediate discrimination result of the first sample feature;

determining a selected loss value according to the intermediate discrimination result of the first sample characteristic, wherein the selected loss value characterizes the characteristic extraction capacity of the first model on the sample information of the first data domain;

and training the first model according to the selected loss value to obtain a third model.

4. The method according to claim 2, wherein the obtaining a third model obtained in the last training process as the information selection model includes:

acquiring a third model obtained in the last training process as a model to be corrected;

Obtaining a correction sample, wherein the correction sample comprises correction sample information of the first data field and correction sample values corresponding to the correction sample information, and the correction sample values represent the attention degree of a sample object corresponding to the correction sample information;

determining a predicted corrected sample value for the corrected sample information by the model to be corrected;

determining a loss value as a correction loss value according to the predicted correction sample value and the correction sample value;

training the model to be corrected according to the correction loss value to obtain an information selection model.

5. The method of claim 4, wherein determining a loss value as a corrected loss value based on the predicted corrected sample value and the corrected sample value comprises:

determining a first correction loss value from the predicted correction sample value and the correction sample value;

determining sample characteristics aiming at the sample information to be corrected through the model to be corrected, and taking the sample characteristics as corrected sample characteristics;

determining, by the first arbiter, a discrimination result for the corrected sample feature;

determining a second correction loss value according to the discrimination result of the correction sample characteristics;

And determining the corrected loss value according to the first corrected loss value and the second corrected loss value.

6. The method of claim 5, wherein said determining said corrective loss value based on said first corrective loss value and said second corrective loss value comprises:

obtaining a first coefficient for the first corrective loss value and a second coefficient for the second corrective loss value;

calculating the product of the first correction loss value and the first coefficient as a first product result;

calculating the product of the second correction loss value and the second coefficient as a second product result;

and calculating the sum of the first product result and the second product result as the correction loss value.

7. The method according to claim 2, wherein determining, by the first model, sample features of the first sample information of the first data field as first sample features comprises:

inputting first sample information of the first data field into an embedded network of the first model to obtain a first embedded feature;

processing the first embedded feature through a first task tower of the first model to obtain a first task feature;

Processing the first embedded feature through a second task tower of the first model to obtain a second task feature;

and splicing the first task feature and the second task feature to obtain the first sample feature.

8. The method of claim 1, wherein the training method of the second model comprises:

acquiring a first training sample, wherein the first training sample comprises fifth sample information of the second data field and a first sample value corresponding to the fifth sample information, and the first sample value represents the attention degree of a sample object corresponding to the fifth sample information;

determining an embedded feature for the fifth sample information as a second embedded feature through an embedded network in a fourth model;

processing the second embedded feature through a third task tower in the fourth model to obtain a first predicted sample value;

a first target loss value based on the first predicted sample value and the first sample value;

and training the fourth model according to the first target loss value to obtain a second model.

9. The method of claim 1, wherein the training method of the second model comprises:

Acquiring a second training sample, wherein the second training sample comprises sixth sample information of the second data field, a second sample value corresponding to the sixth sample information and a third sample value corresponding to the sixth sample information; the second sample value represents the probability that the sixth sample information is selected by the sample object corresponding to the sixth sample information, and the third sample value represents the probability that the sixth sample information is converted by the sample object corresponding to the sixth sample information;

determining an embedded feature for the sixth sample information as a third embedded feature through an embedded network in a fifth model;

processing the third embedded feature through a fourth task tower in the fifth model to obtain a second predicted sample value;

processing the third embedded feature through a fifth task tower in the fifth model to obtain a third predicted sample value;

and training the fifth model according to the second sample value, the third sample value, the second predicted sample value and the third predicted sample value to obtain a second model.

10. The method of claim 9, wherein training the fifth model based on the second sample value, the third sample value, the second predicted sample value, and the third predicted sample value, results in a second model, comprising:

Determining a first loss value according to the second predicted sample value and the second sample value;

determining a second loss value according to the third predicted sample value and the third sample value;

calculating the third sample value and the product of the third sample value as a fourth sample value;

calculating the product of the second predicted sample value and the third predicted sample value as a fourth predicted sample value;

determining a third loss value according to the fourth predicted sample value and the fourth sample value;

calculating a sum of the first loss value, the second loss value and the third loss value as a fourth loss value;

calculating the sum of the fourth loss value and the first loss value as a second target loss value;

and training the fifth model according to the second target loss value to obtain a second model.

11. The method according to claim 1, wherein selecting target information for the target object from the plurality of pieces of information to be selected according to the respective target values of the plurality of pieces of information to be selected, comprises:

selecting information to be selected, of which the target value reaches a preset threshold value, from the plurality of information to be selected, and taking the information to be selected as target information aiming at the target object; or alternatively, the first and second heat exchangers may be,

And selecting a preset number of pieces of information to be selected from the plurality of pieces of information to be selected according to the order of the target value from high to low, and taking the information to be selected as target information aiming at the target object.

12. The method according to claim 1, wherein the obtaining the object feature of the target object and the information features of the plurality of pieces of information to be selected includes:

responding to a received sending request comprising a plurality of pieces of information to be selected, and acquiring object characteristics of a target object and information characteristics of the plurality of pieces of information to be selected;

the method further includes, after selecting target information for the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected,:

and sending the target information to the target object.

13. An information selecting apparatus, the apparatus comprising:

the acquisition module is used for acquiring object characteristics of a target object and information characteristics of a plurality of pieces of information to be selected, wherein the plurality of pieces of information to be selected belong to information of a first data domain;

the target value determining module is used for processing the object characteristics of the target object and the information characteristics of each piece of information to be selected through the information selecting model to obtain a target value of each piece of information to be selected; the target value of each piece of information to be selected represents the attention of the target object to the information to be selected; the information selection model is obtained by training a first model through a first discriminator; the first model is configured by parameters of a second model, and the second model is obtained by sample information training of a second data domain; the first discriminator is obtained by training a second discriminator through a counterdamage value, and the counterdamage value characterizes the distinguishing capability of the second discriminator on the information of the first data field and the information of the second data field;

And the information selection module is used for selecting target information aiming at the target object from the plurality of pieces of information to be selected according to the target values of the plurality of pieces of information to be selected.

14. An electronic device, comprising:

one or more processors;

a memory;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-12.

15. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program code, which is callable by a processor for performing the method according to any one of claims 1-12.