CN116910372A

CN116910372A - Information push model processing method and device, information push method and device

Info

Publication number: CN116910372A
Application number: CN202311163354.3A
Authority: CN
Inventors: 叶祺; 欧阳文俊; 张舒; 王惠东; 孙思维; 丁建波; 王峰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-09-11
Filing date: 2023-09-11
Publication date: 2023-10-20
Anticipated expiration: 2043-09-11
Also published as: CN116910372B

Abstract

An information push model processing method comprises the following steps: acquiring a plurality of first sample information and second sample information, wherein each of the first sample information and the second sample information has the same information category to which at least one hierarchy belongs in an information hierarchy structure; acquiring sample mode information of at least one preset mode corresponding to each first sample information, and determining a sample mode embedding vector of the sample mode information; for each piece of first sample information, determining sample semantic information of the first sample information based on the information category of each level of the first sample information in the information hierarchy; predicting the prediction pushing probability of pushing the second sample information according to the sample mode embedding vector of each first sample information and the sample semantic embedding vector of the sample semantic information; model training is conducted based on the difference of the predicted pushing probability relative to the expected pushing probability of pushing the second sample information, and an information pushing model is obtained. By adopting the method, the flexibility of information pushing can be improved.

Description

Information push model processing method and device, information push method and device

Technical Field

The present application relates to the field of computer technology, and in particular, to an information push model processing method, an apparatus, a computer device, a computer readable storage medium, and a computer program product, and an information push method, an apparatus, a computer device, a computer readable storage medium, and a computer program product.

Background

With the development of computer technology, information pushing technology is presented, through which information of interest can be recommended to different users. For example, information of interest to the user may be predicted by a recommendation model using historical browsing information of the user, and then recommended to the user.

The traditional recommendation model often uses coding capability to directly code the identification of the information, so that different information has unique coding information, and information pushing is realized based on the coding information of the information. In the recommendation scene, new information identifiers are continuously generated, and in order to cope with the situation that the information identifiers are continuously changed, a traditional recommendation model needs to be continuously trained, so that the recommendation model can encode the newly generated information identifiers to push new information.

Therefore, the conventional recommendation model has a problem that information recommendation is not flexible.

Disclosure of Invention

In view of the foregoing, it is desirable to provide an information push and information push model processing method, apparatus, computer device, computer readable storage medium, and computer program product that can improve flexibility.

The application provides an information push model processing method, which comprises the following steps:

acquiring a plurality of first sample information and second sample information, wherein each of the first sample information and the second sample information has the same information category to which at least one hierarchy belongs in an information hierarchy structure;

acquiring sample mode information of at least one preset mode corresponding to each piece of first sample information, and determining a sample mode embedding vector of the sample mode information;

for each piece of first sample information, determining an information category to which each level of the first sample information belongs in the information hierarchy;

determining sample semantic information of the first sample information based on the information category to which each level of the first sample information belongs in the information hierarchy;

determining sample semantic embedded vectors of each piece of sample semantic information, and predicting the predicted pushing probability of pushing the second sample information according to the sample modal embedded vectors and the sample semantic embedded vectors of each piece of first sample information;

And acquiring expected pushing probability of pushing the second sample information, and performing model training based on the difference of the predicted pushing probability relative to the expected pushing probability to acquire an information pushing model.

The application also provides an information push model processing device, which comprises:

the sample information acquisition module is used for acquiring a plurality of first sample information and second sample information, wherein each first sample information and each second sample information belong to the same information category of at least one hierarchy in the information hierarchy structure;

the sample mode determining module is used for acquiring sample mode information of at least one preset mode corresponding to each piece of first sample information and determining sample mode embedding vectors of the sample mode information;

an information category determining module, configured to determine, for each piece of first sample information, an information category to which each level of the first sample information belongs in the information hierarchy;

the sample semantic determining module is used for determining sample semantic information of the first sample information based on the information category of each level of the first sample information in the information hierarchy;

The prediction module is used for determining sample semantic embedded vectors of each piece of sample semantic information, and predicting the prediction pushing probability of pushing the second sample information according to the sample modal embedded vectors and the sample semantic embedded vectors of each piece of first sample information;

the training module is used for acquiring expected pushing probability of pushing the second sample information, and carrying out model training based on the difference of the predicted pushing probability relative to the expected pushing probability to acquire an information pushing model.

The application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

The application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:

The information push model processing method, the information push model processing device, the computer equipment, the storage medium and the computer program product acquire a plurality of first sample information and second sample information, and the acquired first sample information and second sample information belong to the same information category of at least one layer in the information hierarchy structure so as to predict the probability of pushing the second sample information through the plurality of first sample information with the same information category of the selected at least one layer. Sample mode information of at least one preset mode corresponding to each piece of first sample information is obtained, sample mode embedded vectors of the sample mode information are determined, so that the sample mode information is mapped to a vector space, and key information of the sample mode information under different modes can be represented through the sample mode embedded vectors. For each piece of first sample information, determining the information category to which each level of the first sample information belongs in the information hierarchy can perform multi-level classification on the first sample information to obtain more detailed category information. Based on the information category to which each level of the first sample information belongs in the information hierarchy, the sample semantic information of the first sample information is determined, so that the sample semantic information can be determined through the detailed category information of multiple levels, and the sample semantic information determined by different information with the same category is the same. And determining a sample semantic embedding vector of each piece of sample semantic information to predict a predicted pushing probability of pushing the second sample information according to the sample modal embedding vector and the sample semantic embedding vector of each piece of first sample information, so that the probability of using the second sample information as information to be pushed subsequently can be predicted according to the known modal characteristics and semantic characteristics of a plurality of pieces of first sample information without predicting through unique identification of the information. And the prediction result is not influenced no matter whether the identification of the information is changed or not by combining the modal characteristics and the semantic characteristics of the first sample information, so that the prediction of the information is more flexible. The expected pushing probability of pushing the second sample information is obtained, model training is conducted based on the difference of the predicted pushing probability relative to the expected pushing probability, so that the difference between the predicted pushing probability and the expected pushing probability is gradually reduced in the training process, the probability of model prediction continuously tends to the expected pushing probability, the extraction precision of a model is gradually improved, and finally the information pushing model is obtained. The information pushing model focuses on semantic information and specific modal information of different information in the information hierarchical structure, so that information to be pushed subsequently can be accurately predicted based on the semantic information and the modal information of the known information, and flexible prediction and pushing of the information are realized.

The application provides an information pushing method, which comprises the following steps:

acquiring a plurality of first information;

acquiring the modal information of at least one preset mode corresponding to each piece of first information, and determining a mode embedding vector of the modal information;

acquiring a semantic embedded vector corresponding to each first information, wherein the semantic embedded vector is related to an information category to which the corresponding first information belongs in an information hierarchy, and the information category to which each hierarchy belongs in the information hierarchy is consistent, and the corresponding semantic embedded vectors are the same;

and acquiring a plurality of candidate information, and screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each piece of first information.

The application also provides an information pushing device, which comprises:

the information acquisition module is used for acquiring a plurality of first information;

the vector determining module is used for acquiring the modal information of at least one preset mode corresponding to each piece of first information and determining a mode embedding vector of the modal information;

the semantic acquisition module is used for acquiring a semantic embedded vector corresponding to each first information, wherein the semantic embedded vector is related to an information category to which the corresponding first information belongs in an information hierarchy, and the information category to which each hierarchy belongs in the information hierarchy is consistent, and the corresponding semantic embedded vectors are the same;

And the screening module is used for acquiring a plurality of candidate information, and screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each piece of first information.

acquiring a plurality of first information;

The information pushing method, the information pushing device, the computer equipment, the storage medium and the computer program product acquire a plurality of first information, acquire the modal information of at least one preset mode corresponding to each first information, determine the mode embedded vector of the modal information, map the modal information to a vector space, and can represent key information of each first information under different modes through the mode embedded vector. The semantic embedded vector corresponding to each first information is obtained and is related to the information category to which the corresponding first information belongs in an information hierarchical structure, wherein the information category to which each level belongs in the information hierarchical structure is consistent, the corresponding semantic embedded vectors are identical, so that the semantics of the information can be determined through multi-level detailed category information, the semantic embedded vectors determined by different information with the same category are identical, and the problem that the pushing probability is low due to the fact that some information has no historical interaction data is avoided. And acquiring a plurality of candidate information, and screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each first information, wherein the screened second information is the same as the information category of at least one layer of each first information in the information hierarchy structure, so that second information similar to each first information can be screened out, and accurate recommendation of similar information is realized. And the mode and the semantics of the known information are combined to predict the similar information of the subsequent pushing, and whether the information has history interaction data or not, the information can be predicted and pushed, so that the information pushing is more flexible.

Drawings

FIG. 1 is an application environment diagram of an information push model processing method in one embodiment;

FIG. 2 is a flow chart of a method for processing an information push model in one embodiment;

FIG. 3 is a schematic diagram of an architecture of an information push model to be trained in one embodiment;

FIG. 4 is a flow chart of a predicted push probability of predicting push second sample information according to a sample modality embedded vector, a sample semantic embedded vector, and a sample push position embedded vector of each first sample information in one embodiment;

FIG. 5 is a schematic diagram of mapping sample embedding vectors to key space and query space in one embodiment;

FIG. 6 is a schematic diagram of mapping sample embedding vectors to a value space in one embodiment;

FIG. 7 is a flow chart of a multi-set self-attention process for sample embedding vectors of first sample information according to one embodiment;

FIG. 8 is a schematic diagram of mapping sample embedding vectors to key spaces by multiple sets of initial key weights in one embodiment;

FIG. 9 is a schematic diagram of an architecture of a class tree in one embodiment;

FIG. 10 is a flowchart of a method for pushing information in an embodiment;

FIG. 11 is a flow diagram of selecting second information for pushing from a plurality of candidate information in one embodiment;

FIG. 12 is a schematic diagram of an architecture of an information push model in one embodiment;

FIG. 13 is a block diagram of an information push model processing device in one embodiment;

FIG. 14 is a block diagram of an information pushing device in one embodiment;

fig. 15 is an internal structural view of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The embodiment of the application can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent traffic, auxiliary driving, audio and video and the like. For example, it is applicable to the field of artificial intelligence (Artificial Intelligence, AI) technology, where artificial intelligence is a theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The scheme provided by the embodiment of the application relates to an artificial intelligence information push model processing method and an information push method, and specifically illustrates through the following embodiments.

The information push model processing method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be provided separately, may be integrated on the server 104, or may be located on a cloud or other server. The terminal 102 and the server 104 can each independently execute the information push model processing method provided in the embodiment of the present application. The terminal 102 and the server 104 may also cooperate to perform the information push model processing method provided in the embodiments of the present application. When the terminal 102 and the server 104 cooperate to perform the information push model processing method provided in the embodiment of the present application, the terminal 102 obtains a plurality of first sample information and second sample information from the server 104, each of the first sample information and the second sample information having the same information category to which at least one hierarchy belongs in the information hierarchy. The terminal 102 obtains sample mode information of at least one preset mode corresponding to each first sample information from the server 104, and determines a sample mode embedding vector of the sample mode information. For each first sample information, the terminal 102 determines the information category to which each level of the first sample information belongs in the information hierarchy. The terminal 102 determines sample semantic information for the first sample information based on the information category to which each level in the information hierarchy belongs. The terminal 102 determines a sample semantic embedded vector for each sample semantic information, predicts a predicted push probability for pushing the second sample information based on the sample modality embedded vector and the sample semantic embedded vector for each first sample information. The terminal 102 obtains expected pushing probability of pushing the second sample information, and performs model training based on the difference of the predicted pushing probability relative to the expected pushing probability to obtain an information pushing model. The information push model may be deployed on the terminal 102 or the server 104.

The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, intelligent voice interaction devices, intelligent home appliances, vehicle terminals, aircrafts, portable wearable devices, etc. The server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like.

In one embodiment, the information pushing method may also be applied in the application environment as shown in fig. 1. Both the terminal 102 and the server 104 may independently execute the information pushing method provided in the embodiment of the present application. The terminal 102 and the server 104 may also cooperate to perform the information push method provided in the embodiments of the present application.

It should be noted that the numbers of "plural" and the like mentioned in the respective embodiments of the present application each refer to the number of "at least two".

In one embodiment, as shown in fig. 2, an information push model processing method is provided, and the method is applied to the computer device in fig. 1 (the computer device may be a terminal or a server in fig. 1) for illustration, and includes the following steps:

Step S202, a plurality of first sample information and second sample information are acquired, wherein each of the first sample information and the second sample information has the same information category to which at least one hierarchy belongs in the information hierarchy structure.

Wherein the first sample information and the second sample information are training samples for model training. The first sample information and the second sample information belong to at least one information category.

The information category refers to a category of information, for example, but not limited to, video category, audio category, text category, image category, graphics context category, etc., and the video category may be classified into drama, movie, documentary, cartoon, etc., but not limited thereto.

The information hierarchy is a hierarchy formed by hierarchically dividing information categories. For example, the information hierarchy includes 3 levels of information categories, the first level being divided into video categories and audio categories, the second level being divided into dramas, movies, documentaries, etc., and the audio categories being divided into chores, genres, languages, etc.; in the third level, the drama is divided into modern cities, swordsman, suspense and the like; movies are also divided into action movies, comedy movies, science fiction movies, etc.

In this embodiment, each information category may be characterized by a category identification.

Specifically, the computer device acquires a plurality of sample information, classifies the plurality of sample information according to an information hierarchy, and obtains an information category to which at least one hierarchy of each sample information belongs in the information hierarchy. And screening sample information with the same information category at least one level from the plurality of sample information, and dividing the screened sample information into a plurality of first sample information and a second sample information.

In this embodiment, the computer device may acquire a sample information push sequence, and determine a plurality of first sample information and second sample information other than the plurality of first sample information in the sample information push sequence.

In this embodiment, the sample information pushing sequence is a history pushing information pushing sequence, which refers to a history information pushing sequence that is pushed to a push receiving object in a history time period.

In this embodiment, the sample information located on the target sequence in the sample information pushing sequence may be used as the second sample information, and the sample information of the other sequence in the sample information pushing sequence may be used as the first sample information.

For example, the sample information at the last sequence bit in the sample information pushing sequence may be used as the second sample information, and the sample information at other sequence bits in the sample information pushing sequence may be used as the first sample information.

Step S204, sample mode information of at least one preset mode corresponding to each first sample information is obtained, and sample mode embedding vectors of the sample mode information are determined.

The preset mode refers to a visual presentation mode of the first sample information, and specifically may be an image, a text, an audio or a video.

The sample mode information refers to information of the first sample information under a certain preset mode, for example, the sample mode information of the image corresponding to the first sample information refers to the image in the first sample information. The first sample information corresponds to sample mode information of the text, and refers to text information in the first sample information. The first sample information corresponds to sample mode information of audio, and refers to audio information in the first sample information. The first sample information corresponds to sample mode information of the video, and refers to video information in the first sample information.

Specifically, for each first sample information, the computer device may correspond to sample modality information of at least one preset modality for the first sample information. And the computer equipment maps each sample mode information to a vector space respectively to obtain a sample mode embedded vector corresponding to each sample mode information.

In this embodiment, as shown in fig. 3, the information push model to be trained may include a modal feature extraction model. The computer equipment can acquire sample mode information of a plurality of preset modes of each first sample information, and input the sample mode information of the plurality of preset modes of the first sample information into the mode characteristic extraction model. The modal encoder in the modal feature extraction model carries out modal encoding on sample modal information of each preset mode to obtain sample modal embedded vectors corresponding to each sample modal information.

The output modal embedded vectors tend to have higher dimensions, and in order to reduce the dimensions of the modal embedded vectors, a representation layer is introduced behind each modal encoder, which can be implemented using one or more fully connected layers for converting the high-dimensional embedded vectors to low dimensions. In training, model fine tuning can be performed by each modal encoder to obtain the relevant weights of the representation layer.

After sample mode embedded vectors of all preset modes of each first sample information are obtained, the sample mode embedded vectors are spliced, a longer sample embedded vector is obtained for each first sample information, and the sample embedded vectors are stored in a language model of an information push model to be trained. In particular, may be stored in a modality embedding layer of the language model.

For example, the information identifiers of the n pieces of first sample information are ID1, ID2 and IDn, respectively, and the multiple preset modalities may include text and images, and may also include other preset modalities Z. The text X1 and the image Y1 of the first sample information whose information is identified as ID1 are acquired, and the text X2 and the image Y2 of the first sample information whose information is identified as ID2 are acquired until the text Xn and the image Yn of the first sample information whose information is identified as IDn are acquired. The text X1 and the image Y1 of the ID1 are sample mode information corresponding to the first sample information represented by the ID 1.

The text X2 and the image Y2 of the ID2 are sample mode information corresponding to the first sample information represented by the ID 2.

The text X1 and the text X2 are input into a text encoder of a modal encoder to map the text X1 and the text X2 into a corresponding sample modal embedded vector X1 and a sample modal embedded vector X2, respectively. The image Y1 and the image Y2 are input into an image encoder of a modal encoder, so that the image Y1 and the image Y2 are respectively mapped into a corresponding sample modal embedded vector Y1 and a sample modal embedded vector Y2.

And carrying out dimension reduction processing on each sample mode embedded vector in a corresponding representation layer. And splicing the sample mode embedded vector x1 after the dimension reduction with the sample mode embedded vector y1 to obtain a sample embedded vector E1 corresponding to the ID 1. And splicing the sample mode embedded vector x2 after the dimension reduction with the sample mode embedded vector y2 to obtain a sample embedded vector E2 corresponding to the ID 2. The sample embedding vector E1 and the sample embedding vector E2 are input to the language model.

Step S206, for each first sample information, determining an information category to which each level of the first sample information belongs in the information hierarchy.

Specifically, for each first sample information, the computer device classifies the first sample information for each level in the information hierarchy, and obtains an information category to which each level of the first sample information for the information hierarchy belongs.

According to the same processing, the information category to which each level of each first sample information belongs in the information hierarchy can be obtained.

Step S208, determining sample semantic information of the first sample information based on the information category to which each level of the information hierarchy belongs.

In particular, the computer device may connect the information category to which each level of the information hierarchy belongs for the first sample information as sample semantic information for the first sample information.

In this embodiment, the computer device may connect, according to the order of each level in the information hierarchy, information categories to which the first sample information targeted by the information category belongs in the information hierarchy, and use the connected information categories as sample semantic information of the first sample information targeted by the information category. For example, the sample semantic information is "primary classification name/secondary classification name/…/n-level classification name".

In this embodiment, a sample information identifier of each first sample information may be obtained, and sample semantic information of the first sample information may be obtained based on an information category to which each level of the first sample information belongs in the information hierarchy and the sample information identifier of the first sample information.

Further, according to the sequence of each level in the information level structure, the information category and the sample information identification of the first sample information to which each level in the information level structure belongs are connected, and sample semantic information of the first sample information to which the first sample information belongs is obtained. The sample information identifies information categories that may be connected at the last level. For example, the sample semantic information is "primary classification name/secondary classification name/…/n-level classification name/information identification".

Step S210, determining sample semantic embedded vectors of each piece of sample semantic information, and predicting the prediction pushing probability of pushing the second sample information according to the sample modal embedded vectors and the sample semantic embedded vectors of each piece of first sample information.

The predicted pushing probability refers to a probability of predicting to push the second sample information when pushing the plurality of first sample information.

Specifically, the computer device may map each sample semantic information to a vector space, to obtain a sample semantic embedded vector corresponding to each sample semantic information.

The computer device determines a sample embedding vector for each first sample information based on the sample modality embedding vector and the sample semantic embedding vector for each first sample information. The computer device predicts a predicted push probability of pushing the second sample information based on the sample embedding vector for each of the first sample information.

Step S212, obtaining expected pushing probability of pushing the second sample information, and performing model training based on the difference of the predicted pushing probability relative to the expected pushing probability to obtain an information pushing model.

The expected pushing probability refers to a probability that when a plurality of pieces of first sample information are pushed, second sample information is expected to be pushed. The expected push probability is pre-annotated as a reference result of model training.

Specifically, the computer device obtains an expected push probability of pushing the second sample information, and determines a difference of the predicted push probability relative to the expected push probability. Model training is carried out based on the difference of the predicted pushing probability relative to the expected pushing probability so as to adjust model parameters and continue training, and an information pushing model is obtained after training is completed.

In this embodiment, the computer device obtains the training stop condition, matches the difference with the training stop condition, adjusts the model parameters and continues training when the difference does not satisfy the training stop condition, and stops until the difference determined by training satisfies the training stop condition, thereby obtaining the information push model.

In this embodiment, a target loss function is obtained, and a target loss value is calculated based on the target loss function, the predicted push probability, and the expected push probability. And when the target loss value meets the training stop condition, obtaining an information push model. And when the target loss value does not meet the training stop condition, adjusting the model parameters and continuing training until the target loss value in training meets the training stop condition, and obtaining the information push model.

The training stopping condition may be a preset iteration number, a preset difference threshold, a loss threshold, and the like. For example, when the difference is greater than the difference threshold, parameters of the model are adjusted and training is continued until the difference determined in the training process is less than or equal to the difference threshold, and the training is stopped, so that the information push model with the completed training is obtained.

In the information push model processing method, a plurality of first sample information and second sample information are acquired, and the acquired first sample information and second sample information belong to the same information category in at least one level in the information hierarchy structure, so that the probability of pushing the second sample information is predicted through the plurality of first sample information with the same information category of the selected at least one level. Sample mode information of at least one preset mode corresponding to each first sample information is obtained, sample mode embedded vectors of the sample mode information are determined, so that the sample mode information is mapped to a vector space, and key information of the sample mode information under different modes can be represented through the sample mode embedded vectors. For each first sample information, determining the information category to which each level of the first sample information belongs in the information hierarchy can perform multi-level classification on the first sample information to obtain more detailed category information. Based on the information category to which each level of the first sample information belongs in the information hierarchy, the sample semantic information of the first sample information is determined, so that the sample semantic information can be determined through the detailed category information of multiple levels, and the sample semantic information determined by different information with the same category is the same. The sample semantic embedding vector of each sample semantic information is determined so as to predict the predicted pushing probability of pushing the second sample information according to the sample modal embedding vector and the sample semantic embedding vector of each first sample information, so that the probability of the second sample information serving as information to be pushed subsequently can be predicted according to the known modal characteristics and semantic characteristics of a plurality of first sample information without predicting through unique identification of the information. And the prediction result is not influenced no matter whether the identification of the information is changed or not by combining the modal characteristics and the semantic characteristics of the first sample information, so that the prediction of the information is more flexible. The expected pushing probability of pushing the second sample information is obtained, model training is conducted based on the difference of the predicted pushing probability relative to the expected pushing probability, so that the difference between the predicted pushing probability and the expected pushing probability is gradually reduced in the training process, the probability predicted by the model continuously tends to the expected pushing probability, the extraction precision of the model is gradually improved, and the information pushing model is finally obtained. The information pushing model focuses on semantic information and specific modal information of different information in the information hierarchical structure, so that information to be pushed subsequently can be accurately predicted based on the semantic information and the modal information of the known information, and flexible prediction and pushing of the information are realized.

In one embodiment, acquiring a plurality of first sample information and second sample information includes:

acquiring a sample information pushing sequence, and determining a plurality of first sample information and second sample information except the plurality of first sample information in the sample information pushing sequence;

according to the sample mode embedded vector and the sample semantic embedded vector of each piece of first sample information, predicting the predicted pushing probability of pushing the second sample information, including:

acquiring sample pushing position information of each piece of first sample information in a sample sequence, and determining a sample pushing position embedding vector of the sample pushing position information; and predicting the predicted pushing probability of pushing the second sample information according to the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of each piece of first sample information.

The sample pushing position information refers to position information of the first sample information in a sample information pushing sequence, and represents pushing sequence bits of the first sample information in the sample information pushing sequence.

The sample pushing position embeds a vector characterizing the position characteristics of the first sample information in the sample information pushing sequence. I.e. sequence features characterizing the push sequence bits of the first sample information in the sample information push sequence.

In particular, the computer device may obtain a sample information push sequence, each of the sample information push sequences corresponding to one sample information. The computer device may determine a plurality of first sample information from the sample information of each order bit of the sample information push sequence and determine second sample information in the sample information push sequence other than the plurality of first sample information.

The computer device may obtain sample pushing position information of each first sample information in the sample sequence, and map each sample pushing position information to a vector space to obtain a corresponding sample pushing position embedded vector.

In this embodiment, each sample pushing position information may be encoded separately, so as to obtain a sample pushing position embedded vector corresponding to each first sample information.

And determining the sample embedding vector of each piece of first sample information according to the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of each piece of first sample information. And predicting the predicted pushing probability of pushing the second sample information according to the sample embedding vector of each first sample information.

In this embodiment, the sample information located on the target sequence in the sample information pushing sequence may be used as the second sample information, and the sample information of the other sequence in the sample information pushing sequence may be used as the first sample information. The predicted push probability of pushing the second sample information at the target sequence is predicted based on the sample embedding vector of each first sample information. For example, if the second sample information is at the 3 rd bit in the sample information pushing sequence, the preset pushing probability that the second sample information is pushed at the 3 rd bit of the sample information pushing sequence is predicted.

In this embodiment, a sample information pushing sequence is acquired, a plurality of first sample information and second sample information except for the plurality of first sample information in the sample information pushing sequence are determined, and sample pushing position information of each first sample information in the sample information pushing sequence is acquired to determine a pushing sequence bit of each sample information in the sample information pushing sequence, so that a sample pushing position embedding vector of each sample pushing position information in a vector space is determined. According to the sample mode embedded vector, the sample semantic embedded vector and the sample pushing position embedded vector of each piece of first sample information, the prediction pushing probability of pushing the second sample information is predicted, so that the prediction probability of pushing the second sample information in a specific sequence in the sample information pushing sequence can be predicted more accurately by combining the mode characteristics, the category characteristics and the position characteristics of the known first sample information in the sample information pushing sequence.

In one embodiment, predicting a predicted push probability of pushing the second sample information based on the sample modality embedded vector, the sample semantic embedded vector, and the sample push position embedded vector of each of the first sample information comprises:

For each piece of first sample information, fusing a sample mode embedded vector, a sample semantic embedded vector and a sample pushing position embedded vector of the first sample information to obtain a sample embedded vector of the first sample information; and predicting the predicted pushing probability of pushing the second sample information according to the sample embedding vector of each first sample information.

Specifically, for each first sample information, the computer device performs a stitching process on the sample mode embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information so as to fuse the sample mode embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information to obtain the sample embedded vector of the first sample information.

Or for each piece of first sample information, the computer equipment linearly adds the sample mode embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information to obtain the sample embedded vector of the first sample information.

The computer device generates a fusion embedded vector from the sample embedded vector for each first sample information. The fusion embedded vector is used for representing the predicted sample embedded vector of the information to be pushed, so that the probability that the fusion embedded vector is the sample embedded vector of the second sample information can be predicted, and the probability is used as the predicted pushing probability for pushing the second sample information.

In this embodiment, for each first sample information, the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of the first sample information are fused to obtain the sample embedding vector of the first sample information, so that each sample embedding vector fuses the mode feature, the category feature and the position feature in the sample information pushing sequence of the first sample information, and the prediction pushing probability of pushing the second sample information can be accurately predicted according to the known relevant feature of the first sample information in the sample information pushing sequence.

In one embodiment, as shown in fig. 4, according to the sample modality embedding vector, the sample semantic embedding vector, and the sample push position embedding vector of each first sample information, the predicted push probability of predicting the second sample information is predicted, including steps S402-S404:

step S402, predicting the predicted pushing probability of pushing the second sample information at each sequence position of the sample information sequence according to the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of each first sample information.

Specifically, for each piece of first sample information, the computer device fuses the sample modality embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information to obtain a sample embedded vector of the first sample information. And predicting the predicted pushing probability of pushing the second sample information at each sequence bit of the sample information sequence according to the sample embedding vector of each first sample information.

Further, a fusion embedded vector is generated from the sample embedded vector of each first sample information. And predicting the predicted pushing probability of pushing the second sample information at each sequence position of the sample information sequence according to the fusion embedded vector.

In this embodiment, the prediction fusion embedded vector is a probability of a sample embedded vector of sample information on each sequence bit in the sample information sequence, and the probability is used as a prediction pushing probability of pushing the second sample information on each sequence bit in the sample information sequence. For example, the first sample information 1, 2, 3 and the second sample information 4 are respectively located at the sequence bits 1, 2, 3, 4 in the sample information sequence, and the fusion embedded vector is generated by the sample embedded vector of the first sample information 1, 2, 3. The probability of the fusion embedded vector is respectively the probability of the sample embedded vector of the first sample information 1, the probability of the sample embedded vector of the first sample information 2, the probability of the sample embedded vector of the first sample information 3 and the probability of the sample embedded vector of the second sample information 4, and these probabilities are taken as the prediction pushing probabilities of the second sample information 4 in the sequence bits 1, 2, 3 and 4.

Step S404, based on the predicted pushing probability of pushing the second sample information at each sequence bit of the sample information sequence, a predicted probability distribution of the second sample information corresponding to the sample information sequence is generated.

Specifically, the computer equipment combines the predicted pushing probability of pushing the second sample information in each sequence bit of the sample information sequence to obtain the predicted probability distribution of the second sample information corresponding to the sample information sequence.

For example, the prediction probability distribution is (0.1,0.2,0.1,0.6), which indicates the prediction push probabilities of the second sample information 4 at the sequence bits 1, 2, 3, and 4, respectively.

The expected pushing probability of pushing the second sample information is obtained, model training is carried out based on the difference of the predicted pushing probability relative to the expected pushing probability, and an information pushing model is obtained, and the method comprises the following steps of S406-S408:

in step S406, a desired probability distribution of the second sample information corresponding to the sample information sequence is obtained, where the desired probability distribution includes a desired pushing probability of pushing the second sample information at each sequence bit of the sample information sequence.

The expected probability distribution refers to the actual probability distribution achieved by the expected model in training, and represents the actual pushing probability of pushing the second sample information in each sequence of the sample information sequence. For example, the expected probability distribution is (0, 1), and the push probabilities of the second sample information 4 predicted by the expected model at the sequence positions 1, 2, 3, and 4 are 0, and 1, respectively. The second sample information 4 is thus at bit 4 in the sample information sequence, meaning that the pushing of the second sample information 4 at bit 4 corresponds to the actual pushing result.

In particular, the computer device obtains a desired probability distribution of the second sample information corresponding to the sample information sequence, the desired probability distribution comprising a desired push probability of pushing the second sample information at each order bit of the sample information sequence.

Step S408, model training is carried out based on the difference between the predicted probability distribution and the expected probability distribution, and an information push model is obtained.

Specifically, the computer device determines a difference between the predicted probability distribution and the expected probability distribution, adjusts model parameters based on the difference, and continues training until the training stop condition is met, and obtains an information push model.

In this embodiment, the difference between the predicted probability distribution relative to the desired probability distribution may be characterized by cross entropy; determining a cross entropy between the predicted probability distribution and the desired probability distribution; and training the model based on the cross entropy to obtain an information push model. And cross entropy minimization is achieved through model training, so that an information push model is obtained. The smaller the cross entropy, the smaller the loss.

The cross-over entropy is used to describe the distance between two probability distributions, i.e. the distance between the result predicted by the model and the real result can be determined by the cross-over entropy.

For example, the crossover entropy is:

H ( （0，0，0，1），（0.1，0.2，0.1，0.6） )=-(0/log0.1+0log0.2+0*log0.1+1*log0.6)

in this embodiment, according to the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of each first sample information, each sample embedding vector is enabled to merge the mode feature, the category feature and the position feature of the first sample information in the sample information pushing sequence, so that the prediction pushing probability of predicting the second sample information to be pushed in each sequence position of the sample information sequence can be accurately predicted according to the known relevant feature of the first sample information in the sample information pushing sequence. Based on the predicted pushing probability of pushing the second sample information at each sequence bit of the sample information sequence, generating predicted probability distribution of the second sample information corresponding to the sample information sequence, performing model training based on the difference between the predicted probability distribution and expected probability distribution, and obtaining an information pushing model, wherein the obtained information pushing model can predict similar next pushing information based on the modal characteristics, the category characteristics and the position characteristics of a plurality of pieces of information in the information sequence, so that information similar to browsed information can be pushed for a user.

In one embodiment, a sample embedding vector of each first sample information is determined according to a sample modality embedding vector and a sample semantic embedding vector of each first sample information; and carrying out multi-scale decoding processing on the basis of the sample embedded vectors of the first sample information to obtain the predicted pushing probability of pushing the second sample information.

Performing multi-scale decoding processing based on the sample embedded vector of each first sample information to obtain a predicted push probability of pushing the second sample information, including:

performing self-attention processing on the sample embedded vectors of the first sample information to obtain fusion embedded vectors; and acquiring sample pushing weights of the second sample information, and determining the predicted pushing probability of pushing the second sample information based on the fusion embedded vector and the sample pushing weights.

In one embodiment, predicting a predicted push probability of pushing the second sample information from the sample modality embedded vector and the sample semantic embedded vector of each first sample information comprises:

determining a sample embedding vector of each first sample information according to the sample mode embedding vector and the sample semantic embedding vector of each first sample information; performing self-attention processing on the sample embedded vectors of the first sample information to obtain fusion embedded vectors; and acquiring sample pushing weights of the second sample information, and determining the predicted pushing probability of pushing the second sample information based on the fusion embedded vector and the sample pushing weights.

Specifically, for each piece of first sample information, a sample mode embedded vector and a sample semantic embedded vector of the first sample information are fused, and a sample embedded vector of the first sample information is obtained. And carrying out self-attention processing on the sample embedded vectors of the first sample information through a self-attention mechanism to obtain a fusion embedded vector. And acquiring a sample pushing weight corresponding to the second sample information, and taking the product of the fusion embedded vector and the sample pushing weight as the predicted pushing probability for pushing the second sample information.

In this embodiment, the sample embedded vectors of the first sample information may be self-attentively processed by a multi-headed self-attentively mechanism to obtain a fusion embedded vector.

In this embodiment, according to the sample mode embedding vector and the sample semantic embedding vector of each first sample information, the sample embedding vector of each first sample information is determined, and the sample embedding vector of each first sample information is subjected to self-attention processing to capture the dependency relationship between the sample embedding vectors, so as to obtain a fusion embedding vector fused with the dependency relationship between the sample embedding vectors. And acquiring the sample pushing weight of the second sample information, and accurately determining the predicted pushing probability of pushing the second sample information based on the fusion embedded vector and the sample pushing weight.

In one embodiment, performing self-attention processing on the sample embedded vector of each first sample information to obtain a fusion embedded vector, including:

mapping the sample embedded vectors of the first sample information to a key space respectively to obtain a key vector of each first sample information; mapping sample embedded vectors of the first sample information to a query space respectively to obtain a query vector of each first sample information; according to each key vector and each query vector, determining a predicted self-attention weight corresponding to each first sample information; and fusing each sample embedded vector based on the predicted self-attention weight to obtain a fused embedded vector.

The Key space refers to a projection space under a QKV (Query-Key-Value) mechanism. QKV projection operation is one of the attention mechanisms, which projects an input vector into three different spaces, respectively, and converts the vector into vectors in the three spaces. Three different spaces are query space, key space, and value space. The value space is also referred to as the content space.

The predicted self-attention weight refers to the self-attention weight determined by the model in the model training process, and the final self-attention weight is obtained after model training is completed. Self-attention weights refer to weights under the self-attention mechanism that characterize the relevance between embedded vectors. The relevance between embedded vectors may be characterized by the similarity between the embedded vectors.

Self-attention mechanisms are a mechanism commonly used in computer vision and natural language processing to establish associations between features and weight assignments by computing similarities between queries (Q), keys (K), and numbers (V). In this embodiment, the self-attention mechanism is applied to feature fusion of sample embedding vectors of a plurality of first sample information to predict second sample information similar to the first sample information.

Specifically, the computer device obtains an initial key weight, and maps the sample embedded vector of each first sample information to a key space based on the initial key weight, so as to obtain a key vector of each first sample information. And acquiring initial query weights, and mapping each sample embedded vector to a query space based on the initial query weights to obtain a query vector of each piece of second sample information.

The computer device fuses each sample embedding vector based on the predicted self-attention weights to obtain a fused embedding vector. Further, the predicted self-attention weight comprises fusion weights corresponding to each piece of first sample information, and the computer equipment respectively performs weighted summation on the sample embedded vectors of the first sample information and the corresponding fusion weights to obtain fusion embedded vectors.

As shown in FIG. 5, the initial key weight isSample embedding vector is +.>By means of the initial key weight +.>Embedding samples into vectors->Mapped to the key space to obtain a key vector K.

The initial query weight isBy initial query weight->Embedding samples into vectors->Mapping to a key space to obtain a query vector Q.

Determining the predicted self-attention weight corresponding to each first sample information according to each key vector K and each query vector Q 。

It will be appreciated that the initial key weightsEmbedding a vector for each sample into a weight matrix; initial query weight->Embedding a vector for each sample into a weight matrix; predictive self-attention weight->A weight matrix is included for each sample embedding vector.

In this embodiment, for each key vector, the computer device determines the content similarity between the key vector for which it is intended and each query vector, respectively. And determining the predicted self-attention weight corresponding to each piece of first sample information according to the similarity of the contents.

Further, normalization processing is carried out on the similarity of each content, and the predicted self-attention weight corresponding to each piece of first sample information is obtained.

In this embodiment, the computer device may determine the key similarity between the key vectors, and determine the predicted self-attention weight corresponding to each first sample information according to the content similarity and the key similarity.

In this embodiment, the computer device may determine query similarities between the query vectors, and determine the predicted self-attention weight corresponding to each first sample information according to the content similarities and the query similarities.

In this embodiment, model training is performed based on a difference between a predicted push probability and an expected push probability, so as to obtain an information push model, including:

predicting a difference in push probability relative to a desired push probability;

when the difference does not meet the training stop condition, determining the key similarity between the key vectors, and adjusting the initial key weight according to the key similarity to obtain updated key weight;

determining query similarity among the query vectors, and adjusting initial query weights according to the query similarity to obtain updated query weights;

and taking the updated key weight as the initial key weight of the next iteration, taking the updated query weight as the initial query weight of the next iteration, returning to the step of mapping the sample embedded vectors of each piece of first sample information to the key space based on the initial key weight and continuing to execute until the determined prediction difference meets the training stop condition, and stopping to obtain the information push model.

In one embodiment, the method further comprises:

acquiring initial content weight, and mapping sample embedded vectors of each first sample information to a value space based on the initial content weight to obtain a content vector of each first sample information;

Fusing each sample embedded vector based on the predicted self-attention weight to obtain a fused embedded vector, comprising:

and fusing the content vectors of the sample embedded vectors based on the predicted self-attention weights to obtain fused embedded vectors.

Further, when the difference does not meet the training stop condition, determining, for each key vector, content similarity between the targeted key vector and each query vector, respectively;

adjusting initial content weight according to the similarity of each content to obtain updated content weight;

and taking the updated content weight as the initial content weight of the next iteration, returning to the step of mapping the sample embedded vector of each piece of first sample information to the key space based on the initial key weight and continuing to execute.

As shown in fig. 6, the predicted self-attention weight corresponding to each first sample information is determined based on each key vector K and each query vector Q。

Based on initial content weightsEmbedding each sample into a vector +.>And mapping to the value spaces respectively to obtain the content vector V.

Based on predictive self-attention weightsAnd fusing the content vectors V to obtain the predicted palm features.

It will be appreciated that the initial content weightsThe weight matrix includes a weight for each first sample embedded feature and a weight for each second sample embedded feature.

In this embodiment, the sample embedded vectors of the first sample information are mapped to the key space respectively to obtain the key vectors of each first sample information, the sample embedded vectors of the first sample information are mapped to the query space respectively to obtain the query vectors of each first sample information, and according to the key vectors and the query vectors, the predicted self-attention weight corresponding to each first sample information can be predicted, and the predicted self-attention weight characterizes the respective importance degree of each key vector and each query vector in fusion, namely, characterizes the attention degree of the model to the sample embedded vectors of each first sample information. Based on the prediction self-attention weight, each sample embedded vector is fused, so that the fused embedded vector fused with the modal characteristics and the semantic characteristics of each first sample information can be obtained, and the prediction pushing probability of pushing the second sample information can be accurately predicted.

In one embodiment, as shown in fig. 7, a plurality of sets of self-attention processing is performed on the sample embedding vectors of each first sample information to obtain a fusion embedding vector.

For example, the initial key weights of the plurality of groups areAnd->Multiple groups of initial query weights are->And->Multiple sets of content vectors are- >And->。

Specifically, the computer device obtains multiple groups of initial key weights, and based on each group of initial key weights, respectively maps sample embedded vectors of each piece of first sample information to a key space to obtain key vectors corresponding to each piece of first sample information in each group. And fusing the key vectors of the first sample information in each group aiming at each piece of first sample information to obtain the key vector corresponding to the first sample information.

As shown in FIG. 8, the initial key weights of the plurality of groups areAnd->Sample embedding vector is +.>By initial key weight ofAnd->Respectively embedding samples into vectors->Mapped to the key space, resulting in a key vector K1 and a key vector K2. The key vector K1 and the key vector K2 are fused to obtain the key vector K.

According to a similar process, multiple sets of initial query weights can be usedAnd->Query vector Q1 and query vector Q2 are obtained, and query vector Q is obtained by fusing query vector Q1 and query vector Q2.

Can pass through multiple groups of content vectorsAnd->The content vector V1 and the content vector V2 are obtained, and the content vector V can be obtained by fusing the content vector V1 and the content vector V2.

And acquiring a plurality of groups of initial query weights, and respectively mapping each sample embedded vector to a query space based on the plurality of groups of initial query weights to obtain a query vector of each piece of second sample information in each group. And fusing the query vectors of the first sample information in each group aiming at each piece of first sample information to obtain the query vector corresponding to the first sample information.

And acquiring a plurality of groups of initial content weights, and mapping the sample embedded vectors of the first sample information to a value space respectively based on the plurality of groups of initial content weights to obtain the content vector of each first sample information in each group. And fusing the content vectors of the first sample information in each group aiming at each piece of first sample information to obtain the content vector corresponding to the first sample information.

In one embodiment, the information hierarchy is characterized by a category tree that includes a plurality of levels of nodes, the nodes between each level being connected by paths; for each first sample information, determining an information category to which the first sample information belongs at each level in the information hierarchy, including:

traversing the nodes of the class tree aiming at each piece of first sample information, and recording the traversed nodes and paths of the first sample information in the class tree; determining a corresponding hierarchy of the traversed node in the class tree, and determining an information class represented by the traversed node in the hierarchy;

determining sample semantic information of the first sample information based on the information category to which each level in the information hierarchy belongs, including

And connecting the information category represented by the traversed node according to the traversed node and path of the first sample information in the category tree, and obtaining the sample semantic information of the first sample information.

The information hierarchy is characterized by a category tree, the category tree comprises a plurality of levels of nodes, and the nodes among each level are connected through paths. Nodes in the category tree characterize information categories. The first node in the class tree is called the root node, and serves as the largest information class in the class tree.

As shown in fig. 9, the category tree includes a first level, a second level, a third level, and a fourth level. The first level includes node A, and node A characterizes information category A, and the second level includes node B1, B2, B3, characterizes information category B1, B2, B3 respectively, and the third level includes node C1, C2, C3, C4, characterizes information category C1, C2, C3, C4 respectively, and the third level includes node D1, characterizes information category D1. Wherein node a of the first level is connected to nodes B1, B2, B3 of the second level by paths AB1, AB2, AB3, respectively. Node B1 is connected to node C1 of the third hierarchy by path B1C 1. Node B2 is connected to nodes C2, C3 of the third hierarchy via paths B2C2, B2C3, respectively. Node B3 is connected to nodes C4, C5 of the third hierarchy via paths B3C4, B3C5, respectively. Node C2 is connected to node D1 of the fourth hierarchy by path C2D 2.

Specifically, a pre-built class tree is stored in the computer device. For each first sample information, the computer device traverses the nodes of the class tree to classify the first sample information at a different level. The computer device records nodes and paths traversed in the category tree for the first sample information.

The computer device determines each node traversed in the class tree for the first sample information, a corresponding level in the class tree, and determines a class of information characterized in the corresponding level by each node traversed.

And the computer equipment sequentially connects the information categories represented by the traversed nodes according to the traversed nodes and paths of the first sample information in the category tree, so as to obtain the sample semantic information of the first sample information.

As shown in fig. 9, when the first sample information 1 traverses the nodes and paths in the category tree: node a, path AB2, node B2, path B2C3, node C3, the sample semantic information of the first sample information 1 is: information category a-information category B2-information category C3. The nodes and paths traversed in the class tree when the first sample information 2 are: node a, path AB2, node B2, path B2C2, node C2, path C2D1, node D1, then the sample semantic information of the first sample information 2 is: information category a-information category B2-information category C2-information category D1.

In this embodiment, the computer device determines the information identifier of the first sample information, connects the information category represented by the traversed node according to the traversed node and path of the first sample information in the category tree, and connects the information identifier of the first sample information to obtain the sample semantic information of the first sample information. The information identification can be connected with the last information category to obtain sample semantic information.

In this embodiment, for each first sample information, the nodes of the class tree are traversed, and the nodes and paths traversed in the class tree for the first sample information are recorded, so that each first sample information is classified in different levels through the levels to which the nodes of the class tree belong and the paths between the nodes, and each first sample information is gradually divided into more detailed information categories. According to the traversed nodes and paths of the first sample information in the class tree, connecting the traversed node-characterized information classes to distinguish similar first sample information through different levels of information classes so as to form sample semantic information of each first sample information. The information categories of different levels are used as the semantics of the information, so that the prediction of the pushing probability does not depend on the historical interaction data of the information, the information without the historical interaction data or with less historical interaction data can obtain a fair pushing opportunity, and the method is beneficial to increasing the exposure rate of new information or cold information.

In one embodiment, as shown in fig. 10, an information push model processing method is provided, which is illustrated by using the method applied to a computer device (the computer device may be a terminal or a server in fig. 1) as an example, and includes the following steps:

in step S1002, a plurality of first information is acquired.

The first information refers to information of historical pushing or candidate information for pushing. The information categories to which at least one hierarchy of the plurality of first information belongs in the information hierarchy are the same.

Specifically, the computer device may obtain a plurality of first information from the history information of the history push, or obtain a plurality of first information from a plurality of candidate information.

The computer device may acquire the information push sequence, or acquire a plurality of pieces of information in the information push sequence, and use each piece of acquired information as the first information. Further, the information pushing sequence may be a history pushing information pushing sequence, which refers to a history information pushing sequence pushed to a push receiving object in a history time period.

Step S1004, acquiring the mode information of at least one preset mode corresponding to each piece of first information, and determining a mode embedding vector of the mode information.

The preset mode refers to a visual presentation mode of the first information, and specifically may be an image, a text, an audio or a video.

The mode information refers to information of the first information under a certain preset mode, for example, the mode information of the image corresponding to the first information refers to the image in the first information. The first information corresponds to the modal information of the text, and refers to the text information in the first information. The first information corresponds to the audio mode information, and refers to the audio information in the first information. The first information corresponds to the modal information of the video, and refers to the video information in the first information.

Specifically, for each first information, the computer device may correspond to modality information of at least one preset modality for the first information. The computer equipment maps each mode information to a vector space respectively to obtain a mode embedded vector corresponding to each mode information.

In step S1006, a semantic embedded vector corresponding to each first information is obtained, where the semantic embedded vector is related to an information category to which the corresponding first information belongs in the information hierarchy, and in the information hierarchy, information with a consistent information category to which each level belongs is obtained, and the corresponding semantic embedded vectors are the same.

Specifically, for each first information, an information category to which each level of the first information belongs in the information hierarchy is determined. Semantic information of the first information is determined based on the information category to which each level of the first information belongs in the information hierarchy. The computer device may obtain a semantic embedded vector corresponding to each of the first information.

In this embodiment, for each first information, the computer device classifies the first information according to each level in the information hierarchy, to obtain an information category to which each level of the first information belongs in the information hierarchy. The computer device may connect the information categories to which each level of the information hierarchy belongs for the first information as semantic information for the first information.

In this embodiment, the computer device may connect information categories to which the first information is directed in the information hierarchy according to an order of each level in the information hierarchy, and use the connected information categories as semantic information of the first information.

In this embodiment, the information identifier of each first information may be obtained, and the semantic information of the first information may be obtained based on the information category to which each level of the first information belongs in the information hierarchy and the information identifier of the first information.

Further, according to the sequence of each level in the information level structure, the information category and the information identification of each level of the first information in the information level structure are connected, so that the semantic information of the first information is obtained. The information identification may be connected after the information category of the last hierarchy.

In this embodiment, the information push model includes a semantic embedded vector corresponding to each semantic information. After the computer equipment determines the semantic information of each piece of first information, the semantic embedded vector corresponding to the semantic information of each piece of first information can be obtained through the information push model.

Step S1008, obtaining a plurality of candidate information, and screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each piece of first information.

Wherein the candidate information is information for pushing.

Specifically, the computer device acquires a plurality of candidate information, and determines a pushing probability of pushing each candidate information according to the modal embedded vector and the semantic embedded vector of each first information. And screening the second information for pushing from the plurality of candidate information according to each pushing probability.

In this embodiment, for each first information, the computer device fuses the modal embedded vector and the semantic embedded vector for the first information to obtain the information embedded vector for the first information. Second information for pushing is selected from the plurality of candidate information based on the information embedding vector of each first information. Further, a push probability of pushing each candidate information is determined based on the information embedding vector of each first information. And screening the second information for pushing from the plurality of candidate information according to each pushing probability.

In this embodiment, a fusion embedding vector is generated based on the information embedding vector of each first information, and a pushing probability of pushing each candidate information is determined based on the fusion embedding vector. Further, a pushing weight corresponding to each piece of candidate information is obtained, and the pushing probability of pushing each piece of candidate information is determined based on the fusion embedded vector and the pushing weight corresponding to each piece of candidate information. The product of the fusion embedded vector and the push weight corresponding to each candidate information can be used as the push probability of the corresponding candidate information.

In this embodiment, the information push model includes a push weight matrix, where the push weight matrix includes a push weight corresponding to each candidate information.

In this embodiment, a plurality of first information is acquired, the mode information of at least one preset mode corresponding to each first information is acquired, and the mode embedding vector of the mode information is determined, so that the mode information is mapped to a vector space, and key information of each first information under different modes can be represented through the mode embedding vector. The method comprises the steps of obtaining semantic embedded vectors corresponding to first information, wherein the semantic embedded vectors are related to information types of the corresponding first information in an information hierarchy, the information types of the information of each hierarchy in the information hierarchy are identical, the corresponding semantic embedded vectors enable the semantics of the information to be determined through multi-level detailed type information, the semantic embedded vectors determined by different information with the same classification are identical, and the problem that pushing probability is low due to the fact that some information has no historical interaction data is avoided. And acquiring a plurality of candidate information, and screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each first information, wherein the screened second information is the same as the information category of at least one layer of each first information in the information hierarchy structure, so that second information similar to each first information can be screened out, and accurate recommendation of similar information is realized. And the mode and the semantics of the known information are combined to predict the similar information of the subsequent pushing, and whether the information has history interaction data or not, the information can be predicted and pushed, so that the information pushing is more flexible.

In one embodiment, as shown in fig. 11, the method further includes steps S1102-S1104:

step S1102, obtaining the mode information of each candidate information corresponding to at least one preset mode, and determining the mode embedding vector of each mode information.

Specifically, for each candidate information, the computer device may correspond to the candidate information to the modality information of at least one preset modality. The computer equipment maps each mode information to a vector space respectively to obtain a mode embedded vector corresponding to each mode information.

Step S1104, obtaining a semantic embedded vector corresponding to each piece of candidate information, and fusing the modal embedded vector and the semantic embedded vector of the candidate information aiming at each piece of candidate information to obtain the candidate embedded vector of the candidate information aiming at.

Specifically, the semantic embedded vector is related to an information category to which the corresponding candidate information belongs in the information hierarchy. In the information hierarchy structure, the information of which the information category of each hierarchy is consistent, and the corresponding semantic embedded vectors are the same.

For each candidate information, determining an information category to which the candidate information belongs at each level in the information hierarchy. Semantic information of the targeted candidate information is determined based on the information category to which each level of the targeted candidate information belongs in the information hierarchy. The computer device may obtain a semantic embedded vector corresponding to each candidate information.

In this embodiment, for each candidate information, the computer device classifies the candidate information according to each level in the information hierarchy, and obtains an information category to which each level of the candidate information belongs in the information hierarchy. The computer device may connect the information categories to which each level of the information hierarchy belongs for the candidate information as semantic information for the candidate information.

In this embodiment, the computer device may connect information categories to which the candidate information belongs in the information hierarchy according to the order of each level in the information hierarchy, and use the connected information categories as semantic information of the candidate information.

In this embodiment, the information identifier of each candidate information may be obtained, and the semantic information of the candidate information may be obtained based on the information category to which each level of the information hierarchy of the candidate information belongs and the information identifier of the candidate information.

Further, according to the sequence of each level in the information level structure, the information category and the information identification of each level of the candidate information in the information level structure are connected, so that semantic information of the candidate information is obtained. The information identification may be connected after the information category of the last hierarchy.

In this embodiment, the information push model includes a semantic embedded vector corresponding to each semantic information. After the computer equipment determines the semantic information of each candidate information, the semantic embedded vector corresponding to the semantic information of each candidate information can be obtained through the information push model.

Screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each first information, including steps S1106-S1108:

in step S1106, for each first information, the modality embedded vector and the semantic embedded vector of the first information are fused to obtain an information embedded vector of the first information.

Specifically, for each piece of first information, the computer device fuses the modal embedded vector and the semantic embedded vector of the first information to obtain an information embedded vector corresponding to the first information.

Further, the modal embedded vector and the semantic embedded vector of the first information are linearly summed to obtain the information embedded vector of the first information.

In step S1108, the second information for pushing is selected from the plurality of candidate information according to the information embedded vector of each first information and the candidate embedded vector of each candidate information.

Specifically, the computer device calculates the similarity between the information embedding vector of each first information and the candidate embedding vector of each candidate information, and screens the second information for pushing from the plurality of candidate information according to the respective similarity.

For example, candidate information with similarity greater than a similarity threshold is selected as the second information, or candidate information with the largest similarity is selected as the second information, or a preset number of candidate information is selected as the second information according to the sequence of the similarity from high to low.

In this embodiment, the similarity may be represented by a distance, and then a distance between the information embedded vector of each first information and the candidate embedded vector of each candidate information may be determined, and the second information for pushing may be selected from the plurality of candidate information according to the distance.

In this embodiment, the mode information of at least one preset mode corresponding to each candidate information is obtained, and the mode embedding vector of each mode information is determined, so that the mode information is mapped to the vector space, and key information of each first information under different modes can be represented through the mode embedding vector. The method comprises the steps of obtaining semantic embedded vectors corresponding to each candidate information, wherein the semantic embedded vectors are related to information types of corresponding first information in an information hierarchical structure, the information types of the information corresponding to each level in the information hierarchical structure are identical, the semantics of the information can be determined through multi-level detailed type information, the semantic embedded vectors determined by different information with the same classification are identical, and the problem that pushing probability is low due to the fact that some information has no historical interaction data is avoided. For each piece of candidate information, merging the modal embedded vector and the semantic embedded vector of the candidate information to obtain a candidate embedded vector of the candidate information, and for each piece of first information, merging the modal embedded vector and the semantic embedded vector of the first information to obtain an information embedded vector of the first information, so that candidate information similar to the first information can be screened out from a plurality of pieces of candidate information to serve as pushed second information according to the information embedded vector of each piece of first information and the candidate embedded vector of each piece of candidate information, and the screened second information is the same as the information category of at least one layer of the first information in the information hierarchy structure, so that second information similar to each piece of first information can be screened out, and accurate recommendation of similar information can be realized. And the mode and the semantics of the known information are combined to predict the similar information of the subsequent pushing, and whether the information has history interaction data or not, the information can be predicted and pushed, so that the information pushing is more flexible.

In one embodiment, obtaining a plurality of first information includes: acquiring a plurality of first information of push receiving object history browsing;

the method further comprises the steps of: and pushing the second information to the push receiving object.

Specifically, the computer device determines a push receiving object and obtains a plurality of first information of historical browsing of the push receiving object. The method comprises the steps of obtaining mode information of at least one preset mode corresponding to each first information, determining a mode embedding vector of the mode information, obtaining a semantic embedding vector corresponding to each first information, screening second information used for pushing from a plurality of candidate information according to the mode embedding vector and the semantic embedding vector of each first information, and pushing the second information to a pushing receiving object, so that information similar to the first information can be screened and pushed for a User based on information browsed by a User history, and the information pushing method can be applied to U2I (User-to-Item) recommended scenes.

In one embodiment, the method further comprises:

determining a plurality of push receiving objects, and pushing a plurality of pieces of first information to the plurality of push receiving objects respectively; when the push receiving object browses a plurality of first information, pushing second information to the push receiving object browsed by the plurality of first information.

Specifically, the computer device acquires a plurality of first information from each candidate information, acquires mode information of at least one preset mode corresponding to each first information, and determines a mode embedding vector of the mode information. And acquiring a semantic embedded vector corresponding to each piece of first information, and screening second information for pushing from the rest of candidate information except the plurality of pieces of first information according to the modal embedded vector and the semantic embedded vector of each piece of first information.

The computer device determines a plurality of push receiving objects to which a plurality of first information is pushed, respectively. The computer equipment detects whether each push receiving object browses a plurality of first information, and when the push receiving object browses a plurality of first information, pushes second information to the push receiving object browsed by a plurality of first information.

In this embodiment, the first information that may be pushed after pushing the plurality of first information is determined in advance, so that after pushing the plurality of first information to the plurality of push receiving objects, if the push receiving object browses the plurality of first information, the second information may be quickly pushed to the push receiving object browsed the plurality of first information, and application of the push method in an I2I (Item-to-Item) recommendation scene may be effectively implemented.

In one embodiment, an information push model processing method and an information push method are provided, applied to a computer device, including:

training process:

and acquiring a sample information pushing sequence, and determining a plurality of first sample information and second sample information except the plurality of first sample information in the sample information pushing sequence.

Sample mode information of at least one preset mode corresponding to each first sample information is obtained, and sample mode embedding vectors of the sample mode information are determined.

For each first sample information, determining an information category to which each level of the first sample information belongs in the information hierarchy; sample semantic information for the first sample information is determined based on the information category to which each level of the information hierarchy for the first sample information belongs.

Determining sample semantic embedding vectors of each piece of sample semantic information, acquiring sample pushing position information of each piece of first sample information in a sample information pushing sequence, and determining the sample pushing position embedding vectors of the sample pushing position information.

And fusing the sample mode embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information aiming at each piece of first sample information to obtain the sample embedded vector of the first sample information.

And acquiring a sample pushing weight of each first sample and a sample pushing weight of second sample information, and predicting the predicted pushing probability of pushing the second sample information at each sequence position of the sample information sequence based on the fusion embedded vector and each sample pushing weight.

Based on the predicted push probability of pushing the second sample information at each sequence bit of the sample information sequence, a predicted probability distribution of the second sample information corresponding to the sample information sequence is generated.

Acquiring expected probability distribution of the second sample information corresponding to the sample information sequence, wherein the expected probability distribution comprises expected pushing probability of pushing the second sample information in each sequence bit of the sample information sequence; model training is carried out based on the difference between the predicted probability distribution and the expected probability distribution, and an information push model is obtained.

The application process comprises the following steps:

and determining a plurality of push receiving objects through the information push model, and pushing a plurality of first information to the plurality of push receiving objects respectively.

Acquiring the modal information of at least one preset mode corresponding to each first information through an information pushing model, and determining a mode embedding vector of the modal information; and acquiring a semantic embedding vector corresponding to each first information through an information pushing model, wherein the semantic embedding vector is related to an information category to which the corresponding first information belongs in an information hierarchy, and the corresponding semantic embedding vectors are the same in information of which the information category to which each hierarchy belongs in the information hierarchy.

And fusing the modal embedded vector and the semantic embedded vector of the first information for each piece of first information through an information push model to obtain the information embedded vector of the first information.

Acquiring the mode information of at least one preset mode corresponding to each candidate information through an information pushing model, and determining a mode embedding vector of each mode information; and acquiring semantic embedded vectors corresponding to each piece of candidate information through an information pushing model, and fusing the modal embedded vectors and the semantic embedded vectors of the candidate information aiming at each piece of candidate information to obtain the candidate embedded vectors of the candidate information aiming at.

And screening the second information for pushing from the plurality of candidate information according to the information embedded vector of each first information and the candidate embedded vector of each candidate information through the information pushing model.

When the push receiving object browses a plurality of first information, pushing second information to the push receiving object browsed by the plurality of first information through the information push model.

In one embodiment, an application scenario of an information push model processing method and an information push method is provided, including:

the method comprises the steps of obtaining a sample information pushing sequence, wherein the length of the sample information pushing sequence is n, namely n pieces of sample information exist in the sample information pushing sequence, and the first (n-1) pieces of sample information in the sample information pushing sequence are determined to serve as first sample information.

For each first sample information, a sample embedding feature of each first sample information is obtained in accordance with the process shown in fig. 3.

Next, sample semantic information for each first sample information is determined according to the following steps:

a class tree is constructed from a plurality of information classes of sample information, each node representing an information class, each node being assigned an independent class identifier. Traversing down from the root node for each first sample information, eventually forming a shape like: sample semantic information of "primary class name/secondary class name/…/n class name".

The manner in which the sample semantic information is constructed is such that the sample semantic information serves as a finer granularity of class features. Unlike the one-to-one correspondence existing in the conventional information ID, a plurality of first sample information may be mapped to the same sample semantic information, so that when new candidate information appears, the semantic information of the candidate information can share the same semantic embedded vector with the candidate information of the same kind, so that the semantic information has a certain knowledge migration capability.

The information push model to be trained is shown in fig. 12, and includes: input layer, embedded layer, decoding layer and output layer. Wherein, the liquid crystal display device comprises a liquid crystal display device,

input layer: the method comprises 2 sub-input layers, namely a sample information identification sequence input layer and a sample semantic information sequence input layer.

Sample information sequence input layer: the sub-layer inputs an ID sequence of n first sample information. If the number of a sequence is less than n, then the subsequent positions of the dissatisfaction need to be filled with a mask.

Sample semantic information sequence input layer: the sub-layer inputs sample semantic information sequences corresponding to the ID sequences of n first sample information, and if the sample semantic information is less than the set maximum length n, the following unsatisfied positions are filled with masks.

Embedding Layer (Embedding Layer): the embedding layer has 3 sub-embedding layers, namely a semantic information embedding layer, a multi-mode information embedding layer and a position information embedding layer.

The output of the embedding layer is to convert each sample semantic information in the sample information ID sequence and the corresponding sample semantic information sequence into a corresponding sample embedding vector. The sample embedding vector is the linear sum of the sample semantic embedding vector, the sample modal embedding vector and the sample position embedding vector corresponding to the same ID.

Semantic information embedding layer: sample semantic embedded vectors corresponding to sample semantic information for querying each first sample information. The semantic embedded vectors of the sample semantic information are generated randomly at first, and weights of the semantic embedded vectors are adjusted through iteration of a model in the model training process, so that the semantic embedded vectors of the sample semantic information are finally obtained.

Multimodal embedding layer: and the multi-mode embedded vector is used for inquiring the multi-mode embedded vector corresponding to each first sample information ID. The multi-modal embedded vector includes modal embedded vectors for a plurality of preset modalities. The multi-modal embedded vector is obtained by a multi-modal encoder as in fig. 3, the multi-modal embedded vector remaining unchanged during the model training process.

Position information embedding layer: for storing a position embedding vector corresponding to the position of the input first sample information ID in the sample information push sequence, which position embedding vector is also updated in the iterative training of the model.

Decoding layer, i.e. transducer layer: this layer is mainly a decoder built deeper from a stack of decoder blocks (decoding blocks) of multiple transformers with masking mechanisms to improve the performance and generalization ability of the model. Each decoding block (decoding block) contains the following parts:

a) Multi-Head Self-Attention mechanism with masking mechanism (Masked Multi-Head Self-Attention): the portion captures the dependency between different positions in the input sequence by performing a multi-headed self-attention calculation on the input sequence. The masking mechanism is used to avoid using future information in the self-attention mechanism.

b) Feedforward neural network (Feedforward Neural Network): the feature expression capability of each position is enhanced by performing full-connection layer calculation on the position embedding vector of each position.

c) Residual connection (Residual Connection) and layer normalization (Layer Normalization): the part carries out residual connection on the output of the multi-head self-attention calculation and the output of the feedforward neural network with the input sequence, and carries out layer normalization on the result after the residual connection so as to accelerate the training of the model and improve the performance of the model.

Output layer: the output layer consists of a linear transformation function and a normalization (Softmax) function. The linear transformation maps the output of the last transducer block to a vector space with dimensions equal to the size of the candidate information, and the Softmax function converts each element in this vector space to a probability value representing the probability of generating the next sample information ID (i.e. the second sample information). Each element represents a push weight for one candidate information.

The objective in this embodiment is to predict the nth sample information given (n-1) sample information. For example, a sample information push sequence of length nWherein->Representing the sample information ID at the t-th order bit in the sample information push sequence. Then, the objective loss function can be expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the sequence of the first sample information ID at the 1 st to (t-1) th order bits.Indicating that after a given first sample information ID sequence, it is predicted +.>Is a probability of (2). The goal of the training is to maximize the product of this conditional probability, i.e., minimize the negative log likelihood loss L. In the training process, an optimization algorithm such as random gradient descent is used for minimizing the loss function so as to update parameters of the model, and the model can better predict the next sample information ID of the sample information pushing sequence.

After training is completed, the generation of information embedded vectors and I2I recommendation can be realized through an information push model:

when the training of the information push model is completed, in order to obtain the information embedding vector of each candidate information ID, one information ID is generated for each information IDWhere the first element characterizes the ID of the information and the following (n-1) elements are the mask on the padding.

After the ID sequence of each message is input into the message pushing model, an embedded vector output is obtained in the last transducer decoding block (i.e. the nth transducer decoding block) in the message pushing model, and the output is the information embedded vector obtained by each message ID through the message pushing model.

By using the information embedding vector of each information ID, the information closest to the information can be found by the approximate nearest neighbor Algorithm (ANN) to recommend information-information (I2I).

In other embodiments, the U2I recommendation may also be implemented by an information push model:

after the information push model is trained, an information sequence browsed by a user history can be input, namely each piece of candidate information can be directly output as the push probability of the next push information through the output layer of the information push model. When the U2I recommends the scene, candidate information with highest probability can be selected to carry out relevant recommendation directly according to the output pushing probabilities.

In other embodiments, cold start of information may also be implemented through an information push model:

when new information is input into the information push model, the original information push model structure can be kept unchanged, and the generation mode of the information semantic embedded vector is only slightly adjusted. In order to prevent the new semantic embedded vector from not belonging to the existing semantic embedded vector stored in the semantic information embedding layer, the hash value of the semantic information can be used as the Key of the query, so that even if the new semantic information appears, the related semantic embedded vector can be queried through the hash value corresponding to the semantic information.

Meanwhile, since the multi-modal information of the candidate information depends only on the preset modality of the information instead of the ID of the information, the multi-modal information of the new candidate information can be stored in the multi-modal embedded layer.

It will be appreciated that the ID of the information in this embodiment does not participate in any computation, but is used only for identifying data such as information, semantic embedded vectors, multimodal embedded vectors, etc.

In this embodiment, an information push model is provided that combines the semantic features and the multimodal features of information with the language model architecture. The information pushing model can encode the multi-mode information and the semantic information of the information at the same time, and the processing efficiency of a recommendation algorithm in a recommendation scene can be greatly improved by combining the powerful inference capability of the language model.

In addition, the embodiment can effectively solve the problem of information ID invalidation caused by the new and old information adding and invalidation in the recommendation field. And the recommendation of new information can be realized without continuously retraining the model due to the information ID invalidation problem caused by the new and old information adding and invalidation.

In addition, the embodiment does not need to encode the information ID, but extracts the category of different levels of the information as semantic features, so that the problem of cold start caused by no history interaction data of the information can be effectively avoided.

It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides an information push model processing device for realizing the above related information push model processing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the information push model processing device or devices provided below may refer to the limitation of the information push model processing method hereinabove, and will not be described herein.

In one embodiment, as shown in fig. 13, there is provided an information push model processing apparatus 1300, including:

the sample information obtaining module 1302 is configured to obtain a plurality of first sample information and second sample information, where each of the first sample information and the second sample information belongs to the same information category in at least one hierarchy in the information hierarchy.

The sample mode determining module 1304 is configured to obtain sample mode information of at least one preset mode corresponding to each first sample information, and determine a sample mode embedding vector of the sample mode information.

An information category determining module 1306, configured to determine, for each first sample information, an information category to which each level of the information hierarchy for the first sample information belongs.

The sample semantic determining module 1308 is configured to determine sample semantic information of the first sample information based on an information category to which each level of the information hierarchy belongs.

The prediction module 1310 is configured to determine a sample semantic embedding vector of each sample semantic information, and predict a prediction pushing probability of pushing the second sample information according to the sample modality embedding vector and the sample semantic embedding vector of each first sample information.

The training module 1312 is configured to obtain an expected push probability of pushing the second sample information, and perform model training based on a difference between the predicted push probability and the expected push probability, to obtain an information push model.

In this embodiment, a plurality of first sample information and second sample information are acquired, and the acquired first sample information and second sample information belong to the same information category in at least one hierarchy in the information hierarchy structure, so as to predict the probability of pushing the second sample information by the plurality of first sample information with the same information category in the selected at least one hierarchy. Sample mode information of at least one preset mode corresponding to each first sample information is obtained, sample mode embedded vectors of the sample mode information are determined, so that the sample mode information is mapped to a vector space, and key information of the sample mode information under different modes can be represented through the sample mode embedded vectors. For each first sample information, determining the information category to which each level of the first sample information belongs in the information hierarchy can perform multi-level classification on the first sample information to obtain more detailed category information. Based on the information category to which each level of the first sample information belongs in the information hierarchy, the sample semantic information of the first sample information is determined, so that the sample semantic information can be determined through the detailed category information of multiple levels, and the sample semantic information determined by different information with the same category is the same. The sample semantic embedding vector of each sample semantic information is determined so as to predict the predicted pushing probability of pushing the second sample information according to the sample modal embedding vector and the sample semantic embedding vector of each first sample information, so that the probability of the second sample information serving as information to be pushed subsequently can be predicted according to the known modal characteristics and semantic characteristics of a plurality of first sample information without predicting through unique identification of the information. And the prediction result is not influenced no matter whether the identification of the information is changed or not by combining the modal characteristics and the semantic characteristics of the first sample information, so that the prediction of the information is more flexible. The expected pushing probability of pushing the second sample information is obtained, model training is conducted based on the difference of the predicted pushing probability relative to the expected pushing probability, so that the difference between the predicted pushing probability and the expected pushing probability is gradually reduced in the training process, the probability predicted by the model continuously tends to the expected pushing probability, the extraction precision of the model is gradually improved, and the information pushing model is finally obtained. The information pushing model focuses on semantic information and specific modal information of different information in the information hierarchical structure, so that information to be pushed subsequently can be accurately predicted based on the semantic information and the modal information of the known information, and flexible prediction and pushing of the information are realized.

In one embodiment, the sample information obtaining module 1302 is further configured to obtain a sample information push sequence, determine a plurality of first sample information and second sample information except the plurality of first sample information in the sample information push sequence;

the prediction module 1310 is further configured to obtain sample pushing position information of each first sample information in the sample sequence, and determine a sample pushing position embedding vector of the sample pushing position information;

and predicting the predicted pushing probability of pushing the second sample information according to the sample mode embedding vector, the sample semantic embedding vector and the sample pushing position embedding vector of each piece of first sample information.

In one embodiment, the prediction module 1310 is further configured to, for each piece of first sample information, fuse the sample modality embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the piece of first sample information to obtain a sample embedded vector of the piece of first sample information; and predicting the predicted pushing probability of pushing the second sample information according to the sample embedding vector of each first sample information.

In one embodiment, the prediction module 1310 is further configured to predict a predicted push probability of pushing the second sample information at each sequence bit of the sample information sequence according to the sample modality embedded vector, the sample semantic embedded vector, and the sample push position embedded vector of each first sample information; based on the predicted pushing probability of pushing the second sample information in each sequence bit of the sample information sequence, generating a predicted probability distribution of the first sample information corresponding to the sample information sequence;

The training module 1312 is further configured to obtain an expected probability distribution of the first sample information corresponding to the sample information sequence, where the expected probability distribution includes an expected pushing probability of pushing the second sample information at each sequence bit of the sample information sequence; model training is carried out based on the difference between the predicted probability distribution and the expected probability distribution, and an information push model is obtained.

In one embodiment, the prediction module 1310 is further configured to determine a sample embedding vector of each first sample information according to the sample modality embedding vector and the sample semantic embedding vector of each first sample information; performing self-attention processing on the sample embedded vectors of the first sample information to obtain fusion embedded vectors; and acquiring sample pushing weights of the second sample information, and determining the predicted pushing probability of pushing the second sample information based on the fusion embedded vector and the sample pushing weights.

In one embodiment, the prediction module 1310 is further configured to map the sample embedded vectors of the first sample information to the key space respectively, so as to obtain a key vector of each first sample information; mapping sample embedded vectors of the first sample information to a query space respectively to obtain a query vector of each first sample information; according to each key vector and each query vector, determining a predicted self-attention weight corresponding to each first sample information; and fusing each sample embedded vector based on the predicted self-attention weight to obtain a fused embedded vector.

In one embodiment, the information category determining module 1306 is further configured to traverse, for each first sample information, a node of the category tree, and record the node and the path traversed in the category tree for the first sample information; determining a corresponding hierarchy of the traversed node in the class tree, and determining an information class represented by the traversed node in the hierarchy;

The sample semantic determining module 1308 is further configured to connect the information categories represented by the traversed nodes according to the traversed nodes and paths of the first sample information in the category tree, so as to obtain sample semantic information of the first sample information.

Based on the same inventive concept, the embodiment of the application also provides an information pushing device for realizing the above related information pushing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in one or more embodiments of the information pushing device provided below may refer to the limitation of the information pushing method hereinabove, and will not be repeated herein.

In one embodiment, as shown in fig. 14, there is provided an information pushing apparatus 1400, comprising:

the information acquisition module 1402 is configured to acquire a plurality of first information.

The vector determining module 1404 is configured to obtain modality information of at least one preset modality corresponding to each first information, and determine a modality embedding vector of the modality information.

The semantic acquisition module 1406 is configured to acquire a semantic embedded vector corresponding to each first information, where the semantic embedded vector is related to an information category to which the corresponding first information belongs in an information hierarchy, and the information of which information category to which each hierarchy belongs in the information hierarchy is consistent, and the corresponding semantic embedded vectors are the same.

And a filtering module 1408, configured to obtain a plurality of candidate information, and filter the second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each first information.

In one embodiment, the vector determining module 1404 is further configured to obtain modality information of each candidate information corresponding to at least one preset modality, and determine a modality embedded vector of each modality information; acquiring a semantic embedded vector corresponding to each candidate information, and respectively fusing the modal embedded vector and the semantic embedded vector of each candidate information to acquire a candidate embedded vector of each candidate information;

respectively fusing the modal embedded vector and the semantic embedded vector of each piece of first information to obtain the information embedded vector of each piece of first information;

the filtering module 1408 is further configured to filter the second information for pushing from the plurality of candidate information according to the information embedded vector of each first information and the candidate embedded vector of each candidate information.

In one embodiment, the information obtaining module 1402 is further configured to obtain a plurality of first information of a push receiving object history browse;

the apparatus further comprises:

and the pushing module is used for pushing the second information to the pushing receiving object.

In this embodiment, information similar to the first information can be screened and pushed for the user based on the information browsed by the user history, so that the information pushing method can be applied to the U2I recommendation scene.

In one embodiment, the apparatus further comprises:

the pushing module is used for determining a plurality of pushing receiving objects and pushing a plurality of pieces of first information to the pushing receiving objects respectively; when the push receiving object browses a plurality of first information, pushing second information to the push receiving object browsed by the plurality of first information.

The above information push model processing and the respective modules in the information push device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal or a server. Taking the terminal as an example, the internal structure thereof can be as shown in fig. 15. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program, when executed by a processor, implements an information push model processing method and an information push method. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 15 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements are applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with related laws and regulations and standards.

The information push related to the application can be refused by the user or can be conveniently refused.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as Static Random access memory (Static Random access memory AccessMemory, SRAM) or dynamic Random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. An information push model processing method, which is characterized by comprising the following steps:

2. The method of claim 1, wherein the obtaining a plurality of first sample information and second sample information comprises:

acquiring a sample information push sequence, and determining a plurality of first sample information and second sample information except the plurality of first sample information in the sample information push sequence;

the predicting a predicted push probability of pushing the second sample information according to the sample modality embedded vector and the sample semantic embedded vector of each piece of the first sample information includes:

Acquiring sample pushing position information of each piece of first sample information in the sample information pushing sequence, and determining a sample pushing position embedding vector of the sample pushing position information;

3. The method of claim 2, wherein predicting a predicted push probability of pushing the second sample information based on the sample modality embedded vector, the sample semantic embedded vector, and the sample push location embedded vector for each of the first sample information comprises:

for each piece of first sample information, fusing the sample mode embedded vector, the sample semantic embedded vector and the sample push position embedded vector of the first sample information to obtain a sample embedded vector of the first sample information;

and predicting the predicted pushing probability of pushing the second sample information according to the sample embedded vector of each piece of the first sample information.

4. The method of claim 2, wherein predicting a predicted push probability of pushing the second sample information based on the sample modality embedded vector, the sample semantic embedded vector, and the sample push location embedded vector for each of the first sample information comprises:

Predicting a predicted push probability of pushing the second sample information at each sequence position of the sample information sequence according to the sample modality embedded vector, the sample semantic embedded vector and the sample push position embedded vector of each piece of the first sample information;

generating a predicted probability distribution of the second sample information corresponding to the sample information sequence based on a predicted push probability of pushing the second sample information at each sequence bit of the sample information sequence;

the obtaining the expected pushing probability of pushing the second sample information, performing model training based on the difference of the predicted pushing probability relative to the expected pushing probability, and obtaining an information pushing model, including:

acquiring expected probability distribution of the second sample information corresponding to the sample information sequence, wherein the expected probability distribution comprises expected pushing probability of pushing the second sample information in each sequence bit of the sample information sequence;

and carrying out model training based on the difference between the predicted probability distribution and the expected probability distribution, and obtaining an information push model.

5. The method of claim 1, wherein predicting a predicted push probability of pushing the second sample information based on the sample modality embedded vector and the sample semantic embedded vector for each of the first sample information comprises:

Determining a sample embedding vector of each piece of first sample information according to the sample modal embedding vector and the sample semantic embedding vector of each piece of first sample information;

performing self-attention processing on the sample embedded vectors of each piece of first sample information to obtain a fusion embedded vector;

and acquiring a sample pushing weight of the second sample information, and determining a predicted pushing probability for pushing the second sample information based on the fusion embedded vector and the sample pushing weight.

6. The method of claim 5, wherein performing self-attention processing on the sample embedding vectors of each of the first sample information to obtain a fusion embedding vector, comprises:

mapping the sample embedded vector of each piece of first sample information to a key space respectively to obtain a key vector of each piece of first sample information;

mapping sample embedded vectors of the first sample information to a query space respectively to obtain a query vector of each piece of first sample information;

determining a predicted self-attention weight corresponding to each piece of first sample information according to each key vector and each query vector;

and fusing each sample embedded vector based on the predicted self-attention weight to obtain a fused embedded vector.

7. The method of any one of claims 1 to 6, wherein the information hierarchy is characterized by a class tree comprising a plurality of hierarchical levels of nodes, the nodes between each of the hierarchical levels being connected by paths; the determining, for each of the first sample information, an information category to which each level of the first sample information belongs in the information hierarchy includes:

traversing the nodes of the class tree for each piece of first sample information, and recording the nodes and paths traversed in the class tree for the first sample information;

determining a hierarchy corresponding to the traversed node in the class tree, and determining an information class characterized by the traversed node in the hierarchy;

the determining sample semantic information of the first sample information based on the information category of each level of the first sample information in the information hierarchy comprises:

8. An information pushing method, characterized in that the method comprises:

acquiring a plurality of first information;

9. The method of claim 8, wherein the method further comprises:

acquiring the mode information of at least one preset mode corresponding to each piece of candidate information, and determining a mode embedding vector of each piece of mode information;

acquiring semantic embedded vectors corresponding to each piece of candidate information, and fusing the modal embedded vectors and the semantic embedded vectors of the piece of candidate information aiming at each piece of candidate information to acquire candidate embedded vectors of the piece of candidate information;

The step of screening second information for pushing from the plurality of candidate information according to the modal embedded vector and the semantic embedded vector of each piece of first information includes:

for each piece of first information, fusing the modal embedded vector and the semantic embedded vector of the first information to obtain an information embedded vector of the first information;

and screening the second information for pushing from the plurality of candidate information according to the information embedding vector of each piece of first information and the candidate embedding vector of each piece of candidate information.

10. The method of claim 8, wherein the method further comprises:

determining a plurality of push receiving objects, and pushing the plurality of first information to the plurality of push receiving objects respectively;

and when the push receiving object browses the plurality of first information, pushing the second information to the push receiving object browsed by the plurality of first information.

11. An information push model processing device, characterized in that the device comprises:

12. An information pushing apparatus, characterized in that the apparatus comprises:

13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 10 when the computer program is executed.

14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 10.