WO2023241207A1 - 一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品 - Google Patents

一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品 Download PDF

Info

Publication number
WO2023241207A1
WO2023241207A1 PCT/CN2023/088857 CN2023088857W WO2023241207A1 WO 2023241207 A1 WO2023241207 A1 WO 2023241207A1 CN 2023088857 W CN2023088857 W CN 2023088857W WO 2023241207 A1 WO2023241207 A1 WO 2023241207A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
recommended
updated
target
weight
Prior art date
Application number
PCT/CN2023/088857
Other languages
English (en)
French (fr)
Inventor
沈春旭
成昊
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023241207A1 publication Critical patent/WO2023241207A1/zh
Priority to US18/587,671 priority Critical patent/US20240211991A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • G06Q30/0256User search
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to information recommendation technology in the field of artificial intelligence, and in particular, to a data processing method, device, equipment, computer-readable storage medium and computer program product.
  • a cold start object refers to an object whose number of conversions is zero during information recommendation; since the number of recommended information converted by a cold start object is zero, therefore, in the conversion bipartite diagram corresponding to the recommended object and the recommended information, the cold start object corresponds to
  • the vertices of are isolated vertices; however, because isolated vertices do not have any edge connections in the graph neural network, information cannot be effectively disseminated, so the conversion probability cannot be determined in the cold start scenario, which affects the accuracy of information recommendation, thereby increasing Reduce the resource consumption of information recommendation.
  • Embodiments of the present application provide a data processing method, device, equipment, computer-readable storage medium and computer program product, which can improve the accuracy of information recommendation and reduce the resource consumption of information recommendation.
  • Embodiments of the present application provide a data processing method, which is executed by a data processing device.
  • the method includes:
  • the characteristics of the object to be recommended corresponding to the object to be recommended wherein the characteristics of the object to be recommended are obtained through the nonlinear mapping result of the target second-order information, and the target second-order information is obtained by aggregating the object characteristics corresponding to at least one interactive object, and The first information feature corresponding to at least one piece of recommended information is obtained, the interactive object is an object that interacts with the object to be recommended, and the at least one piece of recommended information is the full amount of information converted by each of the interactive objects;
  • information recommendation is performed on the object to be recommended.
  • the data processing device includes:
  • the feature acquisition module is configured to obtain the characteristics of the object to be recommended corresponding to the object to be recommended, wherein the characteristics of the object to be recommended are obtained through a non-linear mapping result of the target second-order information, and the target second-order information is obtained by aggregating at least one interactive object respectively. Corresponding object characteristics and first information characteristics corresponding to at least one recommendation information are obtained.
  • the interactive object is an object that interacts with the object to be recommended, and the at least one recommendation information is for each of the interactions. The full amount of information transformed by the object;
  • the feature acquisition module is further configured to acquire features of the information to be recommended corresponding to the information to be recommended, where the information to be recommended is any of the recommended information converted by the interactive object;
  • An information recommendation module is configured to recommend information for the object to be recommended based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended.
  • An embodiment of the present application provides a data processing device, including:
  • Memory for storing computer programs or computer-executable instructions
  • the processor is configured to implement the data processing method provided by the embodiment of the present application when executing the computer program or computer-executable instructions stored in the memory.
  • Embodiments of the present application provide a computer-readable storage medium that stores computer programs or computer-executable instructions. When executed by a processor, the computer program or computer-executable instructions are used to implement the data processing method provided by the embodiments of the present application. .
  • Embodiments of the present application provide a computer program product, which includes a computer program or computer-executable instructions.
  • the computer program or computer-executable instructions are executed by a processor, the data processing method provided by the embodiment of the present application is implemented.
  • the target second-order information used not only includes the interaction between objects, but also includes the interaction between the object and the recommended information. It is a kind of Heterogeneous information; therefore, the characteristics of the object to be recommended corresponding to the object to be recommended are obtained by nonlinear mapping of the target second-order information, so that the object to be recommended and the recommendation information are accurately associated, so that the object to be recommended can be accurately identified based on the characteristics of the object to be recommended.
  • Figure 1 is a schematic architectural diagram of an information recommendation system provided by an embodiment of this application.
  • Figure 2 is a schematic structural diagram of a server in Figure 1 provided by an embodiment of the present application;
  • FIG. 3a is a schematic flowchart 1 of the data processing method provided by the embodiment of the present application.
  • Figure 3b is a schematic flow chart 2 of the data processing method provided by the embodiment of the present application.
  • Figure 3c is a schematic flowchart of obtaining characteristics of objects to be recommended according to an embodiment of the present application.
  • Figure 4a is a schematic flow chart of model training provided by the embodiment of the present application.
  • Figure 4b is a schematic flow chart 2 of model training provided by the embodiment of the present application.
  • Figure 5 is a schematic diagram of an exemplary information recommendation process provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of an exemplary click bipartite diagram provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of an exemplary social graph provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of an exemplary heterogeneous graph provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of an exemplary heterogeneous information aggregation provided by an embodiment of the present application.
  • Figure 10 is another exemplary heterogeneous information aggregation schematic diagram provided by the embodiment of the present application.
  • Figure 11 is an exemplary weight update schematic diagram provided by the embodiment of the present application.
  • Figure 12 is an exemplary model performance comparison diagram provided by the embodiment of the present application.
  • first ⁇ second and so on are only used to distinguish similar objects and do not represent a specific ordering of objects. It is understood that “first ⁇ second” and so on are used where permitted. The specific order or sequence may be interchanged so that the embodiments of the application described herein can be practiced in another order than illustrated or described herein.
  • AI Artificial Intelligence
  • Machine Learning is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory and other disciplines. It is used to study computer simulation or realize human learning behavior to acquire new knowledge or skills; to reorganize the existing knowledge structure to continuously improve its performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent. Its applications cover all fields of artificial intelligence.
  • Machine learning usually includes techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, and inductive learning.
  • Artificial neural network is a mathematical model that imitates the structure and function of biological neural networks.
  • the exemplary structures of artificial neural networks include graph convolutional network (GCN), deep neural network (Deep). Neural Networks (DNN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), Neural State Machine (NSM) and Phase-Functioned Neural Network (Phase-Functioned Neural Network) , PFNN) etc.
  • GCN graph convolutional network
  • DNN Deep neural network
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • NSM Neural State Machine
  • Phase-Functioned Neural Network Phase-Functioned Neural Network
  • the specified heterogeneous graph and the heterogeneous graph to be updated involved in the embodiments of this application are models corresponding to artificial neural networks.
  • Cloud Computing is a computing model that distributes computing tasks on a resource pool composed of a large number of computers, enabling various application systems to obtain computing power, storage space and information services as needed; where, is the resource
  • the network that provides resources from the pool is called a "cloud”.
  • the resources in the "cloud” can be infinitely expanded from the user's perspective, and can be obtained at any time, used on demand, expanded at any time, and paid according to use.
  • the data processing method provided by the embodiments of this application can be implemented through cloud computing.
  • Conversion probability refers to the probability of successful conversion; among them, successful conversion includes click, browse, download, purchase, run and register, etc. Therefore, conversion probability includes click probability (Clickthrough Rate, CTR), Browsing probability, download probability, purchase probability, running probability and registration probability, etc.; for example, the probability of an ad exposure account clicking on an ad, and the probability of an ad exposure account purchasing target resources.
  • Homogeneous Graph a graph including one type of nodes and one type of edge; the object interaction graph in the embodiment of this application is a homogeneous graph.
  • Heterogeneous Graph refers to a graph in which at least one type of vertices and edges is greater than or equal to two types; the object information conversion graph in the embodiment of this application, as well as the designated heterogeneous graph, the heterogeneous graph to be updated, The pictures are all heterogeneous pictures.
  • Bipartite Graph means that there are two types of vertices, and edges exist on different types of vertices.
  • the graph between; that is, the vertex set of the bipartite graph includes two mutually disjoint subsets, and the vertices at both ends of each edge in the bipartite graph belong to two different subsets, so the vertices in the same subset Not adjacent.
  • the object information transformation graph is a bipartite graph, including an object set and an information set, and the edges represent the object's successful transformation of the information.
  • SNS Social Networking Service
  • the corresponding recommendation strategy can be determined first through the historical transformation relationship between the recommended information and the object, and then based on the recommendation strategy, information recommendation can be made in the current information recommendation process.
  • the objects of information recommendation are not the same; that is to say, the current object to be recommended may not appear in the historical transformation relationship between recommended information and objects, and is a cold start object. Therefore, the recommendation strategy determined based on the historical conversion relationship between recommended information and objects cannot be applied to the current objects to be recommended, resulting in low accuracy of information recommendation in cold start scenarios, which in turn leads to greater resource consumption for information recommendation.
  • GCN such as Neural Graph Collaborative Filtering (NGCF) and simplified GCN (LightGCN); among them, NGCF uses objects and information as vertices to construct a bipartite graph. And the conversion probability is estimated through message propagation; the simplified GCN estimates the conversion probability by removing the feature transformation and non-linear operations in NGCF.
  • NGCF Neural Graph Collaborative Filtering
  • LightGCN simplified GCN
  • the object to be recommended is an isolated vertex; however, , Isolated vertices have no edge connections in GCN, and information cannot be effectively propagated. Therefore, the conversion probability of the recommended object to recommended information cannot be determined in the cold start scenario, which affects the accuracy of information recommendation in the cold start scenario, thereby increasing the information recommendation resource consumption.
  • exploration and exploitation strategies can also be used.
  • transfer learning and meta-learning can be used, that is, by learning the conversion behavior of cold-start objects and recommended information in other information recommendation scenarios, and adapting to the current information recommendation scenario through knowledge transfer.
  • Recommended information In order to recommend information for cold start objects, it can also be based on the knowledge graph (KG) to make related recommendations based on the similarity of the recommended information on KG; that is to say, the information recommended for cold start objects is the cold start object. Recommended information converted from similar objects on KG.
  • a heterogeneous graph neural model can be used to convert the "object-recommendation information" into a bipartite graph and an "object-object” social graph as subgraphs of the whole graph.
  • the whole graph is passed through the two subgraphs. Linear splicing results, and using SNS prior knowledge to recommend information for cold start objects.
  • embodiments of the present application provide a data processing method, device, equipment, computer-readable storage medium and computer program product, which can improve the accuracy of information recommendation and reduce the resource consumption of information recommendation.
  • the following describes exemplary applications of the data processing equipment provided by the embodiments of the present application.
  • the data processing equipment provided by the embodiments of the present application can be implemented as smartphones, smart watches, laptops, tablets, desktop computers, smart home appliances, set-top boxes, and smart vehicles.
  • Various types of terminals such as devices, portable music players, personal digital assistants, dedicated messaging devices, intelligent voice interaction devices, portable game devices and smart speakers can also be implemented as servers or A combination of both.
  • an exemplary application when the device is implemented as a server will be described.
  • Figure 1 is a schematic architectural diagram of an information recommendation system provided by an embodiment of the present application; as shown in Figure 1, in order to support an information recommendation application, in the information recommendation system 100, a terminal 200 (exemplarily showing the terminal 200-1 and terminal 200-2) are connected to the server 400 (called data processing equipment) through the network 300.
  • the network 300 can be a wide area network or a local area network, or a combination of the two.
  • the information recommendation system 100 also includes a database 500 for providing data support to the server 400; and, what is shown in Figure 1 is a situation where the database 500 is independent of the server 400.
  • the database 500 can also be integrated in In the server 400, the embodiment of the present application does not limit this.
  • the terminal 200 is configured to display target information to be recommended on a graphical interface (graphical interface 210-1 and graphical interface 210-2 are shown as examples).
  • the server 400 is used to obtain the characteristics of the object to be recommended corresponding to the object to be recommended, wherein the characteristics of the object to be recommended are obtained through the nonlinear mapping result of the target second-order information, and the target second-order information is obtained by aggregating the object characteristics corresponding to at least one interactive object, and obtaining the first information characteristics respectively corresponding to at least one piece of recommended information, the interactive object is an object that interacts with the object to be recommended, and the at least one piece of recommended information is the full amount of information converted by each interactive object; obtaining the characteristics of the information to be recommended corresponding to the information to be recommended , where the information to be recommended is any recommendation information converted by the interactive object; based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended, the target information to be recommended that is recommended to the object to be recommended is sent to the terminal 200 through the network 300.
  • the server 400 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, Cloud servers for basic cloud computing services such as network services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
  • the terminal and the server can be connected directly or indirectly through wired or wireless communication methods, which are not limited in the embodiments of this application.
  • FIG. 2 is a schematic structural diagram of a server in Figure 1 provided by an embodiment of the present application.
  • the server 400 shown in Figure 2 includes: at least one processor 410, a memory 450 and at least one network interface 420.
  • the various components in server 400 are coupled together by bus system 440 .
  • the bus system 440 is used to implement connection communication between these components.
  • the bus system 440 also includes a power bus, a control bus, and a status signal bus.
  • the various buses are labeled bus system 440 in FIG. 2 .
  • the processor 410 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware Components, etc., wherein the general processor can be a microprocessor or any conventional processor, etc.
  • DSP Digital Signal Processor
  • Memory 450 may be removable, non-removable, or a combination thereof.
  • Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, etc.
  • Memory 450 optionally includes one or more storage devices physically located remotely from processor 410 .
  • Memory 450 includes volatile memory or non-volatile memory, and may include both volatile and non-volatile memory.
  • Non-volatile memory can be read-only memory (Read Only Memory, ROM), and volatile memory can be random access memory (Random Access Memory, RAM).
  • ROM read-only memory
  • RAM random access memory
  • the memory 450 described in the embodiments of this application is intended to include any suitable type of memory.
  • the memory 450 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplarily described below.
  • the operating system 451 includes system programs used to process various basic system services and perform hardware-related tasks, such as the framework layer, core library layer, driver layer, etc., which are used to implement various basic services and process hardware-based tasks;
  • Network communication module 452 for reaching other computer devices via one or more (wired or wireless) network interfaces 420.
  • Example network interfaces 420 include: Bluetooth, Wireless Compliance Certified (Wi-Fi), and Universal Serial Bus (Universal Serial Bus, USB), etc.;
  • the data processing device provided by the embodiment of the present application can be implemented in software.
  • Figure 2 shows the data processing device 455 stored in the memory 450, which can be software in the form of programs, plug-ins, etc., including the following Software modules: feature acquisition module 4551, information recommendation module 4552, object judgment module 4553 and model training module 4554. These modules are logical, so they can be combined or further split in any way according to the functions implemented. The functions of each module are explained below.
  • the data processing device provided by the embodiment of the present application can be implemented in hardware.
  • the data processing device provided by the embodiment of the present application can be a processor in the form of a hardware decoding processor, which is programmed to To execute the data processing method provided by the embodiments of this application, for example, a processor in the form of a hardware decoding processor may use one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC), DSP, programmable logic device (Programmable Logic) Device (PLD), Complex Programmable Logic Device (CPLD), Field-Programmable Gate Array (FPGA) or other electronic components.
  • ASIC Application Specific Integrated Circuit
  • DSP digital signal processor
  • PLD programmable logic device
  • CPLD Complex Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the data processing method provided by the embodiment of the present application will be described in conjunction with the exemplary application and implementation of the data processing device provided by the embodiment of the present application.
  • the data processing method provided by the embodiments of this application is applied to various information recommendation scenarios such as cloud technology, artificial intelligence, smart transportation, and vehicle-mounted.
  • Figure 3a is a schematic flowchart 1 of the data processing method provided by the embodiment of the present application, which will be described in conjunction with the steps shown in Figure 3a.
  • Step 301 Obtain the characteristics of the object to be recommended corresponding to the object to be recommended.
  • the data processing device when the data processing device performs information recommendation on the object to be recommended, it first determines the characteristic representation associated with the recommended information for the object to be recommended, which is the characteristic of the object to be recommended; wherein the characteristic of the object to be recommended is based on at least one interactive object It is determined that each interactive object is an object interacted by the object to be recommended, and each interactive object has converted at least one piece of recommended information.
  • the characteristics of the object to be recommended may be determined in real time or may be determined in advance, which is not limited in the embodiments of the present application.
  • the data processing device first obtains the characteristic representation of each interactive object (called object characteristics) and the characteristic representation of each recommended information (called first information characteristics).
  • object characteristics characteristic representation of each interactive object
  • first information characteristics characteristic representation of each recommended information
  • the object characteristics corresponding to at least one interactive object and the first information characteristics corresponding to at least one recommendation information are obtained; then, the data processing device obtains the object characteristics corresponding to at least one interactive object and the first information characteristics corresponding to at least one recommendation information.
  • the first information features are aggregated, and the target second-order information of the object to be recommended is obtained.
  • the data processing device performs nonlinear mapping on the target second-order information, and the characteristics of the object to be recommended are obtained.
  • the object feature can be an embedded representation of the interactive object, or it can be a one-hot code (One-Hot encoding) of the interactive object, or it can be a feature representation corresponding to the label of the interactive object, etc., the embodiments of the present application are Not limited.
  • the characteristics of the object to be recommended are the nonlinear mapping results corresponding to the target second-order information.
  • the target second-order information is obtained by aggregating the object characteristics corresponding to at least one interactive object and the first information characteristics corresponding to at least one recommendation information.
  • At least one interactive object is an object that interacts with the object to be recommended, and at least one recommendation information is the information transformed by each interactive object;
  • nonlinear mapping refers to the processing of improving the dimension of the feature space, for example, based on kernel functions (such as , Gaussian kernel function) data processing, etc.; through nonlinear mapping, the result after nonlinear mapping has a higher feature space dimension than the data before nonlinear mapping; that is to say, the target second-order information is a low-dimensional feature,
  • the characteristics of the object to be recommended are high-dimensional features; in this way, the object to be recommended can be effectively associated with each recommendation information, and the similarity between objects can be accurately determined.
  • the field may also be the same as the field to which at least one interactive object and at least one recommendation information corresponding to each interactive object belong; for example, both are in the game field, both are in the instant messaging field, and so on.
  • Step 302 Obtain characteristics of the information to be recommended corresponding to the information to be recommended.
  • the information to be recommended is any recommendation information converted by the interactive object, that is, any one of the at least one recommendation information converted by any interactive object; here, the data processing device obtains the characteristic representation of the information to be recommended , thus obtaining the characteristics of the information to be recommended.
  • the characteristics of the information to be recommended are used to determine whether the information to be recommended is information that can be recommended to the object to be recommended; in addition, the characteristics of the information to be recommended can be an embedded representation of the information to be recommended, or can be a unique representation of the information to be recommended.
  • the code may also be a feature representation corresponding to the label of the information to be recommended, etc. This is not limited in the embodiment of the present application.
  • Step 303 Based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended, perform information recommendation on the object to be recommended.
  • the data processing device After the data processing device obtains the characteristics of the object to be recommended and the characteristics of the information to be recommended, it determines whether to recommend the information to be recommended to the object to be recommended based on the characteristics of the object to be recommended and the characteristics of the information to be recommended; here, the data processing device First, fuse the characteristics of the object to be recommended and the characteristics of the information to be recommended, and then use an activation function (for example, Sigmoid function) to process the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended, thus obtaining the transformation of the object to be recommended and the information to be recommended.
  • an activation function for example, Sigmoid function
  • the target second-order information is obtained by aggregating the object characteristics corresponding to at least one interactive object and the first-order information corresponding to at least one recommendation information.
  • the acquisition of information features allows the following information to be associated: interaction information between objects, and conversion information between objects and recommended information; in addition, because the characteristics of the objects to be recommended are obtained through non-linear mapping of the target second-order information , enabling effective interaction between objects and recommended information, thereby improving the accuracy of information recommendation and reducing resource consumption of information recommendation.
  • Figure 3b is a schematic flow chart 2 of the data processing method provided by the embodiment of the present application; as shown in Figure 3b, in the embodiment of the present application, step 301 can be implemented through steps 3011 to 3014; that is to say, the data
  • the processing device obtains the characteristics of the object to be recommended corresponding to the object to be recommended, including steps 3011 to 3014. Each step will be described below.
  • Step 3011 Obtain target second-order information corresponding to the object to be recommended.
  • the data processing device can directly aggregate based on the object characteristics corresponding to at least one interactive object and the first information characteristics corresponding to at least one recommendation information to obtain the target second-order information; for example, the data processing device Obtain the first accumulation result of the object characteristics respectively corresponding to at least one interactive object, obtain the second accumulation result of the first information characteristics corresponding to at least one recommendation information, and accumulate the first accumulation result and at least one corresponding to the at least one interactive object. The second accumulation result is used to obtain the target second-order information.
  • the data processing device can also combine weights to aggregate object features corresponding to at least one interactive object and first information features corresponding to at least one recommendation information to obtain target second-order information. That is to say, the data processing device obtains the target second-order information corresponding to the object to be recommended, including: the data processing device obtains the interaction weight between the object to be recommended and the interactive object, and obtains the conversion weight between the interactive object and the recommended information; and then , the data processing device obtains the first fusion result of the interaction weight and the object characteristics, and the second fusion result of the transformation weight and the first information characteristic; and obtains at least one first fusion result corresponding to at least one interactive object, and the second fusion result of each interactive object. At least one second fusion result corresponding to the converted at least one recommendation information is combined into target second-order information corresponding to the object to be recommended.
  • the interaction weight refers to the weight between the object to be recommended and the interactive object, indicating the closeness between the object to be recommended and the interactive object; wherein, the data processing device can determine the to-be-recommended object through at least one of the following information.
  • the degree of intimacy between the recommended object and the interactive object number of interactions, duration of interaction, frequency of interaction and mode of interaction.
  • the conversion weight refers to the weight between the interactive object and the recommended information, indicating the degree of conversion between the interactive object and the recommended information; wherein, the data processing device can determine the relationship between the interactive object and the recommended information through at least one of the following information Conversion degree: number of conversions, conversion duration, conversion frequency and conversion method (for example, click, order, browse, play, follow, etc.).
  • the data processing device fuses the interaction weight with the object characteristics of the interactive object corresponding to the interaction weight, thereby obtaining the first fusion result. Therefore, for at least one interactive object, at least one first fusion result is obtained; The data processing device fuses the conversion weight with the first information feature of the corresponding recommended information, thereby obtaining a second fusion result. Therefore, for at least one piece of recommended information, at least one second fusion result is obtained, that is, each The interactive object corresponds to at least one second fusion result.
  • the method of combining the target second-order information may be addition, splicing, weighted fusion, etc., and the embodiments of the present application are not limited to this.
  • the first superposition result is obtained by the data processing device directly accumulating at least one object feature corresponding to at least one interactive object; the first fusion result is obtained by the data processing device based on the relationship between the object to be recommended and the interactive object. Closeness is obtained by weighted accumulation of at least one object characteristic.
  • the second superposition result is obtained by the data processing device directly accumulating at least one first information feature corresponding to at least one recommendation information.
  • the second fusion result is the data processing device based on the conversion degree between the interactive object and the recommended information, at least A first information feature is obtained by weighted accumulation.
  • the accuracy of obtaining the second-order information of the target can be improved, which in turn can improve the accuracy of the characteristics of the object to be recommended. This enables the characteristics of the object to be recommended to be effectively associated with the information to be recommended, thereby improving the accuracy of information recommendation.
  • Step 3012 Obtain the spatial distance between the target second-order information and the specified center information.
  • the data processing equipment can obtain the designated center information; wherein the designated center information is determined through multiple second-order information, and the multiple second-order information includes the target second-order information, and also includes in addition to the target second-order information.
  • the other second-order information can be the second-order information corresponding to at least one interactive object; for example, the designated central information is the mean of multiple second-order information; in addition, the second-order information is the object that the object interacts with The aggregation result of the characteristics of the information transformed by each interactive object.
  • the data processing device obtains the feature difference between the target second-order information and the designated center information, and determines the feature difference as the spatial distance between the target second-order information and the designated center information, such as the Euclidean distance, etc. .
  • Step 3013 Perform nonlinear mapping of spatial distance based on multiple specified mapping parameters to obtain multiple second-order features to be fused.
  • the data processing device can obtain multiple specified mapping parameters, where the specified mapping parameter represents the mapping space range, such as Gaussian bandwidth; here, the data processing device uses each specified mapping parameter to perform nonlinear mapping of spatial distance, The spatial distance is mapped to different mapping space ranges, so that multiple second-order features to be fused can be obtained for multiple specified mapping parameters.
  • each second-order feature to be fused is the result of nonlinear mapping based on the corresponding specified mapping parameters.
  • Step 3014 Obtain the first nonlinear mapping results corresponding to multiple second-order features to be fused, and obtain the features of the objects to be recommended based on the first nonlinear mapping results.
  • the data processing device after the data processing device obtains multiple second-order features to be fused, it integrates the multiple second-order features to be fused, thereby obtaining the first nonlinear mapping result; here, the data processing device can directly convert the second-order features to be fused.
  • a non-linear mapping result is determined as the feature of the object to be recommended, or the first non-linear mapping result and other information (such as information converted from the object to be recommended, etc.) can be combined into the feature of the object to be recommended, etc., according to the embodiment of the present application There is no limit to this. Therefore, the nonlinear mapping result corresponding to the target second-order information includes at least the first nonlinear mapping result.
  • the data processing device obtains the characteristics of the object to be recommended corresponding to the object to be recommended. Before collecting data, the data processing method also includes: the data processing device obtains the conversion identification of the object to be recommended and the recommended information base.
  • the information database to be recommended includes at least one recommendation information converted by each interactive object in at least one interactive object; the data processing device detects the conversion status of each recommended information in the recommended information database of the object to be recommended to obtain conversion identification; wherein, the conversion identification indicates whether the object to be recommended has been converted from the recommendation information in the recommendation information database.
  • Figure 3c is a schematic flow chart of obtaining the characteristics of an object to be recommended according to an embodiment of the present application; as shown in Figure 3c, in the embodiment of the present application, in step 3014, the data processing device is based on the first nonlinear mapping As a result, the characteristics of the object to be recommended are obtained, including step 30141A and step 30142A. Each step will be described below.
  • Step 30141A When the conversion identifier indicates that the information database to be recommended includes at least one converted information, aggregate at least one second information feature corresponding to the at least one converted information to obtain the target first-order information corresponding to the object to be recommended.
  • the conversion identifier indicates that the information database to be recommended includes at least one converted information corresponding to the object to be recommended, it indicates that the object to be recommended has been converted to the recommended information in the information database to be recommended, and the converted recommendation information is at least one converted information, so the object to be recommended is a non-cold start object; among them, each converted information is the recommended information that has been converted by the object to be recommended in the information database to be recommended; at this time, the object to be recommended and the recommended information The association between them can also be established through at least one converted information converted by itself; thus, the data processing device aggregates the second information characteristics corresponding to at least one converted information, thereby obtaining the target first-order information corresponding to the object to be recommended. , that is to say, the target first-order information is obtained by aggregating at least one second information feature corresponding to the converted information.
  • the second information feature is a feature of the transformed information.
  • Step 30142A Obtain the second nonlinear mapping result corresponding to the target first-order information, and combine the second nonlinear mapping result and the first nonlinear mapping result into the characteristics of the object to be recommended.
  • the data processing device performs nonlinear mapping on the target first-order information, thereby obtaining the second nonlinear mapping result; wherein, the process of the data processing device performing nonlinear mapping on the target first-order information is the same as the process of performing nonlinear mapping on the target first-order information.
  • the process of nonlinear mapping of target second-order information is similar, and will not be described again in the embodiments of this application.
  • the data processing device obtains the characteristics of the object to be recommended by combining the first nonlinear mapping result and the second nonlinear mapping result; wherein the combination method may be addition, weighted addition, etc.; and , in the process of combining the first nonlinear mapping result with other information to obtain the characteristics of the object to be recommended, the other combined information is the second nonlinear mapping result.
  • the object to be recommended is a non-cold start object
  • the first non-linear mapping result corresponding to the second-order information of the target is combined with the target It is realized by the second non-linear mapping result corresponding to the first-order information, which improves the diversity of data based on which the characteristics of the objects to be recommended are obtained; thus, the accuracy of the characteristics of the objects to be recommended can be improved, which in turn can improve the accuracy of information recommendation and reduce the Resource consumption of information recommendation.
  • the data processing device combines the second nonlinear mapping result and the first nonlinear mapping result into the characteristics of the object to be recommended corresponding to the object to be recommended, including: the data processing device combines the second nonlinear mapping result with the first nonlinear mapping result.
  • the mapping result and the first nonlinear mapping result are combined to obtain the initial aggregation information; and the first combination weight that is negatively correlated with the initial aggregation information and positively correlated with the second nonlinear mapping result is obtained, and the corresponding first combination weight is obtained
  • the second combination weight of The fourth fusion result refers to the fusion result of the second combination weight and the first nonlinear mapping result.
  • the data processing device can achieve the combination of the first nonlinear mapping result and the second nonlinear mapping result by adding the two to obtain the initial aggregate information; in addition, the first combination weight It is negatively related to the second combination weight. For example, the first combination weight is the difference between 1 and the second combination weight.
  • the data processing device can add the third fusion result and the fourth fusion result. combination to obtain the characteristics of the objects to be recommended.
  • step 3014 the data processing device obtains the characteristics of the object to be recommended based on the first non-linear mapping result, including step 30141B, which step will be described below.
  • Step 30141B When the conversion identifier indicates that the conversion object library is independent of the object to be recommended, determine the first nonlinear mapping result as the feature of the object to be recommended.
  • the conversion identifier indicates that the conversion object library corresponding to the information library to be recommended is independent of the object to be recommended, it indicates that the object to be recommended has not been converted from the recommended information in the information library to be recommended. At this time, the object to be recommended is cold. Start the object; that is to say, the object to be recommended does not belong to the conversion object library; among them, the conversion object library is a collection of objects that have been converted to the information to be recommended in the information library to be recommended; at this time, the relationship between the object to be recommended and the recommended information The correlation is obtained through the first non-linear mapping of the target second-order information.
  • the characteristics of the objects to be recommended and the characteristics of the information to be recommended are obtained by specifying a heterogeneous graph; in the specified heterogeneous graph, the vertices are the characteristics or recommendation information of the objects (including the objects to be recommended and at least one interactive object) (including at least one recommendation information corresponding to each interactive object), the edges are the edges between objects, or the edges between objects and recommendation information; and the edges can be authorized edges or unauthorized edges;
  • the edge is a weighted edge
  • the weight of the edge represents the degree of association between the two vertices, for example, the degree of intimacy between objects, the degree of conversion between objects and recommended information; and, the characteristics of objects are transformed by aggregating associated objects.
  • the characteristics of the recommended information are obtained.
  • FIG. 4a is a schematic flow chart of model training provided by an embodiment of the present application. As shown in Figure 4a, the designated heterogeneous graph is obtained through steps 305 to 309. Each step will be described below.
  • Step 305 Construct an object interaction graph based on the interaction records between at least two first objects.
  • the at least two first objects include the object to be recommended and at least one interactive object; in the object interaction graph constructed by the data processing device, the vertices represent the characteristic representation of the first object, and the edges represent the relationship between the two first objects. There is interaction; when the edge is a entitled edge, it indicates the degree of intimacy determined based on the interaction information between the two first objects (for example, the number of interactions, the frequency of interaction, the type of interaction, etc.).
  • Step 306 Construct an object information conversion graph based on the conversion record of at least one second object to at least one initial recommendation information.
  • At least one initial recommendation information includes at least one recommendation information converted by each interactive object; in the object information transformation graph constructed by the data processing device, the vertex is the characteristic representation of the second object or the characteristic representation of the initial recommendation information. , the edge indicates that the second object has transformed the initial recommendation information; when the edge is a weighted edge, the weight corresponding to the edge indicates the transformation information based on the second object's transformation of the initial recommendation information (for example, the number of conversions, conversion duration, conversion frequency, conversion type, etc.).
  • Step 307 Based on the common object between at least two first objects and at least one second object, fuse the object interaction graph and the object information transformation graph to obtain a heterogeneous graph to be updated.
  • the data processing device after the data processing device obtains the object interaction diagram and the object information conversion diagram, it first obtains a common object between at least two first objects and at least one second object; and then adds the common object to the object interaction diagram.
  • the relevant information of the first object associated through the edge is combined with the relevant information of the initial recommendation information associated with the common object through the edge in the object information conversion graph, and the heterogeneous graph to be updated is obtained; that is, in The heterogeneous graph to be updated includes the interactive relationship between the common object and the first object to be interacted with, and also includes the transformation relationship between the common object and the converted initial recommendation information.
  • Step 308 Iteratively update the object vertices in the heterogeneous graph to be updated based on the nonlinear mapping results corresponding to the second-order information of each object vertex in the heterogeneous graph to be updated.
  • some of the vertices in the heterogeneous graph to be updated are object vertices and some are information vertices; among them, the object vertices are the feature representations of the first object or the second object, and the information vertices are initial recommendation information.
  • feature table shown here, the data processing device iteratively updates the object vertices of the heterogeneous graph to be updated to complete the update of the heterogeneous graph to be updated; and, the data processing device is based on the nonlinear mapping result corresponding to the second-order information of each object vertex. , to achieve the update of the corresponding object vertices.
  • the second-order information of each object vertex includes the object vertex corresponding to the object with which the object vertex interacts, and the information vertex corresponding to the initial recommendation information converted by the interacted object; in addition, the embodiment of the present application
  • the second-order information in is obtained by aggregating the characteristics of the objects that the objects interact with and the characteristics of the recommendation information converted by the interacted objects. Therefore, the target second-order information is the second-order information of the object to be recommended.
  • Step 309 Determine the heterogeneous graph to be updated after the iterative update as the designated heterogeneous graph.
  • the data processing device iteratively updates the heterogeneous graph to be updated.
  • the iteratively updated heterogeneous graph to be updated reaches a specified deadline, the iterative update ends, and the iteratively updated heterogeneous graph to be updated is , determined to be the specified heterogeneous graph.
  • the specified cut-off condition means that the heterogeneous graph to be updated after the current iterative update can reach the specified indicator.
  • the accuracy is greater than the specified accuracy
  • the loss function value is less than the specified loss function value
  • Figure 4b is a schematic diagram 2 of the model training process provided by the embodiment of the present application; as shown in Figure 4b, in the embodiment of the present application, step 308 can be implemented through steps 3081 to 3086 (not shown in the figure) ; That is to say, the data processing device iteratively updates the object vertices in the heterogeneous graph to be updated based on the nonlinear mapping results corresponding to the second-order information of each object vertex in the heterogeneous graph to be updated, including steps 3081 to 3086, Each step is explained below.
  • Step 3081 Perform the following processing on each object vertex in the heterogeneous graph to be updated: update the object vertex based on the nonlinear mapping result corresponding to the second-order information of the object vertex.
  • Step 3082 Determine the heterogeneous graph to be updated that has been updated as the current heterogeneous graph.
  • the process of the data processing device obtaining the second-order information of each object vertex in the heterogeneous graph to be updated is similar to the process of obtaining the target second-order information; and, the data processing device obtains the second-order information of each object vertex in the heterogeneous graph to be updated.
  • the process of obtaining the nonlinear mapping result corresponding to the second-order information of the object vertex is similar to the process of obtaining the nonlinear mapping result corresponding to the target second-order information; and the process of the data processing device updating the object vertex in the heterogeneous graph to be updated is similar to
  • the process of obtaining the characteristics of the object to be recommended based on the nonlinear mapping result corresponding to the target second-order information is similar; the embodiments of the present application will not be repeated here.
  • Step 3083 Perform attention updates on the edge weights in the current heterogeneous graph to obtain the edge weights to be updated.
  • the data processing device performs an attention update on the edge weights in the current heterogeneous graph to obtain the edge weights to be updated, including: the data processing device performs the following processing on each current object vertex in the current heterogeneous graph. : Get at least one adjacent object vertex corresponding to the current object vertex.
  • the adjacent object vertex is the object vertex adjacent to the current object vertex; based on at least one adjacent object vertex, determine the distance between the current object vertex and each adjacent object vertex.
  • Attention interaction weight obtain at least one adjacent information vertex corresponding to the current object vertex, the adjacent information vertex is the information vertex adjacent to the current object vertex; based on at least one adjacent information vertex, determine the current object vertex and each related
  • the attention conversion weight between adjacent information vertices; among them, the edge weight to be updated is the attention interaction weight or the attention conversion weight.
  • the data processing device determines the attention interaction weight based on the proportion of the adjacent object vertex in the at least one adjacent object vertex; also That is to say, the attention interaction weight is the proportion of each adjacent object vertex in at least one adjacent object vertex; the data processing device, for each adjacent information vertex in at least one adjacent object vertex, based on the adjacent The proportion of an information vertex in at least one adjacent information vertex determines the attention conversion weight; that is, the attention conversion weight is the proportion of each adjacent information vertex in at least one adjacent information vertex.
  • Step 3084 Perform adaptive enhancement on the edge weight to be updated to obtain the target edge weight.
  • the data processing device performs adaptive enhancement on the edge weight to be updated to obtain the target edge weight.
  • weight including: the data processing device obtains at least one edge weight to be updated, wherein at least the heterogeneous edge weight to be updated is adjacent to the target edge weight to be updated, and is of a different type from the target edge weight to be updated, and the target edge weight to be updated is any 1.
  • the data processing device obtains at least one attention conversion weight adjacent to the attention interaction weight, and adds each attention conversion weight and the attention interaction weight. , obtain the first weighted overlap sum; then, the data processing device determines the first enhancement parameter that is negatively correlated with the first weighted overlap sum and positively correlated with the attention conversion weight, and fuses the first enhancement parameter with the corresponding attention Transform the weight to obtain the first enhancement weight; finally, the data processing device adds the attention interaction weight to at least one first enhancement weight corresponding to at least one attention transformation weight to obtain the updated target edge weight to be updated, which is the current difference Target edge weights in the composition.
  • the data processing device obtains at least one attention interaction weight adjacent to the attention conversion weight, and adds each attention interaction weight and the attention conversion weight to obtain the second weight.
  • the data processing device determines a second enhancement parameter that is negatively correlated with the second weight stack sum and is positively correlated with the attention interaction weight, and obtains the second enhancement parameter by fusing the second enhancement parameter with the corresponding attention interaction weight.
  • Enhancement weight finally, the data processing device adds the attention conversion weight to at least one second enhancement weight corresponding to at least one attention interaction weight to obtain the updated target edge weight to be updated, which is the target in the current heterogeneous graph Edge weight.
  • Step 3085 Based on the target edge weight, aggregate the second-order information of each current object vertex in the current heterogeneous graph.
  • the process by which the data processing device obtains the second-order information of each current object vertex in the current heterogeneous graph based on the target edge weight is similar to the process by which the data processing device obtains the target second-order information by combining the interaction weight and the interaction weight.
  • the embodiments of the present application will not be repeatedly described here.
  • Step 3086 Iteratively update the object vertices in the current heterogeneous graph based on the nonlinear mapping results corresponding to the second-order information of the current object vertices.
  • the process of iteratively updating the current heterogeneous graph by the data processing device is similar to the process of iteratively updating the heterogeneous graph, and will not be described again in this embodiment of the present application.
  • the data processing device performs information recommendation on the object to be recommended based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended, including: the data processing device recommends information based on the characteristics of the object to be recommended and the characteristics of the information to be recommended.
  • Fusion results are used to determine the conversion probability of the object to be recommended into the information to be recommended; when the information database to be recommended includes at least two pieces of information to be recommended, based on at least two conversion probabilities of the object to be recommended to at least two pieces of information to be recommended, at least The two pieces of information to be recommended are arranged in reverse order to obtain a sequence of information to be recommended; a specified number of pieces of information to be recommended will be selected sequentially from the sequence of information to be recommended and determined as target information to be recommended; and the target piece of information to be recommended will be recommended to the target to be recommended.
  • the specified data refers to at least one.
  • the data processing device can also compare the conversion probability with the specified probability, and when the conversion probability is greater than the specified probability, recommend the information to be recommended to the object to be recommended.
  • the object vertices in the heterogeneous graph may be updated, or all vertices in the heterogeneous graph (including object vertices and information vertices) may be updated. Update is performed, which is not limited by the embodiments of this application. Moreover, the process of the data processing device updating the information vertices in the heterogeneous graph is similar to the updating process of the object vertices, which will not be described repeatedly in the embodiments of this application.
  • This exemplary application is executed by a server (called a data processing device) and describes how, in the field of gaming, an "account-advertisement” click bipartite graph (called “account-advertising” click bipartite graph) is constructed based on historical clicks between accounts and advertisements in historical recommendation data.
  • a server called a data processing device
  • an object information transformation graph an object information transformation graph
  • an object interaction graph an "account-account” social graph
  • an object interaction graph fuse the click bipartite graph and the social graph to obtain a heterogeneous graph
  • a heterogeneous graph called an object interaction graph
  • the heterogeneous graph is to be updated), and the accounts in the heterogeneous graph
  • Corresponding vertices perform heterogeneous information aggregation to realize information interaction between cold-start accounts and advertisements, thereby improving the accuracy of information recommendation for cold-start accounts.
  • Figure 5 is a schematic diagram of an exemplary information recommendation process provided by an embodiment of the present application; as shown in Figure 5, this exemplary information recommendation process includes a data collection stage 5-1 and an information aggregation stage 5-2 and advertising recommendation stages 5-3.
  • the data collection phase 5-1 historical recommendation data of advertisements in the game field are collected, and the data of account clicks on advertisements (called interactive records) are extracted from the historical recommendation data, and an "account” is constructed based on the data of account clicks on advertisements.
  • -Advertisement" click bipartite graph in the click bipartite graph, the vertices are vector representations (called feature representations) of accounts (called second objects) or advertisements (called initial recommendation information), and edges between vertices represent The account clicked on the advertisement, and the weight of the edge indicates the conversion relationship between the account and the advertisement. For example, the weight of the edge is positively related to the number of times the account clicks on the advertisement or the length of consumption.
  • FIG. 6 is a schematic diagram of an exemplary click bipartite graph provided by an embodiment of the present application.
  • vertices A, B, C and D are account numbers.
  • the vector representation of , vertices a, b, c and d are vector representations of advertisements; here, the connection between vertex B and vertex a is taken as an example: the account corresponding to vertex B clicks on the advertisement corresponding to vertex a, so that the vertex There is an edge between B and vertex a, and the weight corresponding to this edge is W Ba .
  • vertex A has no edges with vertices a, b, c and d respectively, and is an isolated vertex in Figure 6-1. Therefore, the account corresponding to vertex A is a cold start account (called a cold start object); and The accounts corresponding to vertices B, C and D are non-cold start accounts (called non-cold start objects).
  • the vertex (called the first Object) is a vector representation of the account.
  • the edges between the vertices represent the interaction between the two accounts, such as virtual resource interaction, team games, communication, etc., and the weight of the edge represents the intimacy of the interaction between the two accounts.
  • vertices A, B, C and D are vector representations of accounts;
  • the account corresponding to vertex A interacts with the account corresponding to vertex B, the account corresponding to C, and the account corresponding to D, so that vertex A
  • the weights corresponding to each edge are W AB , W AC and W AD respectively; in addition, the weight between vertex C and vertex D is W CD .
  • the click bipartite graph and the social graph are fused to obtain a heterogeneous graph (called a heterogeneous graph to be updated).
  • a heterogeneous graph to be updated a heterogeneous graph to be updated.
  • the vertex V represents the vector set composed of the vector representations corresponding to the accounts and advertisements
  • U is the complete set of vector representations of clicks on the account in the bipartite graph G B
  • u is the individual represented by the vector of the account, so u ⁇ U
  • I is the complete set of vector representations of the advertisements in the social graph G S
  • i is the advertisement.
  • the individual represented by the vector, so i ⁇ I; therefore, clicking on the vertices, edges, and edge weights (called transformation weights) of the bipartite graph can be expressed as u m or i n , w m,n ; the vertices, edges, and edge weights (called interaction weights) of the social graph can be expressed as u m in turn, w m1,m2 ; where m represents the account index and n represents the advertising index.
  • FIG 8 is a schematic diagram of an exemplary heterogeneous graph provided by an embodiment of the present application; as shown in Figure 8, the heterogeneous graph 8-1 is created by fusing the two click graphs 6-1 in Figure 6 and the social graph 7-1 in Figure 7 obtained.
  • vertex A can be associated with vertices a, b, c and d; therefore, the association between cold start accounts and advertisements is reflected in the heterogeneous graph.
  • heterogeneous information aggregation is performed on the vertices corresponding to the accounts in the heterogeneous graph, including heterogeneous information aggregation of cold-start accounts and heterogeneous information aggregation of non-cold-start accounts.
  • heterogeneous information aggregation of cold start accounts When aggregating heterogeneous information for non-activated accounts, the first-order information and second-order information of vertices are used for aggregation of heterogeneous information.
  • first-order information can be realized through equation (1), which is as follows.
  • Equation (2) Second-order information can be realized through equation (2), which is as follows.
  • Figure 9 is an exemplary heterogeneous information aggregation schematic diagram provided by an embodiment of the present application; as shown in Figure 9, for the vertex A corresponding to the cold start account in the heterogeneous diagram 8-1 of Figure 8,
  • the corresponding second-order information can be expressed as [vertex A-vertex B-vertex a; vertex A-vertex C-vertex b; vertex A-vertex C-vertex c; vertex A-vertex D-vertex d], as shown in Figure 9 shown by the solid line.
  • Figure 10 is another exemplary heterogeneous information aggregation schematic diagram provided by an embodiment of the present application; as shown in Figure 10, for the vertices corresponding to non-cold start accounts in the heterogeneous graph 8-1 of Figure 8 C, the corresponding first-order information can be expressed as [vertex C-vertex b; vertex C-vertex c], as shown in edge 10-1 and edge 10-2 in Figure 10; the second-order information can be expressed as [vertex C- Vertex A; vertex C - vertex D - vertex d], as shown in Figure 10 as edges 10-3, 10-4 and 10-5.
  • k represents the index value of Gaussian bandwidth
  • K represents a set of multiple Gaussian bandwidths (called multiple specified mapping parameters); represents the Gaussian kernel function, here
  • (called the first combination weight)
  • formula (4) and formula (5) are as follows.
  • ⁇ k represents the kth Gaussian bandwidth parameter, Represents the Gaussian kernel function center.
  • the weights are updated. Since the heterogeneous graph originates from two different types of subgraphs (click bipartite graph and social graph), the two different types of subgraphs belong to heterogeneous information. Therefore, an adaptive weighted attention mechanism is used to update the weights; that is, the attention mechanism is first used to update the weights on a single subgraph, and then the adaptive weighting mechanism is used to fuse the weights on different subgraphs. Therefore, the attention mechanism is used to update the weight of the edge belonging to the click bipartite graph in the heterogeneous graph. The process is shown in equation (6).
  • LeakyReLU represents the activation layer function ();
  • Neighbor(I) represents the vector representation set of all advertisements clicked by account u, which is called at least one adjacent information vertex corresponding to the current object vertex, i′ ⁇ Neighbor(I); represents the vector representation of advertisement i′.
  • Equation (7) The process of using the attention mechanism to update the weights of edges belonging to the social graph in the heterogeneous graph is shown in Equation (7).
  • Neighbor(U) represents the vector representation set of all accounts that account u interacts with, which is called at least one adjacent object vertex corresponding to the current object vertex, u′ ⁇ Neighbor(U); Represents the vector representation of account u′; It is called the attention transformation weight; It is called the attention interaction weight.
  • is called the second enhancement parameter
  • is called the first enhancement parameter
  • Figure 11 is an exemplary weight update schematic diagram provided by an embodiment of the present application; as shown in Figure 11, when the attention mechanism is used to update the weight 11-1 (W AC ), based on the vertex A and vertex D are implemented; when the adaptive weighting mechanism is used to update the weight 11-1, the updated weight 11-1, weight 11-2 and weight 11-3 are implemented based on the attention mechanism.
  • the advertisement click probability Y can be determined based on the vertices corresponding to account u and the vertices corresponding to advertisement i in the final heterogeneous graph u,i , as shown in equation (10).
  • Y u,i Sigmoid(V u *V i ) (10);
  • V u represents the vertex corresponding to account u in the final heterogeneous graph
  • V i represents the vertex corresponding to advertisement i in the final heterogeneous graph
  • Sigmoid() is the activation layer function
  • V u *V i is called the object to be recommended
  • the ad click probability corresponding to each advertisement (such as N ads, N is a positive integer) is estimated through the final heterogeneous graph, and the ads are sorted based on the ad click probability, and the ads with the highest ad click probability are selected and recommended to the account , to achieve information recommendation.
  • FIG 12 is an exemplary model performance comparison diagram provided by the embodiment of the present application; as shown in Figure 12, the horizontal axis represents the application date (0507 to 0511), and the vertical axis represents the performance index (0.06 to 0.13 ); Curve 12-1 is the performance information corresponding to baseline model 1, curve 12-2 is the performance information corresponding to baseline model 2, curve 12-3 is the performance information corresponding to baseline model 3, and curve 12-4 is the embodiment of the present application. Performance information corresponding to the provided data processing method. From curve 12-1 to curve 12-4, it can be seen that the data processing method provided by the embodiment of the present application is better than the baseline model 1 to baseline model 3 in terms of performance indicators.
  • the account and the advertisement can interact with information; thus, even if It is a cold start account and has an effective association with the advertisement.
  • the adaptive attention mechanism is also used to update the edge weights, which can improve the effect of nonlinear aggregation; in summary, the embodiments of this application provide The heterogeneous graph aggregation method can improve the accuracy of information recommendation for cold-start accounts and reduce the resource consumption of information recommendation.
  • the software module stored in the data processing device 455 of the memory 450 may include :
  • the feature acquisition module 4551 is configured to obtain the characteristics of the object to be recommended corresponding to the object to be recommended, wherein the characteristics of the object to be recommended are obtained through the nonlinear mapping result of the target second-order information, and the target second-order information is obtained by aggregating at least one interactive object Respectively corresponding object characteristics and first information characteristics corresponding to at least one recommendation information are obtained, the interactive object is an object that interacts with the object to be recommended, and the at least one recommendation information is converted by each of the interactive objects full information;
  • the feature acquisition module 4551 is also configured to acquire the features of the information to be recommended corresponding to the information to be recommended, where the information to be recommended is any of the recommended information converted by the interactive object;
  • the information recommendation module 4552 is configured to recommend information for the object to be recommended based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended.
  • the feature acquisition module 4551 is also configured to acquire the target second-order information corresponding to the object to be recommended; and acquire the spatial distance between the target second-order information and the designated center information, where , the designated center information is determined through a plurality of second-order information, and the plurality of second-order information includes the target second-order information; the spatial distance is nonlinearly mapped based on a plurality of designated mapping parameters to obtain multiple The second-order features to be fused, where the specified mapping parameters represent the mapping space range; a plurality of first non-linear mapping results corresponding to the second-order features to be fused are obtained, and based on the first non-linear mapping results, the Characteristics of the object to be recommended, wherein the nonlinear mapping result corresponding to the target second-order information includes the first nonlinear mapping result.
  • the feature acquisition module 4551 is further configured to acquire the interaction weight between the object to be recommended and the interactive object, where the interaction weight represents the interaction between the object to be recommended and the interaction object.
  • the degree of intimacy between objects obtaining the conversion weight between the interactive object and the recommended information, where the conversion weight represents the conversion degree between the interactive object and the recommended information; obtaining the interaction weight with the characteristics of the object
  • the first fusion result, and the second fusion result of the conversion weight and the first information feature obtain at least one first fusion result corresponding to at least one of the interactive objects, and at least one of the recommended information
  • Corresponding at least one second fusion result combine at least one first fusion result and at least one second fusion result corresponding to each interactive object into the said object corresponding to the to-be-recommended object.
  • Target second-order information is further configured to acquire the interaction weight between the object to be recommended and the interactive object, where the interaction weight represents the interaction between the object to be recommended and the interaction object.
  • the degree of intimacy between objects obtaining the conversion weight between the interactive object
  • the data processing device 455 also includes an object judgment module 4553, configured to obtain the conversion identification of the object to be recommended in a database of information to be recommended, where the information database to be recommended includes each of the interactions At least one of the recommended information converted by the object.
  • an object judgment module 4553 configured to obtain the conversion identification of the object to be recommended in a database of information to be recommended, where the information database to be recommended includes each of the interactions At least one of the recommended information converted by the object.
  • the feature acquisition module 4551 is also configured to: when the conversion identifier indicates that the information library to be recommended includes at least one converted information, at least one corresponding to the at least one converted information
  • the second information features are aggregated to obtain the target first-order information corresponding to the object to be recommended, wherein the converted information is the converted recommendation information of the object to be recommended, and the second information feature is the converted information of the object to be recommended. Convert the characteristics of the information; obtain the second non-linear mapping result corresponding to the target first-order information, and combine the second non-linear mapping result and the first non-linear mapping result into the characteristics of the object to be recommended.
  • the feature acquisition module 4551 is also configured to combine the second nonlinear mapping result and the first nonlinear mapping result to obtain initial aggregation information; obtain and the initial aggregation
  • the information is negatively correlated and the first combination weight is positively correlated with the second non-linear mapping result, and the second combination weight corresponding to the first combination weight is obtained; the first combination weight and the second non-linear mapping result are
  • the linear mapping results are fused to obtain a third fusion result
  • the second combination weight and the first nonlinear mapping result are fused to obtain a fourth fusion result; the third fusion result and the fourth fusion result are obtained.
  • the results are combined to obtain the characteristics of the object to be recommended.
  • the feature acquisition module 4551 is further configured to determine the first non-linear mapping result as the object to be recommended when the conversion identifier indicates that the conversion object library is independent of the object to be recommended.
  • the conversion object library refers to a collection of objects that convert the recommended information in the information library to be recommended.
  • the characteristics of the object to be recommended and the characteristics of the information to be recommended are obtained by specifying a heterogeneous graph
  • the data processing device 455 also includes a model training module 4554 configured to be based on at least two first Interaction records between objects, constructing an object interaction graph, wherein at least two of the first objects include the object to be recommended and at least one of the interactive objects; based on the conversion of at least one initial recommendation information by at least one second object Record and construct an object information conversion graph, wherein at least one of the initial recommendation information includes at least one of the recommendation information converted by each of the interactive objects; based on at least two of the first objects and at least one of the second Common objects between objects, fuse the object interaction graph and the object information transformation graph to obtain a heterogeneous graph to be updated; based on the nonlinearity corresponding to the second-order information of each object vertex in the heterogeneous graph to be updated
  • the mapping result is to iteratively update the object vertices in the heterogeneous graph
  • the model training module 4554 is also configured to perform the following processing on each object vertex in the heterogeneous graph to be updated: a nonlinear mapping corresponding to the second-order information of the object vertex.
  • the object vertices are updated; the updated heterogeneous graph to be updated is determined as the current heterogeneous graph; the edge weights in the current heterogeneous graph are updated with attention to obtain the edge weights to be updated;
  • the edge weight to be updated is adaptively enhanced to obtain the target edge weight; based on the target edge weight, the second-order information of each current object vertex in the current heterogeneous graph is aggregated; based on the second-order information of the current object vertex
  • the nonlinear mapping result corresponding to the information is used to iteratively update the object vertices in the current heterogeneous graph.
  • the model training module 4554 is also configured to perform the following processing on each current object vertex in the current heterogeneous graph: obtain at least one adjacent object corresponding to the current object vertex. Vertex; based on at least one of the adjacent object vertices, determine the distance between the current object vertex and each of the adjacent object vertices. the attention interaction weight; obtain at least one adjacent information vertex corresponding to the current object vertex; determine the attention between the current object vertex and each of the adjacent information vertices based on at least one of the adjacent information vertices Force conversion weight, wherein the edge weight to be updated is the attention interaction weight or the attention conversion weight.
  • the model training module 4554 is further configured to obtain at least one edge weight to be updated, where at least one edge weight to be updated is adjacent to the target edge weight to be updated and is adjacent to the edge weight to be updated.
  • the target edge weight to be updated is of different types, and the target edge weight to be updated is any edge weight to be updated that needs to be adaptively enhanced; based on at least one of the edge weights to be updated, the target edge weight to be updated is enhanced , obtain the target edge weight.
  • the information recommendation module 4552 is further configured to determine, based on the fusion result of the characteristics of the object to be recommended and the characteristics of the information to be recommended, how the object to be recommended converts the information to be recommended. Conversion probability; when the information library to be recommended includes at least two of the information to be recommended, at least two of the conversion probabilities of at least two of the information to be recommended based on the object to be recommended, for at least two of the information to be recommended, The recommended information is arranged in reverse order to obtain a sequence of information to be recommended; a specified number of the information to be recommended will be selected sequentially from the sequence of information to be recommended and determined as the target information to be recommended.
  • Embodiments of the present application provide a computer program product.
  • the computer program product or computer program includes a computer program or computer-executable instructions.
  • the computer program or computer-executable instructions are stored in a computer-readable storage medium.
  • the processor of the computer device (called data processing device) reads the computer program or computer-executable instructions from the computer-readable storage medium, and the processor executes the computer program or computer-executable instructions, so that the computer device executes the embodiments of the present application.
  • data processing device reads the computer program or computer-executable instructions from the computer-readable storage medium, and the processor executes the computer program or computer-executable instructions, so that the computer device executes the embodiments of the present application.
  • Embodiments of the present application provide a computer-readable storage medium that stores a computer program or computer-executable instructions.
  • the computer program or computer-executable instructions are stored therein.
  • the processor executes the data processing method provided by the embodiment of the present application, for example, the data processing method shown in Figure 3a.
  • the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; it may also include one or any combination of the above memories.
  • Various equipment may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; it may also include one or any combination of the above memories.
  • Various equipment may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; it may also include one or any combination of the above memories.
  • a computer program or computer-executable instructions may take the form of a program, software, software module, script, or code in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages. written, and which may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program or computer-executable instructions may, but do not necessarily correspond to, a file in a file system and may be stored as part of a file holding other programs or data, e.g., in HyperText Markup Language (HTML). Markup Language) document, stored in a single file dedicated to the program in question, or, stored in multiple collaborative files (e.g., one or more modules, subroutines, or portions of code) file).
  • HTML HyperText Markup Language
  • executable instructions may be deployed for execution on one computer device, in which case the one computer device is the data processing device, or on multiple computer devices located in one location, in which case the one computer device is the data processing device.
  • a plurality of computer devices are data processing devices), or executed on a plurality of computer devices distributed in multiple locations and interconnected by a communication network (in this case, a plurality of computer devices distributed in multiple locations and interconnected by a communication network
  • a plurality of computer equipment are data processing equipment).
  • relevant data such as interaction records and conversion records are involved.
  • user permission or consent needs to be obtained, and the collection of relevant data , use and processing need to comply with relevant laws, regulations and standards of relevant countries and regions.
  • the embodiment of this application obtains the target second-order information corresponding to the object to be recommended, and based on the target second-order information
  • the target second-order information since the target second-order information not only includes the interaction between objects, but also includes the interaction between objects and recommended information, it is a kind of heterogeneous information. Therefore, through the The target second-order information is non-linearly mapped to obtain the characteristics of the object to be recommended corresponding to the object to be recommended, so that the object to be recommended and the recommended information are accurately associated, so that it can accurately determine whether to recommend any recommended information to the object to be recommended.
  • Improve the accuracy of conversion probability therefore, even if it is a cold start object, it can improve the accuracy of information recommendation, thereby reducing the resource consumption of information recommendation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品,应用于云技术、人工智能、智慧交通、游戏和车载等各种信息推荐场景;该数据处理方法包括:获取待推荐对象对应的待推荐对象特征,其中,待推荐对象特征通过目标二阶信息的非线性映射结果获得,目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,互动对象为与待推荐对象互动的对象,至少一个推荐信息为每个互动对象所转化的全量信息;获取待推荐信息对应的待推荐信息特征,其中,待推荐信息为互动对象所转化的任一推荐信息;基于待推荐对象特征和待推荐信息特征的融合结果,对待推荐对象进行信息推荐。

Description

一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品
相关申请的交叉引用
本申请基于申请号为202210662836.2、申请日为2022年06月13日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及人工智能领域中的信息推荐技术,尤其涉及一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品。
背景技术
冷启动对象是指在信息推荐时转化次数为零的对象;由于冷启动对象所转化的推荐信息的数量为零,因此,在推荐对象与推荐信息对应的转化二部图中,冷启动对象对应的顶点为孤立顶点;然而,孤立顶点在图神经网络中因为没有任何边的连接,信息无法有效地进行信息传播,从而冷启动场景中无法确定转化概率,影响了信息推荐的准确率,进而增加了信息推荐的资源消耗。
发明内容
本申请实施例提供一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品,能够提升信息推荐的准确率,降低信息推荐的资源消耗。
本申请实施例的技术方案是这样实现的:
本申请实施例提供一种数据处理方法,所述方法由数据处理设备执行,所述方法包括:
获取待推荐对象对应的待推荐对象特征,其中,所述待推荐对象特征通过目标二阶信息的非线性映射结果获得,所述目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,所述互动对象为与所述待推荐对象互动的对象,至少一个所述推荐信息为每个所述互动对象所转化的全量信息;
获取待推荐信息对应的待推荐信息特征,其中,所述待推荐信息为所述互动对象所转化的任一所述推荐信息;
基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐。
本申请实施例提供一种数据处理装置,所述数据处理装置包括:
特征获取模块,配置为获取待推荐对象对应的待推荐对象特征,其中,所述待推荐对象特征通过目标二阶信息的非线性映射结果获得,所述目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,所述互动对象为与所述待推荐对象互动的对象,至少一个所述推荐信息为每个所述互动 对象所转化的全量信息;
所述特征获取模块,还配置为获取待推荐信息对应的待推荐信息特征,其中,所述待推荐信息为所述互动对象所转化的任一所述推荐信息;
信息推荐模块,配置为基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐。
本申请实施例提供一种数据处理设备,包括:
存储器,用于存储计算机程序或计算机可执行指令;
处理器,用于执行所述存储器中存储的计算机程序或计算机可执行指令时,实现本申请实施例提供的数据处理方法。
本申请实施例提供一种计算机可读存储介质,存储有计算机程序或计算机可执行指令,所述计算机程序或计算机可执行指令用于被处理器执行时,实现本申请实施例提供的数据处理方法。
本申请实施例提供一种计算机程序产品,包括计算机程序或计算机可执行指令,所述计算机程序或计算机可执行指令被处理器执行时,实现本申请实施例提供的数据处理方法。
本申请实施例至少具有以下有益效果:由于在获取待推荐对象特征的过程中,所采用的目标二阶信息不仅包括对象之间的交互,还包括对象与推荐信息之间的交互,是一种异质信息;因此,通过对目标二阶信息进行非线性映射来获得待推荐对象对应的待推荐对象特征,使得待推荐对象与推荐信息建立了准确的关联,从而基于待推荐对象特征能够准确地确定是否向待推荐对象推荐任一推荐信息,提升转化概率的准确度,降低信息推荐的资源消耗;另外,即使待推荐对象是冷启动对象,也能够提升信息推荐的准确率,降低信息推荐的资源消耗。
附图说明
图1是本申请实施例提供的信息推荐***的架构示意图;
图2是本申请实施例提供的图1中的一种服务器的组成结构示意图;
图3a是本申请实施例提供的数据处理方法的流程示意图一;
图3b是本申请实施例提供的数据处理方法的流程示意图二;
图3c是本申请实施例提供的获取待推荐对象特征的流程示意图;
图4a是本申请实施例提供的模型训练的流程示意图一;
图4b是本申请实施例提供的模型训练的流程示意图二;
图5是本申请实施例提供的一种示例性的信息推荐流程示意图;
图6是本申请实施例提供的一种示例性的点击二部图的示意图;
图7是本申请实施例提供的一种示例性的社交图的示意图;
图8是本申请实施例提供的一种示例性的异构图的示意图;
图9是本申请实施例提供的一种示例性的异质信息聚合示意图;
图10是本申请实施例提供的另一种示例性的异质信息聚合示意图;
图11是本申请实施例提供的一种示例性的权重更新示意图;
图12是本申请实施例提供的一种示例性的模型性能对比示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一 步地详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
在以下的描述中,所涉及的术语“第一\第二”等仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二”等在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
除非另有定义,本申请实施例所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本申请实施例中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。
1)人工智能(Artificial Intelligence,AI),是指利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用***。
2)机器学习(Machine Learning,ML),是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析和算法复杂度理论等多门学科。用于研究计算机模拟或实现人类的学习行为,以获取新的知识或技能;重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习通常包括人工神经网络、置信网络、强化学习、迁移学习和归纳学习等技术。
3)人工神经网络,是一种模仿生物神经网络结构和功能的数学模型,本申请实施例中人工神经网络的示例性结构包括图卷积网络(Graph Convolutional Network,GCN)、深度神经网络(Deep Neural Networks,DNN)、卷积神经网络(Convolutional Neural Network,CNN)和循环神经网络(Recurrent Neural Network,RNN)、神经状态机(Neural State Machine,NSM)和相位函数神经网络(Phase-Functioned Neural Network,PFNN)等。本申请实施例中所涉及的指定异构图和待更新异构图即为人工神经网络对应的模型。
4)云计算(Cloud Computing),是一种计算模式,通过将计算任务分布在大量计算机构成的资源池上,使各种应用***能够根据需要获取计算力、存储空间和信息服务;其中,为资源池提供资源的网络被称为“云”,“云”中的资源在使用者看来是可以无限扩展的,并且可以随时获取,按需使用,随时扩展,按使用付费。本申请实施例提供的数据处理方法,可通过云计算实现。
5)转化概率(Conversion Rate,CVR),是指成功转化的概率;其中,成功转化包括点击、浏览、下载、购买、运行和注册等,从而,转化概率包括点击概率(Clickthrough Rate,CTR)、浏览概率、下载概率、购买概率、运行概率和注册概率等;比如,广告曝光账号点击广告的概率,广告曝光账号对目标资源进行购买的概率。
6)同质图(Homogeneous Graph),包括一种类型的节点和一种类型的边的图;本申请实施例中的对象交互图为同质图。
7)异质图(Heterogeneous Graph),是指顶点和边的类型中的至少一种大于等于两种的图;本申请实施例中的对象信息转化图、以及指定异构图、待更新异构图均为异质图。
8)二部图(Bipartite Graph),是指顶点的类型为两种,且边存在于不同类型的顶点 之间的图;也就是说,二部图的顶点集包括两个互不相交的子集,且二部图中每条边两端的顶点都属于不同的两个子集,从而同一个子集中的顶点不相邻。在本申请实施例中,对象信息转化图即为二部图,包括对象集合和信息集合,边表示对象对信息成功进行了转化。
9)社交网络服务(Social Networking Service,SNS),通过对象之间的互动将对象连接起来获得的网络结构;其中,互动是指对象之间的社交行为,比如,关注,组队对局,交互虚拟资源等。
需要说明的是,为了进行信息推荐,可先通过推荐信息与对象的历史转化关系,确定出对应的推荐策略,然后基于推荐策略在当前的信息推荐过程中进行信息推荐。然而,不同信息推荐周期,信息推荐的对象并不相同;也就是说,当前的待推荐对象可能在推荐信息与对象的历史转化关系中并未出现,是冷启动对象。因此,基于推荐信息与对象的历史转化关系所确定出的推荐策略,无法适用于当前的待推荐对象,导致冷启动场景中信息推荐的准确率较低,进而导致信息推荐的资源消耗较大。
另外,为了进行信息推荐,还可以通过GCN实现,比如,神经网络协同过滤(Neural Graph Collaborative Filtering,NGCF),简化GCN(LightGCN);其中,NGCF利用对象和信息作为顶点来构建的二部图,并通过消息传播预估转化概率;简化GCN通过去除NGCF中特征转换和非线性操作,来预估转化概率。然而,在包括冷启动对象的信息推荐场景中,由于对待推荐对象所转化的推荐信息的数量为零,因此,待推荐对象与推荐信息对应的二部图中,待推荐对象为孤立顶点;然而,孤立顶点在GCN中没有边的连接,信息无法有效传播,从而冷启动场景中无法确定待推荐对象对推荐信息的转化概率,影响了冷启动场景中信息推荐的准确率,进而增加了信息推荐的资源消耗。
此外,为了对冷启动对象进行信息推荐,还可以采用探索(Exploration)和利用(Exploitation)策略。为了对冷启动对象进行信息推荐,又可以采用迁移学习与元学习,也就是通过学习冷启动对象与其他信息推荐场景的推荐信息的转化行为,来通过知识迁移适配到当前的信息推荐场景中的信息推荐上。为了对冷启动对象进行信息推荐,也可以基于知识图谱(Knowledge Graph,KG)来根据推荐信息在KG上的相似性进行关联推荐;也就是说,为冷启动对象推荐的信息,是冷启动对象在KG上的相似对象所转化的推荐信息。为了对冷启动对象进行信息推荐,以及可以采用异质图神经模型将“对象-推荐信息”转化二部图和“对象-对象”社交图作为全图的子图,全图通过两个子图的线性拼接结果,并利用SNS先验知识,对冷启动对象进行信息推荐。然而,采用探索和利用策略对冷启动对象进行信息推荐的过程中,由于冷启动对象对不同类别推荐信息的喜好相同,需要不断试错,找到冷启动对象潜在的共同喜好;同时,也要多(大于数量阈值)推一些已知情况下账号兴趣相关的广告(比如,转化概率大于阈值的广告),来促进账号对广告素材的点击,以实现对冷启动对象的信息推荐。而融合知识图谱和“对象-推荐信息”转化二部图的信息推荐过程中,或者融合社交图和“对象-推荐信息”转化二部图的信息推荐过程中,由于不同图结构上的信息是异质信息,线性运算无法充分衡量账号间的相似性,影响了冷启动场景中信息推荐的准确率,进而增加了信息推荐的资源消耗。
基于此,本申请实施例提供一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品,能够提升信息推荐的准确率,降低信息推荐的资源消耗。下面说明本申请实施例提供的数据处理设备的示例性应用,本申请实施例提供的数据处理设备可以实施为智能手机、智能手表、笔记本电脑、平板电脑、台式计算机、智能家电、机顶盒、智能车载设备、便携式音乐播放器、个人数字助理、专用消息设备、智能语音交互设备、便携式游戏设备和智能音箱等各种类型的终端,也可以实施为服务器,还可以是 两者的结合。下面,将说明设备实施为服务器时的示例性应用。
参见图1,图1是本申请实施例提供的信息推荐***的架构示意图;如图1所示,为支撑一个信息推荐应用,在信息推荐***100中,终端200(示例性地示出了终端200-1和终端200-2)通过网络300连接服务器400(称为数据处理设备),网络300可以是广域网或者局域网,又或者是二者的组合。另外,该信息推荐***100中还包括数据库500,用于向服务器400提供数据支持;并且,图1中示出的为数据库500独立于服务器400的一种情况,此外,数据库500还可以集成在服务器400中,本申请实施例对此不作限定。
终端200,用于在图形界面(示例性示出了图形界面210-1和图形界面210-2)显示目标待推荐信息。
服务器400,用于获取待推荐对象对应的待推荐对象特征,其中,待推荐对象特征通过目标二阶信息的非线性映射结果获得,目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,互动对象为与待推荐对象互动的对象,至少一个推荐信息为每个互动对象所转化的全量信息;获取待推荐信息对应的待推荐信息特征,其中,待推荐信息为互动对象所转化的任一推荐信息;基于待推荐对象特征和待推荐信息特征的融合结果,通过网络300向终端200发送向待推荐对象推荐的目标待推荐信息。
在一些实施例中,服务器400可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式***,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请实施例中不作限制。
参见图2,图2是本申请实施例提供的图1中的一种服务器的组成结构示意图,图2所示的服务器400包括:至少一个处理器410、存储器450和至少一个网络接口420。服务器400中的各个组件通过总线***440耦合在一起。可理解,总线***440用于实现这些组件之间的连接通信。总线***440除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图2中将各种总线都标为总线***440。
处理器410可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(Digital Signal Processor,DSP),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。
存储器450可以是可移除的,不可移除的或其组合。示例性的硬件设备包括固态存储器,硬盘驱动器,光盘驱动器等。存储器450可选地包括在物理位置上远离处理器410的一个或多个存储设备。
存储器450包括易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(Read Only Memory,ROM),易失性存储器可以是随机存取存储器(Random Access Memory,RAM)。本申请实施例描述的存储器450旨在包括任意适合类型的存储器。
在一些实施例中,存储器450能够存储数据以支持各种操作,这些数据的示例包括程序、模块和数据结构或者其子集或超集,下面示例性说明。
操作***451,包括用于处理各种基本***服务和执行硬件相关任务的***程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务;
网络通信模块452,用于经由一个或多个(有线或无线)网络接口420到达其他计算机设备,示例性的网络接口420包括:蓝牙、无线相容性认证(Wi-Fi)、和通用串行总线(Universal Serial Bus,USB)等;
在一些实施例中,本申请实施例提供的数据处理装置可以采用软件方式实现,图2示出了存储在存储器450中的数据处理装置455,其可以是程序和插件等形式的软件,包括以下软件模块:特征获取模块4551、信息推荐模块4552、对象判断模块4553和模型训练模块4554,这些模块是逻辑上的,因此根据所实现的功能可以进行任意的组合或进一步拆分。将在下文中说明各个模块的功能。
在一些实施例中,本申请实施例提供的数据处理装置可以采用硬件方式实现,作为示例,本申请实施例提供的数据处理装置可以是采用硬件译码处理器形式的处理器,其被编程以执行本申请实施例提供的数据处理方法,例如,硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(Application Specific Integrated Circuit,ASIC)、DSP、可编程逻辑器件(Programmable Logic Device,PLD)、复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或其他电子元件。
下面,将结合本申请实施例提供的数据处理设备的示例性应用和实施,说明本申请实施例提供的数据处理方法。另外,本申请实施例提供的数据处理方法应用于云技术、人工智能、智慧交通和车载等各种信息推荐场景。
参见图3a,图3a是本申请实施例提供的数据处理方法的流程示意图一,将结合图3a示出的步骤进行说明。
步骤301、获取待推荐对象对应的待推荐对象特征。
在本申请实施中,数据处理设备在对待推荐对象进行信息推荐时,先针对待推荐对象确定与推荐信息关联的特征表示,即为待推荐对象特征;其中,待推荐对象特征基于至少一个互动对象确定,每个互动对象为待推荐对象所互动的对象,并且,每个互动对象对至少一个推荐信息发生了转化。
这里,待推荐对象特征可以是实时确定的,还可以是预先确定的,本申请实施例对此不作限定。在确定待推荐对象特征的过程中,数据处理设备先获取每个互动对象的特征表示(称为对象特征)、以及每个推荐信息的特征表示(称为第一信息特征),此时,也就获得了至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征;接着,数据处理设备对至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征进行聚合,也就获得了待推荐对象的目标二阶信息,最后,数据处理设备对目标二阶信息进行非线性映射,也就获得了待推荐对象特征。另外,还可以迭代地获取目标二阶信息并对目标二阶信息进行非线性映射,来获得待推荐对象特征。其中,对象特征可以是互动对象的嵌入表示,还可以是互动对象的独热码(One-Hot编码),又可以是互动对象的标签所对应的特征表示,等等,本申请实施例对此不作限定。
需要说明的是,待推荐对象特征为目标二阶信息所对应的非线性映射结果,目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,至少一个互动对象为与待推荐对象互动的对象,至少一个推荐信息为每个互动对象所转化的信息;另外,非线性映射是指提升特征空间维度的处理,比如,基于核函数(比如,高斯核函数)的数据处理等;通过非线性映射,使得非线性映射后的结果相比非线性映射前的数据,特征空间维度更高;也就是说,目标二阶信息为低维特征,待推荐对象特征为高维特征;如此,能够使得待推荐对象与每个推荐信息有效关联,准确地确定对象之间的相似性。此外,待推荐对象和至少一个互动对象所属 的领域,还可以与至少一个互动对象与每个互动对象对应的至少一个推荐信息所属的领域相同;比如,均为游戏领域,均为即时通信领域,等等。
步骤302、获取待推荐信息对应的待推荐信息特征。
在本申请实施例中,待推荐信息为互动对象所转化的任一推荐信息,即为任一互动对象所转化的至少一个推荐信息中任一个;这里,数据处理设备获取待推荐信息的特征表示,也就获得了待推荐信息特征。
需要说明的是,待推荐信息特征用于确定待推荐信息是否为可向待推荐对象推荐的信息;另外,待推荐信息特征可以是待推荐信息的嵌入表示,还可以是待推荐信息的独热码,又可以是待推荐信息的标签所对应的特征表示,等等,本申请实施例对此不作限定。
步骤303、基于待推荐对象特征和待推荐信息特征的融合结果,对待推荐对象进行信息推荐。
在本申请实施例中,数据处理设备获得了待推荐对象特征和待推荐信息特征之后,基于待推荐对象特征和待推荐信息特征确定是否向待推荐对象推荐该待推荐信息;这里,数据处理设备先对待推荐对象特征和待推荐信息特征进行融合,再利用激活函数(比如,Sigmoid函数)对待推荐对象特征和待推荐信息特征的融合结果进行处理,也就获得了待推荐对象对待推荐信息进行转化的概率,进而基于待推荐对象对待推荐信息进行转化的概率,来确定是否向待推荐对象推荐该待推荐信息,以实现对待推荐对象的信息推荐。
可以理解的是,由于待推荐对象的待推荐对象特征是基于目标二阶信息确定的,而目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,使得以下信息建立了关联:对象之间的交互信息、以及对象与推荐信息之间的转化信息;另外,又由于待推荐对象特征是通过对目标二阶信息进行非线性映射获得的,使得对象与推荐信息之间能够有效交互,进而能够提升信息推荐的准确度,降低信息推荐的资源消耗。
参见图3b,图3b是本申请实施例提供的数据处理方法的流程示意图二;如图3b所示,在本申请实施例中,步骤301可通过步骤3011至步骤3014实现;也就是说,数据处理设备获取待推荐对象对应的待推荐对象特征,包括步骤3011至步骤3014,下面对各步骤分别进行说明。
步骤3011、获取待推荐对象对应的目标二阶信息。
在本申请实施例中,数据处理设备可以直接基于至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征进行聚合,来获得目标二阶信息;比如,数据处理设备获取至少一个互动对象分别对应的对象特征的第一累加结果,并获取至少一个推荐信息分别对应的第一信息特征的第二累加结果,以及累加第一累加结果和至少一个互动对象对应的至少一个第二累加结果,来获得目标二阶信息。
数据处理设备还可以结合权重来聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征,以获得目标二阶信息。也就是说,数据处理设备获取待推荐对象对应的目标二阶信息,包括:数据处理设备获取待推荐对象与互动对象之间的互动权重,并获取互动对象与推荐信息之间的转化权重;接着,数据处理设备获取互动权重与对象特征的第一融合结果、以及转化权重和第一信息特征的第二融合结果;将至少一个互动对象对应的至少一个第一融合结果、以及每个互动对象所转化的至少一个推荐信息对应的至少一个第二融合结果,组合为待推荐对象对应的目标二阶信息。
需要说明的是,互动权重是指待推荐对象与互动对象之间的权重,表示待推荐对象与互动对象之间的亲密度;其中,数据处理设备可通过以下信息中的至少一种来确定待 推荐对象与互动对象之间的亲密度:互动次数、互动时长、互动频率和互动方式。转化权重是指互动对象与推荐信息之间的权重,表示互动对象与推荐信息之间的转化度;其中,数据处理设备可通过以下信息中的至少一种来确定互动对象与推荐信息之间的转化度:转化次数、转化时长、转化频率和转化方式(比如,点击,下单,浏览,播放,关注等)。这里,数据处理设备将互动权重与该互动权重所对应的互动对象的对象特征融合,也就获得了第一融合结果,从而,针对至少一个互动对象,也就获得了至少一个第一融合结果;数据处理设备将转化权重与对应的推荐信息的第一信息特征融合,也就获得了第二融合结果,从而,针对至少一个推荐信息,也就获得了至少一个第二融合结果,也即每个互动对象对应至少一个第二融合结果。另外,组合为目标二阶信息的方式可以是相加,也可以是拼接,又可以是加权融合,等等,本申请实施例对此不作限定。
还需要说明的是,第一叠加结果为数据处理设备直接对至少一个互动对象所对应的至少一个对象特征进行累加获得的;第一融合结果为数据处理设备基于待推荐对象和互动对象之间的亲密度,对至少一个对象特征进行加权累加获得的。第二叠加结果为数据处理设备直接对至少一个推荐信息所对应的至少一个第一信息特征进行累加获得的,第二融合结果为数据处理设备基于互动对象与推荐信息之间的转化度,对至少一个第一信息特征进行加权累加获得的。
可以理解的是,通过结合权重来获得目标二阶信息,进而再通过目标二阶信息获取待推荐对象特征,能够提升目标二阶信息获取的准确度,进而能够提升待推荐对象特征的准确度,使得待推荐对象特征能够有效与待推荐信息关联,也就能够提升信息推荐的准确度。
步骤3012、获取目标二阶信息与指定中心信息之间的空间距离。
需要说明的是,数据处理设备能够获取到指定中心信息;其中,指定中心信息是通过多个二阶信息确定的,多个二阶信息包括目标二阶信息,还包括除目标二阶信息之外的其他二阶信息,该其他二阶信息可以是至少一个互动对象分别对应的二阶信息;比如,指定中心信息为多个二阶信息的均值;另外,二阶信息即为对象所交互的对象的特征、以及每个所交互的对象所转化的信息的特征的聚合结果。
在本申请实施例中,数据处理设备获取目标二阶信息与指定中心信息之间的特征差异,并将特征差异确定为目标二阶信息与指定中心信息之间的空间距离,比如为欧式距离等。
步骤3013、基于多个指定映射参数对空间距离进行非线性映射,得到多个待融合二阶特征。
需要说明的是,数据处理设备能够获得多个指定映射参数,其中,指定映射参数表示映射空间范围,比如为高斯带宽;这里,数据处理设备利用每个指定映射参数对空间距离进行非线性映射,以将空间距离映射到不同的映射空间范围,从而,针对多个指定映射参数,能够得到多个待融合二阶特征。另外,每个待融合二阶特征为基于对应指定映射参数进行非线性映射所获得的结果。
步骤3014、获取多个待融合二阶特征对应的第一非线性映射结果,基于第一非线性映射结果,得到待推荐对象特征。
在本申请实施例中,数据处理设备获得了多个待融合二阶特征之后,整合多个待融合二阶特征,也就获得了第一非线性映射结果;这里,数据处理设备可以直接将第一非线性映射结果确定为待推荐对象特征,也可以将第一非线性映射结果和其他信息(比如,待推荐对象所转化的信息等)结合为待推荐对象特征,等等,本申请实施例对此不作限定。从而,目标二阶信息所对应的非线性映射结果至少包括第一非线性映射结果。
在本申请实施例中,步骤301中数据处理设备获取待推荐对象对应的待推荐对象特 征之前,该数据处理方法还包括:数据处理设备获取待推荐对象对待推荐信息库的转化标识。
需要说明的是,待推荐信息库包括至少一个互动对象中每个互动对象所转化的至少一个推荐信息;数据处理设备检测待推荐对象对待推荐信息库中每个推荐信息的转化情况,以获得转化标识;其中,转化标识表示待推荐对象是否对待推荐信息库中的推荐信息进行过转化。
相应地,参见图3c,图3c是本申请实施例提供的获取待推荐对象特征的流程示意图;如图3c所示,在本申请实施例中,步骤3014中数据处理设备基于第一非线性映射结果,得到待推荐对象特征,包括步骤30141A和步骤30142A,下面对各步骤分别进行说明。
步骤30141A、当转化标识表示待推荐信息库中包括至少一个已转化信息时,对至少一个已转化信息对应的至少一个第二信息特征进行聚合,得到待推荐对象对应的目标一阶信息。
需要说明的是,当转化标识表示待推荐信息库中包括待推荐对象对应的至少一个已转化信息时,表明待推荐对象对待推荐信息库中的推荐信息进行过转化,且所转化过的推荐信息为至少一个已转化信息,故待推荐对象为非冷启动对象;其中,每个已转化信息为待推荐信息库中被待推荐对象进行过转化的推荐信息;此时,待推荐对象与推荐信息之间的关联还可以通过自身所转化的至少一个已转化信息建立;从而,数据处理设备聚合至少一个已转化信息分别对应的第二信息特征,也就获得了待推荐对象对应的目标一阶信息,也就是说,目标一阶信息通过聚合至少一个已转化信息分别对应的第二信息特征获得。这里,第二信息特征为已转化信息的特征。
步骤30142A、获取目标一阶信息对应的第二非线性映射结果,将的第二非线性映射结果与第一非线性映射结果,组合为待推荐对象特征。
在本申请实施例中,数据处理设备对目标一阶信息进行非线性映射,也就获得了第二非线性映射结果;其中,数据处理设备对目标一阶信息进行非线性映射的过程,与对目标二阶信息进行非线性映射的过程类似,本申请实施例在此不再重复描述。这里,数据处理设备通过组合第一非线性映射结果和第二非线性映射结果,也就获得了待推荐对象特征;其中,组合方式可以是相加,也可以是加权相加,等等;并且,第一非线性映射结果结合其他信息获得待推荐对象特征的过程中,所结合的其他信息即为第二非线性映射结果。
可以理解的是,当待推荐对象为非冷启动对象时,在基于目标二阶信息获得待推荐对象特征的过程中,结合目标二阶信息对应的第一非线性映射结果的同时,再结合目标一阶信息对应的第二非线性映射结果来实现,提升了获取待推荐对象特征所依据数据的多样性;从而,能够提升待推荐对象特征的准确度,进而能够提升信息推荐的准确度,降低信息推荐的资源消耗。
在本申请实施例中,步骤30142A中数据处理设备将第二非线性映射结果与第一非线性映射结果,组合为待推荐对象对应的待推荐对象特征,包括:数据处理设备对第二非线性映射结果、以及第一非线性映射结果进行组合,得到初始聚合信息;并获取与初始聚合信息负相关、且与第二非线性映射结果正相关的第一组合权重,并获取第一组合权重对应的第二组合权重;以及对第三融合结果、以及第四融合结果进行组合,得到待推荐对象特征,其中,第三融合结果是指第一组合权重与第二非线性映射结果的融合结果,第四融合结果是指第二组合权重与第一非线性映射结果的融合结果。
需要说明的是,数据处理设备可以通过将第一非线性映射结果和第二非线性映射结果进行相加等方式,来实现两者的组合,以获得初始聚合信息;另外,第一组合权重 和第二组合权重负相关,比如,第一组合权重为1与第二组合权重之间的差值;此外,数据处理设备可以通过将第三融合结果、以及第四融合结果进行相加等方式的组合,来获得待推荐对象特征。
继续参见图3c,在本申请实施例中,步骤3014中数据处理设备基于第一非线性映射结果,得到待推荐对象特征,包括步骤30141B,下面对该步骤进行说明。
步骤30141B、当转化标识表示转化对象库与待推荐对象独立时,将第一非线性映射结果,确定为待推荐对象特征。
需要说明的是,当转化标识表示待推荐信息库对应的转化对象库与待推荐对象独立时,表明待推荐对象未对待推荐信息库中的推荐信息进行过转化,此时,待推荐对象为冷启动对象;也就是说,待推荐对象不属于转化对象库;其中,转化对象库为对待推荐信息库中的待推荐信息进行过转化的对象集合;此时,待推荐对象与推荐信息之间的关联通过目标二阶信息的第一非线性映射结果获得。
在本申请实施例中,待推荐对象特征和待推荐信息特征通过指定异构图获得;其中,指定异构图中,顶点为对象(包括待推荐对象和至少一个互动对象)的特征或推荐信息(包括每个互动对象对应的至少一个推荐信息)的特征,边为对象之间的边,或者对象与推荐信息之间的边;并且,边可以是有权边,也可以是无权边;当边为有权边时,边的权重表示两顶点之间的关联程度,比如,对象之间的亲密度,对象与推荐信息之间的转化度;以及,对象的特征通过聚合关联对象所转化的推荐信息的特征获得。
参见图4a,图4a是本申请实施例提供的模型训练的流程示意图一;如图4a所示,指定异构图通过步骤305至步骤309获得,下面对各步骤分别进行说明。
步骤305、基于至少两个第一对象之间的互动记录,构建对象互动图。
需要说明的是,至少两个第一对象包括待推荐对象和至少一个互动对象;数据处理设备所构建的对象互动图中,顶点为第一对象的特征表示,边表示两个第一对象之间存在交互;当边为有权边时,表示基于两个第一对象之间的交互信息(比如,交互次数,交互频率,交互类型等)确定的亲密度。
步骤306、基于至少一个第二对象对至少一个初始推荐信息的转化记录,构建对象信息转化图。
需要说明的是,至少一个初始推荐信息包括每个互动对象所转化的至少一个推荐信息;数据处理设备所构建的对象信息转化图中,顶点为第二对象的特征表示或者初始推荐信息的特征表示,边表示第二对象对初始推荐信息进行过转化;当边为有权边时,边对应的权重表示基于第二对象对初始推荐信息的转化信息(比如,转化次数,转化时长,转化频率,转化类型等)确定的转化度。
步骤307、基于至少两个第一对象和至少一个第二对象之间的共同对象,融合对象互动图和对象信息转化图,得到待更新异构图。
在本申请实施例中,数据处理设备获得了对象互动图和对象信息转化图之后,先获取至少两个第一对象和至少一个第二对象之间的共同对象;接着将共同对象在对象互动图中通过边所关联的第一对象的相关信息,与共同对象在对象信息转化图中通过边所关联的初始推荐信息的相关信息结合,也就得到了待更新异构图;也就是说,在待更新异构图中,即包括共同对象与所互动的第一对象之间的互动关系,还包括共同对象与所转化的初始推荐信息之间的转化关系。
步骤308、基于待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果,迭代更新待更新异构图中的对象顶点。
在本申请实施例中,待更新异构图中的顶点,有的为对象顶点,有的为信息顶点;其中,对象顶点为第一对象或第二对象的特征表示,信息顶点为初始推荐信息的特征表 示;这里,数据处理设备对待更新异构图的对象顶点进行迭代更新,以完成对待更新异构图的更新;以及,数据处理设备基于每个对象顶点的二阶信息所对应的非线性映射结果,来实现对应对象顶点的更新。
需要说明的是,每个对象顶点的二阶信息包括与该对象顶点所交互的对象相应的对象顶点、以及与所交互的对象所转化的初始推荐信息相应的信息顶点;另外,本申请实施例中的二阶信息通过聚合对象所交互的对象的特征、以及所交互的对象转化的推荐信息的特征获得,从而,目标二阶信息为待推荐对象的二阶信息。
步骤309、将迭代更新后的待更新异构图,确定为指定异构图。
在本申请实施例中,数据处理设备对待更新异构图进行迭代更新,当迭代更新后的待更新异构图达到指定截止条件时,结束迭代更新,并将迭代更新后的待更新异构图,确定为指定异构图。其中,指定截止条件是指当前迭代更新后的待更新异构图能够达到指定指标,比如,准确率大于指定准确率,损失函数值小于指定损失函数值,曲线下面积(Area Under Curve,AUC)大于指定面积等。
参见图4b,图4b是本申请实施例提供的模型训练的流程示意图二;如图4b所示,在本申请实施例中,步骤308可通过步骤3081至步骤3086(图中未示出)实现;也就是说,数据处理设备基于待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果,迭代更新待更新异构图中的对象顶点,包括步骤3081至步骤3086,下面对各步骤分别进行说明。
步骤3081、对待更新异构图中的每个对象顶点执行以下处理:基于对象顶点的二阶信息所对应的非线性映射结果,更新对象顶点。
步骤3082、将完成更新的待更新异构图确定为当前异构图。
需要说明的是,数据处理设备获取待更新异构图中每个对象顶点的二阶信息的过程,与获取目标二阶信息的过程类似;并且,数据处理设备获取待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果的过程,与获取目标二阶信息对应的非线性映射结果的过程类似;以及,数据处理设备更新待更新异构图中对象顶点的过程,与基于目标二阶信息所对应的非线性映射结果获得待推荐对象特征的过程类似;本申请实施例在此不再重复描述。
步骤3083、对当前异构图中的边权重进行注意力更新,得到待更新边权重。
在本申请实施例中,数据处理设备对当前异构图中的边权重进行注意力更新,得到待更新边权重,包括:数据处理设备对当前异构图中的每个当前对象顶点执行以下处理:获取当前对象顶点对应的至少一个相邻对象顶点,相邻对象顶点为与当前对象顶点相邻的对象顶点;基于至少一个相邻对象顶点,确定当前对象顶点与每个相邻对象顶点之间的注意力互动权重;获取当前对象顶点对应的至少一个相邻信息顶点,相邻信息顶点为与当前对象顶点相邻的信息顶点;基于至少一个相邻信息顶点,确定当前对象顶点与每个相邻信息顶点之间的注意力转化权重;其中,待更新边权重为注意力互动权重或注意力转化权重。
需要说明的是,数据处理设备针对至少一个相邻对象顶点中的每个相邻对象顶点,基于该相邻对象顶点在至少一个相邻对象顶点中所占的比重,确定注意力互动权重;也就是说,注意力互动权重为每个相邻对象顶点在至少一个相邻对象顶点中所占的比重;数据处理设备针对至少一个相邻对象顶点中的每个相邻信息顶点,基于该相邻信息顶点在至少一个相邻信息顶点中所占的比重,确定注意力转化权重;也就是说,注意力转化权重为每个相邻信息顶点在至少一个相邻信息顶点中所占的比重。
步骤3084、对待更新边权重进行自适应增强,得到目标边权重。
在本申请实施例中,数据处理设备对待更新边权重进行自适应增强,得到目标边权 重,包括:数据处理设备获取至少一个待更新边权重,其中,至少异构待更新边权重与目标待更新边权重相邻、且与目标待更新边权重类型不同,目标待更新边权重为任一待进行自适应性增强的待更新边权重;基于至少一个待更新边权重,增强目标待更新边权重,得到目标边权重。
需要说明的是,当目标待更新边权重为注意力互动权重时,数据处理设备获取与注意力互动权重相邻的至少一个注意力转化权重,将每个注意力转化权重与注意力互动权重叠加,得到第一权重叠加和;接着,数据处理设备确定与第一权重叠加和负相关、且与该注意力转化权重正相关的第一增强参数,并通过融合第一增强参数与对应的注意力转化权重获得第一增强权重;最后,数据处理设备将该注意力互动权重与至少一个注意力转化权重对应的至少一个第一增强权重叠加,得到更新后的目标待更新边权重,即为当前异构图中的目标边权重。
当目标待更新边权重为注意力转化权重时,数据处理设备获取与注意力转化权重相邻的至少一个注意力互动权重,将每个注意力互动权重与注意力转化权重叠加,得到第二权重叠加和;接着,数据处理设备确定与第二权重叠加和负相关、且与该注意力互动权重正相关的第二增强参数,并通过融合第二增强参数与对应的注意力互动权重获得第二增强权重;最后,数据处理设备将该注意力转化权重与至少一个注意力互动权重对应的至少一个第二增强权重叠加,得到更新后的目标待更新边权重,即为当前异构图中的目标边权重。
步骤3085、基于目标边权重,聚合出当前异构图中每个当前对象顶点的二阶信息。
需要说明的是,数据处理设备基于目标边权重,获得当前异构图中每个当前对象顶点的二阶信息的过程,与数据处理设备结合互动权重和交互权重获得目标二阶信息的过程类似,本申请实施例在此不再重复描述。
步骤3086、基于当前对象顶点的二阶信息所对应的非线性映射结果,迭代更新当前异构图中的对象顶点。
在本申请实施例中,数据处理设备迭代更新当前异构图的过程,与迭代更新异构图的过程类似,本申请实施例在此不再重复描述。
在本申请实施例中,步骤303中数据处理设备基于待推荐对象特征和待推荐信息特征的融合结果,对待推荐对象进行信息推荐,包括:数据处理设备基于待推荐对象特征和待推荐信息特征的融合结果,确定待推荐对象对待推荐信息进行转化的转化概率;当待推荐信息库包括至少两个待推荐信息时,基于待推荐对象对至少两个待推荐信息的至少两个转化概率,对至少两个待推荐信息进行倒序排列,得到待推荐信息序列;将从待推荐信息序列中依次选择指定数量的待推荐信息,确定为目标待推荐信息;向待推荐对象推荐目标待推荐信息。其中,指定数据是指至少一个。
需要说明的是,数据处理设备还可以将转化概率与指定概率进行比较,当转化概率大于指定概率时,向待推荐对象推荐该待推荐信息。
在本申请实施例中,当对待更新异构图进行迭代更新时,可以是对异构图中的对象顶点进行更新,还可以是对异构图中所有的顶点(包括对象顶点和信息顶点)进行更新,本申请实施例对此不作限定。并且,数据处理设备对异构图中的信息顶点进行更新的过程,与对象顶点的更新过程类似,本申请实施例对此不再重复描述。
下面,将说明本申请实施例在一个实际的应用场景中的示例性应用。该示例性应用由服务器(称为数据处理设备)执行,描述了在游戏领域中,先基于历史推荐数据中账号与广告之间的历史点击情况,构建“账号-广告”点击二部图(称为对象信息转化图),并基于账号之间的互动情况,构建“账号-账号”社交图(称为对象互动图);接着,融合点击二部图和社交图,获得异构图(称为待更新异构图),并对异构图中账号 对应的顶点进行异质信息聚合,来实现冷启动账号与广告的信息交互,进而针对冷启动账号,能够提升信息推荐准确度。
参见图5,图5是本申请实施例提供的一种示例性的信息推荐流程示意图;如图5所示,该示例性的信息推荐流程包括数据采集阶段5-1、信息聚合阶段5-2和广告推荐阶段5-3。
在数据采集阶段5-1中,采集游戏领域内广告的历史推荐数据,并从历史推荐数据中提取账号对广告点击的数据(称为互动记录),以及基于账号对广告点击的数据构建“账号-广告”点击二部图;在点击二部图中,顶点为账号(称为第二对象)或广告(称为初始推荐信息)的向量表示(称为特征表示),顶点之间的边表示账号对广告进行了点击,以及边的权重表示账号与广告的转化关系,比如,边的权重与账号对广告点击的次数或消费时长正相关。
参见图6,图6是本申请实施例提供的一种示例性的点击二部图的示意图;如图6所示,点击二部图6-1中,顶点A、B、C和D为账号的向量表示,顶点a、b、c和d为广告的向量表示;这里,以顶点B与顶点a的连接为例进行说明:顶点B对应的账号对顶点a对应的广告进行了点击,从而顶点B与顶点a之间存在边,且该边对应的权重为WB-a。另外,顶点A分别与顶点a、b、c和d之间无边,在点击二部图6-1中为孤立顶点,从而顶点A对应的账号为冷启动账号(称为冷启动对象);而顶点B、C和D分别对应的账号为非冷启动账号(称为非冷启动对象)。
在数据采集阶段5-1中,还采集游戏领域内的历史社交数据(称为转化记录),并基于历史社交数据构建“账号-账号”社交图;在社交图中,顶点(称为第一对象)为账号的向量表示,顶点之间的边表示两账号进行了互动,比如,虚拟资源交互,组队游戏,交流等,以及边的权重表示两账号互动的亲密度。
参见图7,图7是本申请实施例提供的一种示例性的社交图的示意图;如图7所示,社交图7-1中,顶点A、B、C和D为账号的向量表示;这里,以顶点A分别与顶点B、C、以及D的连接为例进行说明:顶点A对应的账号分别与顶点B对应的账号、C对应的账号和D对应的账号进行了互动,从而顶点A分别与顶点B、C和D之间存在边,且各个边对应的权重分别依次为WA-B、WA-C和WA-D;另外,顶点C与顶点D之间的权重为WC-D
在信息聚合阶段5-2中,首先,融合点击二部图和社交图,得到异构图(称为待更新异构图)。在异构图G=<V,E,W>中,顶点V表示账号和广告分别对应的向量表示所构成的向量集合,权重W表示账号之间、以及账号和广告之间的权重所构成的标量集合。由于该异构图是通过点击二部图GB和社交图GS两个子图融合得到的,所以,异构图G也可以表示为G=<GB,GS>。其中,U为点击二部图GB中账号的向量表示的全集,u为账号的向量表示的个体,从而u∈U;I为社交图GS中广告的向量表示的全集,i为广告的向量表示的个体,从而i∈I;所以,点击二部图的顶点、边、边权重(称为转化权重)可以依次表示为um或inwm,n;社交图的顶点、边、边权重(称为互动权重)可以依次表示为um,wm1,m2;其中,m表示账号索引,n表示广告索引。
参见图8,图8是本申请实施例提供的一种示例性的异构图的示意图;如图8所示,异构图8-1是通过融合图6中的点击二部图6-1和图7中的社交图7-1获得的。在异构图8-1中,通过顶点B、C和D,能够使得顶点A与顶点a、b、c和d建立关联;因此,在异构图中体现了冷启动账号与广告的关联。
最后,对异构图中的账号所对应的顶点进行异质信息聚合,包括冷启动账号的异质信息聚合和非冷启动账号的异质信息聚合。其中,在对冷启动账号进行异质信息聚合 时,采用顶点的二阶信息进行异质信息聚合;在对非启动账号进行异质信息聚合时,采用顶点的一阶信息和二阶信息进行异质信息聚合。其中,一阶信息可通过式(1)实现,式(1)如下所示。
其中,表示账号u对应的顶点的一阶信息;wu,i表示账号u和广告i之间的权重;表示广告i的向量表示。
二阶信息可通过式(2)实现,式(2)如下所示。
其中,表示账号u对应的顶点的二阶信息;表示账号u′的向量表示;wu,u′表示账号u和账号u′之间的权重;wu′i表示账号u′和广告i之间的权重。
参见图9,图9是本申请实施例提供的一种示例性的异质信息聚合示意图;如图9所示,针对图8的异构图8-1中冷启动账号所对应的顶点A,对应的二阶信息可表示为[顶点A-顶点B-顶点a;顶点A-顶点C-顶点b;顶点A-顶点C-顶点c;顶点A-顶点D-顶点d],如图9中的实线所示。
参见图10,图10是本申请实施例提供的另一种示例性的异质信息聚合示意图;如图10所示,针对图8的异构图8-1中非冷启动账号所对应的顶点C,对应的一阶信息可表示为[顶点C-顶点b;顶点C-顶点c],如图10中边10-1和边10-2所示;二阶信息可表示为[顶点C-顶点A;顶点C-顶点D-顶点d],如图10中边10-3、边10-4和边10-5所示。
需要说明的是,由于异质信息聚合中所涉及的信息来源于两种不同的子图结构,因此这里采用多频带高斯核函数来实现异质信息聚合。从而,异质信息聚合的结果Vu可通过式(3)实现,式(3)如下所示。
其中,k表示高斯带宽的索引值,K表示多个高斯带宽(称为多个指定映射参数)的集合;表示高斯核函数,这里以为例进行说明,可通过式(4)实现;1-α称为第二组合权重,α(称为第一组合权重)可通过式(5)实现;式(4)和式(5)如下所示。
其中,σk表示第k个高斯带宽参数,表示高斯核函数中心。
需要说明的是,对冷启动账号进行异质信息聚合的过程中,由于不涉及一阶信息,因此式(3)中的α为0。
在本申请实施例中,当完成一次异质信息聚合之后,接着进行权重更新。由于异构图源于两种类型不同的子图(点击二部图与社交图),两种不同类型的子图属于异质信息。因此,采用自适应加权注意力机制进行权重更新;也就是说,先在单个子图上分别采用注意力机制更新权重;再采用自适应加权机制来融合不同子图上的权重。从而,采用注意力机制更新异构图中属于点击二部图上的边的权重的过程如式(6)所示。
其中,LeakyReLU表示激活层函数();Neighbor(I)表示账号u所点击的所有广告的向量表示集合,称为当前对象顶点对应的至少一个相邻信息顶点,i′∈Neighbor(I);表示广告i′的向量表示。
采用注意力机制更新异构图中属于社交图上的边的权重的过程如式(7)所示。
其中,Neighbor(U)表示账号u所互动的所有账号的向量表示集合,称为当前对象顶点对应的至少一个相邻对象顶点,u′∈Neighbor(U);表示账号u′的向量表示;称为注意力转化权重;称为注意力互动权重。
采用自适应加权机制来融合不同子图上的权重的过程如式(8)和(9)所示。

其中,称为至少一个待更新边权重;称为第一权重叠加和,称为第二权重叠加和;β称为第二增强参数,γ称为第一增强参数。
示例性地,参见图11,图11是本申请实施例提供的一种示例性的权重更新示意图;如图11所示,采用注意力机制更新权重11-1(WA-C)时,基于顶点A和顶点D实现;采用自适应加权机制更新权重11-1时,基于注意力机制更新后的权重11-1、权重11-2和权重11-3实现。
需要说明的是,基于式(1)至式(9)进行迭代处理,来更新异构图,获得最终的异构图(称为指定异构图),其中,最终的异构图中包括账号最终的向量表示和广告最终的向量表示;从而,在广告推荐阶段5-3进行广告推荐时,能够基于最终的异构图中账号u对应的顶点和广告i对应的顶点,确定广告点击概率Yu,i,如式(10)所示。
Yu,i=Sigmoid(Vu*Vi)            (10);
其中,Vu表示最终的异构图中账号u对应的顶点,Vi表示最终的异构图中广告i对应的顶点,Sigmoid()为激活层函数;Vu*Vi称为待推荐对象特征和待推荐信息特征的融合结果。
这里,通过最终的异构图预估各个广告(比如N个广告,N为正整数)分别对应的广告点击概率,并基于广告点击概率进行广告排序,筛选出广告点击概率最大的广告推荐给账号,实现信息推荐。
下面说明本申请实施例提供的数据处理方法与基线模型,分别在训练数据集、验证数据集和测试数据集上进行广告点击概率预估时对应的指标数据,如表1所示。
表1

由表1可知,本申请实施例提供的数据处理方法优于基线模型。
下面说明本申请实施例提供的数据处理方法与多个基线模型在应用过程中的对比结果。
参见图12,图12是本申请实施例提供的一种示例性的模型性能对比示意图;如图12所示,横坐标轴表示应用日期(0507至0511),纵坐标表示性能指标(0.06至0.13);曲线12-1为基线模型1对应的性能信息,曲线12-2为基线模型2对应的性能信息,曲线12-3为基线模型3对应的性能信息,曲线12-4为本申请实施例提供的数据处理方法所对应的性能信息。通过曲线12-1至曲线12-4,可知,本申请实施例提供的数据处理方法在性能指标上,优于基线模型1至基线模型3。
可以理解的是,通过将社交图和点击二部图融合为异构图,并结合边权重对异构图中账号对应的顶点进行非线性聚合,使得账号与广告进行了信息交互;从而,即使是冷启动账号,也和广告之间建立了有效关联。另外,由于非线性聚合过程中,在不断更新账号对应的顶点的过程中,还采用自适应注意力机制对边权重进行了更新,能够提升非线性聚合的效果;综上,本申请实施例提供的异构图聚合的方法,能够提升对冷启动账号进行信息推荐的准确度,降低信息推荐的资源消耗。
下面继续说明本申请实施例提供的数据处理装置455的实施为软件模块的示例性结构,在一些实施例中,如图2所示,存储在存储器450的数据处理装置455中的软件模块可以包括:
特征获取模块4551,配置为获取待推荐对象对应的待推荐对象特征,其中,所述待推荐对象特征通过目标二阶信息的非线性映射结果获得,所述目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,所述互动对象为与所述待推荐对象互动的对象,至少一个所述推荐信息为每个所述互动对象所转化的全量信息;
所述特征获取模块4551,还配置为获取待推荐信息对应的待推荐信息特征,其中,所述待推荐信息为所述互动对象所转化的任一所述推荐信息;
信息推荐模块4552,配置为基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐。
在本申请实施例中,所述特征获取模块4551,还配置为获取所述待推荐对象对应的所述目标二阶信息;获取所述目标二阶信息与指定中心信息之间的空间距离,其中,所述指定中心信息是通过多个二阶信息确定的,多个所述二阶信息包括所述目标二阶信息;基于多个指定映射参数对所述空间距离进行非线性映射,得到多个待融合二阶特征,其中,所述指定映射参数表示映射空间范围;获取多个所述待融合二阶特征对应的第一非线性映射结果,基于所述第一非线性映射结果,得到所述待推荐对象特征,其中,所述目标二阶信息所对应的非线性映射结果包括所述第一非线性映射结果。
在本申请实施例中,所述特征获取模块4551,还配置为获取所述待推荐对象与所述互动对象之间的互动权重,其中,所述互动权重表示所述待推荐对象与所述互动对象之间的亲密度;获取所述互动对象与所述推荐信息之间的转化权重,其中,所述转化权重表示所述互动对象与所述推荐信息之间的转化度;获取所述互动权重与所述对象特征的 第一融合结果、以及所述转化权重和所述第一信息特征的第二融合结果,得到与至少一个所述互动对象对应的至少一个所述第一融合结果、以及与至少一个所述推荐信息对应的至少一个所述第二融合结果;将至少一个所述第一融合结果、以及每个所述互动对象对应的至少一个所述第二融合结果,组合为所述待推荐对象对应的所述目标二阶信息。
在本申请实施例中,所述数据处理装置455还包括对象判断模块4553,配置为获取所述待推荐对象对待推荐信息库的转化标识,其中,所述待推荐信息库包括每个所述互动对象所转化的至少一个所述推荐信息。
在本申请实施例中,所述特征获取模块4551,还配置为当所述转化标识表示所述待推荐信息库中包括至少一个已转化信息时,对至少一个所述已转化信息对应的至少一个第二信息特征进行聚合,得到所述待推荐对象对应的目标一阶信息,其中,所述已转化信息为所述待推荐对象已发生转化的推荐信息,所述第二信息特征为所述已转化信息的特征;获取所述目标一阶信息对应的第二非线性映射结果,将所述第二非线性映射结果与所述第一非线性映射结果,组合为所述待推荐对象特征。
在本申请实施例中,所述特征获取模块4551,还配置为对所述第二非线性映射结果、以及所述第一非线性映射结果进行组合,得到初始聚合信息;获取与所述初始聚合信息负相关、且与所述第二非线性映射结果正相关的第一组合权重,并获取所述第一组合权重对应的第二组合权重;对所述第一组合权重与所述第二非线性映射结果进行融合,得到第三融合结果,对所述第二组合权重与所述第一非线性映射结果进行融合,得到第四融合结果;对所述第三融合结果和所述第四融合结果进行组合,得到所述待推荐对象特征。
在本申请实施例中,所述特征获取模块4551,还配置为当所述转化标识表示转化对象库与所述待推荐对象独立时,将所述第一非线性映射结果,确定为所述待推荐对象特征,其中,所述转化对象库是指对所述待推荐信息库中的所述推荐信息进行转化的对象集合。
在本申请实施例中,所述待推荐对象特征和所述待推荐信息特征通过指定异构图获得,其中,所述数据处理装置455还包括模型训练模块4554,配置为基于至少两个第一对象之间的互动记录,构建对象互动图,其中,至少两个所述第一对象包括所述待推荐对象和至少一个所述互动对象;基于至少一个第二对象对至少一个初始推荐信息的转化记录,构建对象信息转化图,其中,至少一个所述初始推荐信息包括每个所述互动对象所转化的至少一个所述推荐信息;基于至少两个所述第一对象和至少一个所述第二对象之间的共同对象,融合所述对象互动图和所述对象信息转化图,得到待更新异构图;基于所述待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果,迭代更新所述待更新异构图中的对象顶点;将迭代更新后的所述待更新异构图,确定为所述指定异构图。
在本申请实施例中,所述模型训练模块4554,还配置为对所述待更新异构图中的每个对象顶点执行以下处理:基于所述对象顶点的二阶信息所对应的非线性映射结果,更新所述对象顶点;将完成更新的所述待更新异构图确定为当前异构图;对所述当前异构图中的边权重进行注意力更新,得到待更新边权重;对所述待更新边权重进行自适应增强,得到目标边权重;基于所述目标边权重,聚合出所述当前异构图中每个当前对象顶点的二阶信息;基于所述当前对象顶点的二阶信息所对应的非线性映射结果,迭代更新所述当前异构图中的对象顶点。
在本申请实施例中,所述模型训练模块4554,还配置为对所述当前异构图中的每个所述当前对象顶点执行以下处理:获取所述当前对象顶点对应的至少一个相邻对象顶点;基于至少一个所述相邻对象顶点,确定所述当前对象顶点与每个所述相邻对象顶点之间 的注意力互动权重;获取所述当前对象顶点对应的至少一个相邻信息顶点;基于至少一个所述相邻信息顶点,确定所述当前对象顶点与每个所述相邻信息顶点之间的注意力转化权重,其中,所述待更新边权重为所述注意力互动权重或所述注意力转化权重。
在本申请实施例中,所述模型训练模块4554,还配置为获取至少一个所述待更新边权重,其中,至少一个所述待更新边权重与目标待更新边权重相邻、且与所述目标待更新边权重类型不同,所述目标待更新边权重为任一待进行自适应性增强的所述待更新边权重;基于至少一个所述待更新边权重,增强所述目标待更新边权重,得到所述目标边权重。
在本申请实施例中,所述信息推荐模块4552,还配置为基于所述待推荐对象特征和所述待推荐信息特征的融合结果,确定所述待推荐对象对所述待推荐信息进行转化的转化概率;当待推荐信息库包括至少两个所述待推荐信息时,基于所述待推荐对象对至少两个所述待推荐信息的至少两个所述转化概率,对至少两个所述待推荐信息进行倒序排列,得到待推荐信息序列;将从所述待推荐信息序列中依次选择指定数量的所述待推荐信息,确定为目标待推荐信息。
本申请实施例提供了一种计算机程序产品,该计算机程序产品或计算机程序包括计算机程序或计算机可执行指令,该计算机程序或计算机可执行指令存储在计算机可读存储介质中。计算机设备(称为数据处理设备)的处理器从计算机可读存储介质读取该计算机程序或计算机可执行指令,处理器执行该计算机程序或计算机可执行指令,使得该计算机设备执行本申请实施例上述的数据处理方法。
本申请实施例提供一种存储有计算机程序或计算机可执行指令的计算机可读存储介质,其中存储有计算机程序或计算机可执行指令,当计算机程序或计算机可执行指令被处理器执行时,将引起处理器执行本申请实施例提供的数据处理方法,例如,如图3a示出的数据处理方法。
在一些实施例中,计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、闪存、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备。
在一些实施例中,计算机程序或计算机可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。
作为示例,计算机程序或计算机可执行指令可以但不一定对应于文件***中的文件,可以可被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(HTML,Hyper Text Markup Language)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。
作为示例,可执行指令可被部署为在一个计算机设备上执行(此时,这一个计算机设备即为数据处理设备),或者在位于一个地点的多个计算机设备上执行(此时,位于一个地点的多个计算机设备即为数据处理设备),又或者,在分布在多个地点且通过通信网络互连的多个计算机设备上执行(此时,分布在多个地点且通过通信网络互连的多个计算机设备即为数据处理设备)。
可以理解的是,在本申请实施例中,涉及到互动记录和转化记录等相关的数据,当本申请实施例运用到具体产品或技术中时,需要获得用户许可或者同意,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。
综上,本申请实施例通过获取待推荐对象对应的目标二阶信息,并基于目标二阶信 息的非线性映射结果确定待推荐对象特征的过程中;由于目标二阶信息不仅包括对象之间的交互,还包括对象与推荐信息之间的交互,是一种异质信息,因此,通过对目标二阶信息进行非线性映射来获得待推荐对象对应的待推荐对象特征,使得待推荐对象与推荐信息建立了准确的关联,从而能够准确地确定是否向待推荐对象推荐任一推荐信息,能够提升转化概率的准确度;因此,即使是冷启动对象,也能够提升信息推荐的准确率较低,进而降低信息推荐的资源消耗。
以上,仅为本申请的实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本申请的保护范围之内。

Claims (15)

  1. 一种数据处理方法,所述方法由数据处理设备执行,所述方法包括:
    获取待推荐对象对应的待推荐对象特征,其中,所述待推荐对象特征通过目标二阶信息的非线性映射结果获得,所述目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,所述互动对象为与所述待推荐对象互动的对象,至少一个所述推荐信息为每个所述互动对象所转化的全量信息;
    获取待推荐信息对应的待推荐信息特征,其中,所述待推荐信息为所述互动对象所转化的任一所述推荐信息;
    基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐。
  2. 根据权利要求1所述的方法,其中,所述获取待推荐对象对应的待推荐对象特征,包括:
    获取所述待推荐对象对应的所述目标二阶信息;
    获取所述目标二阶信息与指定中心信息之间的空间距离,其中,所述指定中心信息是通过多个二阶信息确定的,多个所述二阶信息包括所述目标二阶信息;
    基于多个指定映射参数对所述空间距离进行非线性映射,得到多个待融合二阶特征,其中,所述指定映射参数表示映射空间范围;
    获取多个所述待融合二阶特征对应的第一非线性映射结果,基于所述第一非线性映射结果,得到所述待推荐对象特征,其中,所述目标二阶信息所对应的非线性映射结果包括所述第一非线性映射结果。
  3. 根据权利要求2所述的方法,其中,所述获取所述待推荐对象对应的所述目标二阶信息,包括:
    获取所述待推荐对象与所述互动对象之间的互动权重,其中,所述互动权重表示所述待推荐对象与所述互动对象之间的亲密度;
    获取所述互动对象与所述推荐信息之间的转化权重,其中,所述转化权重表示所述互动对象与所述推荐信息之间的转化度;
    获取所述互动权重与所述对象特征的第一融合结果、以及所述转化权重和所述第一信息特征的第二融合结果,得到与至少一个所述互动对象对应的至少一个所述第一融合结果、以及与至少一个所述推荐信息对应的至少一个所述第二融合结果;
    将至少一个所述第一融合结果、以及每个所述互动对象对应的至少一个所述第二融合结果,组合为所述待推荐对象对应的所述目标二阶信息。
  4. 根据权利要求2或3所述的方法,其中,所述获取待推荐对象对应的待推荐对象特征之前,所述方法还包括:
    获取所述待推荐对象对待推荐信息库的转化标识,其中,所述待推荐信息库包括每个所述互动对象所转化的至少一个所述推荐信息;
    所述基于所述第一非线性映射结果,得到所述待推荐对象特征,包括:
    当所述转化标识表示所述待推荐信息库中包括至少一个已转化信息时,对至少一个所述已转化信息对应的至少一个第二信息特征进行聚合,得到所述待推荐对象对应的目标一阶信息,其中,所述已转化信息为所述待推荐对象已发生转化的推荐信息,所述第二信息特征为所述已转化信息的特征;
    获取所述目标一阶信息对应的第二非线性映射结果,将所述第二非线性映射结果与所述第一非线性映射结果,组合为所述待推荐对象特征。
  5. 根据权利要求4所述的方法,其中,所述将所述第二非线性映射结果与所述第一非线性映射结果,组合为所述待推荐对象特征,包括:
    对所述第二非线性映射结果、以及所述第一非线性映射结果进行组合,得到初始聚合信息;
    获取与所述初始聚合信息负相关、且与所述第二非线性映射结果正相关的第一组合权重,并获取所述第一组合权重对应的第二组合权重;
    对所述第一组合权重与所述第二非线性映射结果进行融合,得到第三融合结果,对所述第二组合权重与所述第一非线性映射结果进行融合,得到第四融合结果;
    对所述第三融合结果和所述第四融合结果进行组合,得到所述待推荐对象特征。
  6. 根据权利要求4所述的方法,其中,所述基于所述第一非线性映射结果,得到所述待推荐对象特征,包括:
    当所述转化标识表示转化对象库与所述待推荐对象独立时,将所述第一非线性映射结果,确定为所述待推荐对象特征,其中,所述转化对象库是指对所述待推荐信息库中的所述推荐信息进行转化的对象集合。
  7. 根据权利要求1所述的方法,其中,所述待推荐对象特征和所述待推荐信息特征通过指定异构图获得,所述指定异构图通过以下步骤获得:
    基于至少两个第一对象之间的互动记录,构建对象互动图,其中,至少两个所述第一对象包括所述待推荐对象和至少一个所述互动对象;
    基于至少一个第二对象对至少一个初始推荐信息的转化记录,构建对象信息转化图,其中,至少一个所述初始推荐信息包括每个所述互动对象所转化的至少一个所述推荐信息;
    基于至少两个所述第一对象和至少一个所述第二对象之间的共同对象,融合所述对象互动图和所述对象信息转化图,得到待更新异构图;
    基于所述待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果,迭代更新所述待更新异构图中的对象顶点;
    将迭代更新后的所述待更新异构图,确定为所述指定异构图。
  8. 根据权利要求7所述的方法,其中,所述基于所述待更新异构图中每个对象顶点的二阶信息所对应的非线性映射结果,迭代更新所述待更新异构图中的对象顶点,包括:
    对所述待更新异构图中的每个对象顶点执行以下处理:基于所述对象顶点的二阶信息所对应的非线性映射结果,更新所述对象顶点;
    将完成更新的所述待更新异构图确定为当前异构图;
    对所述当前异构图中的边权重进行注意力更新,得到待更新边权重;
    对所述待更新边权重进行自适应增强,得到目标边权重;
    基于所述目标边权重,聚合出所述当前异构图中每个当前对象顶点的二阶信息;
    基于所述当前对象顶点的二阶信息所对应的非线性映射结果,迭代更新所述当前异构图中的对象顶点。
  9. 根据权利要求8所述的方法,其中,所述对所述当前异构图中的边权重进行注意力更新,得到待更新边权重,包括:
    对所述当前异构图中的每个所述当前对象顶点执行以下处理:
    获取所述当前对象顶点对应的至少一个相邻对象顶点;
    基于至少一个所述相邻对象顶点,确定所述当前对象顶点与每个所述相邻对象顶点之间的注意力互动权重;
    获取所述当前对象顶点对应的至少一个相邻信息顶点;
    基于至少一个所述相邻信息顶点,确定所述当前对象顶点与每个所述相邻信息顶点之间的注意力转化权重,其中,所述待更新边权重为所述注意力互动权重或所述注意力转化权重。
  10. 根据权利要求8或9所述的方法,其中,所述对所述待更新边权重进行自适应增强,得到目标边权重,包括:
    获取至少一个所述待更新边权重,其中,至少一个所述待更新边权重与目标待更新边权重相邻、且与所述目标待更新边权重类型不同,所述目标待更新边权重为任一待进行自适应性增强的所述待更新边权重;
    基于至少一个所述待更新边权重,增强所述目标待更新边权重,得到所述目标边权重。
  11. 根据权利要求1至3、7至9任一项所述的方法,其中,所述基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐,包括:
    基于所述待推荐对象特征和所述待推荐信息特征的融合结果,确定所述待推荐对象对所述待推荐信息进行转化的转化概率;
    当待推荐信息库包括至少两个所述待推荐信息时,基于所述待推荐对象对至少两个所述待推荐信息的至少两个所述转化概率,对至少两个所述待推荐信息进行倒序排列,得到待推荐信息序列;
    将从所述待推荐信息序列中依次选择指定数量的所述待推荐信息,确定为目标待推荐信息;
    向所述待推荐对象推荐所述目标待推荐信息。
  12. 一种数据处理装置,所述数据处理装置包括:
    特征获取模块,配置为获取待推荐对象对应的待推荐对象特征,其中,所述待推荐对象特征通过目标二阶信息的非线性映射结果获得,所述目标二阶信息通过聚合至少一个互动对象分别对应的对象特征、以及至少一个推荐信息分别对应的第一信息特征获得,所述互动对象为与所述待推荐对象互动的对象,至少一个所述推荐信息为每个所述互动对象所转化的全量信息;
    所述特征获取模块,还配置为获取待推荐信息对应的待推荐信息特征,其中,所述待推荐信息为所述互动对象所转化的任一所述推荐信息;
    信息推荐模块,配置为基于所述待推荐对象特征和所述待推荐信息特征的融合结果,对所述待推荐对象进行信息推荐。
  13. 一种数据处理设备,所述数据处理设备包括:
    存储器,用于存储计算机程序或计算机可执行指令;
    处理器,用于执行所述存储器中存储的计算机程序或计算机可执行指令时,实现权利要求1至11任一项所述的数据处理方法。
  14. 一种计算机可读存储介质,存储有计算机程序或计算机可执行指令,所述计算机程序或计算机可执行指令用于被处理器执行时,实现权利要求1至11任一项所述的数据处理方法。
  15. 一种计算机程序产品,包括计算机程序或计算机可执行指令,所述计算机程序或计算机可执行指令被处理器执行时,实现权利要求1至11任一项所述的数据处理方法。
PCT/CN2023/088857 2022-06-13 2023-04-18 一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品 WO2023241207A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/587,671 US20240211991A1 (en) 2022-06-13 2024-02-26 Data processing method, apparatus, and computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210662836.2 2022-06-13
CN202210662836.2A CN114756762B (zh) 2022-06-13 2022-06-13 数据处理方法、装置、设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/587,671 Continuation US20240211991A1 (en) 2022-06-13 2024-02-26 Data processing method, apparatus, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2023241207A1 true WO2023241207A1 (zh) 2023-12-21

Family

ID=82336354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/088857 WO2023241207A1 (zh) 2022-06-13 2023-04-18 一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品

Country Status (3)

Country Link
US (1) US20240211991A1 (zh)
CN (1) CN114756762B (zh)
WO (1) WO2023241207A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756762B (zh) * 2022-06-13 2022-09-02 腾讯科技(深圳)有限公司 数据处理方法、装置、设备及存储介质
CN116450938A (zh) * 2023-04-07 2023-07-18 北京欧拉认知智能科技有限公司 一种基于图谱的工单推荐实现方法及***
CN117786094A (zh) * 2023-12-29 2024-03-29 北京基智科技有限公司 一种基于知识图谱的企业技术服务推荐方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046286A (zh) * 2019-12-12 2020-04-21 腾讯科技(深圳)有限公司 一种对象推荐方法、装置、以及计算机存储介质
CN111291266A (zh) * 2020-02-13 2020-06-16 腾讯科技(北京)有限公司 基于人工智能的推荐方法、装置、电子设备及存储介质
CN111414548A (zh) * 2020-05-09 2020-07-14 中国工商银行股份有限公司 对象推荐方法、装置、电子设备和介质
WO2021189976A1 (zh) * 2020-03-25 2021-09-30 平安科技(深圳)有限公司 一种产品信息推送方法、装置、设备及存储介质
CN114756762A (zh) * 2022-06-13 2022-07-15 腾讯科技(深圳)有限公司 数据处理方法、装置、设备、存储介质及程序产品

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2700026A4 (en) * 2011-04-19 2015-03-18 Nokia Corp METHOD AND APPARATUS FOR PRODUCING COLLABORATIVE FILTERING BASED ON ATTRIBUTES
CN103995866A (zh) * 2014-05-19 2014-08-20 北京邮电大学 一种基于链路预测的商品信息推送方法及装置
CN105320719B (zh) * 2015-01-16 2019-02-05 焦点科技股份有限公司 一种基于项目标签和图形关系的众筹网站项目推荐方法
CN106202519A (zh) * 2016-07-22 2016-12-07 桂林电子科技大学 一种结合用户评论内容和评分的项目推荐方法
CN106682114B (zh) * 2016-12-07 2020-10-27 广东工业大学 一种融合用户信任关系和评论信息的个性化推荐方法
US20180268318A1 (en) * 2017-03-17 2018-09-20 Adobe Systems Incorporated Training classification algorithms to predict end-user behavior based on historical conversation data
CN110837598B (zh) * 2019-11-11 2021-03-19 腾讯科技(深圳)有限公司 信息推荐方法、装置、设备及存储介质
US20210174164A1 (en) * 2019-12-09 2021-06-10 Miso Technologies Inc. System and method for a personalized search and discovery engine
CN113536106A (zh) * 2020-11-23 2021-10-22 腾讯科技(深圳)有限公司 待推荐信息内容确定方法
CN112948668B (zh) * 2021-02-04 2023-03-03 深圳大学 一种信息推荐方法、电子设备及存储介质
CN112883268B (zh) * 2021-02-22 2022-02-01 中国计量大学 一种考虑用户多兴趣以及社交影响的会话推荐方法
CN113010802B (zh) * 2021-03-25 2022-09-20 华南理工大学 一种基于用户与物品多属***互面向隐式反馈的推荐方法
CN114238750A (zh) * 2021-11-18 2022-03-25 浙江工业大学 基于异构网络信息嵌入模型的交互可视推荐方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046286A (zh) * 2019-12-12 2020-04-21 腾讯科技(深圳)有限公司 一种对象推荐方法、装置、以及计算机存储介质
CN111291266A (zh) * 2020-02-13 2020-06-16 腾讯科技(北京)有限公司 基于人工智能的推荐方法、装置、电子设备及存储介质
WO2021189976A1 (zh) * 2020-03-25 2021-09-30 平安科技(深圳)有限公司 一种产品信息推送方法、装置、设备及存储介质
CN111414548A (zh) * 2020-05-09 2020-07-14 中国工商银行股份有限公司 对象推荐方法、装置、电子设备和介质
CN114756762A (zh) * 2022-06-13 2022-07-15 腾讯科技(深圳)有限公司 数据处理方法、装置、设备、存储介质及程序产品

Also Published As

Publication number Publication date
US20240211991A1 (en) 2024-06-27
CN114756762B (zh) 2022-09-02
CN114756762A (zh) 2022-07-15

Similar Documents

Publication Publication Date Title
WO2023241207A1 (zh) 一种数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品
US11531867B2 (en) User behavior prediction method and apparatus, and behavior prediction model training method and apparatus
CN113626719B (zh) 信息推荐方法、装置、设备、存储介质及计算机程序产品
CN105830049B (zh) 自动化实验平台
US9911143B2 (en) Methods and systems that categorize and summarize instrumentation-generated events
US20170090893A1 (en) Interoperability of Transforms Under a Unified Platform and Extensible Transformation Library of Those Interoperable Transforms
US20170300966A1 (en) Methods and systems that predict future actions from instrumentation-generated events
TWI797099B (zh) 物機器系統及方法
WO2023065859A1 (zh) 物品推荐方法、装置及存储介质
CN112559896B (zh) 信息推荐方法、装置、设备及计算机可读存储介质
CN112085615A (zh) 图神经网络的训练方法及装置
WO2023213157A1 (zh) 数据处理方法、装置、程序产品、计算机设备和介质
US11410203B1 (en) Optimized management of online advertising auctions
WO2023284516A1 (zh) 基于知识图谱的信息推荐方法、装置、设备、介质及产品
Ryzko Modern big data architectures: a multi-agent systems perspective
CN114417174B (zh) 内容推荐方法、装置、设备及计算机存储介质
CN112989182A (zh) 信息处理方法、装置、信息处理设备及存储介质
CN114756768B (zh) 数据处理方法、装置、设备、可读存储介质及程序产品
Davami et al. Improving the performance of mobile phone crowdsourcing applications
Abousalh‐Neto et al. Better together: Extending JMP® with open‐source software
CN114662001A (zh) 资源交互预测模型训练方法和装置和资源推荐方法和装置
CN113656589B (zh) 对象属性确定方法、装置、计算机设备及存储介质
CN110414690A (zh) 利用机器学习模型执行预测的方法及装置
CN116150429A (zh) 异常对象识别方法、装置、计算设备以及存储介质
CN117251820A (zh) 数据处理方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23822778

Country of ref document: EP

Kind code of ref document: A1