CN115495663A - Information recommendation method and device, electronic equipment and storage medium - Google Patents

Information recommendation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115495663A
CN115495663A CN202211275555.8A CN202211275555A CN115495663A CN 115495663 A CN115495663 A CN 115495663A CN 202211275555 A CN202211275555 A CN 202211275555A CN 115495663 A CN115495663 A CN 115495663A
Authority
CN
China
Prior art keywords
output data
recommendation
preferred
recommendation model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211275555.8A
Other languages
Chinese (zh)
Inventor
徐宝江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangjian Information Technology Shenzhen Co Ltd
Original Assignee
Kangjian Information Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangjian Information Technology Shenzhen Co Ltd filed Critical Kangjian Information Technology Shenzhen Co Ltd
Priority to CN202211275555.8A priority Critical patent/CN115495663A/en
Publication of CN115495663A publication Critical patent/CN115495663A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, and provides an information recommendation method, an information recommendation device, electronic equipment and a storage medium; the method comprises the following steps: acquiring user behavior characteristic data; extracting user behavior characteristic data as a training set; performing iterative training on the recommendation model through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter; in the iterative training process of a training set, performing optimization processing on each output data in the output data set to obtain the optimal output data of the recommendation model; and taking the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enabling the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user. The method and the device can generate the recommended content according to the personality and the preference of the user by the generated recommendation model, predict the requirement of the user, recommend the content which is most likely to be liked by the user, and improve the recommendation success rate.

Description

Information recommendation method and device, electronic equipment and storage medium
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to an information recommendation method and device, electronic equipment and a storage medium.
Background
With the continuous development of information technology, the internet has become an integral part of people's daily life. People can perform various activities on the internet every day, such as watching movies, shopping, reading news, and the like, but with the increase of information on the internet, people find it more and more difficult to find the information most suitable for themselves from the mass information on the internet. The common recommendation method is that the items are simply ranked according to sales volume of the items, click volume of topics or reading volume of news, and then N items ranked at the top are selected to form a ranking list and recommended to a user. However, this method also has a great disadvantage, and it is difficult to form personalized recommendations for users, resulting in a low recommendation success rate.
Disclosure of Invention
In view of the foregoing disadvantages of the prior art, an object of the present invention is to provide an information recommendation method, apparatus, electronic device and storage medium, for generating a recommendation model capable of predicting the needs of a user, generating recommended content according to the personality and preference of the user, and improving the success rate of information recommendation.
To achieve the above and other related objects, the present invention provides an information recommendation method, comprising: acquiring user behavior characteristic data; extracting the user behavior characteristic data as a training set; enabling a recommendation model to carry out iterative training through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter; in the iterative training process of the training set, performing optimization processing on each output data in the output data set to obtain the optimal output data of the recommendation model; and taking the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enabling the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user.
The invention provides an information recommendation device which comprises a data acquisition module, a training set acquisition module, an output data set acquisition module, an optimal output data acquisition module and an information recommendation module, wherein the training set acquisition module is used for acquiring training data; the data acquisition module is used for acquiring user behavior characteristic data; the training set acquisition module is used for extracting the user behavior characteristic data as a training set; the output data set acquisition module enables a recommendation model to carry out iterative training through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter; the optimized output data acquisition module is used for evaluating each output data in the output data set in the iterative training process of the training set to obtain the optimal output data of the recommendation model; and the information recommending module takes the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommending model, and enables the recommending model to operate under the optimal configuration parameters to generate recommending information for the user.
The present invention provides an electronic device including: the information recommendation method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the steps of the information recommendation method when executing the computer program.
The present invention provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned information recommendation method.
As described above, the information recommendation method, apparatus, electronic device and storage medium according to the present invention have the following advantages:
compared with the prior art, the method and the device have the advantages that the optimization processing is carried out on the Catboost recommendation model according to the user behavior characteristic data, the optimized output data is obtained, the Catboost recommendation model is trained according to the optimized output data, the recommendation model is generated, the generated recommendation model can generate recommendation contents according to the personality and preference of the user, the requirements of the user are predicted, the contents which are most likely to be liked by the user are recommended to the user, the recommendation success rate is improved, the trouble that the user selects from mass information is relieved, and the shopping experience of the user is improved.
Drawings
Fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the invention.
Fig. 2 is a flowchart illustrating an information recommendation method according to an embodiment of the invention.
Fig. 3 is a flowchart illustrating an embodiment of extracting user behavior feature data according to the information recommendation method of the present invention.
Fig. 4 is a flowchart illustrating an embodiment of obtaining optimized output data according to the information recommendation method of the present invention.
Fig. 5 is a schematic structural diagram of an information recommendation device in an embodiment of the invention.
Description of the reference symbols
1. Terminal device
11. Processing unit
12. Memory device
121. Random access memory
122. Cache memory
123. Storage system
124. Program/utility tool
1241. Program module
13. Bus line
14. Input/output interface
15. Network adapter
2. External device
3. Display device
6. Information recommendation device
61. Data acquisition module
62. Training set acquisition module
63. Output data set acquisition module
64. Optimal output data acquisition module
65. Information recommendation module
S1 to S4
S21 to S22
S31 to S34 steps
Detailed Description
The following description of the embodiments of the present invention is provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than being drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of each component in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
Compared with the prior art, the information recommendation method, the information recommendation device, the electronic equipment and the storage medium have the advantages that the Catboost recommendation model is optimized according to the user behavior characteristic data, the optimized output data is obtained, the Catboost recommendation model is trained according to the optimized output data, the recommendation model is generated, the generated recommendation model can generate recommendation contents according to the personality and the preference of the user, the requirement of the user is predicted, the most probably favorite contents are recommended to the user, the recommendation success rate is improved, the trouble that the user selects from mass information is relieved, and the shopping experience of the user is improved.
The computer-readable storage medium of the present invention stores thereon a computer program that realizes the information recommendation method described below when executed by a processor. The storage medium includes: a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, a usb disk, a Memory card, or an optical disk, which can store program codes.
Any combination of one or more storage media may be employed. The storage medium may be a computer-readable signal medium or a computer-readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The present invention is described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the computer program instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The electronic device of the invention comprises a memory, a processor and a computer program stored on the memory and executable on the processor.
Preferably, the memory comprises: various media that can store program codes, such as ROM, RAM, magnetic disk, U-disk, memory card, or optical disk.
The processor is connected with the memory and is used for executing the computer program stored in the memory so as to enable the electronic equipment to execute the steps of the information recommendation method.
Preferably, the Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components.
In an embodiment, the electronic device includes a terminal and/or a server.
Fig. 1 shows a block diagram of an exemplary terminal 1 suitable for implementing an embodiment of the invention.
The terminal 1 shown in fig. 1 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 1, the terminal 1 is in the form of a general purpose computing device. The components of the terminal 1 may include, but are not limited to: one or more processors or processing units 11, a memory 12, and a bus 13 that couples various system components including the memory 12 and the processing unit 11.
Bus 13 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures can include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
The terminal 1 typically includes a variety of computer system readable media. These media may be any available media that can be accessed by terminal 1 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 12 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 121 and/or cache memory 122. The terminal 1 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 123 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 1, commonly referred to as a "hard drive"). Although not shown in FIG. 1, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 13 by one or more data media interfaces. Memory 12 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
Program/utility 124 having a set (at least one) of program modules 1241 may be stored in, for example, memory 12, such program modules 1241 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 1241 generally perform the functions and/or methodologies of embodiments of the invention as described herein.
The terminal 1 may also communicate with one or more external devices 2 (e.g., keyboard, pointing device, display 3, etc.), one or more devices that enable a user to interact with the terminal 1, and/or any device (e.g., network card, modem, etc.) that enables the terminal 1 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 14. Also, the terminal 1 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) through the network adapter 15. As shown in fig. 1, the network adapter 15 communicates with the other modules of the terminal 1 via the bus 13. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the terminal 1, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, to name a few.
As shown in fig. 2, in an embodiment, the information recommendation method of the present invention includes the following steps:
s1, acquiring user behavior characteristic data.
S2, extracting the user behavior characteristic data as a training set;
s3, enabling the recommendation model to perform iterative training through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter;
s4, in the iterative training process of the training set, performing optimization processing on each output data in the output data set to obtain the optimal output data of the recommendation model;
and S5, taking the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enabling the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user.
The following specifically describes the steps S1 to S5 in the information recommendation method according to the present embodiment.
S1, acquiring user behavior characteristic data.
In one embodiment, the user behavior feature data is obtained by setting a buried point in the application program. For example, the user behavior feature data is obtained by embedding points in a shopping website page.
Wherein the functionality of the subscription component (e.g., kakfa) may be utilized to subscribe to desired user behavior feature data, which may be obtained after the page has or generates the subscribed user behavior feature data.
It should be noted that the user behavior feature data in step S1 includes, but is not limited to, a user ID, a birthday, an education level, a marital status, an income level, a date of registration of a customer in a company, a number of days since the customer purchased last time, whether the customer complains past, a cumulative amount of consumption of the customer past, a discount purchase number, a number of purchases made via a company website, a number of purchases made using a catalog, a number of purchases made directly at a store, a number of recommended purchases, a number of visits to the company website last month, a purchased article tag category, a category purchase number, a category purchase item page duration, a jump rate, and the like.
And S2, extracting the user behavior characteristic data as a training set.
In an embodiment, as shown in fig. 3, the extracting the user behavior feature data includes:
and S21, processing the user behavior characteristic data based on a data tool to obtain effective user behavior characteristic data.
In one embodiment, whether the valid user behavior feature data is valid or not is judged according to a data tool, incomplete and invalid user behavior feature data are eliminated, and the valid user behavior feature data are obtained by performing exception screening on the user behavior feature data.
In an embodiment, the user behavior feature data is extracted, cleaned and standardized according to an Extract-Transform-Load (ETL) data tool to obtain effective user behavior feature data, so that the purpose of extracting, cleaning and standardizing the collected scattered, disordered and non-uniform user behavior feature data to obtain the effective data is achieved.
And S22, screening the effective user behavior characteristic data based on preset screening conditions, and extracting the user behavior characteristic data meeting the training requirements.
The preset screening conditions include, for example, time-based screening conditions and commodity name-based screening conditions.
In one embodiment, whether the effective user behavior feature data meets the training requirement is determined according to a preset screening condition. The screening condition of the user behavior feature data can be generated based on the selection condition input by the user, so that the user behavior feature data is filtered and screened to obtain the user behavior feature data meeting the training purpose requirement, and the part of the user behavior feature data is used as a training set.
The processing of the user behavior feature data adopts distributed stream data processing, wherein ETL adopts, but is not limited to, spark distributed computing cluster. The method comprises the steps of receiving subscribed message data through a spark subscription component (Kakfa), namely obtaining user behavior characteristic data, processing parallel message data (user behavior characteristic data) through a spark distributed computing cluster, then performing abnormal data filtering on the received user behavior characteristic data by utilizing an Extract-Transform-Load (ETL) data tool, extracting, cleaning and standardizing the collected scattered, disordered and standard non-uniform user behavior characteristic data to obtain effective data, and generating a screening condition of the user behavior characteristic data based on a selection condition input by a user to screen the effective user behavior characteristic data to form the user behavior characteristic data required by a training set.
And S3, performing iterative training on the recommendation model through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter.
In this embodiment, firstly, parameters are configured for a recommended model, the recommended model is trained through a training set, and corresponding output data under the configured parameters are obtained; then adjusting the configuration parameters of the recommendation model, and continuing to train the recommendation model through the training set to obtain corresponding output data under the configuration parameters; and repeating the process to obtain an output data set of the recommendation model under each configuration parameter.
In this embodiment, the recommendation model is preferably a castboost recommendation model. Therefore, in the implementation, the recommendation model is trained under different configuration parameters, so that the parameter optimization of the Catboost model is realized, and the output data set is obtained. The configuration parameters of the recommended model include, but are not limited to, ont _ hot _ max _ six, learning _ rate, max _ depth, l2_ leaf _ reg, random _ strongth, and the like.
In this embodiment, the castboost algorithm adopted in the castboost recommendation model is a machine learning library sourced in 2017 by russian search giant Yandex, and is one of Boosting family algorithms.
The Catboost recommendation model may process an input training set using an algorithm of a GBDT technique (e.g., XGboost, lightGBM) to generate a plurality of output data to form an output data set. In the Catboost recommendation model, the process of training the Catboost recommendation model through a training set comprises the following steps:
firstly, randomly ordering all sample data in a training set, then aiming at a certain value in class type characteristics, taking an average value based on class labels arranged in front of the sample data in the training set when the characteristics of each sample are converted into numerical types, and simultaneously adding priority and weight coefficients of the priority so as to reduce noise caused by low-frequency characteristics in the class characteristics.
Specifically, the generation of a plurality of output data by the Catboost recommendation model is divided into two stages:
1) Selecting a tree structure and calculating the values of the leaf nodes after the tree structure is fixed. In order to select the optimal tree structure, the Catboost algorithm is used for building a tree by enumerating different segmentations and calculating the value of the obtained leaf node;
2) And calculating scores of the obtained trees, and finally selecting the optimal segmentation to obtain output data.
The values of the leaf nodes of both stages are calculated as approximations of the gradient or newton step size. In the Catboost algorithm, the first stage employs unbiased estimation of gradient step size and the second stage is performed using a conventional GBDT scheme. The Catboost algorithm replaces a gradient estimation method in the traditional algorithm by adopting a sequencing promotion mode, so that the deviation of gradient estimation is reduced, and the generalization capability of the model is improved.
The Catboost algorithm adopted in the Catboost recommendation model is a fast, high-performance and extensible machine learning algorithm based on gradient-boosted symmetric decision trees, has few parameters, supports categorical variables, and is a high-accuracy GBDT framework. Although the best result can be obtained by using default parameters, the accuracy of the Catboost is influenced by partial parameters.
In this embodiment, in order to obtain the optimal configuration parameters of the Catboost recommendation model, a GWO algorithm (gray Wolf optimization algorithm, GWO) is used to optimize the configuration parameters of the Catboost recommendation model. 5363 the algorithm GWO is an intelligent optimization search algorithm, which is derived from the action of the poecilla chinensis prey and prey, and has the main characteristics of strong convergence performance, few parameters and easy realization. GWO algorithm simulates the rank and hunting mechanism inside the natural gray wolf population, four types of gray wolfs, as used to simulate the leader hierarchy. The concrete implementation process of optimizing the configuration parameters of the Catboost recommendation model by using GWO algorithm (Grey Wolf optimization algorithm, GWO) is described as the following step S4.
And S4, in the iterative training process of the training set, performing optimization processing on each output data in the output data set to obtain the optimal output data of the recommendation model.
In an embodiment, as shown in fig. 4, the performing an optimization process on each output data in the output data set to obtain the optimal output data of the recommendation model includes:
step S41, setting first preferred output data, second preferred output data, third preferred output data and candidate output data in the output data set.
Specifically, in this embodiment, the setting of the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data in the output data set includes:
1) And selecting a preset number of output data from the output data set.
In this embodiment, when initializing the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data, the performance of the recommendation model may be evaluated based on a small amount of output data, and the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data may be determined.
2) And performing recommendation performance ranking on the recommendation model based on each output data.
And evaluating the recommendation performance of the recommendation model based on each output data, wherein a plurality of performance indexes can be set, the corresponding performance indexes under each output data are calculated to evaluate the recommendation performance of the recommendation model, an evaluation result is obtained, and the recommendation performance of the recommendation model is ranked based on the evaluation result. It is well known to those skilled in the art that the recommendation performance of the recommendation model is evaluated based on the output data, and any recommendation performance evaluation method of the recommendation model may be adopted in the embodiment, which is not specifically limited herein.
3) Sequentially selecting output data of the recommendation model with the top three digits of recommendation performance from top to bottom according to the recommendation performance sequence, and taking the corresponding output data as first preferred output data, second preferred output data and third preferred output data respectively;
4) And taking any output data in the output data set except the first preferred output data, the second preferred output data and the third preferred output data as candidate output data.
That is, in this embodiment, the recommendation performance ranking may be performed on the recommendation model based on a small amount of output data, so as to initialize the first preferred output data, the second preferred output data, the third preferred output data, and the candidate output data in the output data set.
In the present embodiment, the method of setting the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data in the output data set is not limited to the above-described method, and the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data may be set artificially in the output data set based on the output data and rules of thumb.
And S32, constructing a data relation equation based on the first preferred output data, the second preferred output data, the third preferred output data, the candidate output data and the rest output data in the output data set.
To mathematically model the social rank of the gray wolf when designing the GWO algorithm, the present embodiment takes the best solution as α. The second and third best solutions are named β and δ, respectively. The remaining candidate solution is assumed to be ω. In the GWO algorithm, the hunting process is guided by α β and δ. The omega wolf follows these three wolfs. Correspondingly, in the present embodiment, that is, the first preferred output data, the second preferred output data, the third preferred output data and the candidate output data are respectively set in the output data set, for example, the first preferred output data α, the second preferred output data β, the third preferred output data δ and the candidate output data ω are set, and in the process of obtaining the optimal output data of the recommendation model, the candidate output data ω follows the optimal output data set guided by the first preferred output data α, the second preferred output data β and the third preferred output data δ.
The gray wolf delta directs the wolf cluster to surround the prey under the lead of gray wolf alpha, and the gray wolf individual tracks the mathematical model of the prey location. Specifically, based on the first preferred output data, the second preferred output data, the third preferred output data, the candidate output data, and the remaining output data in the output data set, a data relation equation is constructed as follows:
Figure BDA0003896423130000111
Figure BDA0003896423130000112
wherein D is α ,D β And D δ Respectively representing vector differences between the first preferred output data, the second preferred output data and the third preferred output data and the remaining output data in the output data set; x α ,X β And X δ Respectively representing the current space vector positions of the first preferred output data, the second preferred output data and the third preferred output data; c 1 ,C 2 And C 3 Is a random vector, and X is the space vector position of the current output data; x 1 ,X 2 ,X 3 Step sizes, A, for which the space vectors of the candidate output data respectively advance towards the first preferred output data, the second preferred output data and the third preferred output data 1 ,A 2 ,A 3 Respectively, the candidate output data are heading towards the first preferred output data, the second preferred output data and the third preferred output data.
In this embodiment, the space vector of the candidate output data is oriented to the first preferred output data, and the direction of the second preferred output data and the third preferred output data is controlled by a convergence factor; wherein the convergence factor decreases linearly with iterative training of the recommendation model.
Specifically, the space vector of the candidate output data is oriented to the optimal output data, and one expression manner of the direction in which the first preferred output data and the second preferred output data advance is: a =2 α · r 1 - α, where α is a convergence factor, the convergence factor α decreasing linearly with the number of iterations, e.g. from 2 to 0,r 1 Is a coefficient, r 1 Is [0,1 ]]A random number in between.
In one embodiment, the random vector C 1 ,C 2 And C 3 One way of expression of (a) is: c = 2. R 2 ,r 2 Is [0,1 ]]A random number in between.
And step S34, obtaining the optimal solution in the data relation equation, and taking the optimal solution as optimized output data.
In one embodiment, the obtaining the optimal solution in the data relation equation comprises: in the iterative training process of the recommendation model, when the position of the space vector of the candidate output data is unchanged, the candidate output data is obtained; and taking the candidate output number as the optimal solution in the data relation equation.
Specifically, in one embodiment, the space vector positions of the candidate output data are:
Figure BDA0003896423130000121
and t is an iterative algebra of the iterative training of the recommendation model.
It should be noted that the wolf completes the hunting process by attacking when the prey stops moving. In order to simulate approaching a prey, in the present embodiment, the value of the convergence factor α is gradually decreased, and therefore the spatial vector of the candidate output data is toward the optimal output data, and the fluctuation range of the direction a in which the first preferred output data and the second preferred output data advance is also decreased. That is, during the iterative operation of the Catboost recommendation model, when the value of the convergence factor α decreases linearly from 2 to 0, its corresponding value of A also changes within the interval [ - α, α ]. When the value of a is within the interval, the next position of the wolf may be located anywhere between its current position and the prey position, i.e. the next position of the candidate output data may be located anywhere between its current position and the optimized output data. When | A | <1, the wolf colony attacks the prey and falls into local optimum, namely the optimized output data is currently local optimum. When | A | >1, the wolf is separated from the prey, and it is desirable to find a more suitable prey, i.e., to optimize the output data to be globally optimal.
And S5, taking the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enabling the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user.
It should be noted that, in this embodiment, configuration parameter optimization processing is performed on the Catboost recommendation model based on the output data, an optimal configuration parameter is found, then the Catboost recommendation model is trained under the optimal configuration parameter to obtain a recommendation model, and finally, the trained recommendation model is used to perform decision response on user input, so as to perform personalized commodity recommendation for the user.
In the embodiment, each output data in the output data set of the recommendation model is subjected to optimization processing based on an algorithm to obtain the optimal output data of the recommendation model, configuration parameters corresponding to the optimal output data are used as the optimal configuration parameters of the recommendation model, and then the Catboost recommendation model is trained, so that the recommendation model can generate recommendation contents according to the personality and preference of a user, and generate recommendation contents according to the personality and preference of the user, and because the personality and preference of different users are different, the recommendation contents are different, so that more articles can be recommended, and the profit of more and small market segments can be obtained; on the other hand, because the recommendation is generated according to the preference of the user, the demand of the user can be predicted and the content which is most likely to be liked by the user can be recommended to the user, the trouble that the user selects from massive information is relieved, the probability of successful recommendation is higher, and the shopping experience of the user is improved.
It should be noted that the protection scope of the information recommendation method according to the present invention is not limited to the execution sequence of the steps listed in this embodiment, and all the solutions implemented by adding, subtracting, and replacing the steps in the prior art according to the principle of the present invention are included in the protection scope of the present invention.
As shown in fig. 5, in an embodiment, the invention provides an information recommendation apparatus 6, where the information recommendation apparatus 6 includes a data acquisition module 61, a training set acquisition module 62, an output data set acquisition module 63, an optimal output data acquisition module 64, and an information recommendation module 65.
The data obtaining module 61 is configured to obtain user behavior feature data.
The training set obtaining module 62 extracts the user behavior feature data as a training set.
The optimized output data obtaining module 63 performs optimization processing on the Catboost recommendation model based on the training set and the GWO algorithm to obtain optimized output data.
The output data set obtaining module 63 makes the recommendation model perform iterative training through the training set under different configuration parameters, so as to obtain an output data set of the recommendation model under each configuration parameter.
The optimized output data obtaining module 64 performs an optimization process on each output data in the output data set during the iterative training process of the training set, so as to obtain the optimal output data of the recommendation model.
The information recommending module 65 takes the configuration parameter corresponding to the optimal output data as the optimal configuration parameter of the recommending model, and makes the recommending model operate under the optimal configuration parameter to generate recommending information for the user.
It should be noted that the structures and principles of the data obtaining module 61, the training set obtaining module 62, the output data set obtaining module 63, the optimal output data obtaining module 64, and the information recommending module 65 correspond to the steps (step S1 to step S5) in the information recommending method one by one, and therefore are not described herein again.
It should be noted that the division of each module of the above apparatus is only a logical division, and all or part of the actual implementation may be integrated into one physical entity or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the x module may be a processing element that is set up separately, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the function of the x module may be called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a System-On-a-Chip (SOC).
It should be noted that the information recommendation apparatus 6 of the present invention can implement the information recommendation method of the present invention, but the implementation apparatus of the information recommendation method of the present invention includes, but is not limited to, the structure of the information recommendation apparatus 6 described in this embodiment, and all the structural modifications and substitutions of the prior art made according to the principle of the present invention are included in the protection scope of the present invention.
In summary, the method and the device perform optimization processing on the Catboost recommendation model according to the user behavior feature data to obtain optimized output data, train the Catboost recommendation model according to the optimized output data to generate the recommendation model, so that the generated recommendation model can generate recommendation contents according to the personality and preference of the user, predict the requirements of the user and recommend the contents which are most likely to be liked to the user, improve the recommendation success rate, relieve the trouble of the user in selecting from mass information, and improve the shopping experience of the user; therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (10)

1. An information recommendation method, characterized by comprising the steps of:
acquiring user behavior characteristic data;
extracting the user behavior characteristic data as a training set;
enabling a recommendation model to carry out iterative training through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter;
in the iterative training process of the training set, performing optimization processing on each output data in the output data set to obtain the optimal output data of the recommendation model;
and taking the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enabling the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user.
2. The information recommendation method according to claim 1, wherein the performing a review process on each output data in the output data set to obtain the optimal output data of the recommendation model comprises:
setting first preferred output data, second preferred output data, third preferred output data and candidate output data in the output data set;
constructing a data relation equation based on the first preferred output data, the second preferred output data, the third preferred output data, the candidate output data and the remaining output data in the output data set;
and obtaining an optimal solution of the data relation equation, and taking the optimal solution as optimal output data of the recommendation model.
3. The information recommendation method according to claim 2, wherein the data relation equation is constructed by:
Figure FDA0003896423120000011
Figure FDA0003896423120000012
wherein D is α ,D β And D δ Respectively representing vector differences between the first preferred output data, the second preferred output data and the third preferred output data and the remaining output data in the output data set; x α ,X β And X δ Respectively representing the current space vector positions of the first preferred output data, the second preferred output data and the third preferred output data; c 1 ,C 2 And C 3 Is a random vector, and X is the space vector position of the current output data; x 1 ,X 2 ,X 3 Step sizes, A, for which the space vectors of the candidate output data respectively advance towards the first preferred output data, the second preferred output data and the third preferred output data 1 ,A 2 ,A 3 Respectively, the candidate output data are heading towards the first preferred output data, the second preferred output data and the third preferred output data.
4. The information recommendation method of claim 3, wherein setting a first preferred output data, a second preferred output data, a third preferred output data and a candidate output data in the output data set comprises:
selecting a preset number of output data from the output data set;
performing recommendation performance ranking on the recommendation model based on each of the output data;
sequentially selecting output data of the recommendation model with the top three digits of recommendation performance from top to bottom according to the recommendation performance sequence, and taking the corresponding output data as first preferred output data, second preferred output data and third preferred output data respectively;
and taking any output data in the output data set except the first preferred output data, the second preferred output data and the third preferred output data as candidate output data.
5. The information recommendation method of claim 3, wherein said obtaining an optimal solution in said data relationship equation comprises:
in the iterative training process of the recommendation model, when the position of the space vector of the candidate output data is unchanged, the candidate output data is obtained;
and taking the candidate output number as the optimal solution in the data relation equation.
6. The information recommendation method according to claim 4, wherein the spatial vector of the candidate output data is directed towards the first preferred output data, and the direction of the second preferred output data and the third preferred output data is controlled by a convergence factor; wherein the convergence factor decreases linearly with iterative training of the recommendation model.
7. The information recommendation method according to any one of claims 1 to 6, wherein the recommendation model is a Catboost recommendation model.
8. An information recommendation apparatus, comprising: the system comprises a data acquisition module, a training set acquisition module, an output data set acquisition module, an optimal output data acquisition module and an information recommendation module;
the data acquisition module is used for acquiring user behavior characteristic data;
the training set acquisition module is used for extracting the user behavior characteristic data as a training set;
the output data set acquisition module enables a recommendation model to perform iterative training through the training set under different configuration parameters to obtain an output data set of the recommendation model under each configuration parameter;
the optimized output data acquisition module is used for evaluating each output data in the output data set in the iterative training process of the training set to obtain the optimal output data of the recommendation model;
and the information recommendation module takes the configuration parameters corresponding to the optimal output data as the optimal configuration parameters of the recommendation model, and enables the recommendation model to operate under the optimal configuration parameters to generate recommendation information for the user.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the information recommendation method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the information recommendation method according to any one of claims 1 to 7.
CN202211275555.8A 2022-10-18 2022-10-18 Information recommendation method and device, electronic equipment and storage medium Pending CN115495663A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211275555.8A CN115495663A (en) 2022-10-18 2022-10-18 Information recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211275555.8A CN115495663A (en) 2022-10-18 2022-10-18 Information recommendation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115495663A true CN115495663A (en) 2022-12-20

Family

ID=84474889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211275555.8A Pending CN115495663A (en) 2022-10-18 2022-10-18 Information recommendation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115495663A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501979A (en) * 2023-06-30 2023-07-28 北京水滴科技集团有限公司 Information recommendation method, information recommendation device, computer equipment and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501979A (en) * 2023-06-30 2023-07-28 北京水滴科技集团有限公司 Information recommendation method, information recommendation device, computer equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
Chi et al. k-pod: A method for k-means clustering of missing data
Pachter et al. Tropical geometry of statistical models
US8893076B2 (en) Configurable computation modules
Miao et al. Context‐based dynamic pricing with online clustering
WO2022057658A1 (en) Method and apparatus for training recommendation model, and computer device and storage medium
CN114048331A (en) Knowledge graph recommendation method and system based on improved KGAT model
CN107786943B (en) User grouping method and computing device
CN111723292B (en) Recommendation method, system, electronic equipment and storage medium based on graph neural network
CN110110233B (en) Information processing method, device, medium and computing equipment
CN116010684A (en) Article recommendation method, device and storage medium
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN110264277B (en) Data processing method and device executed by computing equipment, medium and computing equipment
CN112395487A (en) Information recommendation method and device, computer-readable storage medium and electronic equipment
KR101639656B1 (en) Method and server apparatus for advertising
Wang et al. CROWN: a context-aware recommender for web news
CN110377906A (en) Entity alignment schemes, storage medium and electronic equipment
CN115495663A (en) Information recommendation method and device, electronic equipment and storage medium
CN116738081B (en) Front-end component binding method, device and storage medium
CN111488517A (en) Method and device for training click rate estimation model
CN112069404A (en) Commodity information display method, device, equipment and storage medium
CN112650942A (en) Product recommendation method, device, computer system and computer-readable storage medium
CN116186541A (en) Training method and device for recommendation model
CN115630219A (en) Training method and device of recommendation model and computer equipment
Wang et al. Clustered coefficient regression models for poisson process with an application to seasonal warranty claim data
CN116127083A (en) Content recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination