CN113312554B - Method and device for evaluating recommendation system, electronic equipment and medium - Google Patents

Method and device for evaluating recommendation system, electronic equipment and medium Download PDF

Info

Publication number
CN113312554B
CN113312554B CN202110662575.XA CN202110662575A CN113312554B CN 113312554 B CN113312554 B CN 113312554B CN 202110662575 A CN202110662575 A CN 202110662575A CN 113312554 B CN113312554 B CN 113312554B
Authority
CN
China
Prior art keywords
candidate
attribute
content
candidate objects
evaluated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110662575.XA
Other languages
Chinese (zh)
Other versions
CN113312554A (en
Inventor
耿林
陈洋
李洪岩
秦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110662575.XA priority Critical patent/CN113312554B/en
Publication of CN113312554A publication Critical patent/CN113312554A/en
Application granted granted Critical
Publication of CN113312554B publication Critical patent/CN113312554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure provides a method and a device for evaluating a recommendation system, electronic equipment and a medium, relates to the technical field of data processing, and particularly relates to an intelligent recommendation technology. The implementation scheme is as follows: determining at least one content to be evaluated of a target object, wherein the at least one content to be evaluated is historical recommended content presented to the target object by a recommendation system; acquiring evaluation data of a target object aiming at least one content to be evaluated; and evaluating the recommendation effect of the recommendation system according to the evaluation data.

Description

Method and device for evaluating recommendation system, electronic equipment and medium
Technical Field
The present disclosure relates to the field of data processing technology, and in particular, to an intelligent recommendation technology, and more particularly, to a method, an apparatus, an electronic device, a computer readable storage medium, and a computer program product for evaluating a recommendation system.
Background
The recommendation system is used for screening out content which is possibly interested by the user from the mass data and pushing the content to the user. At present, a recommendation system is widely applied to various scenes such as news information recommendation, commodity recommendation, audio and video recommendation, advertisement delivery, social friend recommendation and the like. With the development of information technology, the data volume of a recommendation system is increased, and the adopted recommendation algorithm is more complex and diversified, so that the recommendation system faces a great challenge in providing personalized recommendation services for users, and the content of interest of the user may not be accurately recommended to the users. In order to optimize the recommendation system, so that the recommendation system can better provide personalized recommendation service for users, the recommendation effect of the recommendation system needs to be evaluated.
The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.
Disclosure of Invention
The present disclosure provides a method, apparatus, electronic device, computer-readable storage medium, and computer program product for evaluating a recommendation system.
According to an aspect of the present disclosure, there is provided a method for evaluating a recommendation system, including: determining at least one content to be evaluated of a target object, wherein the at least one content to be evaluated is historical recommended content presented to the target object by the recommendation system; acquiring evaluation data of the target object aiming at the at least one content to be evaluated; and evaluating a recommendation effect of the recommendation system according to the evaluation data.
According to another aspect of the present disclosure, there is provided an apparatus for evaluating a recommendation system, including: a content determination module configured to determine at least one content to be evaluated of a target object, the at least one content to be evaluated being a historical recommended content presented to the target object by the recommendation system; the data acquisition module is configured to acquire evaluation data of the target object aiming at the at least one content to be evaluated through the evaluation interface; and an evaluation module configured to evaluate a recommendation effect of the recommendation system according to the evaluation data.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for evaluating a recommender system described above.
According to another aspect of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions is provided. The computer instructions are for causing a computer to perform the above-described method for evaluating a recommendation system.
According to another aspect of the present disclosure, a computer program product is provided, including a computer program. The computer program, when executed by a processor, implements the method for evaluating a recommendation system described above.
According to one or more embodiments of the present disclosure, the recommendation effect of the recommendation system is evaluated according to evaluation data of historical recommendation contents (i.e., contents to be evaluated) presented to the recommendation system by a user (i.e., a target object). The evaluation data is real experience feedback of the user on the historical recommended content, the recommendation effect of the recommendation system is evaluated according to the evaluation data, and the accuracy of the evaluation result of the recommendation system can be improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.
FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, in accordance with an embodiment of the present disclosure;
FIG. 2 illustrates a flow chart of a method for evaluating a recommendation system, according to an embodiment of the present disclosure;
3A, 3B illustrate schematic diagrams of exemplary content presentation interfaces according to embodiments of the present disclosure;
FIGS. 4A, 4B illustrate schematic diagrams of exemplary data interfaces according to embodiments of the present disclosure;
FIG. 5 shows a block diagram of an apparatus for evaluating a recommendation system, according to an embodiment of the present disclosure; and
Fig. 6 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.
The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.
Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented, in accordance with an embodiment of the present disclosure. Referring to fig. 1, the system 100 includes one or more client devices 101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120. Client devices 101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.
In embodiments of the present disclosure, the server 120 may run one or more services or software applications that enable execution of the method for evaluating the recommendation system.
In some embodiments, server 120 may also provide other services or software applications that may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of client devices 101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.
In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof that are executable by one or more processors. A user operating client devices 101, 102, 103, 104, 105, and/or 106 may in turn utilize one or more client applications to interact with server 120 to utilize the services provided by these components. It should be appreciated that a variety of different system configurations are possible, which may differ from system 100. Accordingly, FIG. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.
The user may evaluate the historical recommendation content presented to him by the recommendation system using client devices 101, 102, 103, 104, 105 and/or 106. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that the present disclosure may support any number of any type of client devices.
Client devices 101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computer devices may run various types and versions of software applications and operating systems, such as Microsoft Windows, apple iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., *** Chrome OS); or include various mobile operating systems such as Microsoft Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head mounted displays and other devices. The gaming system may include various handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.
Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a number of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. For example only, the one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.
The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture that involves virtualization (e.g., one or more flexible pools of logical storage devices that may be virtualized to maintain virtual storage devices of the server). In various embodiments, server 120 may run one or more services or software applications that provide the functionality described below.
The computing units in server 120 may run one or more operating systems including any of the operating systems described above as well as any commercially available server operating systems. Server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, etc.
In some implementations, server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of client devices 101, 102, 103, 104, 105, and 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of client devices 101, 102, 103, 104, 105, and 106.
In some implementations, the server 120 may be a server of a distributed system or a server that incorporates a blockchain. The server 120 may also be a cloud server, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology. The cloud server is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and virtual private server (VPS, virtual Private Server) service.
The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of databases 130 may be used to store information such as audio files and video files. The data store 130 may reside in a variety of locations. For example, the data store used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The data store 130 may be of different types. In some embodiments, the data store used by server 120 may be a database, such as a relational database. One or more of these databases may store, update, and retrieve the databases and data from the databases in response to the commands.
In some embodiments, one or more of databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key value stores, object stores, or conventional stores supported by the file system.
The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.
For purposes of embodiments of the present disclosure, in the example of fig. 1, client applications for content browsing, through which a user can browse content, may be included in client devices 101, 102, 103, 104, 105, and 106. The content browsed by the user can be news information, audio and video, commodity information and the like, and correspondingly, the client application can be a news information application, an audio and video entertainment application, a shopping application and the like. The client application may exist in the client device in a variety of ways. For example, the client application may be an application program that needs to be downloaded and installed before running, a website that is accessible through a browser, a lightweight applet that runs in a host application, and so on.
The server 120 may be a server corresponding to a client application for content browsing in the client device, accordingly. The server 120 may include a service program that may provide a content browsing service to users based on content information (including titles, drawings, text, authors, types, interactions (e.g., praise, comment, forward, etc.) of the content, etc., stored in the database 130. Further, the service program includes a recommendation system, which is capable of providing a personalized recommendation service to the user, determining contents (i.e., recommended contents) that may be of interest to the user from among the stored plurality of contents according to the related information (e.g., attribute information, behavior information, etc.) of the user, and presenting part or all of the determined plurality of recommended contents to the user. Accordingly, the user can browse the recommended content recommended thereto by the recommendation system through the client application.
In some cases, the recommendation effect of the recommendation system can be evaluated, and the recommendation system can be optimized according to the evaluation result, so that the recommendation system can provide personalized recommendation services for users better. In the related art, the recommendation effect of the recommendation system is generally evaluated according to technical indexes such as the number of clicks, the click-to-display ratio (i.e. the ratio of the number of clicks to the number of displays), the browsing duration, and the like of the recommended content by the user. Because of the uneven quality of the recommended content, there may be cases where the title is excessively exaggerated or distorted (i.e., colloquially called "title party"), the title is inconsistent with the body, and so on, the clicking, long-time browsing, and so on actions of the user on the recommended content do not mean that the user is satisfied with the recommended content. Technical indexes such as clicking times, click-to-display ratios, browsing time lengths and the like can not accurately reflect real experience and subjective feeling of a user on recommended content, so that evaluation results of a recommendation system obtained according to the technical indexes are inaccurate, and the confidence is low.
In order to accurately evaluate the recommendation effect of the recommendation system, in an embodiment of the present disclosure, the server 120 may perform the method 200 for evaluating the recommendation system, acquire evaluation data of historical recommendation contents that the user browses, and evaluate the recommendation effect of the recommendation system according to the evaluation data. The evaluation data is real experience feedback of the user on the historical recommended content, the recommendation effect of the recommendation system is evaluated according to the evaluation data, the accuracy of the evaluation result of the recommendation system can be improved, and clear guidance is provided for the optimization direction of the recommendation system.
Further, the server 120 may optimize the recommendation system according to the evaluation result obtained by performing the method for evaluating the recommendation system according to the embodiment of the present disclosure. For the optimized recommendation system, the recommendation system can be evaluated again according to the method for evaluating the recommendation system, and the recommendation system is optimized again based on the evaluation result. The evaluation and optimization process can be circularly executed for a plurality of times, so that the recommendation effect of the recommendation system can be continuously improved, and the recommendation system can better provide personalized recommendation service for users.
FIG. 2 illustrates a flow chart of a method 200 for evaluating a recommendation system, according to an embodiment of the present disclosure. The method 200 may be performed at a server (e.g., the server 120 shown in fig. 1), i.e., the subject of execution of the steps of the method 200 may be the server 120 shown in fig. 1. It is to be appreciated that in some embodiments, the method 200 may also be performed at a client device (e.g., client devices 101, 102, 103, 104, 105, and 106 shown in fig. 1). Further, the client device may upload the evaluation result of the recommendation system obtained by performing the method 200 to the server.
As shown in fig. 2, the method 200 includes: step 210, determining at least one content to be evaluated of a target object, wherein the content to be evaluated is a historical recommended content presented to the target object by a recommendation system; step 220, acquiring evaluation data of a target object aiming at the at least one content to be evaluated; and step 230, evaluating the recommendation effect of the recommendation system according to the evaluation data.
According to the embodiment of the present disclosure, the recommendation effect of the recommendation system is evaluated according to the evaluation data of the user (i.e., the target object) on the history recommendation content (i.e., the content to be evaluated) presented to the recommendation system. The evaluation data is real experience feedback of the user on the historical recommended content, the recommendation effect of the recommendation system is evaluated according to the evaluation data, the accuracy of the evaluation result of the recommendation system can be improved, and clear guidance is provided for the optimization direction of the recommendation system.
The various steps of method 200 are described in detail below.
In step 210, at least one content to be evaluated of the target object is determined, the content to be evaluated being a historical recommended content presented to the target object by the recommendation system.
In embodiments of the present disclosure, the target object refers to a real user participating in the ratings recommendation system, which may be selected from a plurality of users using the recommendation system. Hereinafter, unless otherwise specified, a user who uses the recommendation system is referred to as a "candidate object", and a user who participates in the rating recommendation system, which is selected from a plurality of users who use the recommendation system, is referred to as a "target object". It should be appreciated that there may be one or more of the target objects, and that the set of target objects (including the one or more target objects) is a subset of the set of candidate objects (including the plurality of candidate objects).
According to some embodiments, the method 200 may further comprise a step 240 for determining a target object, the step 240 comprising: acquiring object attributes and liveness attributes of a plurality of candidate objects using a recommendation system; and determining a target object from the plurality of candidate objects according to the object attribute and the liveness attribute.
Object attributes refer to characteristics of the candidate itself, including but not limited to the sex, age, region of the candidate, etc.
The liveness attribute refers to the liveness of the candidate object on the recommender system, which may be determined, for example, based on the duration and/or frequency of use of the recommender system by the candidate object. The longer the candidate uses the recommendation system, the higher the frequency (i.e., the more frequently it is used), the greater the value of its liveness attribute. As described above, the recommendation system is typically part of the service program to which the client application corresponds. In some embodiments, the duration and frequency of candidate use of the client application may be used as the duration and frequency of candidate use of the recommendation system.
It should be noted that, in the embodiments of the present disclosure, the object attribute, the acquisition, the storage, the use, and the like of the candidate object are all in accordance with the rules of the related laws and regulations, and do not violate the popular regulations. Object attributes, liveness attributes, of candidate objects (i.e., users) are obtained, stored, and used based on the user's authorization and consent. In addition, the object attribute and the liveness attribute are subjected to desensitization (namely anonymization) in the processes of being acquired, stored and used.
The user's (candidate) experience of using the recommendation system has a significant positive correlation with its liveness attribute. In the embodiment of the disclosure, the target object is determined from the plurality of candidate objects according to the object attribute and the activity attribute of the candidate objects, so that the composition of the target object is more similar to the real user group of the recommendation system, and the authenticity and the accuracy of the evaluation result of the recommendation system are improved.
According to some embodiments, the target object may be determined from the plurality of candidate objects further according to the following steps 242-246: step 242, sampling the plurality of candidate objects according to the object attribute to obtain a first object set; step 244, determining a second object set based on the first object set according to the liveness attribute, wherein the distribution condition of the liveness attribute of the candidate objects in the second object set is consistent with the distribution condition of the liveness attribute of the plurality of candidate objects; and step 246, taking the candidate objects in the second object set as target objects.
According to some embodiments, step 242 further comprises: dividing the plurality of candidate objects into a plurality of object groups according to object attributes, wherein each object group comprises at least one candidate object with the same object attribute; and hierarchically sampling the candidate objects in the plurality of object groups according to the number of candidate objects included in each object group in the plurality of object groups.
Generally, in a hierarchical sampling process, the number of candidate objects that need to be extracted from each object group may be first determined. Specifically, the number of candidates extracted from the ith object group is (n i *n)/∑n i Wherein n is i For the number of candidate objects included in the ith object group, n is the total number of target objects, Σn i Is the total number of candidates. Subsequently, a corresponding number of candidate objects are randomly extracted from each object group, i.e. from the ith object group (n i *n)/∑n i And candidate objects.
For example, object attributes include gender (male/female), age (teenager/young/middle-aged/elderly), and region (first-line city/second-line city/third-line city/fourth-line city/fifth-line city), and accordingly, a plurality of candidate objects may be divided into (male, teenager, first-line city), (female, teenager, first-line city), (male, young, second-line city), (female, young, second-line city), and the like for a total of 2×4×5=40 object groups, each object group including n candidate objects, respectively 1 ,n 2 ,n 3 ,…,n 40 Total number of candidatesSetting the total number of target objects to n, it is necessary to randomly extract (n) from the i (i=1, 2,3, …, 40) th object group i * N)/N candidates.
By dividing the plurality of candidate objects into a plurality of object groups and hierarchically sampling the candidate objects in the plurality of object groups according to the number of candidate objects included in each object group, the object attribute configuration of the target object can be made to coincide with the object attribute configuration of the plurality of candidate objects.
According to some embodiments, step 244 further comprises: determining a first liveness attribute and a second liveness attribute, wherein the proportion of the candidate objects with the first liveness attribute in the first object set is larger than the proportion of the candidate objects with the first liveness attribute in the plurality of candidate objects, and the proportion of the candidate objects with the second liveness attribute in the first object set is smaller than the proportion of the candidate objects with the second liveness attribute in the plurality of candidate objects; removing a first candidate object with a first liveness attribute from the first object set; and adding a second candidate object having a second liveness attribute to the first object set, wherein the object attribute of the second candidate object is the same as the object attribute of the first candidate object.
The first candidate object and the second candidate object can be selected randomly or according to a certain rule. The object properties of the second candidate object are the same as the object properties of the first candidate object, thereby ensuring that the object properties of the candidate objects in the first object set are unchanged (always consistent with the object properties of the plurality of candidate objects) after removing the first candidate object from the first object set and adding the second candidate object thereto.
The step of removing the first candidate object from the first object set and adding the second candidate object to the first object set may be repeatedly performed multiple times until the distribution condition of the activity attribute of the candidate object in the first object set is consistent with the distribution condition of the activity attributes of the plurality of candidate objects, so as to obtain the second object set.
For example, liveness attributes include highly active, moderately active, and lightly active. The proportion of the highly active, moderately active and slightly active candidate objects in the first object set is 0.4, 0.3 and 0.3 respectively, and the proportion of the highly active, moderately active and slightly active candidate objects in the plurality of candidate objects is 0.2, 0.5 and 0.3 respectively. Since the proportion of highly active candidate objects in the first object set is 0.4 greater than the proportion of highly active candidate objects in the plurality of candidate objects is 0.2, the highly active is a first liveness attribute; since the proportion of moderately active candidate objects in the first set of objects is less than 0.3 than the proportion of moderately active candidate objects in the plurality of candidate objects is 0.5, moderately active is the second liveness attribute. Accordingly, a highly active candidate object (i.e., a first candidate object) may be randomly removed from the first set of objects, and the object attribute of the first candidate object may be, for example, (male, teenager, first line city); and randomly selecting a moderately active candidate object (namely a second candidate object) with the same object attribute as (male, teenager, first line city) from the candidate objects which do not belong to the first object set, and adding the moderately active candidate object (namely the second candidate object) into the first object set. The step of removing the highly active first candidate object from the first object set and adding the moderately active second candidate object thereto may be repeatedly performed a plurality of times until the distribution of the activity properties of the candidate objects in the first object set is consistent with the distribution of the activity properties of the plurality of candidate objects, i.e. the proportion of the highly active, moderately active and slightly active candidate objects of the first object set and the second object set is equal to 0.2, 0.5 and 0.3. And the first object set consistent with the distribution condition of the activity attribute of the plurality of candidate objects is the second object set.
Based on the target object determined in step 240, at least one content to be evaluated of the target object may be further determined.
The content to be evaluated is historical recommended content presented to the target object by the recommendation system. From the perspective of the target object, the content to be evaluated is historical recommended content browsed by the target object.
As described above, the recommendation system may determine content that may be of interest to the target object from the stored pieces of content and present some or all of the determined pieces of recommended content to the target object. It should be appreciated that the above-described act of "presenting" may be performed not by the recommender system itself, but by a display of the recommender system indicating the client device.
For example, the recommendation system determines 100 recommended contents for the target object a from among the stored pieces of contents, and sorts the 100 recommended contents in order of the degree of possible interest of the user from high to low. The target object may initiate a recommendation request by a client application in the client device requesting to obtain a quantity of recommended content. The recommendation system returns a corresponding amount of recommended content to the client device in response to the recommendation request and presents the recommended content to the target object by a display of the client device. Fig. 3A shows a schematic diagram of an exemplary content presentation interface 300A presented on a client device in accordance with an embodiment of the present disclosure. As shown in fig. 3, four recommended content, namely recommended content 1-recommended content 4, are presented in interface 300A. The target object may initiate the recommendation request by interactive means such as clicking (e.g., clicking on a preset area or control in interface 300A), sliding (e.g., sliding up in interface 300A), etc. The recommendation system further returns a quantity of recommended content, e.g. recommended content 5, recommended content 6, to the client device in response to the recommendation request and presents the recommended content 5, recommended content 6 to the target object via the display of the client device. The presentation interfaces 300B of the recommended content 5 and the recommended content 6 are shown in fig. 3B.
According to some embodiments, the at least one content to be evaluated of the target object may be obtained according to the following steps: acquiring a browsing history screenshot of a target object; performing optical character recognition (Optical Character Recognition, OCR) on the browsing history screen shot to determine recommended content included in the browsing history screen shot; and taking the recommended content as the at least one content to be evaluated.
The browsing history screenshot of the target object may be, for example, the target object intercepted during browsing of the content. Typically, the browsing history screen shot includes only the title of each recommended content and the background image of the title. According to some embodiments, after the titles of the recommended contents in the browsing history screenshot are identified through the OCR technology, information such as authors, fields, texts, configuration drawings and the like corresponding to the recommended contents can be further obtained from the database according to the titles of the recommended contents, so as to provide a reference for a subsequent evaluation recommendation system.
According to other embodiments, the at least one content to be evaluated of the target object may also be obtained according to the following steps: obtaining an access log of a target object; and taking the recommended content in the access log as the at least one content to be evaluated.
The access log may be, for example, an access log of the target object to the recommender system, or an access log of the target object to a client application served by the recommender system. The access log records the content once browsed by the target object. By analyzing the access log, recommended contents which are browsed by the target object can be determined, and the recommended contents are used as at least one content to be evaluated of the target object.
It should be noted that, in the above embodiment, the access log of the target object is acquired, stored, and used based on the authorization and consent of the user. The acquisition, storage, use and the like of the access log of the target object all conform to the regulations of related laws and regulations, and the public welcome is not violated.
In step 220, evaluation data of the target object for the at least one content to be evaluated may be acquired.
According to some embodiments, step 220 further comprises: providing a data interface for evaluating the at least one content to be evaluated to a target object; and acquiring the evaluation data of the target object aiming at the at least one content to be evaluated through the data interface. In this embodiment, the data interface is generated for the content to be evaluated, and may be used to accurately obtain the evaluation data of the content to be evaluated by the target object, without obtaining the evaluation of the target object on other unrelated contents (such as non-recommended content, recommended content not presented to the target object, etc.), so as to improve the efficiency of obtaining the evaluation data, and further improve the efficiency of evaluating the recommendation system.
According to some embodiments, the rating data includes, but is not limited to, satisfaction of the target object with each of the at least one content to be rated and reason information corresponding to the satisfaction. The satisfaction of the target object with the content to be evaluated may be represented by a set of preset numerical scores (e.g., integers 1-5), with the higher the satisfaction of the target object with the content to be evaluated, the greater the numerical score.
Fig. 4A shows a schematic diagram of an exemplary data interface 400A, where the data interface 400A is used to obtain rating data of a target object to be rated for content 1 according to an embodiment of the present disclosure.
As shown in fig. 4A, the top of the data interface 400A shows basic information of the content 1 to be evaluated, including a title 402 of the content 1 to be evaluated and profiles 404, 406, and 408 in the content 1 to be evaluated. The text box 410 is used to obtain a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart, CAPTCHA) to distinguish whether the target object is a real user or a computer program. The verification code may specifically be the first three words of the title of the content to be evaluated.
The evaluation data that the data interface 400A can acquire includes satisfaction of the target object to the content 1 to be evaluated (corresponding to the problem 1 in fig. 4A) and cause information (corresponding to the problems 2 and 3 in fig. 4A) corresponding to the satisfaction.
For problem 1, the data interface 400A shows a number of satisfaction options, and the target object may submit its satisfaction with content 1 to be evaluated by checking the radio box 412 corresponding to the satisfaction options. For problem 2, the data interface 400A shows a plurality of satisfactory cause options, and the target object may submit its cause information satisfactory to the content 1 to be evaluated by checking the check box 414 corresponding to the cause option. For problem 3, the data interface 400A shows a plurality of unsatisfactory cause options, and the target object may submit its unsatisfactory cause information for the content 1 to be evaluated by checking the check box 416 corresponding to the cause option.
Fig. 4B shows a schematic diagram of another exemplary data interface 400B according to an embodiment of the present disclosure, where the data interface 400B is configured to further obtain, based on the data interface 400A, evaluation data of other aspects of the content to be evaluated 1 by the target object. The data interface 400B and the data interface 400A may be the same data interface, i.e., they belong to the same page. For example, the target object may obtain the data interface 400B shown in fig. 4B by performing a sliding-up operation in the data interface 400A shown in fig. 4A.
As shown in fig. 4B, similarly to fig. 4A, the top of the data interface 400B shows a title 402 of the content 1 to be evaluated, maps 404, 406, and 408, and a text box 410 for acquiring a verification code.
The evaluation data that the data interface 400B can acquire includes the type of the content 1 to be evaluated (corresponding to the problem 4 in fig. 4B) that the target object considers, the domain to which it belongs (corresponding to the problem 5 in fig. 4B), and the frequency with which the target object browses the similar content (corresponding to the problem 6 in fig. 4B).
For problem 4, the data interface 400B shows multiple type options, and the target object may submit the type of content 1 it considers to be evaluated by checking the radio box 418 corresponding to the type option. For problem 5, the data interface 400B shows two drop-down boxes 420, and the target object can select the domain to which it considers content 1 to be evaluated belongs by clicking on the drop-down boxes 420. For problem 6, the data interface 400B shows a number of frequency options, and the target object may submit its frequency of seeing the same or similar content as content 1 to be evaluated by clicking on the radio box 422 corresponding to the frequency option.
In step 230, the recommendation effect of the recommendation system may be evaluated according to the evaluation data acquired in step 220.
The evaluation data acquired in step 220 may be abnormal, for example, the evaluation data may be randomly input by the target object, and such evaluation data cannot express the true feeling of the target object to the content to be evaluated. According to some embodiments, abnormal data in the evaluation data can be determined according to a preset abnormal judgment condition, the abnormal data is removed, and the recommendation effect of the recommendation system is evaluated according to the rest of the evaluation data, so that the accuracy of the evaluation result of the recommendation system is ensured.
The preset abnormality judgment condition may be, for example, whether the time taken for the target object to input the evaluation data is abnormal, whether the evaluation of all the contents to be evaluated by the target object is the same, whether the evaluation data has a logic error, or the like. If the target object takes a short time (e.g., shorter than a preset time threshold) to input the evaluation data, or the evaluation is the same for all the contents to be evaluated (e.g., the same satisfaction or the same cause information for all the contents to be evaluated), or the evaluation data is significantly erroneous (e.g., the true type of a certain content to be evaluated is "picture-text", and the type selected by the target object is "short video"), it is judged that the evaluation data is abnormal data.
After the abnormal data in the evaluation data is removed, the recommendation effect of the recommendation system can be evaluated according to the remaining evaluation data. According to some embodiments, the overall satisfaction degree of each target object on the recommendation system can be respectively determined according to the corresponding evaluation data; and determining the recommendation effect of the recommendation system according to the overall satisfaction degree of each target object.
For example, the recommendation effect S of the recommendation system may be calculated according to the following steps:
First, the overall satisfaction s of the target object i with the recommendation system is calculated i
Wherein m is i The number of contents to be evaluated for the target object i, c ij And (5) the satisfaction degree of the target object i on the j-th content to be evaluated.
According to the overall satisfaction degree of each target object on the recommendation system, calculating a recommendation effect S of the recommendation system:
where n is the number of target objects.
According to some embodiments, the recommendation effect of the recommendation system on different types of content can be respectively evaluated according to the evaluation data of each content to be evaluated by a plurality of target objects, for example, the recommendation effect of the recommendation system on types of content such as graphics context, short video, live broadcast, advertisement and the like is respectively evaluated, so that the content type with poor recommendation effect can be targeted and optimized later.
Fig. 5 shows a block diagram of an apparatus 500 for evaluating a recommendation system, according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus 500 includes a content determination module 510, a data acquisition module 520, and an evaluation module 530.
The content determination module 510 may be configured to determine at least one content to be rated of a target object, the at least one content to be rated being a historical recommended content presented to the target object by the recommendation system.
The data acquisition module 520 may be configured to acquire rating data of the target object for the at least one content to be rated.
The evaluation module 530 may be configured to evaluate a recommendation effect of the recommendation system according to the evaluation data.
According to the embodiment of the present disclosure, the recommendation effect of the recommendation system is evaluated according to the evaluation data of the user (i.e., the target object) on the history recommendation content (i.e., the content to be evaluated) presented to the recommendation system. The evaluation data is real experience feedback of the user on the historical recommended content, the recommendation effect of the recommendation system is evaluated according to the evaluation data, the accuracy of the evaluation result of the recommendation system can be improved, and clear guidance is provided for the optimization direction of the recommendation system.
According to some embodiments, the apparatus 500 further comprises an object determination module. The object determination module further includes an attribute acquisition unit and an object determination unit, wherein the attribute acquisition unit may be configured to acquire object attributes and liveness attributes of a plurality of candidate objects using the recommendation system; the object determination unit may be configured to determine the target object from the plurality of candidate objects according to the object attribute and the liveness attribute.
According to some embodiments, the liveness attribute is determined according to a duration and/or frequency with which candidate objects use the recommendation system.
According to some embodiments, the object determination unit may be further configured to: sampling the plurality of candidate objects according to the object attributes to obtain a first object set; the second object determining unit may be configured to determine a second object set based on the first object set according to the liveness attribute, wherein a distribution situation of liveness attributes of candidate objects in the second object set is consistent with a distribution situation of liveness attributes of the plurality of candidate objects; and taking the candidate objects in the second object set as the target objects.
According to some embodiments, the object determination unit may be further configured to: dividing the plurality of candidate objects into a plurality of object groups according to the object attributes, wherein each object group comprises at least one candidate object with the same object attribute; and hierarchically sampling the candidate objects in the plurality of object groups according to the number of candidate objects included in each object group in the plurality of object groups.
According to some embodiments, the object determination unit may be further configured to: determining a first liveness attribute and a second liveness attribute, wherein the proportion of the candidate objects with the first liveness attribute in the first object set is larger than the proportion of the candidate objects with the first liveness attribute in the plurality of candidate objects, and the proportion of the candidate objects with the second liveness attribute in the first object set is smaller than the proportion of the candidate objects with the second liveness attribute in the plurality of candidate objects; removing a first candidate object with the first liveness attribute from the first object set; and adding a second candidate object having the second liveness attribute to the first object set, wherein the object attribute of the second candidate object is the same as the object attribute of the first candidate object.
According to some embodiments, the content determination module 510 may be further configured to: acquiring a browsing history screenshot of a target object; performing optical character recognition on the browsing history screenshot to determine recommended content included in the browsing history screenshot; and taking the recommended content as the at least one content to be evaluated.
According to some embodiments, the content determination module 510 may be further configured to: obtaining an access log of the target object; and taking the recommended content in the access log as the at least one content to be evaluated.
According to some embodiments, the data acquisition module 520 may be further configured to: providing a data interface for evaluating the at least one content to be evaluated to the target object; and acquiring the evaluation data of the target object aiming at the at least one content to be evaluated through the data interface.
According to some embodiments, the evaluation data includes satisfaction of the target object with each of the at least one content to be evaluated and reason information corresponding to the satisfaction.
According to some embodiments, the target object has a plurality, and the evaluation module 530 may be further configured to: according to the corresponding evaluation data, determining the overall satisfaction degree of each target object on the recommendation system; and determining the recommendation effect of the recommendation system according to the overall satisfaction degree of each target object.
According to some embodiments, the apparatus 500 further comprises a data cleansing module. The data cleansing module may be configured to: determining abnormal data in the evaluation data according to preset abnormal judgment conditions; and rejecting the abnormal data.
It should be appreciated that the various modules of the apparatus 500 shown in fig. 5 may correspond to the various steps in the method 200 described with reference to fig. 2. Thus, the operations, features, and advantages described above with respect to method 200 are equally applicable to apparatus 500 and the modules that it comprises. For brevity, certain operations, features and advantages are not described in detail herein.
Although specific functions are discussed above with reference to specific modules, it should be noted that the functions of the various modules discussed herein may be divided into multiple modules and/or at least some of the functions of the multiple modules may be combined into a single module. For example, the content determination module 510 and the data acquisition module 520 described above may be combined into a single module in some embodiments.
It should also be appreciated that various techniques may be described herein in the general context of software hardware elements or program modules. The various modules described above with respect to fig. 5 may be implemented in hardware or in hardware in combination with software and/or firmware. For example, the modules may be implemented as computer program code/instructions configured to be executed in one or more processors and stored in a computer-readable storage medium. Alternatively, these modules may be implemented as hardware logic/circuitry. For example, in some embodiments, one or more of the content determination module 510, the data acquisition module 520, the evaluation module 530 may be implemented together in a System on Chip (SoC). The SoC may include an integrated circuit chip including one or more components of a processor (e.g., a central processing unit (Central Processing Unit, CPU), microcontroller, microprocessor, digital signal processor (Digital Signal Processor, DSP), etc.), memory, one or more communication interfaces, and/or other circuitry, and may optionally execute received program code and/or include embedded firmware to perform functions.
According to embodiments of the present disclosure, there is also provided an electronic device, a readable storage medium and a computer program product.
Referring to fig. 6, a block diagram of an electronic device 600 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 6, the apparatus 600 includes a computing unit 601 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM 602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606, an output unit 607, a storage unit 608, and a communication unit 609. The input unit 606 may be any type of device capable of inputting information to the device 600, the input unit 606 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a trackpad, a trackball, a joystick, a microphone, and/or a remote control. The output unit 607 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 608 may include, but is not limited to, magnetic disks, optical disks. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as bluetooth TM Devices, 1302.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the various methods and processes described above, such as the method 200 described above. For example, in some embodiments, the method 200 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. One or more of the steps of the method 200 described above may be performed when a computer program is loaded into RAM 603 and executed by the computing unit 601. Alternatively, in other embodiments, computing unit 601 may be configured to perform method 200 by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely illustrative embodiments or examples and that the scope of the present disclosure is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims (10)

1. A method for evaluating a recommendation system, comprising:
acquiring object attributes and liveness attributes of a plurality of candidate objects using the recommendation system, wherein the liveness attributes are determined according to the duration and/or frequency of using the recommendation system by the candidate objects;
dividing the plurality of candidate objects into a plurality of object groups according to the object attributes, wherein each object group comprises at least one candidate object with the same object attribute;
According to the number of candidate objects included in each object group in the plurality of object groups, performing hierarchical sampling on the candidate objects in the plurality of object groups to obtain a first object set;
determining a second object set based on the first object set according to the liveness attribute, wherein the distribution condition of the liveness attribute of the candidate objects in the second object set is consistent with the distribution condition of the liveness attribute of the plurality of candidate objects, and determining the second object set based on the first object set comprises:
determining a first liveness attribute and a second liveness attribute, wherein the proportion of the candidate objects with the first liveness attribute in the first object set is larger than the proportion of the candidate objects with the first liveness attribute in the plurality of candidate objects, and the proportion of the candidate objects with the second liveness attribute in the first object set is smaller than the proportion of the candidate objects with the second liveness attribute in the plurality of candidate objects;
removing a first candidate object with the first liveness attribute from the first object set; and
Adding a second candidate object with the second activity attribute to the first object set, wherein the object attribute of the second candidate object is the same as the object attribute of the first candidate object, and the steps of removing the first candidate object from the first object set and adding the second candidate object to the first object set are repeatedly performed for a plurality of times until the distribution condition of the activity attribute of the candidate object in the first object set is consistent with the distribution condition of the activity attribute of the plurality of candidate objects, so as to obtain the second object set;
taking candidate objects in the second object set as target objects;
determining at least one content to be evaluated of the target object, wherein the at least one content to be evaluated is historical recommended content presented to the target object by the recommendation system;
acquiring evaluation data of the target object aiming at the at least one content to be evaluated; and
and evaluating the recommendation effect of the recommendation system according to the evaluation data.
2. The method of claim 1, wherein the determining at least one content to be evaluated of the target object comprises:
acquiring a browsing history screenshot of a target object;
Performing optical character recognition on the browsing history screenshot to determine recommended content included in the browsing history screenshot; and
and taking the recommended content as the at least one content to be evaluated.
3. The method of claim 1, wherein the determining at least one content to be evaluated of the target object comprises:
obtaining an access log of the target object; and
and taking the recommended content in the access log as the at least one content to be evaluated.
4. A method according to any one of claims 1-3, wherein the obtaining of rating data of the target object for the at least one content to be rated comprises:
providing a data interface for evaluating the at least one content to be evaluated to the target object; and
and acquiring the evaluation data of the target object aiming at the at least one content to be evaluated through the data interface.
5. The method according to any one of claims 1-3, wherein the evaluation data includes satisfaction of the target object with each of the at least one content to be evaluated and reason information corresponding to the satisfaction.
6. The method of claim 5, wherein the target object has a plurality of target objects, and the evaluating the recommendation effect of the recommendation system according to the evaluation data comprises:
According to the corresponding evaluation data, determining the overall satisfaction degree of each target object on the recommendation system; and
and determining the recommendation effect of the recommendation system according to the overall satisfaction degree of each target object.
7. A method according to any one of claims 1-3, further comprising:
determining abnormal data in the evaluation data according to preset abnormal judgment conditions; and
and eliminating the abnormal data.
8. An apparatus for evaluating a recommendation system, comprising:
an attribute obtaining unit configured to obtain object attributes and liveness attributes of a plurality of candidate objects using the recommendation system, wherein the liveness attributes are determined according to the duration and/or frequency of using the recommendation system by the candidate objects;
an object determination unit configured to:
dividing the plurality of candidate objects into a plurality of object groups according to the object attributes, wherein each object group comprises at least one candidate object with the same object attribute;
according to the number of candidate objects included in each object group in the plurality of object groups, performing hierarchical sampling on the candidate objects in the plurality of object groups to obtain a first object set;
Determining a second object set based on the first object set according to the liveness attribute, wherein the distribution condition of the liveness attribute of the candidate objects in the second object set is consistent with the distribution condition of the liveness attribute of the plurality of candidate objects, and determining the second object set based on the first object set comprises:
determining a first liveness attribute and a second liveness attribute, wherein the proportion of the candidate objects with the first liveness attribute in the first object set is larger than the proportion of the candidate objects with the first liveness attribute in the plurality of candidate objects, and the proportion of the candidate objects with the second liveness attribute in the first object set is smaller than the proportion of the candidate objects with the second liveness attribute in the plurality of candidate objects;
removing a first candidate object with the first liveness attribute from the first object set; and
adding a second candidate object with the second activity attribute to the first object set, wherein the object attribute of the second candidate object is the same as the object attribute of the first candidate object, and the steps of removing the first candidate object from the first object set and adding the second candidate object to the first object set are repeatedly performed for a plurality of times until the distribution condition of the activity attribute of the candidate object in the first object set is consistent with the distribution condition of the activity attribute of the plurality of candidate objects, so as to obtain the second object set;
Taking candidate objects in the second object set as target objects;
a content determination module configured to determine at least one content to be evaluated of a target object, the at least one content to be evaluated being a historical recommended content presented to the target object by the recommendation system;
the data acquisition module is configured to acquire evaluation data of the target object aiming at the at least one content to be evaluated; and
and the evaluation module is configured to evaluate the recommendation effect of the recommendation system according to the evaluation data.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the method comprises the steps of
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202110662575.XA 2021-06-15 2021-06-15 Method and device for evaluating recommendation system, electronic equipment and medium Active CN113312554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110662575.XA CN113312554B (en) 2021-06-15 2021-06-15 Method and device for evaluating recommendation system, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110662575.XA CN113312554B (en) 2021-06-15 2021-06-15 Method and device for evaluating recommendation system, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN113312554A CN113312554A (en) 2021-08-27
CN113312554B true CN113312554B (en) 2023-11-03

Family

ID=77378808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110662575.XA Active CN113312554B (en) 2021-06-15 2021-06-15 Method and device for evaluating recommendation system, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN113312554B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402249B (en) * 2023-03-06 2024-02-23 贝壳找房(北京)科技有限公司 Recommendation system overflow effect evaluation method, recommendation system overflow effect evaluation equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729542A (en) * 2017-10-31 2018-02-23 咪咕音乐有限公司 A kind of information methods of marking and device and storage medium
WO2018130220A1 (en) * 2017-01-16 2018-07-19 广州市动景计算机科技有限公司 Message pushing method and device, and programmable device
CN108319611A (en) * 2017-01-17 2018-07-24 腾讯科技(深圳)有限公司 The methods of sampling and sampling apparatus
CN108804506A (en) * 2018-04-13 2018-11-13 北京猫眼文化传媒有限公司 A kind of the recommendation method, apparatus and electronic equipment of information
CN110297975A (en) * 2019-06-26 2019-10-01 北京百度网讯科技有限公司 Appraisal procedure, device, electronic equipment and the storage medium of Generalization bounds
CN112817870A (en) * 2021-02-26 2021-05-18 北京小米移动软件有限公司 Software testing method, device and medium
CN112925978A (en) * 2021-02-26 2021-06-08 北京百度网讯科技有限公司 Recommendation system evaluation method and device, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018130220A1 (en) * 2017-01-16 2018-07-19 广州市动景计算机科技有限公司 Message pushing method and device, and programmable device
CN108319611A (en) * 2017-01-17 2018-07-24 腾讯科技(深圳)有限公司 The methods of sampling and sampling apparatus
CN107729542A (en) * 2017-10-31 2018-02-23 咪咕音乐有限公司 A kind of information methods of marking and device and storage medium
CN108804506A (en) * 2018-04-13 2018-11-13 北京猫眼文化传媒有限公司 A kind of the recommendation method, apparatus and electronic equipment of information
CN110297975A (en) * 2019-06-26 2019-10-01 北京百度网讯科技有限公司 Appraisal procedure, device, electronic equipment and the storage medium of Generalization bounds
CN112817870A (en) * 2021-02-26 2021-05-18 北京小米移动软件有限公司 Software testing method, device and medium
CN112925978A (en) * 2021-02-26 2021-06-08 北京百度网讯科技有限公司 Recommendation system evaluation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113312554A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
US10163117B2 (en) System, method, and computer program product for model-based data analysis
US20160012124A1 (en) Methods for automatic query translation
CN112579909A (en) Object recommendation method and device, computer equipment and medium
CN113411645B (en) Information recommendation method and device, electronic equipment and medium
CN112836072B (en) Information display method and device, electronic equipment and medium
WO2023245938A1 (en) Object recommendation method and apparatus
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
WO2015148420A1 (en) User inactivity aware recommendation system
US20140052842A1 (en) Measuring problems from social media discussions
CN113312554B (en) Method and device for evaluating recommendation system, electronic equipment and medium
CN116883181B (en) Financial service pushing method based on user portrait, storage medium and server
WO2024027125A1 (en) Object recommendation method and apparatus, electronic device, and storage medium
WO2023240833A1 (en) Information recommendation method and apparatus, electronic device, and medium
CN116450944A (en) Resource recommendation method and device based on recommendation model, electronic equipment and medium
CN116304335A (en) Object recommendation method, user preference identification method and device
CN113868453B (en) Object recommendation method and device
CN113596011B (en) Flow identification method and device, computing device and medium
CN113590447B (en) Buried point processing method and device
CN114238745A (en) Method and device for providing search result, electronic equipment and medium
CN113076480A (en) Page recommendation method and device, electronic equipment and medium
CN115809364B (en) Object recommendation method and model training method
CN115033782B (en) Object recommendation method, training method, device and equipment of machine learning model
CN113420227B (en) Training method of click rate estimation model, click rate estimation method and device
CN116405551B (en) Social platform-based data pushing method and system and cloud platform
CN113326417B (en) Method and device for updating webpage library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant