WO2021134944A1 - Mobile news client-based evaluation method and system therefor - Google Patents

Mobile news client-based evaluation method and system therefor Download PDF

Info

Publication number
WO2021134944A1
WO2021134944A1 PCT/CN2020/082285 CN2020082285W WO2021134944A1 WO 2021134944 A1 WO2021134944 A1 WO 2021134944A1 CN 2020082285 W CN2020082285 W CN 2020082285W WO 2021134944 A1 WO2021134944 A1 WO 2021134944A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
evaluation
recommended
news
recommended content
Prior art date
Application number
PCT/CN2020/082285
Other languages
French (fr)
Chinese (zh)
Inventor
张丹
张涛
石霖
董晓飞
曹峰
孙明俊
Original Assignee
南京新一代人工智能研究院有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京新一代人工智能研究院有限公司 filed Critical 南京新一代人工智能研究院有限公司
Priority to AU2020335019A priority Critical patent/AU2020335019B2/en
Publication of WO2021134944A1 publication Critical patent/WO2021134944A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Definitions

  • the invention belongs to the field of Internet news information operation and service platforms, and specifically relates to an evaluation method and system based on a mobile news client.
  • Content acquisition method create the same number of threads according to the number of news clients acquired, acquire the number of CPU cores of the system's central processing unit, and bind each thread to the corresponding CPU core according to preset rules; bind the same CPU core
  • the data in all the preset analysis queues on the above is stored in the corresponding preset output queue.
  • the data in the preset output queue is transferred to the preset database, so as to be based on the data in the preset database Realize the supervision of news client data.
  • the evaluation indicators of the recommendation system generally come from multiple dimensions: user experience, algorithm accuracy, and business goals.
  • the method of directly acquiring the column content of each news client makes it difficult for the recommendation system to evaluate the recommended content of each user;
  • the purpose of the present invention is to propose an evaluation method and system based on mobile news clients, by acquiring personalized recommended content of the mobile news client, real-time analysis of recommended content, Dimensional evaluation of the credibility of recommended content.
  • An evaluation method based on mobile news clients including:
  • Step 1 Obtain recommended content: simulate a user portrait requesting data from the server, collect the pushed content returned by the server and store it in the database;
  • Step 2 Evaluate the recommended content from the two dimensions of quality and timeliness;
  • Step 3 Feedback the final result of the evaluation.
  • step 1 the specific method of step 1 is as follows: firstly set user operation instructions in advance, including: refresh, read, and like, then simulate user operation instructions, and transmit them to the virtualized news client and news client via adb. According to the instruction, data is requested from the server by simulating a click, and the server returns the pushed personalized content to the news client. Finally, the hook server is used to obtain the recommended content from the news client and save the recommended content in the database.
  • step 2 the specific method of content quality evaluation in step 2 is as follows: firstly perform bad content recognition: adopt the method of sequence labeling, construct a large number of training data for marking bad content vocabulary, and conduct model training; use the obtained recommendation data for training A good model makes predictions and judges whether it contains bad content words, if it does, extract and record it, if it does not, then perform "title party content” identification;
  • the "Title Party” content recognition use a two-class model to construct a large amount of training data, and use the title and the article body as the model input. The title and the article are marked as 1, and the title and the article are not marked as 0 for classification. Model training; input the title and article text of the obtained recommendation data into the model to determine whether it is the content of "Title Party";
  • timeliness index and quality index obtained in step 2 are calculated, and the average value is taken as the final result.
  • the present invention also provides a system for implementing the above evaluation method, including a recommended content acquisition module, a content credibility evaluation module, and an evaluation result feedback module;
  • the recommended content acquisition module includes: simulated user portrait data request module: simulates the user's browsing, clicking, and sharing behaviors to form a preset user portrait, and requests data from the news client to the server according to the preset user portrait;
  • Server push module The server pushes personalized recommended news information content to the news client;
  • Content collection module use hook server to obtain recommended content from news clients
  • Data saving module save the collected content to the database
  • Quality assessment module detect whether the recommended content contains bad content or "headline party" content.
  • the present invention has the following beneficial effects:
  • personalized recommended content acquisition from the mobile news client APP side can simulate real-time acquisition of content recommended by different users based on different user portraits;
  • Fig. 1 is a schematic diagram of a data collection method of a news client in the prior art
  • Figure 2 is a schematic diagram of the evaluation system of the mobile news client of the present invention.
  • Figure 3 is a schematic diagram of a recommended content acquisition module in the evaluation system of the present invention.
  • Fig. 5 is a flowchart of the content evaluation method in the evaluation method of the present invention.
  • FIG. 1 a schematic diagram of a data collection method for news clients in the prior art.
  • the same number of threads are created according to the number of news clients obtained, the number of cores of the central processing unit CPU of the system is obtained, and each of the The thread is bound to the corresponding CPU core; the data in all the preset analysis queues on the same CPU core is stored in the corresponding preset output queue, and the data in the preset output queue is transmitted when an output instruction is received In the preset database, in order to realize the supervision of the news client data based on the data in the preset database.
  • This collection method is a way to directly obtain the column content of each news client, which makes it difficult to evaluate the recommended content of each user with respect to the recommendation system.
  • the present invention provides an evaluation method based on a mobile news client, including:
  • Step 1 Obtain recommended content: simulate a user portrait requesting data from the server, collect the pushed content returned by the server and store it in the database;
  • Step 2 Evaluate the recommended content from the two dimensions of quality and timeliness;
  • Step 3 Feedback the final result of the evaluation.
  • the method for obtaining recommended content in the evaluation method of the present invention is specifically:
  • set user operation instructions in advance including: refresh, read, and like
  • simulate user operation instructions and transmit them to the virtualized news client through adb.
  • the news client requests data from the server by simulating clicks according to the instructions, and the server Then return the pushed personalized content to the news client, and finally use the hook server to obtain the recommended content from the news client, and save the recommended content in the database.
  • the "Title Party” content recognition use a two-class model to construct a large amount of training data, and use the title and the article body as the model input. The title and the article are marked as 1, and the title and the article are not marked as 0 for classification.
  • Model training input the title and article text of the obtained recommendation data into the model to determine whether it is the content of "Title Party”; finally calculate the quality index of the content, the formula is:
  • Q is the ratio of the number of recommended content detected that do not meet the content quality requirements at a certain moment in the total recommended number
  • U is the total number of simulated different user portraits
  • q i is the recommended content obtained from the user portrait of i is not meeting the content quality requirements
  • All i is the total number of recommended content obtained by the portrait of user i.
  • the recommended content production time is the number of recommendations in the last 72 hours accounting for the total recommended number, the production time for obtaining the recommended content, and the calculated production time is 72 hours.
  • T is the ratio of the number of recommended content with a production time of 72 hours to the total recommended content
  • U is the total number of simulated different user portraits
  • t i is the number of recommended content obtained by user i that does not meet the timeliness requirements
  • ALL i is the total number of recommended content obtained by the portrait of user i.
  • step 3 the data obtained by simulating different user portraits at each moment are passed through the above-mentioned evaluation method to obtain the timeliness index and the quality index, and the average value is taken as the final result and fed back to the monitoring platform.
  • the present invention also provides a system for realizing the above-mentioned evaluation and monitoring method based on news clients, including a recommended content acquisition module, a content credibility evaluation module, and an evaluation result feedback module;
  • the recommended content acquisition module includes:
  • Simulate user portrait data request module simulate the user's browsing, clicking, and sharing behaviors to form a preset user portrait, and request data from the news client to the server according to the preset user portrait;
  • Server push module The server pushes personalized recommended news information content to the news client;
  • Content collection module use hook server to obtain recommended content from news clients
  • Data saving module save the collected content to the database
  • the content credibility evaluation module includes:
  • Timeliness evaluation module Evaluate the real-time performance of the recommendation system through the recommendation rate of news in the last n days, N ⁇ 3;
  • Quality assessment module detect whether the recommended content contains bad content or "headline party" content.
  • the news client-based evaluation method disclosed in the present invention collects personalized recommended content from the mobile news client APP side, which can simulate real-time collection of content recommended by different users according to different user portraits, and similar users ( The recommended content of users with similar user portraits does not need to be collected all, and it is more flexible.
  • the scale of collected content can be freely controlled, and the scale of collected personalized content can be adjusted freely according to relevant rules; in terms of content evaluation, from timeliness and quality
  • the dimensionality evaluates the credibility of the recommended content of news information, which can well reflect the real-time nature of news information and the quality of article content.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention relates to the field of Internet news information operation and service platforms, and disclosed are a mobile news client-based evaluation method and a system therefor. The method comprises: step 1: the acquisition of recommended content: simulating a user portrait to request data from a server, collecting push content returned by the server, and storing same in a database; step 2: evaluating the recommended content from the two dimensions of quality and timeliness; and step 3: feeding back a final evaluation result. Compared to the prior art, the present invention performs personalized recommended content evaluation from the app side of the mobile news client which may, according to different user portraits, simulate the real-time collection of content recommended by different users; and evaluates the credibility of content from the two dimensions of timeliness and quality, which may well reflect the time relevancy of news information, the coverage of information, and the quality of content.

Description

一种基于移动新闻客户端的评估方法及其***Evaluation method and system based on mobile news client 技术领域Technical field
本发明属于互联网新闻信息运营和服务平台领域,具体涉及一种基于移动新闻客户端的评估方法及其***。The invention belongs to the field of Internet news information operation and service platforms, and specifically relates to an evaluation method and system based on a mobile news client.
背景技术Background technique
近年来,随着人工智能技术的发展,个性化推荐算法逐步在新闻推送、网络直播、小视频推送、商品等场景中得到了广泛的应用。特别在新闻信息推荐领域,丰富的资讯资源、实时的信息推送和方便的社交互动被越来越多的用户认可。如何能够更好地评估新闻信息推荐***成为了新的课题。In recent years, with the development of artificial intelligence technology, personalized recommendation algorithms have gradually been widely used in news push, web live broadcast, small video push, commodity and other scenarios. Especially in the field of news information recommendation, abundant information resources, real-time information push and convenient social interaction are recognized by more and more users. How to better evaluate news information recommendation system has become a new topic.
现有的技术方案一般为:The existing technical solutions are generally:
内容获取方法:根据获取到的新闻客户端的数量创建相同数量的线程,获取***的中央处理器CPU的核心数量并根据预设规则将每个线程绑定在相应的CPU核心上;将同一CPU核心上的所有预置解析队列中的数据存储于对应的预置输出队列中,当接收到输出指令时则将预置输出队列中的数据传送于预置数据库中,以便基于预置数据库中的数据实现对新闻客户端数据的监管。Content acquisition method: create the same number of threads according to the number of news clients acquired, acquire the number of CPU cores of the system's central processing unit, and bind each thread to the corresponding CPU core according to preset rules; bind the same CPU core The data in all the preset analysis queues on the above is stored in the corresponding preset output queue. When an output instruction is received, the data in the preset output queue is transferred to the preset database, so as to be based on the data in the preset database Realize the supervision of news client data.
内容评估方法:推荐***的评估指标一般从多个维度:用户感受、算法精度以及商业目标。Content evaluation method: The evaluation indicators of the recommendation system generally come from multiple dimensions: user experience, algorithm accuracy, and business goals.
而这种方法会带来诸多问题,比如:And this method will bring many problems, such as:
在内容获取方法上,直接获取每个新闻客户端的栏目内容的方式,难以针对推荐***对每个用户的推荐内容进行评估;In terms of content acquisition methods, the method of directly acquiring the column content of each news client makes it difficult for the recommendation system to evaluate the recommended content of each user;
在内容评估上,目前的评价指标都是基于通用的推荐算法的,并无专门针对新闻信息推荐***进行评价的相关指标。In terms of content evaluation, current evaluation indicators are all based on general recommendation algorithms, and there are no relevant indicators specifically for evaluating news information recommendation systems.
发明内容Summary of the invention
针对现有技术中存在的诸多问题,本发明的目的是:提出了一种基于移动新闻客户端的评估方法及其***,通过获取移动新闻客户端个性化推荐的内容,实时分析推荐内容,从多维度评估推荐内容的可信程度。In view of the many problems existing in the prior art, the purpose of the present invention is to propose an evaluation method and system based on mobile news clients, by acquiring personalized recommended content of the mobile news client, real-time analysis of recommended content, Dimensional evaluation of the credibility of recommended content.
为了实现上述目的,本发明所采用的技术方案为:In order to achieve the above objectives, the technical solutions adopted by the present invention are:
一种基于移动新闻客户端的评估方法,包括:An evaluation method based on mobile news clients, including:
步骤1:推荐内容的获取:模拟用户画像向服务器请求数据,采集服务器返回的推送内容并储存在数据库中;Step 1: Obtain recommended content: simulate a user portrait requesting data from the server, collect the pushed content returned by the server and store it in the database;
步骤2:对推荐内容从质量和时效两个维度进行评估;Step 2: Evaluate the recommended content from the two dimensions of quality and timeliness;
步骤3:反馈评估的最终结果。Step 3: Feedback the final result of the evaluation.
进一步的,所述步骤1具体方法为:首先预先设定好用户操作指令,包括:刷新、阅读、点赞,然后模拟用户操作指令,通过adb方式传输给虚拟化的新闻客户端,新闻客户端根据指令通过模拟点击向服务器请求数据,服务器则向新闻客户端返回推送的个性化内容,最后利用hook服务器从新闻客户端中获取推荐内容,并将推荐内容保存到数据库中。Further, the specific method of step 1 is as follows: firstly set user operation instructions in advance, including: refresh, read, and like, then simulate user operation instructions, and transmit them to the virtualized news client and news client via adb. According to the instruction, data is requested from the server by simulating a click, and the server returns the pushed personalized content to the news client. Finally, the hook server is used to obtain the recommended content from the news client and save the recommended content in the database.
进一步的,所述步骤2中内容质量评估的具体方法为:首先进行不良内容识别:采用序列标注的方法,构建大量标记不良内容词汇的训练数据,进行模型训练;将获取到的推荐数据使用训练好的模型进行预测,判断是否包含不良内容词语,若包含则提取并记录,若不包含,则进行“标题党内容”识别;Further, the specific method of content quality evaluation in step 2 is as follows: firstly perform bad content recognition: adopt the method of sequence labeling, construct a large number of training data for marking bad content vocabulary, and conduct model training; use the obtained recommendation data for training A good model makes predictions and judges whether it contains bad content words, if it does, extract and record it, if it does not, then perform "title party content" identification;
所述“标题党”内容识别:采用二分类的模型,构建大量训练数据,将标题与文章正文共同作为模型输入,标题与文章相符的标记为1,标题与文章不符的标记为0,进行分类模型训练;将获取到的推荐数据的标题与文章正文输入到模型中,判断是否为“标题党”内容;The "Title Party" content recognition: use a two-class model to construct a large amount of training data, and use the title and the article body as the model input. The title and the article are marked as 1, and the title and the article are not marked as 0 for classification. Model training; input the title and article text of the obtained recommendation data into the model to determine whether it is the content of "Title Party";
所述步骤2中时效评估的具体方法为:通过最近n天内容推荐率来评估***的实时性,取n=3,即推荐内容生产时间为最近72小时内的推荐数量占总推荐数量,获取推荐内容的生产时间,计算生产时间为72小时内的数据占比The specific method of the timeliness evaluation in the step 2 is: evaluate the real-time performance of the system by the content recommendation rate of the last n days, and take n=3, that is, the recommended content production time is the number of recommendations in the last 72 hours accounting for the total number of recommendations, get The production time of the recommended content, calculated as the percentage of data within 72 hours of production time
进一步的,计算步骤2中得到的时效指标和质量指标,取平均值作为最终结果。Further, the timeliness index and quality index obtained in step 2 are calculated, and the average value is taken as the final result.
本发明还提供了一种实现上述评估方法的***,包括推荐内容获取模块、内容可信度评估模块和评估结果反馈模块;The present invention also provides a system for implementing the above evaluation method, including a recommended content acquisition module, a content credibility evaluation module, and an evaluation result feedback module;
推荐内容获取模块包括:模拟用户画像数据请求模块:模拟用户进行浏览、点击、分享行为,形成预设的户画像,按照预设的用户画像从新闻客户端向服务器请求数据;The recommended content acquisition module includes: simulated user portrait data request module: simulates the user's browsing, clicking, and sharing behaviors to form a preset user portrait, and requests data from the news client to the server according to the preset user portrait;
服务器推送模块:服务器向新闻客户端推送个性化推荐的新闻信息内容;Server push module: The server pushes personalized recommended news information content to the news client;
内容采集模块:利用hook服务器从新闻客户端中获取推荐内容;Content collection module: use hook server to obtain recommended content from news clients;
数据保存模块:将采集到的内容保存到数据库中;Data saving module: save the collected content to the database;
所述内容可信度评估模块包括:时效评估模块:通过最近n天新闻推荐率来评估推荐***的实时性,N<=3;The content credibility evaluation module includes: a timeliness evaluation module: the real-time performance of the recommendation system is evaluated by the news recommendation rate of the last n days, N<=3;
质量评估模块:检测推荐内容中是否含有不良内容或“标题党”内容。Quality assessment module: detect whether the recommended content contains bad content or "headline party" content.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
(1)内容获取方面,从移动新闻客户端APP侧进行个性化推荐内容获取,可以模拟根据不同的用户画像对不同用户推荐的内容进行实时获取;(1) In terms of content acquisition, personalized recommended content acquisition from the mobile news client APP side can simulate real-time acquisition of content recommended by different users based on different user portraits;
(2)内容评估方面,从时效、质量两个维度评估内容的可信程度,能够良好的反映新闻信息的实时性、信息覆盖面以及内容的质量。(2) In terms of content evaluation, the credibility of content is evaluated from the two dimensions of timeliness and quality, which can well reflect the real-time nature of news information, information coverage, and content quality.
(3)实时反馈评估结果以及检测到的不良内容,方便平台进行管理。(3) Real-time feedback of evaluation results and detected undesirable content to facilitate platform management.
附图说明Description of the drawings
图1为现有技术中新闻客户端数据采集方法示意图;Fig. 1 is a schematic diagram of a data collection method of a news client in the prior art;
图2为本发明的移动新闻客户端的评估***示意图;Figure 2 is a schematic diagram of the evaluation system of the mobile news client of the present invention;
图3为本发明的评估***中推荐内容获取模块示意图;Figure 3 is a schematic diagram of a recommended content acquisition module in the evaluation system of the present invention;
图4为本发明的评估方法中推荐内容获取方法流程图;4 is a flowchart of a method for obtaining recommended content in the evaluation method of the present invention;
图5为本发明的评估方法中内容评估方法流程图。Fig. 5 is a flowchart of the content evaluation method in the evaluation method of the present invention.
具体实施方式Detailed ways
为了让技术人员更好的了解本发明,下面结合具体实施例对本发明进行详细说明。In order to allow the skilled person to better understand the present invention, the present invention will be described in detail below in conjunction with specific embodiments.
如图1所示,现有技术中新闻客户端数据采集方法示意图,根据获取到的新闻客户端的数量创建相同数量的线程,获取***的中央处理器CPU的核心数量并根据预设规则将每个线程绑定在相应的CPU核心上;将同一CPU核心上的所有预置解析队列中的数据存储于对应的预置输出队列中,当接收到输出指令时则将预置输出队列中的数据传送于预置数据库中,以便基于预置数据库中的数据实现对新闻客户端数据的监管。这种采集方法是直接获取每个新闻客户端的栏目内容的方式,导致难以针对推荐***对每个用户的推荐内容进行评估。As shown in Figure 1, a schematic diagram of a data collection method for news clients in the prior art. The same number of threads are created according to the number of news clients obtained, the number of cores of the central processing unit CPU of the system is obtained, and each of the The thread is bound to the corresponding CPU core; the data in all the preset analysis queues on the same CPU core is stored in the corresponding preset output queue, and the data in the preset output queue is transmitted when an output instruction is received In the preset database, in order to realize the supervision of the news client data based on the data in the preset database. This collection method is a way to directly obtain the column content of each news client, which makes it difficult to evaluate the recommended content of each user with respect to the recommendation system.
如图2所示,本发明提供了一种基于移动新闻客户端的评估方法,包括:As shown in Figure 2, the present invention provides an evaluation method based on a mobile news client, including:
步骤1:推荐内容的获取:模拟用户画像向服务器请求数据,采集服务器返回的推送内容并储存在数据库中;Step 1: Obtain recommended content: simulate a user portrait requesting data from the server, collect the pushed content returned by the server and store it in the database;
步骤2:对推荐内容从质量和时效两个维度进行评估;Step 2: Evaluate the recommended content from the two dimensions of quality and timeliness;
步骤3:反馈评估的最终结果。Step 3: Feedback the final result of the evaluation.
如图3和图4所示,本发明的评估方法中推荐内容获取方法具体为:As shown in FIG. 3 and FIG. 4, the method for obtaining recommended content in the evaluation method of the present invention is specifically:
首先预先设定好用户操作指令,包括:刷新、阅读、点赞,然后模拟用户操作指令,通过adb方式传输给虚拟化的新闻客户端,新闻客户端根据指令通过模拟点击向服务器请求数据,服务器则向新闻客户端返回推送的个性化内容,最后利用hook服务器从新闻客户端中获取推荐内容,并将推荐内容保存到数据库中。First, set user operation instructions in advance, including: refresh, read, and like, then simulate user operation instructions, and transmit them to the virtualized news client through adb. The news client requests data from the server by simulating clicks according to the instructions, and the server Then return the pushed personalized content to the news client, and finally use the hook server to obtain the recommended content from the news client, and save the recommended content in the database.
如图5所示,本发明的评估方法中内容评估方法流程图,内容质量评估的具体方法为:As shown in Figure 5, the content evaluation method flow chart in the evaluation method of the present invention, the specific method of content quality evaluation is:
首先进行不良内容识别:采用序列标注的方法,构建大量标记不良内容词汇的训练数据,进行模型训练;将获取到的推荐数据使用训练好的模型进行预测,判断是否包含不良内容词语,若包含则提取并记录,若不包含,则进行“标题党内容”识别;First, perform bad content recognition: use sequence labeling to construct a large number of training data for marking bad content vocabulary, and perform model training; use the trained model to predict the obtained recommendation data to determine whether it contains bad content words, and if it does, Extract and record, if not included, identify the "title party content";
所述“标题党”内容识别:采用二分类的模型,构建大量训练数据,将标题与文章正文共同作为模型输入,标题与文章相符的标记为1,标题与文章不符的标记为0,进行分类模型训练;将获取到的推荐数据的标题与文章正文输入到模型中,判断是否为“标题党”内容;最后计算内容的质量指标,公式为:The "Title Party" content recognition: use a two-class model to construct a large amount of training data, and use the title and the article body as the model input. The title and the article are marked as 1, and the title and the article are not marked as 0 for classification. Model training; input the title and article text of the obtained recommendation data into the model to determine whether it is the content of "Title Party"; finally calculate the quality index of the content, the formula is:
Figure PCTCN2020082285-appb-000001
Figure PCTCN2020082285-appb-000001
其中:Q为某时刻检测不符合内容质量要求的推荐内容数量占总推荐数量的比率,U为模拟不同用户画像的总数量,q i为i用户画像获取到推荐内容中为不符合内容质量要求的数量,ALL i为i用户画像获取到推荐内容的总数量。 Among them: Q is the ratio of the number of recommended content detected that do not meet the content quality requirements at a certain moment in the total recommended number, U is the total number of simulated different user portraits, and q i is the recommended content obtained from the user portrait of i is not meeting the content quality requirements All i is the total number of recommended content obtained by the portrait of user i.
时效评估的具体方法为:The specific methods of timeliness evaluation are:
通过最近n天内容推荐率来评估***的实时性,取n=3,即推荐内容生产时间为最近72小时内的推荐数量占总推荐数量,获取推荐内容的生产时间,计算生产时间为72小时内的数据占比,公式为:Evaluate the real-time performance of the system by the content recommendation rate in the last n days, and take n=3, that is, the recommended content production time is the number of recommendations in the last 72 hours accounting for the total recommended number, the production time for obtaining the recommended content, and the calculated production time is 72 hours The proportion of data within, the formula is:
Figure PCTCN2020082285-appb-000002
Figure PCTCN2020082285-appb-000002
其中,T为生产时间为72小时的推荐内容数量占总推荐内容的比率,U为模拟不同用户画像的总数量,t i为i用户画像获取到推荐内容中为不符合时效要求 的数量,ALL i为i用户画像获取到推荐内容的总数量。 Among them, T is the ratio of the number of recommended content with a production time of 72 hours to the total recommended content, U is the total number of simulated different user portraits, t i is the number of recommended content obtained by user i that does not meet the timeliness requirements, ALL i is the total number of recommended content obtained by the portrait of user i.
所述步骤3中,将每个时刻模拟不同用户画像获取到的数据经过上述评估方法得到时效指标和质量指标,取平均值作为最终结果反馈给监控平台。In the step 3, the data obtained by simulating different user portraits at each moment are passed through the above-mentioned evaluation method to obtain the timeliness index and the quality index, and the average value is taken as the final result and fed back to the monitoring platform.
本发明还提供了一种实现上述基于新闻客户端的评估和监控方法的***,包括推荐内容获取模块、内容可信度评估模块和评估结果反馈模块;The present invention also provides a system for realizing the above-mentioned evaluation and monitoring method based on news clients, including a recommended content acquisition module, a content credibility evaluation module, and an evaluation result feedback module;
所述推荐内容获取模块包括:The recommended content acquisition module includes:
模拟用户画像数据请求模块:模拟用户进行浏览、点击、分享行为,形成预设的户画像,按照预设的用户画像从新闻客户端向服务器请求数据;Simulate user portrait data request module: simulate the user's browsing, clicking, and sharing behaviors to form a preset user portrait, and request data from the news client to the server according to the preset user portrait;
服务器推送模块:服务器向新闻客户端推送个性化推荐的新闻信息内容;Server push module: The server pushes personalized recommended news information content to the news client;
内容采集模块:利用hook服务器从新闻客户端中获取推荐内容;Content collection module: use hook server to obtain recommended content from news clients;
数据保存模块:将采集到的内容保存到数据库中;Data saving module: save the collected content to the database;
所述内容可信度评估模块包括:The content credibility evaluation module includes:
时效评估模块:通过最近n天新闻推荐率来评估推荐***的实时性,N<3;Timeliness evaluation module: Evaluate the real-time performance of the recommendation system through the recommendation rate of news in the last n days, N<3;
质量评估模块:检测推荐内容中是否含有不良内容或“标题党”内容。Quality assessment module: detect whether the recommended content contains bad content or "headline party" content.
本发明公开的基于新闻客户端的评估方法,在内容采集方面,从移动新闻客户端APP侧进行个性化推荐内容采集,可以模拟根据不同的用户画像对不同用户推荐的内容进行实时采集,同类用户(用户画像相似的用户)的推荐内容无需全部采集,更加灵活,可以自由控制采集的内容规模,可以根据相关规则调整自由调整采集的个性化内容规模;在内容评估方面,从时效性、质量两个维度评估新闻信息推荐内容的可信程度,能够良好的反映新闻信息的实时性、文章内容的质量。The news client-based evaluation method disclosed in the present invention, in terms of content collection, collects personalized recommended content from the mobile news client APP side, which can simulate real-time collection of content recommended by different users according to different user portraits, and similar users ( The recommended content of users with similar user portraits does not need to be collected all, and it is more flexible. The scale of collected content can be freely controlled, and the scale of collected personalized content can be adjusted freely according to relevant rules; in terms of content evaluation, from timeliness and quality The dimensionality evaluates the credibility of the recommended content of news information, which can well reflect the real-time nature of news information and the quality of article content.
以上实施例仅为说明本发明的技术思想,不能以此限定本发明的保护范围,凡是按照本发明提出的技术思想,在技术方案基础上所做的任何改动,均落入本发明的权利要求书的保护范围之内。本发明未涉及的技术均可通过现有技术加以实现。The above embodiments are only to illustrate the technical ideas of the present invention, and cannot be used to limit the scope of protection of the present invention. Any changes made on the basis of the technical solutions based on the technical ideas proposed by the present invention fall into the claims of the present invention. Within the scope of protection of the book. The technologies not involved in the present invention can all be realized by the existing technologies.

Claims (6)

  1. 一种基于移动新闻客户端的评估方法,包括:An evaluation method based on mobile news clients, including:
    步骤一:推荐内容的获取:模拟用户画像向服务器请求数据,采集服务器返回的推送内容并储存在数据库中;Step 1: Obtain recommended content: simulate a user portrait requesting data from the server, collect the pushed content returned by the server and store it in the database;
    步骤二:对推荐内容从内容质量评估和时效评估两个维度进行评估;Step 2: Evaluate the recommended content from two dimensions: content quality evaluation and timeliness evaluation;
    步骤三:反馈评估的最终结果。Step 3: Feedback the final result of the evaluation.
  2. 根据权利要求1所述的基于移动新闻客户端的评估方法,其特征在于,所述步骤一具体方法为:The evaluation method based on a mobile news client according to claim 1, wherein the specific method of step one is:
    首先预先设定好用户操作指令,包括:刷新、阅读、点赞,然后模拟用户操作指令,通过adb方式传输给虚拟化的新闻客户端,新闻客户端根据指令通过模拟点击向服务器请求数据,服务器则向新闻客户端返回推送的个性化内容,最后利用hook服务器从新闻客户端中获取推荐内容,并将推荐内容保存到数据库中。First, set user operation instructions in advance, including: refresh, read, and like, then simulate user operation instructions, and transmit them to the virtualized news client through adb. The news client requests data from the server by simulating clicks according to the instructions, and the server Then return the pushed personalized content to the news client, and finally use the hook server to obtain the recommended content from the news client, and save the recommended content in the database.
  3. 根据权利要求2所述新闻客户端的评估方法,其特征在于:所述步骤二中内容质量评估的具体方法为:The evaluation method of the news client according to claim 2, wherein the specific method of the content quality evaluation in the second step is:
    首先进行不良内容识别:采用序列标注的方法,构建大量标记不良内容词汇的训练数据,进行模型训练;将获取到的推荐数据使用训练好的模型进行预测,判断是否包含不良内容词语,若包含则提取并记录,若不包含,则进行“标题党内容”识别;First, perform bad content recognition: use sequence labeling to construct a large number of training data for marking bad content vocabulary, and perform model training; use the trained model to predict the obtained recommendation data to determine whether it contains bad content words, and if it does, Extract and record, if not included, identify the "title party content";
    所述“标题党”内容识别:采用二分类的模型,构建大量训练数据,将标题与文章正文共同作为模型输入,标题与文章相符的标记为1,标题与文章不符的标记为0,进行分类模型训练;将获取到的推荐数据的标题与文章正文输入到模型中,判断是否为“标题党”内容;The "Title Party" content recognition: use a two-class model to construct a large amount of training data, and use the title and the article body as the model input. The title and the article are marked as 1, and the title and the article are not marked as 0 for classification. Model training; input the title and article text of the obtained recommendation data into the model to determine whether it is the content of "Title Party";
    最后计算内容的质量指标,公式为:Finally, the quality index of the content is calculated, and the formula is:
    Figure PCTCN2020082285-appb-100001
    Figure PCTCN2020082285-appb-100001
    其中:Q为某时刻检测不符合内容质量要求的推荐内容数量占总推荐数量的比率,U为模拟不同用户画像的总数量,q i为i用户画像获取到推荐内容中为不符合内容质量要求的数量,ALL i为i用户画像获取到推荐内容的总数量。 Among them: Q is the ratio of the number of recommended content detected that do not meet the content quality requirements at a certain moment in the total recommended number, U is the total number of simulated different user portraits, and q i is the recommended content obtained from the user portrait of i is not meeting the content quality requirements All i is the total number of recommended content obtained by the portrait of user i.
  4. 根据权利要求2所述新闻客户端的评估方法,其特征在于,所述步骤二中时效评估的具体方法为:The evaluation method of the news client according to claim 2, wherein the specific method of the timeliness evaluation in the second step is:
    通过最近n天内容推荐率来评估***的实时性,取n=3,即推荐内容生产时间为最近72小时内的推荐数量占总推荐数量,获取推荐内容的生产时间,计算生产时间为72小时内的数据占比,公式为:Evaluate the real-time performance of the system by the content recommendation rate in the last n days, and take n=3, that is, the recommended content production time is the number of recommendations in the last 72 hours accounting for the total recommended number, the production time for obtaining the recommended content, and the calculated production time is 72 hours The proportion of data within, the formula is:
    Figure PCTCN2020082285-appb-100002
    Figure PCTCN2020082285-appb-100002
    其中,T为生产时间为72小时的推荐内容数量占总推荐内容的比率,U为模拟不同用户画像的总数量,t i为i用户画像获取到推荐内容中为不符合时效要求的数量,ALL i为i用户画像获取到推荐内容的总数量。 Among them, T is the ratio of the number of recommended content with a production time of 72 hours to the total recommended content, U is the total number of simulated different user portraits, t i is the number of recommended content obtained by user i that does not meet the timeliness requirements, ALL i is the total number of recommended content obtained by the portrait of user i.
  5. 根据权利要求2所述新闻客户端的评估方法,其特征在于,所述步骤三具体方法为:The method for evaluating news clients according to claim 2, wherein the specific method of step three is:
    计算步骤二中得到的时效指标和质量指标,取平均值作为最终结果。Calculate the timeliness index and quality index obtained in step two, and take the average value as the final result.
  6. 一种新闻客户端的评估***,其特征在于:包括推荐内容获取模块、内容可信度评估模块和评估结果反馈模块;An evaluation system for news clients, which is characterized in that it includes a recommended content acquisition module, a content credibility evaluation module, and an evaluation result feedback module;
    所述推荐内容获取模块包括:The recommended content acquisition module includes:
    模拟用户画像数据请求子模块:模拟用户进行浏览、点击、分享行为,形成预设的户画像,按照预设的用户画像从新闻客户端向服务器请求数据;Simulate user portrait data request sub-module: simulate the user's browsing, clicking, and sharing behaviors to form a preset user portrait, and request data from the news client to the server according to the preset user portrait;
    服务器推送子模块:服务器向新闻客户端推送个性化推荐的新闻信息内容;Server push sub-module: the server pushes personalized recommended news information content to the news client;
    内容采集子模块:利用hook服务器从新闻客户端中获取推荐内容;Content collection sub-module: use hook server to obtain recommended content from news clients;
    数据保存子模块:将采集到的内容保存到数据库中;Data saving sub-module: save the collected content to the database;
    所述内容可信度评估模块包括:The content credibility evaluation module includes:
    时效评估子模块:通过最近n天新闻推荐率来评估推荐***的实时性,N<=3;Timeliness evaluation sub-module: evaluate the real-time performance of the recommendation system through the recommendation rate of news in the last n days, N<=3;
    质量评估子模块:检测推荐内容中是否含有不良内容或“标题党”内容;Quality evaluation sub-module: detect whether the recommended content contains bad content or "headline party" content;
    所述评估结果反馈模块,将每个时刻模拟不同用户画像获取到的数据经过评估得到时效指标和质量指标,取平均值作为最终结果反馈给监控平台。The evaluation result feedback module evaluates the data obtained by simulating different user portraits at each moment to obtain the timeliness index and the quality index, and the average value is taken as the final result and fed back to the monitoring platform.
PCT/CN2020/082285 2019-12-31 2020-03-31 Mobile news client-based evaluation method and system therefor WO2021134944A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2020335019A AU2020335019B2 (en) 2019-12-31 2020-03-31 Evaluation method based on mobile news client and system thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911409113.6A CN111143688B (en) 2019-12-31 2019-12-31 Evaluation method and system based on mobile news client
CN201911409113.6 2019-12-31

Publications (1)

Publication Number Publication Date
WO2021134944A1 true WO2021134944A1 (en) 2021-07-08

Family

ID=70522566

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082285 WO2021134944A1 (en) 2019-12-31 2020-03-31 Mobile news client-based evaluation method and system therefor

Country Status (3)

Country Link
CN (1) CN111143688B (en)
AU (1) AU2020335019B2 (en)
WO (1) WO2021134944A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291690B (en) * 2023-11-23 2024-06-07 五五海淘(上海)科技股份有限公司 Intelligent product sales recommendation method based on data analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103365902A (en) * 2012-03-31 2013-10-23 北大方正集团有限公司 Method and device for evaluating Internet News
US20140047118A1 (en) * 2008-09-29 2014-02-13 Amazon Technologies, Inc. Optimizing resource configurations
CN107818156A (en) * 2017-10-31 2018-03-20 广东思域信息科技有限公司 A kind of real time individual news recommends method and system
CN107968842A (en) * 2017-12-26 2018-04-27 百度在线网络技术(北京)有限公司 News push method, apparatus and equipment based on distributed system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150046425A1 (en) * 2013-08-06 2015-02-12 Hsiu-Ping Lin Methods and systems for searching software applications
US9992690B2 (en) * 2013-10-11 2018-06-05 Textron Innovations, Inc. Placed wireless instruments for predicting quality of service
CN109684582A (en) * 2018-11-08 2019-04-26 张耀伦 A kind of evaluating system and method for information resources
CN110413890A (en) * 2019-07-29 2019-11-05 武汉匠楚科技有限公司 A kind of method that news recommender system polymerization news is presented

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140047118A1 (en) * 2008-09-29 2014-02-13 Amazon Technologies, Inc. Optimizing resource configurations
CN103365902A (en) * 2012-03-31 2013-10-23 北大方正集团有限公司 Method and device for evaluating Internet News
CN107818156A (en) * 2017-10-31 2018-03-20 广东思域信息科技有限公司 A kind of real time individual news recommends method and system
CN107968842A (en) * 2017-12-26 2018-04-27 百度在线网络技术(北京)有限公司 News push method, apparatus and equipment based on distributed system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN, MING: "Study and Implementation of Microblog Oriented Text Quality Evaluation and Classification", CHINESE MASTER’S THESES FULL-TEXT DATABASE, 1 November 2015 (2015-11-01), pages 1 - 68, XP055826746 *
HONG XIA: "The Research on Data Quality Analysis and Evaluation of Information System and the Application in Labor Mraket", CHINESE MASTER’S THESES FULL-TEXT DATABASE, 1 February 2009 (2009-02-01), pages 1 - 72, XP055826018 *
LIU JIN-HUI;CUI XIANG-YANG;YANG FAN;LIU LI-YAN: "Design and Implementation of a Combined Recommendation System Based on Spring Boot and User Portrait", ELECTRONIC COMPONENT AND INFORMATION TECHNOLOGY, vol. 3, no. 5, 31 May 2019 (2019-05-31), pages 24 - 29, XP055826751, ISSN: 2096-4455, DOI: 10.19772/j.cnki.2096-4455.2019.5.007 *

Also Published As

Publication number Publication date
CN111143688B (en) 2021-03-02
AU2020335019A1 (en) 2021-07-15
AU2020335019B2 (en) 2022-06-16
CN111143688A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
WO2021031566A1 (en) Multi-task learning-based facial beauty prediction method
US20210182611A1 (en) Training data acquisition method and device, server and storage medium
US7421429B2 (en) Generate blog context ranking using track-back weight, context weight and, cumulative comment weight
US9208441B2 (en) Information processing apparatus, information processing method, and program
CN111885399B (en) Content distribution method, device, electronic equipment and storage medium
CN109033408B (en) Information pushing method and device, computer readable storage medium and electronic equipment
TW201706884A (en) Data analysis system, data analysis method, data analysis program, and storage medium
US20180060426A1 (en) Systems and methods for issue management
CN105894253A (en) Method and device for automatic pushing of job application demand
CN112104642A (en) Abnormal account number determination method and related device
CN108829652A (en) A kind of picture labeling system based on crowdsourcing
JP2018509664A (en) Model generation method, word weighting method, apparatus, device, and computer storage medium
WO2020237898A1 (en) Personalized recommendation method for online education system, terminal and storage medium
CN111177559A (en) Text travel service recommendation method and device, electronic equipment and storage medium
CN107809370B (en) User recommendation method and device
Huna et al. Exploiting content quality and question difficulty in CQA reputation systems
CN111723256A (en) Government affair user portrait construction method and system based on information resource library
KR20200145299A (en) Intelligent recruitment support platform based on online interview video analysis and social media information analysis
CN116303663A (en) User affinity calculation method and system based on content social platform
WO2021134944A1 (en) Mobile news client-based evaluation method and system therefor
CN113469752A (en) Content recommendation method and device, storage medium and electronic equipment
WO2020119533A1 (en) Public sentiment warning method and apparatus based on recurrent neural network algorithm, terminal and medium
CN111552882A (en) News influence calculation method and device, computer equipment and storage medium
CN108259588A (en) A kind of method for pushing and device of the cultural cloud platform based on big data
CN112257517A (en) Scenic spot recommendation system based on scenic spot clustering and group emotion recognition

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020335019

Country of ref document: AU

Date of ref document: 20200331

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20910306

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20910306

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20910306

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20910306

Country of ref document: EP

Kind code of ref document: A1