WO2017121076A1 - Information-pushing method and device - Google Patents

Information-pushing method and device Download PDF

Info

Publication number
WO2017121076A1
WO2017121076A1 PCT/CN2016/087453 CN2016087453W WO2017121076A1 WO 2017121076 A1 WO2017121076 A1 WO 2017121076A1 CN 2016087453 W CN2016087453 W CN 2016087453W WO 2017121076 A1 WO2017121076 A1 WO 2017121076A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
website
identification
model
identification information
Prior art date
Application number
PCT/CN2016/087453
Other languages
French (fr)
Chinese (zh)
Inventor
岳爱珍
崔燕
杨自强
谭静
高显
赵辉
王私江
于倩
白霄骅
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Publication of WO2017121076A1 publication Critical patent/WO2017121076A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present application relates to the field of computer technologies, and in particular, to the field of Internet technologies, and in particular, to an information push method and apparatus.
  • the existing information push mode generally pushes various candidate push information directly to the user without adding an identifier to the information, so that there is no difference in the logo between the pushed information, and the user is less efficient in obtaining the information.
  • the purpose of the present application is to propose an improved information push method and apparatus to solve the technical problems mentioned in the background section above.
  • the present application provides an information pushing method, the method comprising: acquiring candidate push information; determining, according to a pre-trained information identification model, identification information corresponding to the candidate push information; and based on the candidate push information and And the identifier information corresponding to the candidate push information is generated, and the information to be pushed is pushed; and the information to be pushed is pushed.
  • the determining, according to the pre-trained information identification model, the identification information corresponding to the candidate push information comprising: confirming a website from which the candidate push information is derived; searching for feature information of the website, Introducing feature information into pre-trained And the identifier information corresponding to the feature information of the website determined by the information identification model, and the identifier information corresponding to the feature information of the website is used as the identifier information corresponding to the candidate push information.
  • the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
  • the method further comprises the step of establishing an information identification model, comprising: obtaining sample data required to train the model, wherein the sample data includes feature information of the sample website and the determined sample website The identification information corresponding to the feature information; predicting the identification information corresponding to the feature information of the sample website based on the initial model, and acquiring the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models : support vector machine model, decision tree model, naive Bayesian model, logistic regression model; determine whether the identification information corresponding to the feature information of the sample website predicted by the initial model is consistent with the identification information corresponding to the determined feature information of the sample website; If not, the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website are used as the training data of the initial model, and the parameters of the initial model are modified based on the training data, Obtaining the information identification model.
  • the identification information includes first identification information and second identification information; and the determining, according to the pre-trained model, the identification information corresponding to the candidate push information, including: based on the searched Whether the preset information is included in the record information of the website, and one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate push information; or, based on the acquired user report information set Whether the information of the website is included, one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate pushing information.
  • the present application provides an information push apparatus, where the apparatus includes: an obtaining unit configured to acquire candidate push information; and a determining unit configured to determine and push the candidate push information based on the pre-trained information identification model Corresponding identification information, a generating unit, configured to generate information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information, and a pushing unit configured to push the to-be-pushed information.
  • the determining unit includes: a website confirmation subunit, configured to confirm a website from which the candidate push information is derived; a feature information search subunit, configured to search for feature information of the website, and import feature information a subunit, configured to import the feature information into a pre-trained information identification model; the identifier information acquisition subunit, configured to acquire, according to the information identification model, identifier information corresponding to the feature information of the website, The identification information corresponding to the feature information of the website is used as the identification information corresponding to the candidate push information.
  • the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
  • the apparatus further includes: an information identification model establishing unit, comprising: a sample data obtaining subunit, configured to acquire feature information of the sample website and identification information corresponding to the determined feature information of the sample website; and the prediction identifier
  • the information obtaining sub-unit is configured to predict the identification information corresponding to the feature information of the sample website based on the initial model, and obtain the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models: Support vector machine model, decision tree model, naive Bayesian model, logistic regression model; predictive identification information judging sub-unit, used to judge the identification information corresponding to the feature information of the sample website predicted by the initial model and the characteristics of the determined sample website Whether the identification information corresponding to the information is consistent; the parameter modification subunit is configured to determine, in the prediction identification information determining subunit, that the identification information corresponding to the feature information of the sample website predicted by the initial model is inconsistent with the identification information corresponding to the determined feature information of the sample website.
  • the identification information includes first identification information and second identification information; and the determining unit includes: a first selecting subunit, configured to be based on whether the searched information of the website is searched for And including a preset keyword, and selecting one of the first identification information and the second identification information as a label corresponding to the candidate push information Or the second selection subunit, configured to select one of the first identification information and the second identification information as the candidate to be pushed based on whether the information of the website is included in the acquired user report information set. Identification information corresponding to the information.
  • the information pushing method and apparatus obtains candidate push information, and then determines identification information corresponding to the candidate push information based on the pre-trained information identification model, and based on the candidate push information and the candidate push information.
  • the corresponding identification information generates the information to be pushed, and finally pushes the information to be pushed, thereby realizing the difference in the identification between the pushed information, so that the user obtains the information more efficiently.
  • FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
  • FIG. 2 is a flow chart of one embodiment of an information push method according to the present application.
  • FIG. 3 is a schematic diagram of an application scenario of an information pushing method according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an information pushing apparatus according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server of an embodiment of the present application.
  • FIG. 1 shows an example of an information pushing method or an information pushing device to which the present application can be applied.
  • system architecture 100 can include terminal devices 101, 102, 103, network 104, and server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • Network 104 may include various types of connections, such as wired, wireless communication links, fiber optic cables, and the like.
  • the user can interact with the server 105 over the network 104 using the terminal devices 101, 102, 103 to receive or transmit messages and the like.
  • Various communication client applications such as a web browser application, a shopping application, a search application, a map application, an instant communication tool, a mailbox client, a social platform software, and the like, may be installed on the terminal devices 101, 102, and 103.
  • the terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic The video specialist compresses the standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV) player, laptop portable computer and desktop computer, and the like.
  • MP3 players Motion Picture Experts Group Audio Layer III, dynamic The video specialist compresses the standard audio layer 3
  • MP4 Moving Picture Experts Group Audio Layer IV
  • the information pushing method provided by the embodiment of the present application is generally performed by the server 105. Accordingly, the information pushing device is generally disposed in the server 105.
  • the server 105 may also obtain candidate push information directly from other servers, push the information to be pushed to other servers, or the server 105 itself may store candidate push information.
  • the system architecture used in the application is also The above terminal devices 101, 102, 103 may not be involved.
  • terminal devices, networks, and servers in Figure 1 is merely illustrative. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
  • the information pushing method includes the following steps:
  • Step 201 Acquire candidate push information.
  • the electronic device for example, the server shown in FIG. 1 on which the information push method runs may obtain the candidate push information by the following steps: , obtaining a user's search request; Then, the search result information is queried based on the search request. At this time, the search result information may be directly used as the candidate push information, or the screening condition may be set according to actual needs, the search result information may be filtered, and the filtered search result information is used as a candidate.
  • Push information For example, if the timeliness of candidate push information is high, you can set a time limit to filter the search results within the set time limit.
  • the electronic device for example, the server shown in FIG. 1 on which the information pushing method runs may also directly obtain search result information from the search server through a wired connection manner or a wireless connection manner, and the search result information is obtained.
  • candidate push information For example, the server shown in FIG. 1, the server shown in FIG. 1, the server shown
  • the candidate push information may also be acquired according to the user's account information and historical push information. For example, if the user's account information records the industry in which it works, the industry dynamic information of the above-mentioned industries may be used as candidate push information.
  • Step 202 Determine identification information corresponding to the candidate push information based on the pre-trained information identification model.
  • the identification information includes at least one of the following: image information, text information, and sound information.
  • the identification information can be used to indicate whether the candidate push information is secure and trustworthy. For example, if the source of the candidate push information is the website of the government agency, it is considered to be safe and trustworthy, and the corresponding identification information is positive, for example, the words "good” or "top” or the image may also be the letter "V". "" words or images, further, the words “V1", “V2", “V3” can be used to indicate the degree of credibility; if the source of the candidate push information is a website that has been reported, it is considered not to It is safe and trustworthy, and its corresponding identification information is negative, such as "non-premium", "not recommended” words or images. If it is not determined that the candidate push information corresponds to the positive identification information, the corresponding identification information may be set to be empty, and the identification information may be empty and the positive identification information may also be formed into a comparison.
  • the electronic device may query the feature information of the candidate push information in a preset database.
  • the candidate push information may be statistically analyzed and/or semantically analyzed, and at least one keyword, such as an organization name or a web address, may be extracted, and then the feature information corresponding to the key information is queried in a preset database based on the key information.
  • the website from which the candidate push information is derived may be obtained first, and the website from which the candidate push information is derived may be obtained through a search tool such as SEO (Search Engine Optimization) on the webmaster tool website that provides the website information. get on Search for the action and grab the information from the search results page as feature information.
  • SEO Search Engine Optimization
  • the feature information is imported into the pre-trained information identification model; according to the pre-trained correspondence relationship of the information identification model, the identification information corresponding to the feature information is obtained, and the identification information corresponding to the feature information is the candidate The identification information corresponding to the push information.
  • the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, flow information, Weight information, information about the organizer of the website.
  • the ranking information of the website may be the ranking information of the website obtained from the Alexa Alexa ranking system.
  • Keyword ranking is a way to reflect the ranking of a page by the relevance of words, words, and phrases in search engine search results.
  • the natural ranking of keywords is generally the embodiment of automatic analysis and automatic ranking of all relevant webpage crawling results by search engines.
  • the search engine's website will provide keyword ranking information of the website.
  • the number of links in the website refers to the number of links imported from other websites to the website.
  • the commonly used external chain analysis tools can obtain the number of links of the website.
  • Website traffic is the number of visits to a website and is a measure of the number of users accessing a website and the number of pages viewed by the user.
  • the traffic information of the website may be historical traffic information or estimated traffic information.
  • the weight information of a website usually refers to the overall evaluation of a website by a search engine. As an example, Baidu weight, Google's PR (page rank, Google webpage level), or Sogou's SR (Sogou Rank) can be used.
  • the website registration information of the website may be searched first, and the first identification information and the second identifier are obtained based on whether the website registration information includes a preset keyword.
  • One of the information is selected as the identification information corresponding to the candidate push information.
  • the website registration information of the website may be searched, and the information corresponding to the field of the nature of the organizer in the record may be captured to determine whether the business order is included therein.
  • a keyword, a government agency, an army, or a social group if yes, determining that the identification information corresponding to the candidate push information is the first identification information, and the first identification information is positive identification information, which may be similar to “excellent” , "top" words or images.
  • the user report information set may be obtained, and the first identifier information and the second identifier information are selected based on whether the information about the website is included in the user report information set.
  • One is identification information corresponding to the candidate push information.
  • the user report information set may be historical report information of the user collected by the server, and the report information includes the reported item and the reported object information, and the reported object information may be the website address or the name of the website organizer.
  • the user report website 1 includes false content. After verification, the website 1 does include false content, and the server records the report information. If the website from which the candidate push information originates is the website 1, the identification information corresponding to the candidate push information is determined.
  • the second identification information is a negative identification information, and may be a word or image similar to “non-premium” or “not recommended”, or may be empty.
  • Step 203 Generate information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information.
  • the electronic device may combine the candidate push information and the identifier information corresponding to the candidate push information as the information to be pushed. For example, when the identification information is image information, the corresponding image may be added to the designated portion of the candidate push information.
  • Step 204 Push the information to be pushed.
  • FIG. 3 is a schematic diagram of an application scenario of the information pushing method according to the embodiment.
  • the user first initiates a search request, and the search keyword is “news”; after that, the information identification server may obtain search result information as candidate push information in the background, and extract feature information corresponding to the candidate push information; Then, the information identification server introduces the feature information of the candidate push information into the pre-trained information identification model, and determines that the identification information corresponding to the news website 1 and the news website 2 in the candidate push information is positive identification information, and the positive identification information is “excellent.
  • the word "" is combined with the candidate push information to generate the information to be pushed, and finally pushes the information to be pushed.
  • the user browses the search results, if there is a hover or click on the word "excellent", Part or all of the feature information is displayed according to actual needs by means of a floating window or the like.
  • the method provided by the foregoing embodiment of the present application generates the to-be-push information by determining the identification information corresponding to the candidate push information, and generating the information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information.
  • the difference in the identification between the pushed information makes the user more efficient in obtaining information.
  • the flow 400 of the information pushing method includes the following steps:
  • Step 401 Acquire candidate push information.
  • the candidate push information may be generated based on the search result information associated with the search operation, or generated based on the user's account information and historical push information.
  • Step 402 Obtain a website from which the candidate push information is derived.
  • the candidate push information directly includes the information of the website from which the source is derived, and the website name or the website address can be extracted by performing statistical analysis and/or semantic analysis on the candidate push information.
  • step 403 it is determined whether the nature of the website sponsoring unit is a public institution, a government agency, an army, or a social group.
  • the website record information database may be queried for the nature of the sponsor corresponding to the website from which the candidate push information is derived, and Table 1 shows some records in the record information database.
  • the above candidate push information can also be queried on the website filing information inquiry website.
  • the website from which the website is sourced obtains the information of the nature of the organizer corresponding to the website by means of crawling. If the website sponsoring unit of the website from which the candidate push information originates is a public institution, a government agency, an army or a social group, then the above is determined.
  • the identification information corresponding to the candidate push information is positive identification information, and proceeds to step 406; if not, proceeds to step 404.
  • the identification information is positive identification information.
  • Step 404 Determine whether the record of the website from which the candidate push information is originated or the record of the organizer of the website is included in the violation behavior database.
  • the records in the above-mentioned violation and trustworthy behavior database may be obtained based on the user's report history information, or may be obtained based on the national enterprise credit information publicity system or the publicized list of serious illegal and untrustworthy enterprises, if the violation behavior database includes the website from which the candidate push information originates.
  • the record or the record of the organizer of the website determines that the identification information corresponding to the candidate push information is negative identification information, and proceeds to step 406; if not, proceeds to step 405.
  • Step 405 Acquire feature information according to the candidate push information, and import the feature information into the information identification model.
  • the steps to establish an information identification model include:
  • sample data required for training the model includes feature information of the sample website and identification information corresponding to the determined feature information of the sample website.
  • the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website may be obtained from the sample data set, and the sample data set may be manually set, or may be a website corresponding to the determined identification information, at the station.
  • the search result page of the long tool type website is obtained by fetching the corresponding feature information.
  • the official website of the Fortune 500 companies has been identified as “excellent”, search for the official website of the Fortune 500 companies on the webmaster tools website, and grab the official website of the Fortune 500 companies on the search results page.
  • Characteristic information will be the world's top 500 companies
  • the official website serves as a sample website, and the feature information of the official website of the Fortune 500 companies is used as the feature information of the sample website, and "excellent" is used as the identification information corresponding to the characteristic information of the determined sample website.
  • Obtaining the sample data required for training the model can also be obtained by counting the manner in which the user browses the record. For example, the identification information corresponding to the website repeatedly visited by a large number of users in a specific time period is determined as “excellent”, and such websites are used as Sample website.
  • the identification information corresponding to the feature information of the sample website is predicted, and the identification information corresponding to the feature information of the sample website predicted by the initial model is obtained, wherein the initial model is one of the following models: a support vector machine model , decision tree model, naive Bayesian model, logistic regression model.
  • the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website are used as the training data of the initial model, and the parameters of the initial model are modified based on the training data, Obtaining the information identification model.
  • the LIBSVM software may be run.
  • the kernel function is determined to be a linear kernel.
  • the parameters that need to be selected and adjusted by the linear kernel are the penalty parameter C.
  • the weight parameter weight is used to adjust the weight of C of different categories of parameters. The weight can be set to positive and negative samples.
  • the penalty parameter C can generally range from 0.0001 to 10000, and the value of C can be adjusted according to the above training data.
  • the feature information is acquired according to the candidate push information, and the feature information is imported into the information identification model; the information identification model finds the corresponding identification information according to the pre-trained correspondence relationship.
  • Step 406 Generate to-be-push information based on the candidate push information and the identifier information corresponding to the candidate push information.
  • Step 407 Push the information to be pushed.
  • the error identification information and its corresponding feature data can be used as new training data to retrain the information identification model to further improve the information identification. The accuracy of the model.
  • the flow 400 of the information push method in the present embodiment highlights the step of determining the identification information as compared with the embodiment corresponding to FIG. Therefore, the solution described in this embodiment can introduce more relevant data for determining the identification information, thereby realizing the determination of the identification information with higher accuracy and more effective information push.
  • the present application provides an embodiment of an information pushing apparatus, and the apparatus embodiment corresponds to the method embodiment shown in FIG. Used in a variety of electronic devices.
  • the information pushing apparatus 500 described in this embodiment includes an obtaining unit 501, a determining unit 502, a generating unit 503, and a pushing unit 504.
  • the obtaining unit 501 is configured to acquire candidate push information;
  • the determining unit 502 is configured to determine, according to the pre-trained information identification model, identifier information corresponding to the candidate push information;
  • the generating unit 503 is configured to use the candidate push information.
  • the identification information corresponding to the candidate push information, generating the information to be pushed; and the pushing unit 504 is configured to push the information to be pushed.
  • the obtaining unit 501 of the information pushing device 500 can obtain candidate push information from the terminal or other server through a wired connection manner or a wireless connection manner.
  • the obtaining unit 501 acquires candidate push information
  • the information pushing device 500 is pre-trained with the information identification model, whereby the determining unit 502 of the information pushing device 500 can determine and candidate based on the pre-trained information identification model.
  • the identification information corresponding to the push information, the generating unit 503 may generate the to-be-pushed information based on the candidate push information and the identification information corresponding to the candidate push information, and the pushing unit 504 may push the information to be pushed generated by the generating unit 503.
  • the determining unit 502 includes: a website confirmation subunit, configured to confirm a website from which the candidate push information is derived; and a feature information search subunit, configured to search for feature information of the website, and feature information.
  • Import subunits for using the features The information is imported into the pre-trained information identification model; the identification information acquisition sub-unit is configured to obtain the identification information corresponding to the feature information of the website determined according to the information identification model, and the identifier corresponding to the feature information of the website The information is identification information corresponding to the candidate push information.
  • the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
  • the apparatus further includes: an information identification model establishing unit, comprising: a sample data obtaining subunit, configured to acquire feature information of the sample website and identification information corresponding to the determined feature information of the sample website; and the prediction identifier
  • the information obtaining sub-unit is configured to predict the identification information corresponding to the feature information of the sample website based on the initial model, and obtain the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models: Support vector machine model, decision tree model, naive Bayesian model, logistic regression model; predictive identification information judging sub-unit, used to judge the identification information corresponding to the feature information of the sample website predicted by the initial model and the characteristics of the determined sample website Whether the identification information corresponding to the information is consistent; the parameter modification subunit is configured to determine, in the prediction identification information determining subunit, that the identification information corresponding to the feature information of the sample website predicted by the initial model is inconsistent with the identification information corresponding to the determined feature information of the sample website.
  • the identification information includes first identification information and second identification information
  • the determining unit 502 includes: a first selection subunit, configured to be based on the searched information of the website Whether the preset keyword is included, one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate push information; or the second selection sub-unit is configured to report based on the obtained user Whether the information of the website is included in the information set, and selecting one of the first identification information and the second identification information as the identification information corresponding to the candidate pushing information.
  • the above information pushing device 500 also includes some of its His well-known structures, such as processors, memories, etc., are not shown in FIG. 5 in order to unnecessarily obscure the embodiments of the present disclosure.
  • FIG. 6 a block diagram of a computer system 600 suitable for use with a server of an embodiment of the present application is shown.
  • computer system 600 includes a central processing unit (CPU) 601 that can be loaded into a program in random access memory (RAM) 603 according to a program stored in read only memory (ROM) 602 or from storage portion 608. And perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read only memory
  • RAM random access memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also coupled to bus 604.
  • the following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, etc.; an output portion 607 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 608 including a hard disk or the like. And a communication portion 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet.
  • Driver 610 is also coupled to I/O interface 605 as needed.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage portion 608 as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
  • the computer program can be downloaded and installed from the network via communication portion 609, and/or installed from removable media 611.
  • the central processing unit (CPU) 601 the above-described functions defined in the method of the present application are performed.
  • each block in the flowchart or block diagram can represent a module, program segment, or code.
  • the module, program segment, or portion of code includes one or more executable instructions for implementing the specified logical functions.
  • the functions noted in the blocks may also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented by software or by hardware.
  • the described unit may also be provided in the processor, for example, as a processor including an acquisition unit, a determination unit, a generation unit, and a push unit.
  • the names of these units do not constitute a limitation on the unit itself in some cases.
  • the obtaining unit may also be described as “a unit that acquires candidate push information”.
  • the present application further provides a non-volatile computer storage medium, which may be a non-volatile computer storage medium included in the apparatus described in the foregoing embodiments; It may be a non-volatile computer storage medium that exists alone and is not assembled into the terminal.
  • the non-volatile computer storage medium stores one or more programs, when the one or more programs are executed by one device, causing the device to: obtain candidate push information; determine and match based on the pre-trained information identification model The identifier information corresponding to the candidate push information is generated; the to-be-push information is generated based on the candidate push information and the identifier information corresponding to the candidate push information; and the to-be-push information is pushed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

An information-pushing method and device, the method comprising: acquiring candidate information for pushing (201); based on a pre-trained information identification model determining identification information corresponding to the candidate information for pushing (202); generating information to be pushed on the basis of the candidate information for pushing and the identification information corresponding to the candidate information for pushing (203); pushing the information to be pushed (204). The present invention provides a different way of identifying information for pushing, and enables users to acquire information in a more efficient manner.

Description

信息推送方法和装置Information push method and device
相关申请的交叉引用Cross-reference to related applications
本申请要求于2016年1月15日提交的中国专利申请号为“201610029313.9”的优先权,其全部内容作为整体并入本申请中。The present application claims the priority of the Chinese Patent Application No. 201610029313.9, filed on Jan.
技术领域Technical field
本申请涉及计算机技术领域,具体涉及互联网技术领域,尤其涉及信息推送方法和和装置。The present application relates to the field of computer technologies, and in particular, to the field of Internet technologies, and in particular, to an information push method and apparatus.
背景技术Background technique
随着互联网技术的发展,信息量已经以几何级别增长,同时,信息缺乏管理或管理不善,信息的发布、传播失去控制,产生了大量虚假信息、无用信息,造成信息环境的污染和“信息垃圾”的产生。With the development of Internet technology, the amount of information has increased at a geometric level. At the same time, information is lacking in management or management, information release and dissemination are out of control, and a large amount of false information and useless information are generated, resulting in pollution of the information environment and "information waste." "The production."
然而,现有的信息推送方式通常是向用户直接推送各种候选推送信息,而不对这些信息添加标识,这样,推送的信息之间没有标识上的差异,用户获取信息的效率较低。However, the existing information push mode generally pushes various candidate push information directly to the user without adding an identifier to the information, so that there is no difference in the logo between the pushed information, and the user is less efficient in obtaining the information.
发明内容Summary of the invention
本申请的目的在于提出一种改进的信息推送方法和装置,来解决以上背景技术部分提到的技术问题。The purpose of the present application is to propose an improved information push method and apparatus to solve the technical problems mentioned in the background section above.
一方面,本申请提供了一种信息推送方法,所述方法包括:获取候选推送信息;基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;推送所述待推送信息。In one aspect, the present application provides an information pushing method, the method comprising: acquiring candidate push information; determining, according to a pre-trained information identification model, identification information corresponding to the candidate push information; and based on the candidate push information and And the identifier information corresponding to the candidate push information is generated, and the information to be pushed is pushed; and the information to be pushed is pushed.
在一些实施例中,所述基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息,包括:确认所述候选推送信息所来源的网站;搜索所述网站的特征信息,将所述特征信息导入预先训练的 信息标识模型;获取根据所述信息标识模型确定出的与所述网站的特征信息对应的标识信息,将与所述网站的特征信息对应的标识信息作为与所述候选推送信息对应的标识信息。In some embodiments, the determining, according to the pre-trained information identification model, the identification information corresponding to the candidate push information, comprising: confirming a website from which the candidate push information is derived; searching for feature information of the website, Introducing feature information into pre-trained And the identifier information corresponding to the feature information of the website determined by the information identification model, and the identifier information corresponding to the feature information of the website is used as the identifier information corresponding to the candidate push information.
在一些实施例中,所述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。In some embodiments, the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
在一些实施例中,所述方法还包括:建立信息标识模型的步骤,包括:获取训练所述模型所需的样本数据,其中,所述样本数据包括样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息;基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型;判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致;如果否,则将所述样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。In some embodiments, the method further comprises the step of establishing an information identification model, comprising: obtaining sample data required to train the model, wherein the sample data includes feature information of the sample website and the determined sample website The identification information corresponding to the feature information; predicting the identification information corresponding to the feature information of the sample website based on the initial model, and acquiring the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models : support vector machine model, decision tree model, naive Bayesian model, logistic regression model; determine whether the identification information corresponding to the feature information of the sample website predicted by the initial model is consistent with the identification information corresponding to the determined feature information of the sample website; If not, the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website are used as the training data of the initial model, and the parameters of the initial model are modified based on the training data, Obtaining the information identification model.
在一些实施例中,所述标识信息包括第一标识信息和第二标识信息;以及,所述基于预先训练的模型确定与所述候选推送信息对应的标识信息,包括:基于搜索到的所述网站的备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息;或,基于获取的用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。In some embodiments, the identification information includes first identification information and second identification information; and the determining, according to the pre-trained model, the identification information corresponding to the candidate push information, including: based on the searched Whether the preset information is included in the record information of the website, and one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate push information; or, based on the acquired user report information set Whether the information of the website is included, one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate pushing information.
第二方面,本申请提供了一种信息推送装置,所述装置包括:获取单元,配置用于获取候选推送信息;确定单元,配置用于基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;生成单元,配置用于基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;推送单元,配置用于推送所述待推送 信息。In a second aspect, the present application provides an information push apparatus, where the apparatus includes: an obtaining unit configured to acquire candidate push information; and a determining unit configured to determine and push the candidate push information based on the pre-trained information identification model Corresponding identification information, a generating unit, configured to generate information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information, and a pushing unit configured to push the to-be-pushed information.
在一些实施例中,所述确定单元,包括:网站确认子单元,用于确认所述候选推送信息所来源的网站;特征信息搜索子单元,用于搜索所述网站的特征信息,特征信息导入子单元,用于将所述特征信息导入预先训练的信息标识模型;标识信息获取子单元,用于获取根据所述信息标识模型确定出的与所述网站的特征信息对应的标识信息,将与所述网站的特征信息对应的标识信息作为与所述候选推送信息对应的标识信息。In some embodiments, the determining unit includes: a website confirmation subunit, configured to confirm a website from which the candidate push information is derived; a feature information search subunit, configured to search for feature information of the website, and import feature information a subunit, configured to import the feature information into a pre-trained information identification model; the identifier information acquisition subunit, configured to acquire, according to the information identification model, identifier information corresponding to the feature information of the website, The identification information corresponding to the feature information of the website is used as the identification information corresponding to the candidate push information.
在一些实施例中,所述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。In some embodiments, the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
在一些实施例中,所述装置还包括:信息标识模型建立单元,包括:样本数据获取子单元,用于获取样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息;预测标识信息获取子单元,用于基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型;预测标识信息判断子单元,用于判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致;参数修改子单元,用于在预测标识信息判断子单元判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息不一致的情况下,将所述样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。In some embodiments, the apparatus further includes: an information identification model establishing unit, comprising: a sample data obtaining subunit, configured to acquire feature information of the sample website and identification information corresponding to the determined feature information of the sample website; and the prediction identifier The information obtaining sub-unit is configured to predict the identification information corresponding to the feature information of the sample website based on the initial model, and obtain the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models: Support vector machine model, decision tree model, naive Bayesian model, logistic regression model; predictive identification information judging sub-unit, used to judge the identification information corresponding to the feature information of the sample website predicted by the initial model and the characteristics of the determined sample website Whether the identification information corresponding to the information is consistent; the parameter modification subunit is configured to determine, in the prediction identification information determining subunit, that the identification information corresponding to the feature information of the sample website predicted by the initial model is inconsistent with the identification information corresponding to the determined feature information of the sample website. Case Wherein the identification information and feature information of the sample site samples of the site information corresponding to the determined training data as the initial model, and, based on the training data to modify the parameters of the initial model to obtain information identifying the model.
在一些实施例中,所述标识信息包括第一标识信息和第二标识信息;以及,所述确定单元,包括:第一选择子单元,用于基于搜索到的所述网站的备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标 识信息;或,第二选择子单元,用于基于获取的用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。In some embodiments, the identification information includes first identification information and second identification information; and the determining unit includes: a first selecting subunit, configured to be based on whether the searched information of the website is searched for And including a preset keyword, and selecting one of the first identification information and the second identification information as a label corresponding to the candidate push information Or the second selection subunit, configured to select one of the first identification information and the second identification information as the candidate to be pushed based on whether the information of the website is included in the acquired user report information set. Identification information corresponding to the information.
本申请提供的信息推送方法和装置,通过获取候选推送信息,而后基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息,并基于所述候选推送信息以及与所述候选推送信息对应的标识信息生成待推送信息,最后推送所述待推送信息,从而实现了推送的信息之间在标识上的差异,使用户获取信息的效率更高。The information pushing method and apparatus provided by the present application obtains candidate push information, and then determines identification information corresponding to the candidate push information based on the pre-trained information identification model, and based on the candidate push information and the candidate push information. The corresponding identification information generates the information to be pushed, and finally pushes the information to be pushed, thereby realizing the difference in the identification between the pushed information, so that the user obtains the information more efficiently.
附图说明DRAWINGS
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects, and advantages of the present application will become more apparent from the detailed description of the accompanying drawings.
图1是本申请可以应用于其中的示例性***架构图;1 is an exemplary system architecture diagram to which the present application can be applied;
图2是根据本申请的信息推送方法的一个实施例的流程图;2 is a flow chart of one embodiment of an information push method according to the present application;
图3是根据本申请的信息推送方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of an information pushing method according to the present application;
图4是根据本申请的信息推送方法的又一个实施例的流程图;4 is a flow chart of still another embodiment of an information push method according to the present application;
图5是根据本申请的信息推送装置的一个实施例的结构示意图;FIG. 5 is a schematic structural diagram of an embodiment of an information pushing apparatus according to the present application; FIG.
图6是适于用来实现本申请实施例的终端设备或服务器的计算机***的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server of an embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention, rather than the invention. It is also to be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings.
图1示出了可以应用本申请的信息推送方法或信息推送装置的实 施例的示例性***架构100。FIG. 1 shows an example of an information pushing method or an information pushing device to which the present application can be applied. An exemplary system architecture 100 of the embodiment.
如图1所示,***架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, system architecture 100 can include terminal devices 101, 102, 103, network 104, and server 105. The network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various types of connections, such as wired, wireless communication links, fiber optic cables, and the like.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如网页浏览器应用、购物类应用、搜索类应用、地图类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can interact with the server 105 over the network 104 using the terminal devices 101, 102, 103 to receive or transmit messages and the like. Various communication client applications, such as a web browser application, a shopping application, a search application, a map application, an instant communication tool, a mailbox client, a social platform software, and the like, may be installed on the terminal devices 101, 102, and 103.
终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic The video specialist compresses the standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV) player, laptop portable computer and desktop computer, and the like.
需要说明的是,本申请实施例所提供的信息推送方法一般由服务器105执行,相应地,信息推送装置一般设置于服务器105中。It should be noted that the information pushing method provided by the embodiment of the present application is generally performed by the server 105. Accordingly, the information pushing device is generally disposed in the server 105.
在某些情况下,服务器105也可以直接从其他服务器处获取候选推送信息,将待推送信息推送至其他服务器,或者服务器105自身就存储有候选推送信息,此时本申请所使用的***架构也可以不涉及上述终端设备101、102、103。In some cases, the server 105 may also obtain candidate push information directly from other servers, push the information to be pushed to other servers, or the server 105 itself may store candidate push information. At this time, the system architecture used in the application is also The above terminal devices 101, 102, 103 may not be involved.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the number of terminal devices, networks, and servers in Figure 1 is merely illustrative. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
继续参考图2,示出了根据本申请的信息推送方法的一个实施例的流程200。所述的信息推送方法,包括以下步骤:With continued reference to FIG. 2, a flow 200 of one embodiment of an information push method in accordance with the present application is illustrated. The information pushing method includes the following steps:
步骤201,获取候选推送信息。Step 201: Acquire candidate push information.
在本实施例中,如果候选推送信息是基于搜索结果信息而生成的,那么信息推送方法运行于其上的电子设备(例如图1所示的服务器)可以通过如下步骤来获取候选推送信息:首先,获取用户的搜索请求; 接着,基于搜索请求查询对应的搜索结果信息,这时,可以将搜索结果信息直接作为候选推送信息,也可以根据实际需要设置筛选条件,筛选搜索结果信息,并将筛选后的搜索结果信息作为候选推送信息。例如,如果对候选推送信息的时效性要求较高,可以设置时限,筛选在设置的时限内的搜索结果。In this embodiment, if the candidate push information is generated based on the search result information, the electronic device (for example, the server shown in FIG. 1) on which the information push method runs may obtain the candidate push information by the following steps: , obtaining a user's search request; Then, the search result information is queried based on the search request. At this time, the search result information may be directly used as the candidate push information, or the screening condition may be set according to actual needs, the search result information may be filtered, and the filtered search result information is used as a candidate. Push information. For example, if the timeliness of candidate push information is high, you can set a time limit to filter the search results within the set time limit.
在本实施例中,信息推送方法运行于其上的电子设备(例如图1所示的服务器)也可以通过有线连接方式或者无线连接方式,直接从搜索服务器处获取搜索结果信息,将搜索结果信息作为候选推送信息。In this embodiment, the electronic device (for example, the server shown in FIG. 1) on which the information pushing method runs may also directly obtain search result information from the search server through a wired connection manner or a wireless connection manner, and the search result information is obtained. As candidate push information.
在本实施例中,候选推送信息也可以是根据用户的账户信息、历史推送信息而获取的。例如,用户的账户信息中记录了其工作的行业,则可将上述行业的行业动态信息作为候选推送信息。In this embodiment, the candidate push information may also be acquired according to the user's account information and historical push information. For example, if the user's account information records the industry in which it works, the industry dynamic information of the above-mentioned industries may be used as candidate push information.
步骤202,基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息。Step 202: Determine identification information corresponding to the candidate push information based on the pre-trained information identification model.
在本实施例中,标识信息包括以下至少一项:图像信息、文字信息、声音信息。标识信息可以用于指示候选推送信息是否安全可信。例如,候选推送信息的来源如果是政府机关的网站,则认为其安全可信,其对应的标识信息就是积极的,如,“优”、“顶级”的字样或图像,同样可以是字母“V”的字样或图像,进一步,可用“V1”、“V2”、“V3”的字样或图像表示其可信的程度;候选推送信息的来源如果是有过被举报记录的网站,则认为其不安全可信,其对应的标识信息就是消极的,如,“非优质”、“不推荐”的字样或图像。若未判断出候选推送信息对应的是积极的标识信息,也可以将其对应的标识信息设置为空,标识信息为空与积极的标识信息也可以形成对照。In this embodiment, the identification information includes at least one of the following: image information, text information, and sound information. The identification information can be used to indicate whether the candidate push information is secure and trustworthy. For example, if the source of the candidate push information is the website of the government agency, it is considered to be safe and trustworthy, and the corresponding identification information is positive, for example, the words "good" or "top" or the image may also be the letter "V". "" words or images, further, the words "V1", "V2", "V3" can be used to indicate the degree of credibility; if the source of the candidate push information is a website that has been reported, it is considered not to It is safe and trustworthy, and its corresponding identification information is negative, such as "non-premium", "not recommended" words or images. If it is not determined that the candidate push information corresponds to the positive identification information, the corresponding identification information may be set to be empty, and the identification information may be empty and the positive identification information may also be formed into a comparison.
上述电子设备得到候选推送信息后,可以在预设的数据库中查询上述候选推送信息的特征信息。具体的,可以先对候选推送信息进行统计分析和/或语义分析,提取至少一个关键词,例如组织机构名称,或网址,再基于关键信息在预设的数据库中查询关键信息对应的特征信息。当然,也可以先获取上述候选推送信息所来源的网站,在提供网站信息的站长工具类网站,通过SEO(Search Engine Optimization,搜索引擎优化)等查询工具,对上述候选推送信息所来源的网站进行 搜索操作,并抓取搜索结果页面中的信息作为特征信息。得到特征信息后,将特征信息导入预先训练的信息标识模型;按照信息标识模型预先训练好的对应关系,得到与上述特征信息对应的标识信息,与上述特征信息对应的标识信息即为与上述候选推送信息对应的标识信息。After obtaining the candidate push information, the electronic device may query the feature information of the candidate push information in a preset database. Specifically, the candidate push information may be statistically analyzed and/or semantically analyzed, and at least one keyword, such as an organization name or a web address, may be extracted, and then the feature information corresponding to the key information is queried in a preset database based on the key information. Of course, the website from which the candidate push information is derived may be obtained first, and the website from which the candidate push information is derived may be obtained through a search tool such as SEO (Search Engine Optimization) on the webmaster tool website that provides the website information. get on Search for the action and grab the information from the search results page as feature information. After obtaining the feature information, the feature information is imported into the pre-trained information identification model; according to the pre-trained correspondence relationship of the information identification model, the identification information corresponding to the feature information is obtained, and the identification information corresponding to the feature information is the candidate The identification information corresponding to the push information.
在本实施例中,上述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。In this embodiment, the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, flow information, Weight information, information about the organizer of the website.
网站的排名信息可以是从亚历克萨Alexa排名***中获取的网站的排名信息。关键词排名是一种在搜索引擎搜索结果中以字、词、词组的相关性体现网页排名的方式。关键词自然排名一般是搜索引擎对所有相关网页抓取结果自动分析、自动排名的体现,通常,搜索引擎的网站会提供网站的关键词排名信息。网站跳出率是指仅浏览了一个页面就离开的用户占一组页面或一个页面访问次数的百分比,例如,一个网站在某一段时间内有1000不同访客从这个链接进入,同时这些访客中有50个人没有二次浏览行为,直接退出网站,那么针对这个入口网址的跳出率就是50/1000=5%。网站的外链数就是指从其他网站导入到本网站的链接数量,通过常用的外链分析工具可以获取网站的外链数信息。网站流量是指网站的访问量,是用来描述访问一个网站的用户数量以及用户所浏览的网页数量的指标。网站的流量信息可以是历史流量信息也可以是预估流量信息。网站的权重信息通常是指搜索引擎对一个网站的整体评价,作为示例,可以使用百度权重,谷歌的PR(page rank,谷歌网页级别),或者,搜狗的SR(Sogou Rank,搜狗网页指数)。The ranking information of the website may be the ranking information of the website obtained from the Alexa Alexa ranking system. Keyword ranking is a way to reflect the ranking of a page by the relevance of words, words, and phrases in search engine search results. The natural ranking of keywords is generally the embodiment of automatic analysis and automatic ranking of all relevant webpage crawling results by search engines. Usually, the search engine's website will provide keyword ranking information of the website. The bounce rate of a website refers to the percentage of users who leave the page only after browsing a page, or a page visit. For example, a website has 1000 different visitors entering from this link in a certain period of time, and 50 of these visitors. If the individual does not have a second browsing behavior and quits the website directly, then the bounce rate for this portal URL is 50/1000=5%. The number of links in the website refers to the number of links imported from other websites to the website. The commonly used external chain analysis tools can obtain the number of links of the website. Website traffic is the number of visits to a website and is a measure of the number of users accessing a website and the number of pages viewed by the user. The traffic information of the website may be historical traffic information or estimated traffic information. The weight information of a website usually refers to the overall evaluation of a website by a search engine. As an example, Baidu weight, Google's PR (page rank, Google webpage level), or Sogou's SR (Sogou Rank) can be used.
在本实施例的一些可选的实现方式中,可以先搜索所述网站的网站备案信息,基于所述网站备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。例如,可以搜索所述网站的网站备案信息,抓取记录中主办单位性质的字段对应的信息,判断其中是否包含事业单 位、政府机关、军队或社会团体等关键字,如果是,则确定与所述候选推送信息对应的标识信息为第一标识信息,第一标识信息是积极的标识信息,可以是类似“优”、“顶级”的字样或图像。In some optional implementation manners of the embodiment, the website registration information of the website may be searched first, and the first identification information and the second identifier are obtained based on whether the website registration information includes a preset keyword. One of the information is selected as the identification information corresponding to the candidate push information. For example, the website registration information of the website may be searched, and the information corresponding to the field of the nature of the organizer in the record may be captured to determine whether the business order is included therein. a keyword, a government agency, an army, or a social group, if yes, determining that the identification information corresponding to the candidate push information is the first identification information, and the first identification information is positive identification information, which may be similar to “excellent” , "top" words or images.
在本实施例的一些可选的实现方式中,可以获取用户举报信息集合,基于所述用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。用户举报信息集合可以是服务器收集的用户的历史举报信息,举报信息包括举报的事项以及举报的对象信息,举报的对象信息可以是网站网址,也可以是网站的主办单位名称。例如,用户举报网站1中包括虚假内容,经核实,网站1确实包括虚假内容,服务器记录此项举报信息,若候选推送信息所来源的网站是网站1,则确定与候选推送信息对应的标识信息为第二标识信息,第二标识信息是消极的标识信息,可以是类似“非优质”、“不推荐”的字样或图像,也可以为空。In some optional implementation manners of the embodiment, the user report information set may be obtained, and the first identifier information and the second identifier information are selected based on whether the information about the website is included in the user report information set. One is identification information corresponding to the candidate push information. The user report information set may be historical report information of the user collected by the server, and the report information includes the reported item and the reported object information, and the reported object information may be the website address or the name of the website organizer. For example, the user report website 1 includes false content. After verification, the website 1 does include false content, and the server records the report information. If the website from which the candidate push information originates is the website 1, the identification information corresponding to the candidate push information is determined. The second identification information is a negative identification information, and may be a word or image similar to “non-premium” or “not recommended”, or may be empty.
步骤203,基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息。Step 203: Generate information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information.
本实施例中,上述电子设备可以将上述候选推送信息以及与所述候选推送信息对应的标识信息相结合作为待推送信息。例如,标识信息为图像信息时,可以将对应的图像添加在候选推送信息的指定部分中。In this embodiment, the electronic device may combine the candidate push information and the identifier information corresponding to the candidate push information as the information to be pushed. For example, when the identification information is image information, the corresponding image may be added to the designated portion of the candidate push information.
步骤204,推送所述待推送信息。Step 204: Push the information to be pushed.
继续参见图3,图3是根据本实施例的信息推送方法的应用场景的一个示意图。在图3的应用场景中,用户首先发起一个搜索请求,搜索关键词为“新闻”;之后,信息标识服务器可以后台获取搜索结果信息作为候选推送信息,并提取出候选推送信息对应的特征信息;然后,上述信息标识服务器将候选推送信息的特征信息导入预先训练的信息标识模型,确定候选推送信息中新闻网站1、新闻网站2对应的标识信息为积极的标识信息,将积极的标识信息“优”的字样与候选推送信息结合,生成待推送信息,最后推送所述待推送信息。当用户浏览搜索结果时,如果在“优”的字样处存在悬停或点击等操作,可 以通过悬浮窗等方式,根据实际需要,显示部分或全部的特征信息。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the information pushing method according to the embodiment. In the application scenario of FIG. 3, the user first initiates a search request, and the search keyword is “news”; after that, the information identification server may obtain search result information as candidate push information in the background, and extract feature information corresponding to the candidate push information; Then, the information identification server introduces the feature information of the candidate push information into the pre-trained information identification model, and determines that the identification information corresponding to the news website 1 and the news website 2 in the candidate push information is positive identification information, and the positive identification information is “excellent. The word "" is combined with the candidate push information to generate the information to be pushed, and finally pushes the information to be pushed. When the user browses the search results, if there is a hover or click on the word "excellent", Part or all of the feature information is displayed according to actual needs by means of a floating window or the like.
本申请的上述实施例提供的方法通过确定与所述候选推送信息对应的标识信息,并且基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息,该实施方式实现了推送的信息之间在标识上的差异,使用户获取信息的效率更高。The method provided by the foregoing embodiment of the present application generates the to-be-push information by determining the identification information corresponding to the candidate push information, and generating the information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information. The difference in the identification between the pushed information makes the user more efficient in obtaining information.
进一步参考图4,其示出了信息推送方法的又一个实施例的流程400。该信息推送方法的流程400,包括以下步骤:With further reference to FIG. 4, a flow 400 of yet another embodiment of an information push method is illustrated. The flow 400 of the information pushing method includes the following steps:
步骤401,获取候选推送信息。Step 401: Acquire candidate push information.
在本实施例中,候选推送信息可以是基于与搜索操作关联的搜索结果信息而生成的,或是根据用户的账户信息、历史推送信息而生成的。In this embodiment, the candidate push information may be generated based on the search result information associated with the search operation, or generated based on the user's account information and historical push information.
步骤402,获取上述候选推送信息所来源的网站。Step 402: Obtain a website from which the candidate push information is derived.
通常,候选推送信息会直接包含其所来源的网站的信息,通过对候选推送信息进行统计分析和/或语义分析,可以提取出网站名称,或网址。Generally, the candidate push information directly includes the information of the website from which the source is derived, and the website name or the website address can be extracted by performing statistical analysis and/or semantic analysis on the candidate push information.
步骤403,判断网站主办单位性质是否为事业单位、政府机关、军队或社会团体。In step 403, it is determined whether the nature of the website sponsoring unit is a public institution, a government agency, an army, or a social group.
在本实施例中,获取上述候选推送信息所来源的网站后可以在网站备案信息数据库中查询候选推送信息所来源的网站对应的主办单位性质,表1展示了备案信息数据库中的部分记录。In this embodiment, after obtaining the website from which the candidate push information is derived, the website record information database may be queried for the nature of the sponsor corresponding to the website from which the candidate push information is derived, and Table 1 shows some records in the record information database.
表1 备案信息数据库中的部分记录Table 1 Part of the record in the record information database
Figure PCTCN2016087453-appb-000001
Figure PCTCN2016087453-appb-000001
同样,也可以在网站备案信息查询网站上查询上述候选推送信息 所来源的网站,通过抓取的方式获得网站对应的主办单位性质的信息,如果上述候选推送信息所来源的网站的网站主办单位性质为事业单位、政府机关、军队或社会团体,则确定与上述候选推送信息对应的标识信息为积极的标识信息,并进入步骤406;如果否,则进入步骤404。Similarly, the above candidate push information can also be queried on the website filing information inquiry website. The website from which the website is sourced obtains the information of the nature of the organizer corresponding to the website by means of crawling. If the website sponsoring unit of the website from which the candidate push information originates is a public institution, a government agency, an army or a social group, then the above is determined. The identification information corresponding to the candidate push information is positive identification information, and proceeds to step 406; if not, proceeds to step 404.
如表1中所示,国家知识产权局政府网站对应的主办单位性质为政府机关,所以其网站提供的信息可靠性高,因此如果候选推送信息来源于国家知识产权局政府网站,则确定其对应的标识信息为积极的标识信息。As shown in Table 1, the host institution of the government website of the State Intellectual Property Office is of a government agency, so the information provided by its website is highly reliable. Therefore, if the candidate push information is from the government website of the State Intellectual Property Office, the corresponding information is determined. The identification information is positive identification information.
步骤404,判断违规行为数据库中是否包括上述候选推送信息所来源的网站的记录或该网站的主办单位的记录。Step 404: Determine whether the record of the website from which the candidate push information is originated or the record of the organizer of the website is included in the violation behavior database.
上述违规失信行为数据库中的记录可以基于用户举报历史信息获取,也可以基于全国企业信用信息公示***,或公示的严重违法失信企业名单获取,如果违规行为数据库中包括上述候选推送信息所来源的网站的记录或该网站的主办单位的记录,则确定与上述候选推送信息对应的标识信息为消极的标识信息,并进入步骤406;如果否,则进入步骤405。The records in the above-mentioned violation and trustworthy behavior database may be obtained based on the user's report history information, or may be obtained based on the national enterprise credit information publicity system or the publicized list of serious illegal and untrustworthy enterprises, if the violation behavior database includes the website from which the candidate push information originates. The record or the record of the organizer of the website determines that the identification information corresponding to the candidate push information is negative identification information, and proceeds to step 406; if not, proceeds to step 405.
步骤405,根据候选推送信息获取特征信息,将特征信息导入信息标识模型。Step 405: Acquire feature information according to the candidate push information, and import the feature information into the information identification model.
建立信息标识模型的步骤,包括:The steps to establish an information identification model include:
第一、获取训练所述模型所需的样本数据,其中,所述样本数据包括样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息。First, obtaining sample data required for training the model, wherein the sample data includes feature information of the sample website and identification information corresponding to the determined feature information of the sample website.
样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息可以从样本数据集合中取得,样本数据集合可以是人工设置的,也可以是基于已确定的标识信息对应的网站,在站长工具类网站的搜索结果页面抓取对应的特征信息而获得的。The feature information of the sample website and the identification information corresponding to the determined feature information of the sample website may be obtained from the sample data set, and the sample data set may be manually set, or may be a website corresponding to the determined identification information, at the station. The search result page of the long tool type website is obtained by fetching the corresponding feature information.
例如,已确定世界500强企业的官方网站对应的标识信息为“优”,则在站长工具类网站搜索世界500强企业的官方网站,在搜索结果页面抓取世界500强企业的官方网站的特征信息,将世界500强企业的 官方网站作为样本网站,世界500强企业的官方网站的特征信息作为样本网站的特征信息,将“优”作为已确定的样本网站的特征信息对应的标识信息。For example, if the official website of the Fortune 500 companies has been identified as “excellent”, search for the official website of the Fortune 500 companies on the webmaster tools website, and grab the official website of the Fortune 500 companies on the search results page. Characteristic information, will be the world's top 500 companies The official website serves as a sample website, and the feature information of the official website of the Fortune 500 companies is used as the feature information of the sample website, and "excellent" is used as the identification information corresponding to the characteristic information of the determined sample website.
获取训练所述模型所需的样本数据还可以通过统计用户浏览记录的方式取得,例如,将大量用户在特定时间段内反复访问的网站对应的标识信息确定为“优”,将此类网站作为样本网站。Obtaining the sample data required for training the model can also be obtained by counting the manner in which the user browses the record. For example, the identification information corresponding to the website repeatedly visited by a large number of users in a specific time period is determined as “excellent”, and such websites are used as Sample website.
第二、基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型。Secondly, based on the initial model, the identification information corresponding to the feature information of the sample website is predicted, and the identification information corresponding to the feature information of the sample website predicted by the initial model is obtained, wherein the initial model is one of the following models: a support vector machine model , decision tree model, naive Bayesian model, logistic regression model.
第三、判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致。Third, determining whether the identification information corresponding to the feature information of the sample website predicted by the initial model is consistent with the identification information corresponding to the determined feature information of the sample website.
如果否,则将所述样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。If not, the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website are used as the training data of the initial model, and the parameters of the initial model are modified based on the training data, Obtaining the information identification model.
例如,如果网站的特征信息对应的标识信息只有两种,第一标识信息或第二标识信息,其中第二标识信息也可为空,上述初始模型是支持向量机模型时,可以运行LIBSVM软件,确定核函数为线性核(Linear Kernel),线性核需要选择和调整的参数有惩罚参数C,权重参数weight,weight用来调整不同类别的参数的C的权值,其中weight可设置成正负样本的比例(即第一标识信息对应的样本网站和第二标识对应的样本网站的比例)。惩罚参数C一般可以范围为0.0001到10000,可以根据上述训练数据调整C的值。For example, if there is only two types of identification information corresponding to the feature information of the website, the first identification information or the second identification information, wherein the second identification information may also be empty. When the initial model is a support vector machine model, the LIBSVM software may be run. The kernel function is determined to be a linear kernel. The parameters that need to be selected and adjusted by the linear kernel are the penalty parameter C. The weight parameter weight is used to adjust the weight of C of different categories of parameters. The weight can be set to positive and negative samples. The proportion of the sample website corresponding to the first identification information and the sample website corresponding to the second identifier. The penalty parameter C can generally range from 0.0001 to 10000, and the value of C can be adjusted according to the above training data.
第四、根据候选推送信息获取特征信息,将特征信息导入信息标识模型;信息标识模型会将特征信息按照预先训练好的对应关系,找到对应的标识信息。Fourth, the feature information is acquired according to the candidate push information, and the feature information is imported into the information identification model; the information identification model finds the corresponding identification information according to the pre-trained correspondence relationship.
步骤406,基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息。Step 406: Generate to-be-push information based on the candidate push information and the identifier information corresponding to the candidate push information.
步骤407,推送所述待推送信息。Step 407: Push the information to be pushed.
用户接收到推送的信息后,如果用户质疑标识信息的正确性,也 可以通过点击设置的报错按钮等方式向服务器报错,服务器收集到错误的标识信息后,可将错误的标识信息及其对应的特征数据作为新的训练数据,重新训练信息标识模型,进一步提高信息标识模型的准确度。After the user receives the pushed information, if the user questions the correctness of the identification information, You can report the error to the server by clicking the error button set. After the server collects the wrong identification information, the error identification information and its corresponding feature data can be used as new training data to retrain the information identification model to further improve the information identification. The accuracy of the model.
从图4中可以看出,与图2对应的实施例相比,本实施例中的信息推送方法的流程400突出了确定标识信息的步骤。由此,本实施例描述的方案可以引入更多的确定标识信息的相关数据,从而实现准确度更高的标识信息的确定和更有效的信息推送。As can be seen from FIG. 4, the flow 400 of the information push method in the present embodiment highlights the step of determining the identification information as compared with the embodiment corresponding to FIG. Therefore, the solution described in this embodiment can introduce more relevant data for determining the identification information, thereby realizing the determination of the identification information with higher accuracy and more effective information push.
进一步参考图5,作为对上述各图所示方法的实现,本申请提供了一种信息推送装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With reference to FIG. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an information pushing apparatus, and the apparatus embodiment corresponds to the method embodiment shown in FIG. Used in a variety of electronic devices.
如图5所示,本实施例所述的信息推送装置500包括:获取单元501、确定单元502、生成单元503和推送单元504。其中,获取单元501配置用于获取候选推送信息;确定单元502配置用于基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;生成单元503配置用于基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;而推送单元504配置用于推送所述待推送信息。As shown in FIG. 5, the information pushing apparatus 500 described in this embodiment includes an obtaining unit 501, a determining unit 502, a generating unit 503, and a pushing unit 504. The obtaining unit 501 is configured to acquire candidate push information; the determining unit 502 is configured to determine, according to the pre-trained information identification model, identifier information corresponding to the candidate push information; and the generating unit 503 is configured to use the candidate push information. And the identification information corresponding to the candidate push information, generating the information to be pushed; and the pushing unit 504 is configured to push the information to be pushed.
在本实施例中,信息推送装置500的获取单元501可以通过有线连接方式或者无线连接方式从终端或其他服务器获取候选推送信息In this embodiment, the obtaining unit 501 of the information pushing device 500 can obtain candidate push information from the terminal or other server through a wired connection manner or a wireless connection manner.
在本实施例中,获取单元501获取候选推送信息,信息推送装置500上预先训练有信息标识模型,由此,信息推送装置500的确定单元502可以基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息,生成单元503可以基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息,推送单元504可以推送生成单元503生成的待推送信息。In this embodiment, the obtaining unit 501 acquires candidate push information, and the information pushing device 500 is pre-trained with the information identification model, whereby the determining unit 502 of the information pushing device 500 can determine and candidate based on the pre-trained information identification model. The identification information corresponding to the push information, the generating unit 503 may generate the to-be-pushed information based on the candidate push information and the identification information corresponding to the candidate push information, and the pushing unit 504 may push the information to be pushed generated by the generating unit 503.
在一些实施例中,所述确定单元502,包括:网站确认子单元,用于确认所述候选推送信息所来源的网站;特征信息搜索子单元,用于搜索所述网站的特征信息,特征信息导入子单元,用于将所述特征 信息导入预先训练的信息标识模型;标识信息获取子单元,用于获取根据所述信息标识模型确定出的与所述网站的特征信息对应的标识信息,将与所述网站的特征信息对应的标识信息作为与所述候选推送信息对应的标识信息。In some embodiments, the determining unit 502 includes: a website confirmation subunit, configured to confirm a website from which the candidate push information is derived; and a feature information search subunit, configured to search for feature information of the website, and feature information. Import subunits for using the features The information is imported into the pre-trained information identification model; the identification information acquisition sub-unit is configured to obtain the identification information corresponding to the feature information of the website determined according to the information identification model, and the identifier corresponding to the feature information of the website The information is identification information corresponding to the candidate push information.
在一些实施例中,所述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。In some embodiments, the feature information includes at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, outer chain number information, and flow information. , weight information, website organizer information.
在一些实施例中,所述装置还包括:信息标识模型建立单元,包括:样本数据获取子单元,用于获取样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息;预测标识信息获取子单元,用于基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型;预测标识信息判断子单元,用于判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致;参数修改子单元,用于在预测标识信息判断子单元判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息不一致的情况下,将所述样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。In some embodiments, the apparatus further includes: an information identification model establishing unit, comprising: a sample data obtaining subunit, configured to acquire feature information of the sample website and identification information corresponding to the determined feature information of the sample website; and the prediction identifier The information obtaining sub-unit is configured to predict the identification information corresponding to the feature information of the sample website based on the initial model, and obtain the identification information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is one of the following models: Support vector machine model, decision tree model, naive Bayesian model, logistic regression model; predictive identification information judging sub-unit, used to judge the identification information corresponding to the feature information of the sample website predicted by the initial model and the characteristics of the determined sample website Whether the identification information corresponding to the information is consistent; the parameter modification subunit is configured to determine, in the prediction identification information determining subunit, that the identification information corresponding to the feature information of the sample website predicted by the initial model is inconsistent with the identification information corresponding to the determined feature information of the sample website. Case Wherein the identification information and feature information of the sample site samples of the site information corresponding to the determined training data as the initial model, and, based on the training data to modify the parameters of the initial model to obtain information identifying the model.
在一些实施例中,所述标识信息包括第一标识信息和第二标识信息;以及,所述确定单元502,包括:第一选择子单元,用于基于搜索到的所述网站的备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息;或,第二选择子单元,用于基于获取的用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。In some embodiments, the identification information includes first identification information and second identification information; and the determining unit 502 includes: a first selection subunit, configured to be based on the searched information of the website Whether the preset keyword is included, one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate push information; or the second selection sub-unit is configured to report based on the obtained user Whether the information of the website is included in the information set, and selecting one of the first identification information and the second identification information as the identification information corresponding to the candidate pushing information.
本领域技术人员可以理解,上述信息推送装置500还包括一些其 他公知结构,例如处理器、存储器等,为了不必要地模糊本公开的实施例,这些公知的结构在图5中未示出。Those skilled in the art can understand that the above information pushing device 500 also includes some of its His well-known structures, such as processors, memories, etc., are not shown in FIG. 5 in order to unnecessarily obscure the embodiments of the present disclosure.
下面参考图6,其示出了适于用来实现本申请实施例服务器的计算机***600的结构示意图。Referring now to Figure 6, a block diagram of a computer system 600 suitable for use with a server of an embodiment of the present application is shown.
如图6所示,计算机***600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有***600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, computer system 600 includes a central processing unit (CPU) 601 that can be loaded into a program in random access memory (RAM) 603 according to a program stored in read only memory (ROM) 602 or from storage portion 608. And perform various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also coupled to bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, etc.; an output portion 607 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 608 including a hard disk or the like. And a communication portion 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet. Driver 610 is also coupled to I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage portion 608 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,所述计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as a computer software program in accordance with an embodiment of the present disclosure. For example, an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart. In such an embodiment, the computer program can be downloaded and installed from the network via communication portion 609, and/or installed from removable media 611. When the computer program is executed by the central processing unit (CPU) 601, the above-described functions defined in the method of the present application are performed.
附图中的流程图和框图,图示了按照本申请各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码 的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality and operation of possible implementations of systems, methods and computer program products in accordance with various embodiments of the present application. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or code. In part, the module, program segment, or portion of code includes one or more executable instructions for implementing the specified logical functions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元、确定单元、生成单元和推送单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,获取单元还可以被描述为“获取候选推送信息的单元”。The units involved in the embodiments of the present application may be implemented by software or by hardware. The described unit may also be provided in the processor, for example, as a processor including an acquisition unit, a determination unit, a generation unit, and a push unit. The names of these units do not constitute a limitation on the unit itself in some cases. For example, the obtaining unit may also be described as “a unit that acquires candidate push information”.
作为另一方面,本申请还提供了一种非易失性计算机存储介质,该非易失性计算机存储介质可以是上述实施例中所述装置中所包含的非易失性计算机存储介质;也可以是单独存在,未装配入终端中的非易失性计算机存储介质。上述非易失性计算机存储介质存储有一个或者多个程序,当所述一个或者多个程序被一个设备执行时,使得所述设备:获取候选推送信息;基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;推送所述待推送信息。In another aspect, the present application further provides a non-volatile computer storage medium, which may be a non-volatile computer storage medium included in the apparatus described in the foregoing embodiments; It may be a non-volatile computer storage medium that exists alone and is not assembled into the terminal. The non-volatile computer storage medium stores one or more programs, when the one or more programs are executed by one device, causing the device to: obtain candidate push information; determine and match based on the pre-trained information identification model The identifier information corresponding to the candidate push information is generated; the to-be-push information is generated based on the candidate push information and the identifier information corresponding to the candidate push information; and the to-be-push information is pushed.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离所述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。 The above description is only a preferred embodiment of the present application and a description of the principles of the applied technology. It should be understood by those skilled in the art that the scope of the invention referred to in the present application is not limited to the specific combination of the above technical features, and should also be covered by the above technical features without departing from the inventive concept. Other technical solutions formed by any combination of their equivalent features. For example, the above features are combined with the technical features disclosed in the present application, but are not limited to the technical features having similar functions.

Claims (12)

  1. 一种信息推送方法,其特征在于,所述方法包括:An information pushing method, the method comprising:
    获取候选推送信息;Obtain candidate push information;
    基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;Determining, according to the pre-trained information identification model, identification information corresponding to the candidate push information;
    基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;Generating to-be-push information based on the candidate push information and the identification information corresponding to the candidate push information;
    推送所述待推送信息。Push the information to be pushed.
  2. 根据权利要求1所述的方法,其特征在于,所述基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息,包括:The method according to claim 1, wherein the determining the identification information corresponding to the candidate push information based on the pre-trained information identification model comprises:
    确认所述候选推送信息所来源的网站;Confirming the website from which the candidate push information is derived;
    搜索所述网站的特征信息;Searching for feature information of the website;
    将所述特征信息导入预先训练的信息标识模型;Importing the feature information into a pre-trained information identification model;
    获取根据所述信息标识模型确定出的与所述网站的特征信息对应的标识信息,将与所述网站的特征信息对应的标识信息作为与所述候选推送信息对应的标识信息。The identification information corresponding to the feature information of the website determined according to the information identification model is obtained, and the identification information corresponding to the feature information of the website is used as the identification information corresponding to the candidate push information.
  3. 根据权利要求2所述的方法,其特征在于,所述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。The method according to claim 2, wherein the feature information comprises at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, External chain number information, traffic information, weight information, and website organizer information.
  4. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    建立信息标识模型的步骤,包括:The steps to establish an information identification model include:
    获取训练所述模型所需的样本数据,其中,所述样本数据包括样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息; Obtaining sample data required for training the model, wherein the sample data includes feature information of the sample website and identification information corresponding to the determined feature information of the sample website;
    基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型;The identification information corresponding to the feature information of the sample website is predicted based on the initial model, and the identification information corresponding to the feature information of the sample website predicted by the initial model is obtained, wherein the initial model is one of the following models: a support vector machine model, a decision tree Model, naive Bayesian model, logistic regression model;
    判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致;Determining whether the identification information corresponding to the feature information of the sample website predicted by the initial model is consistent with the identification information corresponding to the determined feature information of the sample website;
    如果否,则将所述样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。If not, the feature information of the sample website and the identification information corresponding to the determined feature information of the sample website are used as the training data of the initial model, and the parameters of the initial model are modified based on the training data, Obtaining the information identification model.
  5. 根据权利要求2所述的方法,其特征在于,所述标识信息包括第一标识信息和第二标识信息;以及,The method according to claim 2, wherein the identification information comprises first identification information and second identification information;
    所述基于预先训练的模型确定与所述候选推送信息对应的标识信息,包括:Determining the identification information corresponding to the candidate push information based on the pre-training model includes:
    基于搜索到的所述网站的备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息;And selecting one of the first identification information and the second identification information as the identification information corresponding to the candidate pushing information, based on whether the searched information of the website includes a preset keyword;
    或,基于获取的用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。Alternatively, one of the first identification information and the second identification information is selected as the identification information corresponding to the candidate pushing information, based on whether the acquired information of the website includes the information of the website.
  6. 一种信息推送装置,其特征在于,所述装置包括:An information pushing device, characterized in that the device comprises:
    获取单元,配置用于获取候选推送信息;An obtaining unit configured to obtain candidate push information;
    确定单元,配置用于基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;a determining unit, configured to determine, according to the pre-trained information identification model, identifier information corresponding to the candidate push information;
    生成单元,配置用于基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;a generating unit, configured to generate, to be pushed, the information to be pushed based on the candidate push information and the identifier information corresponding to the candidate push information;
    推送单元,配置用于推送所述待推送信息。a pushing unit configured to push the information to be pushed.
  7. 根据权利要求6所述的装置,其特征在于,所述确定单元,包 括:The apparatus according to claim 6, wherein said determining unit, said package include:
    网站确认子单元,用于确认所述候选推送信息所来源的网站;a website confirmation subunit for confirming a website from which the candidate push information is derived;
    特征信息搜索子单元,用于搜索所述网站的特征信息;a feature information search subunit for searching feature information of the website;
    特征信息导入子单元,用于将所述特征信息导入预先训练的信息标识模型;a feature information importing subunit, configured to import the feature information into a pre-trained information identification model;
    标识信息获取子单元,用于获取根据所述信息标识模型确定出的与所述网站的特征信息对应的标识信息,将与所述网站的特征信息对应的标识信息作为与所述候选推送信息对应的标识信息。An identifier information obtaining sub-unit, configured to acquire, according to the information identification model, identifier information corresponding to the feature information of the website, and the identifier information corresponding to the feature information of the website is used as the candidate push information Identification information.
  8. 根据权利要求7所述的装置,其特征在于,所述特征信息包括所述网站的以下信息中的至少一项:服务器数量信息、域名年龄信息、排名信息、关键词排名信息、跳出率信息、外链数信息、流量信息、权重信息、网站的主办单位信息。The apparatus according to claim 7, wherein the feature information comprises at least one of the following information of the website: server quantity information, domain name age information, ranking information, keyword ranking information, bounce rate information, External chain number information, traffic information, weight information, and website organizer information.
  9. 根据权利要求6或7所述的装置,其特征在于,所述装置还包括:The device according to claim 6 or 7, wherein the device further comprises:
    信息标识模型建立单元,包括:The information identification model building unit includes:
    样本数据获取子单元,用于获取训练所述模型所需的样本数据,其中,所述样本数据包括样本网站的特征信息以及已确定的样本网站的特征信息对应的标识信息;a sample data obtaining subunit, configured to acquire sample data required to train the model, wherein the sample data includes feature information of the sample website and identification information corresponding to the determined feature information of the sample website;
    预测标识信息获取子单元,用于基于初始模型对样本网站的特征信息对应的标识信息进行预测,获取初始模型预测的样本网站的特征信息对应的标识信息,其中,所述初始模型是以下模型之一:支持向量机模型、决策树模型、朴素贝叶斯模型,逻辑回归模型;The prediction identifier information acquisition subunit is configured to predict the identifier information corresponding to the feature information of the sample website based on the initial model, and obtain the identifier information corresponding to the feature information of the sample website predicted by the initial model, wherein the initial model is the following model One: support vector machine model, decision tree model, naive Bayesian model, logistic regression model;
    预测标识信息判断子单元,用于判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息是否一致;a prediction identifier information determining subunit, configured to determine whether the identifier information corresponding to the feature information of the sample website predicted by the initial model is consistent with the identifier information corresponding to the determined feature information of the sample website;
    参数修改子单元,用于在预测标识信息判断子单元判断初始模型预测的样本网站的特征信息对应的标识信息与已确定的样本网站的特征信息对应的标识信息不一致的情况下,将所述样本网站的特征信息 以及已确定的样本网站的特征信息对应的标识信息作为所述初始模型的训练数据,并且,基于所述训练数据修改所述初始模型的参数,以获得所述信息标识模型。a parameter modification subunit, configured to: when the prediction identification information determining subunit determines that the identification information corresponding to the feature information of the sample website predicted by the initial model is inconsistent with the identification information corresponding to the determined feature information of the sample website, the sample is Website feature information And the identification information corresponding to the determined feature information of the sample website is used as the training data of the initial model, and the parameters of the initial model are modified based on the training data to obtain the information identification model.
  10. 根据权利要求7所述的装置,其特征在于,所述标识信息包括第一标识信息和第二标识信息;以及,The apparatus according to claim 7, wherein the identification information comprises first identification information and second identification information;
    所述确定单元,包括:The determining unit includes:
    第一选择子单元,用于基于搜索到的所述网站的备案信息中是否包括预先设置的关键词,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息;a first selection sub-unit, configured to select one of the first identification information and the second identification information as corresponding to the candidate push information, based on whether the searched information of the website includes a preset keyword Identification information;
    或,第二选择子单元,用于基于获取的用户举报信息集合中是否包括所述网站的信息,从所述第一标识信息和第二标识信息中选择一个作为与所述候选推送信息对应的标识信息。Or the second selection sub-unit is configured to select one of the first identification information and the second identification information as the corresponding to the candidate push information, based on whether the information about the website is included in the acquired user report information set. Identification information.
  11. 一种设备,包括:A device that includes:
    处理器;和Processor; and
    存储器,Memory,
    所述存储器中存储有能够被所述处理器执行的计算机可读指令,在所述计算机可读指令被执行时,所述处理器执行信息推送方法,所述方法包括:The memory stores computer readable instructions executable by the processor, the processor executing an information push method when the computer readable instructions are executed, the method comprising:
    获取候选推送信息;Obtain candidate push information;
    基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;Determining, according to the pre-trained information identification model, identification information corresponding to the candidate push information;
    基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;Generating to-be-push information based on the candidate push information and the identification information corresponding to the candidate push information;
    推送所述待推送信息。Push the information to be pushed.
  12. 一种非易失性计算机存储介质,所述计算机存储介质存储有能够被处理器执行的计算机可读指令,当所述计算机可读指令被处理器执行时,所述处理器执行信息推送方法,所述方法包括: A non-volatile computer storage medium storing computer readable instructions executable by a processor, the processor executing an information push method when the computer readable instructions are executed by a processor, The method includes:
    获取候选推送信息;Obtain candidate push information;
    基于预先训练的信息标识模型确定与所述候选推送信息对应的标识信息;Determining, according to the pre-trained information identification model, identification information corresponding to the candidate push information;
    基于所述候选推送信息以及与所述候选推送信息对应的标识信息,生成待推送信息;Generating to-be-push information based on the candidate push information and the identification information corresponding to the candidate push information;
    推送所述待推送信息。 Push the information to be pushed.
PCT/CN2016/087453 2016-01-15 2016-06-28 Information-pushing method and device WO2017121076A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610029313.9 2016-01-15
CN201610029313.9A CN105718533A (en) 2016-01-15 2016-01-15 Information pushing method and device

Publications (1)

Publication Number Publication Date
WO2017121076A1 true WO2017121076A1 (en) 2017-07-20

Family

ID=56147623

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/087453 WO2017121076A1 (en) 2016-01-15 2016-06-28 Information-pushing method and device

Country Status (2)

Country Link
CN (1) CN105718533A (en)
WO (1) WO2017121076A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109559158A (en) * 2018-11-06 2019-04-02 北京奇虎科技有限公司 Promotion message put-on method, device, electronic equipment and readable storage medium storing program for executing
CN111177552A (en) * 2019-12-27 2020-05-19 绍兴市上虞区理工高等研究院 Scientific and technological achievement pushing method and device based on user requirements
CN111488517A (en) * 2019-01-29 2020-08-04 北京沃东天骏信息技术有限公司 Method and device for training click rate estimation model
CN111597453A (en) * 2020-03-31 2020-08-28 平安科技(深圳)有限公司 User image drawing method and device, computer equipment and computer readable storage medium
CN111949860A (en) * 2019-05-15 2020-11-17 北京字节跳动网络技术有限公司 Method and apparatus for generating a relevance determination model
CN112148937A (en) * 2020-10-12 2020-12-29 平安科技(深圳)有限公司 Method and system for pushing dynamic epidemic prevention knowledge
CN112766995A (en) * 2019-10-21 2021-05-07 招商证券股份有限公司 Article recommendation method and device, terminal device and storage medium
CN113724815A (en) * 2021-08-30 2021-11-30 平安国际智慧城市科技股份有限公司 Information pushing method and device based on decision grouping model

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110392155B (en) * 2018-04-16 2022-05-24 阿里巴巴集团控股有限公司 Notification message display and processing method, device and equipment
CN110059297B (en) * 2019-04-22 2020-09-29 上海松鼠课堂人工智能科技有限公司 Knowledge point learning duration prediction method, adaptive learning method and computer system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814083A (en) * 2010-01-08 2010-08-25 上海复歌信息科技有限公司 Automatic webpage classification method and system
CN101963966A (en) * 2009-07-24 2011-02-02 李占胜 Method for sorting search results by adding labels into search results
US20110125791A1 (en) * 2009-11-25 2011-05-26 Microsoft Corporation Query classification using search result tag ratios
US20120059838A1 (en) * 2010-09-07 2012-03-08 Microsoft Corporation Providing entity-specific content in response to a search query
CN102375952A (en) * 2011-10-31 2012-03-14 北龙中网(北京)科技有限责任公司 Method for displaying whether website is credibly checked in search engine result
CN103401835A (en) * 2013-07-01 2013-11-20 北京奇虎科技有限公司 Method and device for presenting safety detection results of microblog page

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059818A (en) * 2007-06-26 2007-10-24 申屠浩 Method for reinforcing search engine result safety
CN102142033B (en) * 2010-05-20 2013-04-24 百度在线网络技术(北京)有限公司 Method and device for providing relative sub-link information in search result
CN105868290B (en) * 2012-03-29 2020-03-10 北京奇虎科技有限公司 Method and device for displaying search results
CN103810162B (en) * 2012-11-05 2017-12-12 腾讯科技(深圳)有限公司 The method and system of recommendation network information
CN103902888B (en) * 2012-12-24 2017-12-01 腾讯科技(深圳)有限公司 Method, service end and the system of website degree of belief automatic measure grading
CN103235821B (en) * 2013-04-27 2015-06-24 百度在线网络技术(北京)有限公司 Original content searching method and searching server
CN103399957A (en) * 2013-08-21 2013-11-20 百度在线网络技术(北京)有限公司 Searching method, system and engine as well as client
CN104504058B (en) * 2014-12-18 2018-10-09 北京奇虎科技有限公司 A kind of page display method and browser device
CN104735074A (en) * 2015-03-31 2015-06-24 江苏通付盾信息科技有限公司 Malicious URL detection method and implement system thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101963966A (en) * 2009-07-24 2011-02-02 李占胜 Method for sorting search results by adding labels into search results
US20110125791A1 (en) * 2009-11-25 2011-05-26 Microsoft Corporation Query classification using search result tag ratios
CN101814083A (en) * 2010-01-08 2010-08-25 上海复歌信息科技有限公司 Automatic webpage classification method and system
US20120059838A1 (en) * 2010-09-07 2012-03-08 Microsoft Corporation Providing entity-specific content in response to a search query
CN102375952A (en) * 2011-10-31 2012-03-14 北龙中网(北京)科技有限责任公司 Method for displaying whether website is credibly checked in search engine result
CN103401835A (en) * 2013-07-01 2013-11-20 北京奇虎科技有限公司 Method and device for presenting safety detection results of microblog page

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109559158A (en) * 2018-11-06 2019-04-02 北京奇虎科技有限公司 Promotion message put-on method, device, electronic equipment and readable storage medium storing program for executing
CN111488517A (en) * 2019-01-29 2020-08-04 北京沃东天骏信息技术有限公司 Method and device for training click rate estimation model
CN111949860A (en) * 2019-05-15 2020-11-17 北京字节跳动网络技术有限公司 Method and apparatus for generating a relevance determination model
CN112766995A (en) * 2019-10-21 2021-05-07 招商证券股份有限公司 Article recommendation method and device, terminal device and storage medium
CN111177552A (en) * 2019-12-27 2020-05-19 绍兴市上虞区理工高等研究院 Scientific and technological achievement pushing method and device based on user requirements
CN111597453A (en) * 2020-03-31 2020-08-28 平安科技(深圳)有限公司 User image drawing method and device, computer equipment and computer readable storage medium
CN111597453B (en) * 2020-03-31 2024-05-07 平安科技(深圳)有限公司 User image drawing method, device, computer equipment and computer readable storage medium
CN112148937A (en) * 2020-10-12 2020-12-29 平安科技(深圳)有限公司 Method and system for pushing dynamic epidemic prevention knowledge
CN112148937B (en) * 2020-10-12 2023-07-25 平安科技(深圳)有限公司 Method and system for pushing dynamic epidemic prevention knowledge
CN113724815A (en) * 2021-08-30 2021-11-30 平安国际智慧城市科技股份有限公司 Information pushing method and device based on decision grouping model

Also Published As

Publication number Publication date
CN105718533A (en) 2016-06-29

Similar Documents

Publication Publication Date Title
WO2017121076A1 (en) Information-pushing method and device
US10936959B2 (en) Determining trustworthiness and compatibility of a person
US11797773B2 (en) Navigating electronic documents using domain discourse trees
US9704185B2 (en) Product recommendation using sentiment and semantic analysis
CN107908740B (en) Information output method and device
US10771424B2 (en) Usability and resource efficiency using comment relevance
US9720904B2 (en) Generating training data for disambiguation
US8200617B2 (en) Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata
US9064212B2 (en) Automatic event categorization for event ticket network systems
WO2019149145A1 (en) Compliant report class sorting method and apparatus
US8984414B2 (en) Function extension for browsers or documents
US10134070B2 (en) Contextualized user recapture system
US20150324350A1 (en) Identifying Content Relationship for Content Copied by a Content Identification Mechanism
US20220292160A1 (en) Automated system and method for creating structured data objects for a media-based electronic document
KR102151322B1 (en) Information push method and device
US9705972B2 (en) Managing a set of data
CN113515687B (en) Logistics information acquisition method and device
CN109408725B (en) Method and apparatus for determining user interest
JP2020016960A (en) Estimation device, estimation method and estimation program
US20160110469A1 (en) Method of and system for determining creation time of a web resource
CN112771564A (en) Artificial intelligence engine that generates semantic directions for web sites to map identities for automated entity seeking
WO2016001723A1 (en) Method of and system for determining creation time of a web resource

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16884634

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16884634

Country of ref document: EP

Kind code of ref document: A1