CN108052613B - Method and device for generating page - Google Patents

Method and device for generating page Download PDF

Info

Publication number
CN108052613B
CN108052613B CN201711339545.5A CN201711339545A CN108052613B CN 108052613 B CN108052613 B CN 108052613B CN 201711339545 A CN201711339545 A CN 201711339545A CN 108052613 B CN108052613 B CN 108052613B
Authority
CN
China
Prior art keywords
search
information
search intention
intention
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711339545.5A
Other languages
Chinese (zh)
Other versions
CN108052613A (en
Inventor
李方明
邵英杰
吴家林
张一麟
***
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711339545.5A priority Critical patent/CN108052613B/en
Publication of CN108052613A publication Critical patent/CN108052613A/en
Application granted granted Critical
Publication of CN108052613B publication Critical patent/CN108052613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the application discloses a method and a device for generating a page. One embodiment of the method comprises: in response to receiving the search information, selecting a preset number of search results from the search results corresponding to the search information; and in response to determining that the search information corresponds to the entity object, acquiring search intention information corresponding to the entity object determined in advance, determining the number of search results matched with the search intention indicated by the search intention information corresponding to the entity object in a preset number of search results, and generating a search result page comprising the description information generated in advance and aiming at the entity object based on the determined number and the preset number. The embodiment can accurately identify the search intention of the user, so that the generated page is targeted.

Description

Method and device for generating page
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a method and a device for generating a page.
Background
Search behaviors become basic requirements of each netizen at present, potential search intentions can be hidden behind each search request sent by a user, a plurality of resources can provide search results for the same search information, and how to find the resource which best meets the search intentions of the user from the plurality of resources has the most important meaning for each internet search company and is also the key for competition of each internet search company.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a page.
In a first aspect, an embodiment of the present application provides a method for generating a page, including: in response to receiving the search information, selecting a preset number of search results from the search results corresponding to the search information; and in response to determining that the search information corresponds to the entity object, acquiring search intention information corresponding to the entity object determined in advance, determining the number of search results matched with the search intention indicated by the search intention information corresponding to the entity object in a preset number of search results, and generating a search result page comprising the description information generated in advance and aiming at the entity object based on the determined number and the preset number.
In some embodiments, determining the number of search results that match the search intent indicated by the search intent information corresponding to the entity object comprises: for each search result in a preset number of search results, obtaining abstract information of the search result, generating feature vectors of the abstract information and search intention information, and inputting the feature vectors generated for the search result into a pre-trained search intention recognition model to obtain a search intention recognition result, wherein the search intention recognition model is used for representing the corresponding relation between the feature vectors generated by the abstract information and the search intention information of the search result and the search intention recognition result, and the search intention recognition result is used for indicating whether the search result is matched with the search intention indicated by the search intention information.
In some embodiments, the search intention recognition model is trained by: acquiring a predetermined sample data set, wherein each sample data in the sample data set comprises search intention sample information, abstract sample information of a search result and a search intention identification result, and the search intention identification result comprises a matching identifier and a non-matching identifier; generating a feature vector corresponding to the search intention sample information and the abstract sample information in the sample data aiming at each sample data in the sample data set; and training to obtain a search intention recognition model by using a machine learning method and taking the feature vector generated for each sample data in the sample data set as input and the search intention recognition result in the sample data as output.
In some embodiments, generating a search results page including pre-generated description information for an entity object includes: and displaying the pre-generated description information aiming at the entity object in a target area of the search result page.
In some embodiments, generating a search result page including pre-generated description information for the entity object based on the determined number and the preset number includes: determining whether the ratio of the determined number to a preset number is greater than a preset ratio threshold; and if so, generating a search result page comprising pre-generated description information aiming at the entity object.
In a second aspect, an embodiment of the present application provides an apparatus for generating a page, including: the selecting unit is configured to respond to the received search information and select a preset number of search results from the search results corresponding to the search information; the generating unit is configured to respond to the fact that the searching information corresponds to the entity object, obtain searching intention information corresponding to the entity object which is determined in advance, determine the number of searching results which are matched with the searching intention indicated by the searching intention information corresponding to the entity object in a preset number of searching results, and generate a searching result page which comprises the description information which is generated in advance and aims at the entity object on the basis of the determined number and the preset number.
In some embodiments, the generating unit is further configured to: for each search result in a preset number of search results, obtaining abstract information of the search result, generating feature vectors of the abstract information and search intention information, and inputting the feature vectors generated for the search result into a pre-trained search intention recognition model to obtain a search intention recognition result, wherein the search intention recognition model is used for representing the corresponding relation between the feature vectors generated by the abstract information and the search intention information of the search result and the search intention recognition result, and the search intention recognition result is used for indicating whether the search result is matched with the search intention indicated by the search intention information.
In some embodiments, the apparatus further comprises a training unit for training the search intention recognition model, the training unit comprising: the system comprises an acquisition module, a search module and a display module, wherein the acquisition module is configured to acquire a predetermined sample data set, each sample data in the sample data set comprises search intention sample information, abstract sample information of a search result and a search intention identification result, and the search intention identification result comprises a matching identifier and a non-matching identifier; the generating module is configured to generate a feature vector corresponding to the search intention sample information and the abstract sample information in the sample data for each sample data in the sample data set; and the training module is configured to use a machine learning method to input the feature vector generated aiming at each sample data in the sample data set, output the search intention recognition result in the sample data, and train to obtain the search intention recognition model.
In some embodiments, the generating unit is further configured to: and displaying the pre-generated description information aiming at the entity object in a target area of the search result page.
In some embodiments, the generating unit comprises: a determining module configured to determine whether a ratio of the determined number to a preset number is greater than a preset ratio threshold; a generating module configured to generate a search result page including pre-generated description information for the entity object in response to determining that the ratio is greater than the ratio threshold.
In a third aspect, an embodiment of the present application further provides an electronic device, including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for generating the page provided by the application.
In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for generating a page provided in the present application.
According to the method and the device for generating the page, a preset number of search results are selected from the search results corresponding to the search information in response to the received search information, then the search intention information corresponding to the entity object is obtained in response to the fact that the search information corresponds to the entity object, then the number of the search results matched with the search intention indicated by the search intention information corresponding to the entity object is determined from the preset number of the search results, and finally the search result page including the description information of the entity object is generated based on the preset number and the determined number of the matched search results, so that the search result data corresponding to the search information is utilized, the search intention of the user is accurately determined, and the generated page is pointed.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for generating a page according to the present application;
FIG. 3 is a schematic diagram of an application scenario of a method for generating a page according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating a page according to the present application;
FIG. 5 is a block diagram illustrating one embodiment of an apparatus for generating pages in accordance with the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for generating pages or the apparatus for generating pages of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include user terminals 1011, 1012, 1013, a server 102, target terminals 1031, 1032, 1033, and networks 1041, 1042. The network 1041 serves to provide a medium for communication links between the user terminals 1011, 1012, 1013 and the server 102. The network 1042 serves to provide a medium for communication links between the server 102 and the target terminals 1031, 1032, 1033. The networks 1041, 1042 may comprise various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may interact with the server 102 via the network 1041 using user terminals 1011, 1012, 1013 to send the user's search information to the server 102 or to receive a search results page generated by the server 102, etc. The user terminals 1011, 1012, 1013 may have installed thereon various communication client applications, such as a web browser application, a shopping-type application, a search-type application, and the like.
Users may interact with server 102 via network 1042 using target terminals 1031, 1032, 1033 to send or receive information and the like. The target terminals 1031, 1032, 1033 may have installed thereon various communication client applications, such as a web browser application, a code editing application, a search application, and the like.
The user terminals 1011, 1012, 1013 and the target terminals 1031, 1032, 1033 may be various electronic devices having display screens and supporting information interaction, including but not limited to smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like.
The server 102 may be a server that provides various services, such as a backend server that provides support for search result pages displayed on the user terminals 1011, 1012, 1013. The background server may analyze and perform other processing on the received data such as the search information, and feed back a processing result (e.g., search result page data) to the user terminal.
It should be noted that the method for generating the page provided by the embodiment of the present application is generally performed by the server 102, and accordingly, the apparatus for generating the page is generally disposed in the server 102.
It should be understood that the number of user terminals, servers, target terminals and networks in fig. 1 is merely illustrative. There may be any number of user terminals, servers, target terminals, and networks, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating pages in accordance with the present application is shown. The method for generating the page comprises the following steps:
in step 201, in response to receiving the search information, a preset number of search results are selected from the search results corresponding to the search information.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the method for generating a page operates may receive search information from a terminal, which is used by a user to perform information search, through a wired connection manner or a wireless connection manner, and if the search information is received, the electronic device may obtain search results corresponding to the search information, and then may select a preset number (for example, ten) of search results from the obtained search results, for example, may select the first ten search results in the search results according to a sequence from high to low in presentation priority, or may randomly select the search results from the search results. The search information can be at least one search keyword, and the search keywords can be separated by using symbols such as pause signs, blank spaces and the like, for example, rose tea and efficacy; the search information may also be a long sentence, for example, i want to know what effects rose tea has. The electronic device may locally store a search result corresponding to the search information, and the electronic device may locally obtain the search result corresponding to the search information; the electronic device may request a search result corresponding to the search information from a search server storing the search result corresponding to the search information.
Step 202, determining whether the search information corresponds to an entity object.
In this embodiment, the entity object may include an entity or a concept. An entity is generally a thing that exists objectively and can be distinguished from each other, and the entity can be a human being, an object, or an abstract concept. Concepts may be described and summarized in terms of common essential features of things that are perceived, and are often identified and documented in words or phrases. A knowledge graph (a structured semantic knowledge base) is essentially a concept network in which nodes represent entities or concepts in the physical world, for example, poetry, night thoughts, etc. are all entity objects.
In this embodiment, the electronic device may determine whether the search information corresponds to an entity object, and if so, may execute step 203. Specifically, if the search information is a sentence, the electronic device may first perform word segmentation on the search information, and delete stop words in words obtained by word segmentation to obtain a search keyword; the electronic device may match the search information or the search keyword with the query fields of the entity objects in the locally stored semantic knowledge base, and if the search information or the search keyword is matched with the query fields, the entity object corresponding to the matched query field may be determined as the entity object corresponding to the search information.
As an example, if the search information is "light before bed", and the query value corresponding to the query field in the attribute-value pair of the quiet night thought of ancient poem is "quiet night thought", "light before bed", "ground frost", "look ahead at bright moon", "look ahead at home country", it may be determined that the search information "light before bed" matches the query field of the quiet night thought of ancient poem, and it may be determined that the entity object corresponding to the search information is the quiet night thought of ancient poem.
Step 203, obtaining the search intention information corresponding to the entity object determined in advance.
In this embodiment, if it is determined that the search information corresponds to an entity object, the electronic device may acquire search intention information corresponding to the entity object that is determined in advance. The electronic device may search for a search intention attribute field in the attribute-value pair corresponding to the entity object, and use an attribute value corresponding to the search intention attribute field as search intention information corresponding to the entity object, where the search intention information may also be referred to as user main requirement information, and may be used to indicate a category or a domain of a search result required by the user. For example, the search intention information corresponding to the entity object "meditation at night" may be poetry, the search intention information corresponding to the entity object "plum white" may be poetry, and the search intention information corresponding to the entity object "qilixiang" may be songs.
In this embodiment, the electronic device may mark, in advance, the search intention information corresponding to the entity object. Specifically, for each entity object, the electronic device may search for an entity word corresponding to the entity object as search information; then, generating a feature vector corresponding to the summary information of the searched search result; then, inputting the extracted feature vector into a pre-trained search intention prediction model to obtain search intention information of a search result, wherein the search intention prediction model is used for representing the corresponding relation between the feature vector of the abstract information and the search intention information of the search result; and then, setting the attribute value of the search intention attribute field corresponding to the entity object as the search intention information with the largest occurrence frequency in the obtained search intention information. The entity word may be a name of an entity object, for example, a poem-lipped entity word may be lipped; the entity word may also be an alias of the entity object, for example, the entity word of Yaoqin may be YaoMing daughter.
And step 204, determining the number of the search results matched with the search intention indicated by the search intention information corresponding to the entity object in the preset number of search results.
In this embodiment, the electronic device may determine, among the preset number of search results, the number of search results that match the search intention indicated by the search intention information corresponding to the entity object. Specifically, for each search result in the preset number of search results, the electronic device may first obtain summary information of the search result, where the summary information is usually a short text summary matching the search keyword; then, extracting a feature vector from the abstract information, and inputting the extracted feature vector into the search intention prediction model to obtain search intention information of the search result; then, the electronic device may determine whether the search intention information of the search result matches the search intention information corresponding to the entity object (e.g., the indicated search intention is the same); finally, the number of search results corresponding to the search intention information matched with the search intention information corresponding to the entity object may be counted.
Step 205, based on the determined number and the preset number, generating a search result page including pre-generated description information for the entity object.
In this embodiment, the electronic device may generate a search result page including pre-generated description information for the entity object based on the number determined in step 204 and the preset number. In an implementation manner, the electronic device may determine whether a difference between the preset number and the determined number is smaller than a preset difference threshold, and if so, may generate a search result page including pre-generated description information for the entity object.
In this embodiment, the electronic device may generate description information for each entity object in advance for the entity objects in each field, and store the generated description information locally in the electronic device. The above description information may be an explanation of concepts of an entity, thing, phenomenon, etc., for example, the description information for the quiet night thought of ancient poetry may be the original of a work of the quiet night thought, the description information for the lee of poetry may be the profile of a figure of lee, the description information for fourier transform may be a fourier transform formula, the description information of the formula, etc.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for generating a page according to the present embodiment. In the application scenario of fig. 3, a user first inputs search information "light before bed" 301 in a search box and clicks a search icon; thereafter, the server may select a search result 303, a search result 304, a search result 305, a search result 306, a search result 307, a search result 308, a search result 309, and a search result 310 among search results corresponding to the search information "light of moon before bed" 301; then, the server can determine whether the search information 'bright moon light before bed' 301 corresponds to entity information, and if the search information 'quiet night thinking' corresponding to the entity information ancient poem is determined, search intention information of the ancient poem 'quiet night thinking' can be obtained as poems; then, the server may determine the search intention information of each search result from the eight search results (search results 303-search results 310), for example, the abstract information of each search result is input into a pre-trained search intention prediction model to obtain the search intention information of the search result, so as to determine that the search intention information corresponding to the search result 303, the search result 304, the search result 306, the search result 308, the search result 309 and the search result 310 is poem, the search intention information corresponding to the search result 305 is dance, the search intention information corresponding to the search result 307 is song, and then the server may determine that the number of search results matching the poem of the search intention information of ancient poem "quiet night thinking" is 6; finally, the server may determine that a difference 2 between the selected number 8 of search results and the determined number 6 of search results is greater than a preset difference threshold 3, and may generate a search result page 311 including description information 302 for "quiet night thought" of ancient poem.
The method provided by the embodiment of the application determines the page content of the generated search result page by determining whether the search intention of the search result corresponding to the search information is matched with the search intention corresponding to the entity object corresponding to the search information, thereby accurately determining the search intention of the user and enabling the generated page to have pertinence.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for generating a page is shown. The flow 400 of the method for generating a page includes the steps of:
step 401, in response to receiving the search information, selecting a preset number of search results from the search results corresponding to the search information.
Step 402, determining whether the search information corresponds to an entity object.
Step 403, obtaining search intention information corresponding to the predetermined entity object.
In the present embodiment, the operations of steps 401 and 403 are substantially the same as the operations of steps 201 and 203, and are not described herein again.
Step 404, for each search result in the preset number of search results, obtaining summary information of the search result, generating feature vectors of the summary information and the search intention information, and inputting the feature vectors generated for the search result into a pre-trained search intention recognition model to obtain a search intention recognition result.
In this embodiment, for each search result in the preset number of search results, the electronic device may first obtain summary information of the search result; then, feature vectors of the summary information and the search intention information can be generated; then, the feature vector generated for the search result can be input into a pre-trained search intention recognition model, and a search intention recognition result of the search result is obtained. Here, the obtained search intention recognition result may be used to indicate whether the search result matches the search intention indicated by the search intention information corresponding to the above-mentioned entity object. The search intention recognition result may include a matching identifier and a non-matching identifier, the matching identifier may be used to indicate that the search result matches the search intention indicated by the search intention information corresponding to the entity object, and the matching identifier may be represented by a preset first identifier, for example, 1 or T; the mismatch flag may be used to indicate that the search result does not match the search intention indicated by the search intention information corresponding to the entity object, and the mismatch flag may be represented by a preset second identifier, for example, 0 or F. It should be noted that the search intention identification model may be used to characterize the correspondence between the feature vectors generated by the summary information and the search intention information of the search result and the search intention identification result. As an example, the search intention recognition model may be a correspondence table in which correspondence between a plurality of feature vectors and the search intention recognition result is stored, which is prepared in advance by a technician based on statistics of a large number of feature vectors and the search intention recognition result. Here, the method of generating feature vectors from text information is a well-known technology widely studied and applied at present, and is not described herein again.
In some optional implementations of the embodiment, the electronic device may be trained in advance according to the following steps to obtain the search intention recognition model:
first, the electronic device may acquire a predetermined sample data set, where each sample data in the sample data set includes search intention sample information, abstract sample information of a search result, and a search intention identification result. The sample data in the sample data set may also be obtained by: first, the electronic device may select a second preset number (e.g., twelve) of search results from the search results obtained by searching using the entity word corresponding to the target entity object. The target entity object may be a predetermined entity object for training, the target entity object may correspond to a target search intention, and the target search intention may be stored in a search intention attribute field corresponding to the target entity object; then, the electronic device may push the selected search result to a target terminal so that a user may use the target terminal to divide the selected search result into a search result matching the target search intention and a search result not matching the target search intention, where it is noted that the search result matching the target search intention may have a matching identifier and the search result not matching the target search intention may have a non-matching identifier; finally, the sample data set may be generated using the target search intention information for characterizing the target search intention, the summary information of the search results with the matching identifiers, and the summary information of the search results without the matching identifiers.
Then, for each sample data in the sample data set, the electronic device may generate a feature vector corresponding to the search intention sample information and the digest sample information in the sample data. For each matching sample data (the search intention recognition result is the sample data of the matching identifier), the electronic device may determine the feature vector generated for the matching sample data as a first feature vector; for each unmatched sample data (sample data for which the search intention recognition result is a unmatched identification), the electronic device may determine a feature vector generated for the unmatched sample data as the second feature vector. Here, the method of generating feature vectors from text information is a well-known technology widely studied and applied at present, and is not described herein again.
Finally, the electronic device may use a machine learning method to train a search intention recognition model by taking the feature vector generated for each sample data in the sample data set as an input and taking the search intention recognition result in the sample data as an output. Specifically, the electronic device may use a Naive Bayesian Model (NBM) or a Support Vector Machine (SVM) for classification, and the like, and use the first feature Vector as an input of the Model, the matching identifier as a corresponding Model output, and use the second feature Vector as an input of the Model, and the non-matching identifier as a corresponding Model output, and train the Model by using a Machine learning method to obtain the search intention recognition Model.
In step 405, the number of search results whose search intention recognition results are matching identifications is determined.
In this embodiment, after determining the search intention recognition result of each search result in step 404, the electronic device may determine that the search intention recognition result is the number of search results matching the identifier.
At step 406, it is determined whether the ratio of the determined number to the preset number is greater than a preset ratio threshold.
In this embodiment, the electronic device may determine whether a ratio of the determined number to the preset number is greater than a preset ratio threshold (e.g., 0.6), and if the ratio is greater than the ratio threshold, step 407 may be executed. As an example, if the number of search results of which the search intention recognition result is determined to be the matching identifier is 7, the preset number is 10, and the ratio threshold is 0.6, it may be determined that the ratio is 0.7, and it is determined that the ratio 0.7 is greater than the ratio threshold 0.6.
Step 407, generating a search result page, and displaying the pre-generated description information for the entity object in the target area of the search result page.
In this embodiment, if the ratio is greater than the ratio threshold, the electronic device may generate a search result page, and display pre-generated description information for the entity object on a target area of the search result page. The target area may be a predetermined area for presenting description information for the entity object, and may be a top area of the search result presentation area in general.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for generating a page in the present embodiment highlights steps 404 and 405 of determining the number of search results that match the search intention indicated by the search intention information corresponding to the entity object, and steps 406 and 407 of presenting description information for the entity object in the target area of the search result page in response to determining that the ratio of the determined number to the preset number is greater than the preset ratio threshold. Therefore, the scheme described in the embodiment can more accurately determine the search intention of the user, so that the generated page has more pertinence.
With further reference to fig. 5, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for generating a page, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 5, the apparatus 500 for generating a page of the present embodiment includes: a selecting unit 501 and a generating unit 502. The selecting unit 501 is configured to select a preset number of search results from the search results corresponding to the search information in response to receiving the search information; the generating unit 502 is configured to, in response to determining that the search information corresponds to the entity object, acquire search intention information corresponding to the predetermined entity object, determine, from among a preset number of search results, the number of search results that match the search intention indicated by the search intention information corresponding to the entity object, and generate, based on the determined number and the preset number, a search result page that includes description information for the entity object generated in advance.
In this embodiment, the specific processing of the selecting unit 501 and the generating unit 502 of the apparatus 500 for generating a page may refer to step 201, step 202, step 203, step 204, and step 205 in the corresponding embodiment of fig. 2.
In some optional implementations of this embodiment, for each search result in the preset number of search results, the generating unit 502 may first obtain summary information of the search result; then, feature vectors of the summary information and the search intention information can be generated; then, the feature vector generated for the search result can be input into a pre-trained search intention recognition model, and a search intention recognition result of the search result is obtained. Here, the obtained search intention recognition result may be used to indicate whether the search result matches the search intention indicated by the search intention information corresponding to the above-mentioned entity object. The search intention recognition result may include a matching identifier and a non-matching identifier, the matching identifier may be used to indicate that the search result matches the search intention indicated by the search intention information corresponding to the entity object, and the matching identifier may be represented by a preset first identifier, for example, 1 or T; the mismatch flag may be used to indicate that the search result does not match the search intention indicated by the search intention information corresponding to the entity object, and the mismatch flag may be represented by a preset second identifier, for example, 0 or F. It should be noted that the search intention identification model may be used to characterize the correspondence between the feature vectors generated by the summary information and the search intention information of the search result and the search intention identification result. As an example, the search intention recognition model may be a correspondence table in which correspondence between a plurality of feature vectors and the search intention recognition result is stored, which is prepared in advance by a technician based on statistics of a large number of feature vectors and the search intention recognition result. Here, the method of generating feature vectors from text information is a well-known technology widely studied and applied at present, and is not described herein again.
In some optional implementations of this embodiment, the apparatus 500 for generating a page may further include a training unit 503 for training the search intention recognition model, and the training unit 503 may include an obtaining module 5031, a generating module 5032, and a training module 5033. The training unit 503 may train in advance to obtain the search intention recognition model according to the following steps:
first, the obtaining module 5031 may obtain a predetermined sample data set, where each sample data in the sample data set includes search intention sample information, summary sample information of a search result, and a search intention identification result.
Thereafter, for each sample data in the sample data set, the generating module 5032 may generate a feature vector corresponding to the search intention sample information and the summary sample information in the sample data. For each matching sample data, the generating module 5032 may determine the feature vector generated for the matching sample data as a first feature vector; for each unmatched sample data, the generation module 5032 may determine the feature vector generated for the unmatched sample data as the second feature vector.
Finally, the training module 5033 may use a machine learning method to input the feature vector generated for each sample data in the sample data set, output the search intention recognition result in the sample data, and train to obtain the search intention recognition model. Specifically, the training module 5033 may use a model for classification, such as a naive bayes model or a support vector machine, to input the first feature vector as a model, output the matching identifier as a corresponding model, input the second feature vector as a model, output the mismatching identifier as a corresponding model, and train the model by using a machine learning method to obtain a search intention recognition model.
In some optional implementation manners of this embodiment, the generating unit 502 may display pre-generated description information for the entity object on the target area of the search result page. The target area may be a predetermined area for presenting description information for the entity object, and may be a top area of the search result presentation area in general.
In some optional implementations of the present embodiment, the generating unit 502 may include a determining module 5021 and a generating module 5022. The determining module 5021 may determine whether a ratio of the determined number to the preset number is greater than a preset ratio threshold, and if the ratio is greater than the ratio threshold, the generating module 5022 may generate a search result page including pre-generated description information for the entity object. The above description information may be an explanation of concepts of an entity, thing, phenomenon, etc., for example, the description information for the quiet night thought of ancient poetry may be the original of a work of the quiet night thought, the description information for the lee of poetry may be the profile of a figure of lee, the description information for fourier transform may be a fourier transform formula, the description information of the formula, etc.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use with the electronic device implementing an embodiment of the present invention. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a selecting unit and a generating unit. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves. For example, the selecting unit may also be described as a "unit that selects a preset number of search results among search results corresponding to search information in response to receiving the search information".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: in response to receiving the search information, selecting a preset number of search results from the search results corresponding to the search information; and in response to determining that the search information corresponds to the entity object, acquiring search intention information corresponding to the entity object determined in advance, determining the number of search results matched with the search intention indicated by the search intention information corresponding to the entity object in a preset number of search results, and generating a search result page comprising the description information generated in advance and aiming at the entity object based on the determined number and the preset number.
The foregoing description is only exemplary of the preferred embodiments of the invention and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention according to the present invention is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the scope of the invention as defined by the appended claims. For example, the above features and (but not limited to) features having similar functions disclosed in the present invention are mutually replaced to form the technical solution.

Claims (12)

1. A method for generating a page, comprising:
in response to receiving search information, selecting a preset number of search results from the search results corresponding to the search information;
in response to determining that the search information corresponds to an entity object, acquiring search intention information corresponding to the entity object, which is determined in advance, determining the number of search results matched with the search intention indicated by the search intention information corresponding to the entity object from the preset number of search results, and generating a search result page including description information, which is generated in advance, for the entity object based on the determined number and the preset number;
wherein the search intention information of each entity object is obtained by:
searching by using the entity word corresponding to the entity object as search information; generating a characteristic vector corresponding to the summary information of the searched search result; inputting the extracted feature vector into a pre-trained search intention prediction model to obtain search intention information of a search result, wherein the search intention prediction model is used for representing a corresponding relation between the feature vector of the abstract information and the search intention information of the search result; and setting the attribute value of the search intention attribute field corresponding to the entity object as the search intention information with the largest occurrence frequency in the obtained search intention information.
2. The method of claim 1, wherein the determining a number of search results that match a search intent indicated by search intent information corresponding to the entity object comprises:
for each search result in the preset number of search results, obtaining abstract information of the search result, generating feature vectors of the abstract information and the search intention information, and inputting the feature vectors generated for the search result into a pre-trained search intention recognition model to obtain a search intention recognition result, wherein the search intention recognition model is used for representing the corresponding relation between the feature vectors generated by the abstract information and the search intention information of the search result and the search intention recognition result, and the search intention recognition result is used for indicating whether the search result is matched with the search intention indicated by the search intention information.
3. The method of claim 2, wherein the search intent recognition model is trained by:
acquiring a predetermined sample data set, wherein each sample data in the sample data set comprises search intention sample information, abstract sample information of a search result and a search intention identification result, and the search intention identification result comprises a matching identifier and a non-matching identifier;
generating a feature vector corresponding to the search intention sample information and the abstract sample information in the sample data aiming at each sample data in the sample data set;
and training to obtain a search intention recognition model by using a machine learning method and taking the feature vector generated for each sample data in the sample data set as input and the search intention recognition result in the sample data as output.
4. The method of claim 1, wherein the generating a search results page including pre-generated description information for the entity object comprises:
and displaying the pre-generated description information aiming at the entity object in a target area of the search result page.
5. The method according to one of claims 1 to 4, wherein generating a search result page including pre-generated description information for the entity object based on the determined number and the preset number comprises:
determining whether a ratio of the determined number to the preset number is greater than a preset ratio threshold;
and if so, generating a search result page comprising pre-generated description information aiming at the entity object.
6. An apparatus for generating a page, comprising:
the device comprises a selecting unit, a searching unit and a searching unit, wherein the selecting unit is configured to respond to received searching information and select a preset number of searching results from searching results corresponding to the searching information;
a generating unit configured to, in response to determining that the search information corresponds to an entity object, acquire search intention information corresponding to the entity object that is determined in advance, determine, among the preset number of search results, the number of search results that match a search intention indicated by the search intention information corresponding to the entity object, and generate, based on the determined number and the preset number, a search result page that includes description information for the entity object that is generated in advance;
wherein the search intention information of each entity object is obtained by:
searching by using the entity word corresponding to the entity object as search information; generating a characteristic vector corresponding to the summary information of the searched search result; inputting the extracted feature vector into a pre-trained search intention prediction model to obtain search intention information of a search result, wherein the search intention prediction model is used for representing a corresponding relation between the feature vector of the abstract information and the search intention information of the search result; and setting the attribute value of the search intention attribute field corresponding to the entity object as the search intention information with the largest occurrence frequency in the obtained search intention information.
7. The apparatus of claim 6, wherein the generating unit is further configured to:
for each search result in the preset number of search results, obtaining abstract information of the search result, generating feature vectors of the abstract information and the search intention information, and inputting the feature vectors generated for the search result into a pre-trained search intention recognition model to obtain a search intention recognition result, wherein the search intention recognition model is used for representing the corresponding relation between the feature vectors generated by the abstract information and the search intention information of the search result and the search intention recognition result, and the search intention recognition result is used for indicating whether the search result is matched with the search intention indicated by the search intention information.
8. The apparatus of claim 7, wherein the apparatus further comprises a training unit for training a search intention recognition model, the training unit comprising:
the system comprises an acquisition module, a search module and a display module, wherein the acquisition module is configured to acquire a predetermined sample data set, each sample data in the sample data set comprises search intention sample information, abstract sample information of a search result and a search intention identification result, and the search intention identification result comprises a matching identifier and a non-matching identifier;
the generating module is configured to generate a feature vector corresponding to the search intention sample information and the abstract sample information in the sample data aiming at each sample data in the sample data set;
and the training module is configured to use a machine learning method to input the feature vector generated for each sample data in the sample data set, output the search intention identification result in the sample data, and train to obtain a search intention identification model.
9. The apparatus of claim 6, wherein the generating unit is further configured to:
and displaying the pre-generated description information aiming at the entity object in a target area of the search result page.
10. The apparatus according to one of claims 6-9, wherein the generating unit comprises:
a determining module configured to determine whether a ratio of the determined number to the preset number is greater than a preset ratio threshold;
a generating module configured to generate a search result page including pre-generated description information for the entity object in response to determining that the ratio is greater than the ratio threshold.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201711339545.5A 2017-12-14 2017-12-14 Method and device for generating page Active CN108052613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711339545.5A CN108052613B (en) 2017-12-14 2017-12-14 Method and device for generating page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711339545.5A CN108052613B (en) 2017-12-14 2017-12-14 Method and device for generating page

Publications (2)

Publication Number Publication Date
CN108052613A CN108052613A (en) 2018-05-18
CN108052613B true CN108052613B (en) 2021-12-31

Family

ID=62132891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711339545.5A Active CN108052613B (en) 2017-12-14 2017-12-14 Method and device for generating page

Country Status (1)

Country Link
CN (1) CN108052613B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165344A (en) * 2018-08-06 2019-01-08 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN109255036B (en) * 2018-08-31 2020-02-18 北京字节跳动网络技术有限公司 Method and apparatus for outputting information
CN109348275B (en) * 2018-10-30 2021-07-30 百度在线网络技术(北京)有限公司 Video processing method and device
CN111276136A (en) * 2018-12-04 2020-06-12 北京京东尚科信息技术有限公司 Method, apparatus, system, and medium for controlling electronic device
CN109684633B (en) * 2018-12-14 2023-05-16 北京百度网讯科技有限公司 Search processing method, device, equipment and storage medium
CN110543592B (en) * 2019-08-27 2022-04-01 北京百度网讯科技有限公司 Information searching method and device and computer equipment
CN110941765A (en) * 2019-12-04 2020-03-31 青梧桐有限责任公司 Search intention identification method, information search method and device and electronic equipment
CN111061754B (en) * 2019-12-10 2023-03-14 北京明略软件***有限公司 Family map determining method and device, electronic equipment and storage medium
CN111198971B (en) * 2020-01-15 2023-06-06 北京百度网讯科技有限公司 Searching method, searching device and electronic equipment
CN111324819B (en) * 2020-03-24 2021-07-30 北京字节跳动网络技术有限公司 Method and device for searching media content, computer equipment and storage medium
CN111522927B (en) * 2020-04-15 2023-07-14 北京百度网讯科技有限公司 Entity query method and device based on knowledge graph
CN111625680B (en) * 2020-05-15 2023-08-25 青岛聚看云科技有限公司 Method and device for determining search results
CN114580426A (en) * 2020-12-01 2022-06-03 阿里巴巴集团控股有限公司 User intention identification method, interaction method, device, system, equipment and medium
CN113486253B (en) * 2021-07-30 2024-03-19 抖音视界有限公司 Search result display method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020066A (en) * 2011-09-21 2013-04-03 北京百度网讯科技有限公司 Method and device for recognizing search demand
CN104838375A (en) * 2012-11-13 2015-08-12 微软技术许可有限责任公司 Intent-based presentation of search results
CN105095187A (en) * 2015-08-07 2015-11-25 广州神马移动信息科技有限公司 Search intention identification method and device
CN105677931A (en) * 2016-04-07 2016-06-15 北京百度网讯科技有限公司 Information search method and device
CN106096037A (en) * 2016-06-27 2016-11-09 北京百度网讯科技有限公司 Search Results polymerization based on artificial intelligence, device and search engine
WO2017118427A1 (en) * 2016-01-07 2017-07-13 腾讯科技(深圳)有限公司 Webpage training method and device, and search intention identification method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236663B (en) * 2010-04-30 2014-04-09 阿里巴巴集团控股有限公司 Query method, query system and query device based on vertical search

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020066A (en) * 2011-09-21 2013-04-03 北京百度网讯科技有限公司 Method and device for recognizing search demand
CN104838375A (en) * 2012-11-13 2015-08-12 微软技术许可有限责任公司 Intent-based presentation of search results
CN105095187A (en) * 2015-08-07 2015-11-25 广州神马移动信息科技有限公司 Search intention identification method and device
WO2017118427A1 (en) * 2016-01-07 2017-07-13 腾讯科技(深圳)有限公司 Webpage training method and device, and search intention identification method and device
CN105677931A (en) * 2016-04-07 2016-06-15 北京百度网讯科技有限公司 Information search method and device
CN106096037A (en) * 2016-06-27 2016-11-09 北京百度网讯科技有限公司 Search Results polymerization based on artificial intelligence, device and search engine

Also Published As

Publication number Publication date
CN108052613A (en) 2018-05-18

Similar Documents

Publication Publication Date Title
CN108052613B (en) Method and device for generating page
CN108153901B (en) Knowledge graph-based information pushing method and device
US11669579B2 (en) Method and apparatus for providing search results
US11151177B2 (en) Search method and apparatus based on artificial intelligence
CN107256267B (en) Query method and device
CN107679039B (en) Method and device for determining statement intention
CN106960030B (en) Information pushing method and device based on artificial intelligence
US9471874B2 (en) Mining forums for solutions to questions and scoring candidate answers
US11042542B2 (en) Method and apparatus for providing aggregate result of question-and-answer information
CN107241260B (en) News pushing method and device based on artificial intelligence
CN108256070B (en) Method and apparatus for generating information
CN111522927B (en) Entity query method and device based on knowledge graph
US20200045122A1 (en) Method and apparatus for pushing information
CN106919711B (en) Method and device for labeling information based on artificial intelligence
US20150309988A1 (en) Evaluating Crowd Sourced Information Using Crowd Sourced Metadata
US10095736B2 (en) Using synthetic events to identify complex relation lookups
US11244153B2 (en) Method and apparatus for processing information
US20160110364A1 (en) Realtime Ingestion via Multi-Corpus Knowledge Base with Weighting
CN110737824B (en) Content query method and device
CN112052297A (en) Information generation method and device, electronic equipment and computer readable medium
US20210004406A1 (en) Method and apparatus for storing media files and for retrieving media files
CN113590756A (en) Information sequence generation method and device, terminal equipment and computer readable medium
CN106549860B (en) Information acquisition method and device
CN110895587A (en) Method and device for determining target user
US20160124961A1 (en) Using Priority Scores for Iterative Precision Reduction in Structured Lookups for Questions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant