CN112347365A - Target search information determination method and device - Google Patents

Target search information determination method and device Download PDF

Info

Publication number
CN112347365A
CN112347365A CN202011334168.8A CN202011334168A CN112347365A CN 112347365 A CN112347365 A CN 112347365A CN 202011334168 A CN202011334168 A CN 202011334168A CN 112347365 A CN112347365 A CN 112347365A
Authority
CN
China
Prior art keywords
target
search
word
text
target search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011334168.8A
Other languages
Chinese (zh)
Inventor
康战辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011334168.8A priority Critical patent/CN112347365A/en
Publication of CN112347365A publication Critical patent/CN112347365A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method and a device for determining target search information, which relate to the technical field of computers, and the method comprises the following steps: responding to an operation of selecting target search information triggered by a target user aiming at a target text, and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous; and responding to a search operation triggered by the target user aiming at any target search information set, and searching based on the target search words in the target search set. The method can realize the requirement of the target user through one-time search, improve the searching efficiency, provide convenience for the target user and improve the user experience of the target user.

Description

Target search information determination method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for determining target search information.
Background
With the development of computer technology and network technology, search requirements are ubiquitous. A user can generally input a search statement reflecting own information acquisition requirements through an application providing information search service, and trigger the application to search in an information database available in a network according to the search statement to obtain and display a corresponding search result so as to acquire information meeting the requirements.
In the related art, besides the above search method, a user may search in a common reading scene, for example, after reading an article, the user may search for a word of interest.
Under a reading scene in the related art, the search requirement of a user on a single search word can be met, but the search requirement of the user on inquiring a plurality of search words in an article cannot be met.
Disclosure of Invention
The embodiment of the application provides a method and a device for determining target search information, which are used for meeting the search requirement of a user for inquiring a plurality of search terms in an article.
In one aspect, an embodiment of the present application provides a method for determining target search information, where the method includes:
responding to an operation of selecting target search information triggered by a target user aiming at a target text, and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous;
and responding to a search operation triggered by the target user aiming at any target search information set, and searching based on the target search words in the target search set.
In one aspect, an embodiment of the present application provides an apparatus for determining target search information, including:
the target search information set display unit is used for responding to a target search information selection operation triggered by a target user aiming at a target text and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous;
and the searching unit is used for responding to a searching operation triggered by the target user aiming at any target searching information set and searching based on the target searching words in the target searching set.
Optionally, the apparatus further includes a target search information set determining unit, where the target search information set determining unit is configured to:
obtaining a search word set to be selected, wherein the search word set to be selected comprises at least two target search words, and the position of each target search word in a target text is discontinuous;
determining the recommendation degree of the search term set to be selected based on the first historical frequency of the search term set to be selected in the first set time period and the second historical frequency of each target search term to be searched;
and if the recommendation degree is determined to be matched with the recommendation degree threshold value, the search word set to be selected is used as the target search information set.
Optionally, the target search information set determining unit is further configured to:
removing stop words in the target text to obtain a target text to be segmented;
performing word segmentation processing on a target text to be segmented to obtain a word segmentation result;
taking each noun in the word segmentation result as a search word to be selected;
determining the historical search frequency of each search word to be selected in the word segmentation result, which is searched in a second set time period, and taking each search word to be selected, of which the historical search frequency is greater than a search frequency threshold value, as a target search word.
Optionally, the target search information set determining unit is further configured to:
and determining that the similarity between the semantics of the search word set to be selected and the semantics of the target text is greater than a semantic similarity threshold.
Optionally, the target search information set determining unit is specifically configured to:
converting each target search word in the search word set to be selected into a corresponding N-dimensional word vector, and adding the ith-dimensional word vector of each target search word to obtain a first semantic vector of the search word set to be selected, wherein N is more than or equal to 1, and i belongs to N;
determining an N-dimensional word vector corresponding to each text message in the target text, and adding the ith-dimensional word vector of each text message to obtain a second semantic vector of the target text;
and determining the similarity between the semantics of the to-be-selected search word set and the target text semantics according to the first semantic vector and the second semantic vector.
Optionally, the target search information set determining unit is specifically configured to:
the method comprises the steps of obtaining a search log in a first set time period, determining a first historical frequency of searching all target search words in a to-be-searched word set in the search log at the same time, and determining a second historical frequency of searching each target search word in the search log;
and determining a first target value according to the product of the first search times and the number of the target search words in the search log, determining a second target value according to the product of the second historical frequency of each target search word in the set, and determining the recommendation degree of the search word set to be selected according to the quotient of the first target value and the second target value.
Optionally, the target search information set determining unit is further configured to:
obtaining the operation times of search results obtained after the search word set to be selected is searched within the first set time period;
determining the historical click rate of the search word set to be selected according to the operation times corresponding to the search word set to be selected and the first historical frequency of the search word set to be selected;
and determining the recommendation degree of the search term set to be selected according to the quotient of the first target value and the second target value and the historical click rate of the search term set to be selected.
In one aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the target search information determination method when executing the program.
In one aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program executable by a computer device, which when the program runs on the computer device, causes the computer device to execute a target search information determination method.
In one aspect, embodiments of the present application provide a computer program product comprising a computer program or instructions that, when executed, enable at least one processor to perform a target search information determination method.
In the embodiment of the application, at least one target search information set is displayed to a user by responding to the operation of selecting target search information triggered by a target user aiming at a target text, and each target search information set comprises at least two different target search words; that is to say, when a target user needs to search for a part of target search terms in a target text, what is displayed to the user is not any target search term in the target text, but a plurality of target search information sets, and the target user can select any target search information set for searching.
Further, in the embodiment of the present application, when a target user determines a target search information set that needs to be searched, a search operation for the target search information set is responded, and a search is performed based on a target search word in the target search information set.
That is to say, in the embodiment of the present application, when a target user needs to search for part of information in a target text, multiple target search information sets can be provided to the target user, and each target search information set includes at least two different target search terms, that is, a requirement of the target user for common search of the at least two target search terms in the target text can be met, so that a requirement of the target user can be met through one search, search efficiency is improved, convenience is provided for the target user, and user experience of the target user is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a diagram illustrating target search information in the related art;
fig. 2 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 3 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 4 is a schematic flowchart of a target search information determining method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a target search information determination apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a Word2Ve model according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In the present application, the embodiments and features of the embodiments may be arbitrarily combined with each other without conflict. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
In addition, it should be understood that the terms "system" and "network" in the embodiments of the present application may be used interchangeably. "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple. And, unless stated to the contrary, the embodiments of the present application refer to the ordinal numbers "first", "second", etc., for distinguishing a plurality of objects, and do not limit the sequence, timing, priority, or importance of the plurality of objects. For example, the first set of values and the second set of values are merely to distinguish between the different sets of numerical representations, and are not to indicate a difference in priority, degree of importance, or the like between the two sets of numerical values.
For convenience of understanding, terms referred to in the embodiments of the present application are explained below.
Target text: any text information displayed in the interface, wherein the text information is composed of at least one word, and the word can be called a target search word; in this embodiment of the present application, the target text may be a short sentence in one article, that is, each short sentence constituting the article may be the target text, or in this embodiment of the present application, the target text is all text information displayed in the interface, and may be multiple articles, or may be part of information in one article, which is not limited herein.
The target user: the target text browsing method refers to a target text browsing user, the target user can browse through any interface, and the browsed content is the target text.
Having introduced the above terms, the concepts of the present application will now be described based on the problems presented in the related art.
With the development of the internet and the development of mobile communication networks, information content is more and more abundant, and information searching methods are more and more.
A user can input a search statement reflecting own information acquisition requirements through an application providing information search service, and the application is triggered to search in an information database available in a network according to the search statement to obtain and display a corresponding search result so as to acquire information meeting the requirements.
For example, a user may input a search term in any search engine application, determine search content corresponding to the search term through a search engine, and display the search content in an interface.
In the related art, besides the above search method, a user may search in a common reading scene, for example, after reading an article, the user may search for a word of interest. For example, when a user uses an instant messaging application, in an article reading scenario, the user may be interested in a word of interest in an article.
The specific search modes are various, the search can be performed through voice control, gesture control and the like, and the search operation can also be triggered after long pressing operation is performed on an article through a fingertip of a user. For example, when a user reads an article through an instant messaging application and is interested in a word in the article, the user triggers a search display interface by pressing the position of the word in the display interface, as shown in fig. 1 for example.
In fig. 1, an interface for a user to read an article is exemplarily shown, the content of the article displayed in the interface is the definition of a patent, and the user searches by long pressing the "invention" word; in fig. 1, after the user presses the "invention" word for a long time, a menu bar is displayed, the menu bar includes a search function menu, and the user can perform a search by clicking the search.
However, in the related art, when a user needs to search for a plurality of interested words in a displayed article at the same time, the search requirement of the user cannot be met. For example, when a user searches for "invention" and "invention" simultaneously in a defined article of a displayed patent, it is necessary for the user to perform a combination search by himself or herself and not to perform a search through a menu bar including a search function menu.
Based on the problems in the related art, embodiments of the present application provide a method and an apparatus for determining target search information, where at least one target search information set may be displayed by responding to a target search information selection operation triggered by a target user for a target text, where the target search information set includes at least two different target search terms, each target search term is partial text information in the target text, and positions of any two target search terms in each target search information set in the target text are discontinuous.
And further, in response to a search operation triggered by a target user aiming at any target search information set, searching based on target search words in the target search set.
In the embodiment of the application, at least one target search information set is displayed to a user by responding to the operation of selecting target search information triggered by a target user aiming at a target text, and each target search information set comprises at least two different target search words; that is to say, when a target user needs to search for a part of target search terms in a target text, what is displayed to the user is not any target search term in the target text, but a plurality of target search information sets, and the target user can select any target search information set for searching.
Further, in the embodiment of the present application, when a target user determines a target search information set that needs to be searched, a search operation for the target search information set is responded, and a search is performed based on a target search word in the target search information set.
That is to say, in the embodiment of the present application, when a target user needs to search for part of information in a target text, multiple target search information sets can be provided to the target user, and each target search information set includes at least two different target search terms, that is, a requirement of the target user for common search of the at least two target search terms in the target text can be met, so that a requirement of the target user can be met through one search, search efficiency is improved, convenience is provided for the target user, and user experience of the target user is improved.
Having described the inventive concept of the present application, first a system architecture diagram applicable to the present application is described, and referring to fig. 2, the system architecture at least includes at least one terminal device 201, a plurality of applications may be run in the terminal device 201, and target text information may be displayed in at least one of the applications.
In the embodiment of the present application, a client of each application may be installed in the terminal device 201. The client in the terminal device 201 may be an application client such as a browser client store; the terminal device 201 may also run an applet for each application, which is an application that can be used without downloading and installing. In order to provide more diversified business services to the user, the developer may develop a corresponding applet for an application (e.g., an instant messaging application, a shopping application, a mail application, etc.) of the terminal device 201, and the applet may be embedded as a sub-application in the application of the terminal device 201, and the corresponding business service may be provided to the user by running the sub-application (i.e., the corresponding applet) in the application.
The client in the terminal apparatus 101 is a client of each application, that is, each controllable independent can be run through the terminal apparatus 101, and the state information data of each controllable object in the terminal apparatus 101 is reported to the server 102.
The terminal device 201 may display at least one target search information set in response to a target search information selection operation triggered by a target user for a target text, where the target search information set includes at least two different target search words, each target search word is partial text information in the target text, and positions of any two target search words in each target search information set in the target text are discontinuous.
Further, the terminal device 201 may perform a search based on a target search word in a target search set in response to a search operation triggered by a target user for any target search information set.
Optionally, in this embodiment of the application, the manner of performing the search by the terminal device 201 may be a local search, that is, performing the search in a record locally stored in the terminal device 201, and in another optional embodiment, after responding to a search operation triggered by a target user for any target search information set, the terminal device 201 may send a search request to the server 202, and obtain search content determined by the server 202. In the embodiment of the present application, the system architecture further includes a server 202.
Further, in this embodiment of the application, the terminal device 201 may display a target search information request to the server 202 after responding to a target search information selection operation triggered by a target user for a target text, and the server 202 sends the determined target search information set to the terminal device 201 for display.
In the present embodiment, server 202 is also capable of determining a target set of search information in a target text.
Further, in this embodiment of the application, the information of the target search information set sent by the server 202 to the terminal device 201 may be determined by the server 202 after receiving the target search information request, or may be determined by the server 202 before receiving the target search information request, which is not described herein again.
Terminal device 201 may include one or more processors 2011, memory 2012, I/O interfaces 2013 that interact with server 202, and a display panel 2014, among others. The terminal device 201 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like.
The server 202 is a terminal device providing computing power, and the server 202 may include one or more processors 2021, a memory 2022, and an I/O interface 2023 and the like interacting with the terminal device 201. In addition, the server 202 may also configure a database 2024. The server 202 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The terminal device 201 and the server 202 may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Illustratively, the system architecture in the embodiment of the present application includes a plurality of terminal devices 201, each terminal device 201 may run an instant messaging application, and the instant messaging application has a reading function, so that a user can browse target text information through the instant messaging application.
When a target user triggers a target search information operation aiming at target text information, the terminal device 201 sends a target search information obtaining request to the server 202, the server 202 is an instant messaging application server, the server 202 determines a target search information set aiming at the target text after receiving the target search information obtaining request, and the terminal device 201 displays the target search information set.
As shown in fig. 3, the above process is exemplary, and in fig. 3, after the user has passed through the target text of "patent classification into three types of invention, and design", the terminal device 201 sends a target search information acquisition request to the server 202.
The server 202 determines a target search information set for the target text upon receiving the target search information obtaining request, and the terminal device 201 displays the target search information set.
In fig. 3, the target search information set is displayed as "patent, invention", "patent, design", and the like.
And when the target user clicks any one of the target search information sets, searching for the target search words in the target search information set.
Of course, the above embodiments only describe one target search information determining method by way of example, and there are other target search information determining methods, which are not limited herein.
Based on the design concept and the application scenario, a method for determining target search information according to an embodiment of the present application is specifically described below.
As shown in fig. 4, an embodiment of the present application provides a method for determining target search information, which specifically includes:
step S401, responding to an operation of selecting target search information triggered by a target user aiming at a target text, and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous.
In this embodiment of the application, an execution subject for executing the target search information determination method may be a target search information determination device, and the device may be any device in the terminal device, may also be any device in the server, or may be an application running in the terminal device or the server, which is not limited herein.
In the embodiment of the present application, there are various ways of operation for selecting target search information triggered by a target text, which may be triggered by voice, for example, a target user triggers operation for selecting target search information by voice "select target search information", or a target user may trigger operation for selecting target search information by gesture; or the target user can trigger a setting control or a combination of setting controls in the terminal equipment to trigger the operation of selecting the target search information.
In yet another alternative embodiment, the target user triggers the operation of selecting the target search information by pressing the target text for a set time, for example, when the target user performs a 5s continuous pressing operation on the target text, that is, the target text triggers the operation of selecting the target search information. Of course, in the embodiment of the present application, 5s is only an exemplary time setting method, and there are other methods for setting the set time, which are not described herein again.
In the embodiment of the application, the target text may be all contents of an article browsed by the user, that is, all real text information in the display interface is the target text; the target text can also be part of the content of an article browsed by the user, for example, a certain short sentence and a certain piece of text information in reality in the display interface are the target text. Of course, there are other ways to define the target text, which are not described herein.
In this embodiment of the present application, when responding to an operation of a target user for selecting target search information triggered by a target user for a target text, at least one target search information set can be displayed, specifically, the target search information set includes at least two different target search words, each target search word is partial text information in the target text, and positions of any two target search words in each target search information set in the target text are not continuous.
That is to say, in the embodiment of the present application, when a target user needs to search for target search information of interest in a target text, a determined target search information set is displayed, and the target search information set is recommended to the target user as recommended content.
Specifically, the number of the target search information sets displayed to the target user is not limited, and at least one target search information set is displayed, where the number of the displayed target search information sets may be set by default, or may be set by the target user during browsing, and is not described herein again.
In the embodiment of the present application, the target search information set includes at least two different target search terms, so that a target user can perform joint search based on the at least two different target search terms, the number of the target search terms in a specific target search set is not limited, and the target search information set includes at least two target search terms, where the specific number may be set by default or set by the target user during browsing, and is not described herein again.
In the embodiment of the application, since the target user is an operation of selecting the target search information triggered by the target text, the search word in the target search information set is also part of text information in the target text.
In the embodiment of the present application, the target search term may be any character, or may be a term formed by a plurality of characters. Illustratively, the target search term may be "patent" or "interest", and the target search term may also be "patent". Of course, the target search term has other definition modes, which are not described herein.
Further, in the embodiment of the present application, since a function of searching text information selected by a user can be implemented in the related art, in the embodiment of the present application, positions of any two target search terms in each target search information set in the target text are not consecutive.
Illustratively, the text information browsed by the target user is composed of a plurality of short sentences, and each short sentence can be a target text corresponding to the target user; specifically, the text information browsed by the target user is composed of a short sentence A, a short sentence B and a short sentence C, and when the target user triggers an operation of selecting the target search information for the short sentence C, a target search information set for the short sentence C is displayed.
Specifically, the short sentence C includes a target search word 1, a target search word2, a target search word 3, and a target search word 4, and the sequence of each target search word in the short sentence C is the target search word 1, the target search word2, the target search word 3, and the target search word 4.
The target search information set displayed for the short sentence C is composed of the at least two discontinuous target search words, exemplarily, the target search information set displayed for the short sentence C includes a set 1, a set 2, and a set 3, and the content of the set 1 is the target search word 1 and the target search word 3; the content of the set 2 is a target search word 1 and a target search word 4; the content of the set 3 is the target search term 2 and the target search term 4.
Of course, the above is only an exemplary method for displaying the target search information, and there are other methods for displaying the target search information, which are not described herein again.
In the embodiment of the present application, in order to be able to display at least one target search information set in time after responding to a target search information selection operation triggered by a target user for a target text, it is further necessary to determine the target search information set before responding to the target search information selection operation triggered by the target user for the target text.
In the embodiment of the present application, an execution subject for determining the target search information set is not limited, and the execution subject may be a target search information determination device, and the device may be any device in the terminal device, may also be any device in the server, or may be an application running in the terminal device or the server, which is not limited herein.
The following describes how to determine a target search information set without limiting the execution subject.
In the embodiment of the application, when a target search information set corresponding to a target text is determined, a search word set to be selected is required to be obtained first, the search word set to be selected includes at least two target search words, and the position of each target search word in the target text is discontinuous.
In the embodiment of the application, each to-be-selected search term set includes at least two target search terms, and the number of the target search terms is not limited herein, where the target search terms are text information with discontinuous positions in a target text.
In the embodiment of the application, in order to meet the search requirement of a target user, search correlation exists among a plurality of target search words in a displayed target search information set, that is, the requirement of the target user for searching a plurality of pieces of information in a target text together is met, when the search correlation exists among the plurality of target search words in a to-be-selected search word set or the search correlation meets the requirement of a search correlation condition, the to-be-selected search word set can be determined to be the target search information set, and the plurality of target search words in the target search information set can meet the search requirement of the target user for searching at least two discontinuous text information in the target text simultaneously.
In the embodiment of the application, it is first determined whether the set of search terms to be selected is the target search set, and a specific manner is to determine search correlations between different target search terms in the set of search terms to be selected.
Before the introduction determines the search relevance between different target search terms, the target search terms in the target text need to be determined first.
In the embodiment of the application, punctuation information and other stop words in the target text are less in probability of being used as the target search word, so that the punctuation information and other stop words in the target text can be removed completely.
And then determining target search words in the target texts with all punctuations removed, wherein in the embodiment of the application, for convenience of description, the target texts with all punctuations removed are used as target texts to be segmented.
In the embodiment of the application, all words in the target text to be participled can be used as target search words, and partial words meeting the search condition in the target text to be participled can also be used as target search words.
In an optional embodiment, the search condition may be determined according to the type of the word searched by the target user, for example, the type of the word searched by the target user is a noun, so that the word segmentation processing may be performed on the target text to be word segmented to obtain a word segmentation result; and taking each noun in the word segmentation result as a search word to be selected.
Of course, the above embodiment is only an optional method for determining the search term to be selected, and the search term to be selected in the target text to be segmented may also be determined based on the set field term or the set context term searched by the target user as the search condition.
For example, the target user needs to search for medical nouns, and all medical nouns in the target text to be participled can be used as search terms to be selected; or if the target user needs to search for the address, all words representing the address in the target text to be word-segmented can be used as the search words to be selected.
Of course, there are other methods for determining the search term to be selected in the target text to be segmented, which are not described herein again.
Further, in the embodiment of the application, after the search term to be selected is obtained, it is also required to determine whether the search term to be selected is a common search term, and if the search term is not a common search term, the search term is considered to not meet the search requirement of the user.
In an optional embodiment, the historical search frequency of each search term to be selected in the segmentation result, which is searched within the second set time period, is determined, and each search term to be selected, of which the historical search frequency is greater than the search frequency threshold value, is used as the target search term.
In the above-described embodiment, the second set time period refers to time period information that can be defined, and the time period may be arbitrarily set time period information, for example, 5 minutes, 10 minutes, 1 hour, 24 hours, and the like, and is not limited to specific time period information.
In this embodiment of the present application, after determining that the target text to be segmented is subjected to the segmentation processing to obtain the segmentation result, determining the historical search frequency of each noun in each segmentation result, which is searched within the second set time period, optionally, in this embodiment of the present application, determining the historical search frequency of each noun, which is searched within the second set time period, according to the search record generated within the second set time period.
If the historical search frequency of a certain noun is determined to be greater than the search frequency threshold, the corresponding noun can be used as the target search word, and if the historical search frequency of a certain noun is determined to be not greater than the search frequency threshold, the corresponding noun can be considered as incapable of being used as the target search word.
Of course, the above embodiments are only an optional method for determining whether each word in the word segmentation result can be used as a target search word, and other methods exist, which are not described herein again.
The following exemplarily describes a method for determining each target search term from the target text, in combination with the processing procedures of the above steps.
Specifically, in the embodiment of the present application, the target text is "a patent, and literally means exclusive rights and benefits", the stop word in the target text is removed first, and the obtained target text to be participled "the patent literally means exclusive rights and benefits".
The word segmentation is carried out by using a disclosed word segmentation tool (such as jieba), and the obtained word segmentation result is patent/from/literal/upper/yes/finger/exclusive/right/benefit/.
Further, the terms "patent", "literal", "exclusive", "right", "benefit" in the above word segmentation result are used as the search words to be selected.
Further, search records within 1 month are obtained, the search frequency of each search word to be selected is determined based on the search records, and if the literal search frequency and the exclusive search frequency are not greater than the search frequency threshold, other nouns are used as target search words.
After each target search word is determined, each set of search words to be selected can be determined based on each target search word, and specifically, each target search word with discontinuous positions in the target text forms a set of information to be searched.
Further, in the embodiment of the present application, in order to ensure that the semantics of the formed information set to be searched and the target text are the same or similar, that is, to ensure that the semantics of the information set to be searched and the target text are not shifted, after each information set to be searched is determined, it is further required to determine that the similarity between the semantics of the search word set to be selected and the semantics of the target text is greater than the semantic similarity threshold.
In the embodiment of the present application, there are various methods for determining that the similarity between the semantics of the to-be-selected search word set and the semantics of the target text is greater than the semantic similarity threshold, for example, converting the semantics of the to-be-selected search word set into vectors, and determining that the similarity between the semantics of the to-be-selected search word set and the semantics of the target text is greater than the semantic similarity threshold according to the distance between the vectors.
Specifically, in the embodiment of the application, each target search word in the search word set to be selected is converted into a corresponding N-dimensional word vector, and the ith-dimensional word vectors of each target search word are added to obtain a first semantic vector of the search word set to be selected, where N is greater than or equal to 1, and i belongs to N; determining an N-dimensional word vector corresponding to each text message in the target text, and adding the ith-dimensional word vector of each text message to obtain a second semantic vector of the target text; and determining the similarity between the semantics of the to-be-selected search word set and the target text semantics according to the first semantic vector and the second semantic vector.
That is to say, the search word set to be selected comprises a plurality of target search words, each target search word is converted into an N-dimensional word vector, and then each dimension of the N-dimensional word vectors is correspondingly added to obtain a first semantic vector of the search word set to be selected; and on the same principle, adding the ith dimension word vector of each text message in the target text to obtain a second semantic vector of the target text.
In the embodiment of the application, the similarity between the semantics of the selected search term set and the semantics of the target text can be determined through a Word vector transformation Word2Vec model.
Specifically, the Word2Ve model is a neural network model that is trained to reconstruct linguistic Word text. Each target search Word in the search Word set to be selected can be determined to be represented as an N-dimensional semantic vector by using a dictionary through the Word2Vec model, and similarly, each Word in the target text can be converted into the N-dimensional semantic vector through the Word2Vec model and added to obtain the N-dimensional semantic vector corresponding to the target text.
The following exemplarily introduces the Word2Vec model in conjunction with fig. 7, where the Word2Vec model in fig. 7 includes an input layer, a hidden layer, and an output layer.
Firstly, in an input layer, carrying out one-hot code one-hot coding on input words or text information based on a vocabulary obtained by training. Assume that 10000 unique non-repeating words are extracted from the training document to form a vocabulary. One-hot coding is performed on the 10000 words, each obtained word is a 10000-dimensional vector, the value of each dimension of the vector is only 0 or 1, if the appearance position of the word ants in the vocabulary is the 3 rd, the vector of the ants is a 10000-dimensional vector (ants ═ 0,0,1,0,. multidot.multidot.0) with the third dimension being 1 and the other dimensions being 0.
In the embodiment of the application, in the hidden layer, all one-hot codes are respectively multiplied by a shared input weight matrix to obtain initial vectors, and then the initial vectors are added to calculate the average to be used as the hidden layer vector.
In the output layer, the obtained hidden layer vector is subjected to linear processing of an activation function to obtain probability distribution, wherein the indicated word or text information with the maximum probability is the predicted result of the output layer.
If the input of the Word2Vec model is a 10000-dimensional vector, the output is also a 10000-dimensional vector, which contains 10000 probabilities, and each probability represents the probability of the current Word being an output Word or text message in the input sample (i.e. the input Word or text message).
The similarity between the semantics of the search term set and the semantics of the target text may then be determined based on any method for determining the distance between vectors, and optionally, the cosine distance, the euclidean distance, or other methods for determining the similarity between vectors may be used, which is not limited herein.
Illustratively, the search term set to be selected comprises a target search term a and a target search term B, the target search term a is converted into a 32-dimensional vector through a Word2Vec model, the target search term B is converted into a 32-dimensional vector, the 1 st-dimensional vector of the target search term a is added to the 1 st-dimensional vector of the target search term B, the 2 nd-dimensional vector of the target search term a is added to the 2 nd-dimensional vector of the target search term B, and so on, the 32 nd-dimensional vector of the target search term a is added to the 32 nd-dimensional vector of the target search term B to obtain a first semantic vector.
Based on the same principle, the target text comprises a target search Word A, a target search Word B, a target search Word C and a target search Word D, the target search Word A is converted into a 32-dimensional vector through a Word2Vec model, the target search Word B is converted into a 32-dimensional vector, the target search Word C is converted into a 32-dimensional vector, and the target search Word D is converted into a 32-dimensional vector.
And adding the 1 st-dimensional vector of the target search word A with the 1 st-dimensional vectors of the target search word B, the target search word C and the target search word D, adding the 2 nd-dimensional vector of the target search word A with the 2 nd-dimensional vectors of the target search word B, the target search word C and the target search word D, and repeating the steps to obtain a second semantic vector by adding the 32 nd-dimensional vector of the target search word A with the 32 nd-dimensional vectors of the target search word B, the target search word C and the target search word D.
And then determining the cosine distance between the first semantic vector and the second semantic vector, determining the similarity between the semantics of the search word set and the semantics of the target text according to the cosine distance, if the similarity is not greater than the threshold of the semantic similarity, determining that the search word set to be selected does not meet the search requirement of the target user, deleting the search word set to be selected, and determining whether the search word set to be selected is the target search set.
After the above-mentioned determination of the search term set to be selected is introduced, it is also necessary to determine whether the search term set to be selected is the target search information set.
In the embodiment of the application, whether the set of search terms to be selected is the set of target search information is determined by determining the search relevance between different target search terms in the set of search terms to be selected.
In the embodiment of the present application, there are various methods for determining the search correlation between different target search terms in the set of search terms to be selected, and several methods for determining the search correlation between different target search terms in the set of search terms to be selected are described below in an exemplary manner.
In an alternative embodiment, the search correlation between different target search terms may be determined based on the semantic correlation between different target search terms, and it may be considered that when the semantic correlation between different target search terms satisfies the set correlation condition, it may indicate that the search correlation between different target search terms is also satisfied.
Specifically, in the embodiment of the present application, different target search terms may be subjected to vectorization processing to obtain a word vector corresponding to each target search term, and then similarity between the word vectors is determined, where the similarity can ensure semantic relevance.
Further, in the embodiment of the present application, the higher the similarity is, the higher the semantic relevance can be determined to be; the lower the similarity, the lower the speech relevance can be determined.
In the embodiment of the present application, there are various methods for determining the similarity between vectors, and in an optional method, the euclidean distance between different vectors is determined, and when the euclidean distance is greater than a set distance, it may be determined that the similarity between two vectors is small; when the euclidean distance is smaller than the set distance, it can be determined that the similarity between the two vectors is large.
In another alternative method, the similarity between different vectors may be measured by a manhattan distance, a chebyshev distance, a mahalanobis distance, and the like, which is not described in detail herein.
In yet another alternative method, the similarity between different vectors is determined by correlation coefficients between different vectors, and the larger the absolute value of the correlation coefficient, the higher the similarity between the same vectors is.
Of course, the above methods are only some optional methods for determining the similarity between different vectors, and other methods exist, which are not described herein again.
After the search relevance between different target search terms is determined in the above embodiments, there is another method for determining the search relevance between different target search terms.
Specifically, in the embodiment of the application, the recommendation degree of the search term set to be selected is determined based on a first historical frequency of searching the search term set to be selected in a first set time period and a second historical frequency of searching each target search term; and if the recommendation degree is determined to be matched with the recommendation degree threshold value, the search word set to be selected is used as the target search information set.
The first set time in the above embodiment refers to any set time period, for example, may be 1 day, or may be 10 days, or may be 30 days, and the like, and is not limited specifically herein.
In the above embodiment, the recommendation degree of the search term set to be selected is determined based on the frequency of searching different target search terms simultaneously and the frequency of searching different target search terms respectively, that is, the search correlation between different target search terms, and a higher recommendation degree indicates a higher search correlation between different target search terms.
Optionally, in this embodiment of the present application, the first historical frequency and the second historical frequency may be determined based on search records, that is, all search records generated within a first set time period are obtained first, and based on the search records, the first historical frequency at which the set of search terms to be selected is searched and the second historical frequency at which each target search term is searched are determined.
Optionally, the search record may be a search record generated by a target user in one application with a search function, or a search record generated by a target user in multiple applications with a search function, or a search record generated by multiple users in the same application with a search function, or a search record generated by different users in different applications with a search function, which is not limited herein.
Further, in the embodiment of the present application, the recommendation degree of the to-be-selected search term set may be determined through a co-occurrence algorithm PMI, specifically, the definition of the PMI is explained by taking an example that the to-be-selected search term set includes two target search terms, as shown in formula 1:
Figure BDA0002796673630000201
in formula 1, x represents a first target search word in the set of search words, y represents a second target search word in the set of search words, p (x, y) represents the frequency with which two target search words are searched simultaneously, p (x) represents the frequency with which the first target search word is searched alone, and p (y) represents the frequency with which the second target search word is searched alone.
From the knowledge in probability, if x and y in formula 1 are not related, the relationship between p (x, y) and p (x), p (y) can be characterized by formula 2, specifically:
p (x, y) ═ p (x) p (y) equation 2
The larger the correlation between p (x), p (y), the larger the ratio of p (x, y) to p (x), p (y).
As can be seen from the above embodiment, as the result in formula 1 is closer to 1, it indicates that the recommendation degree is higher, so that the search correlation between different target search terms can be determined based on formula 1.
For example, in this embodiment of the application, the set of search terms to be selected includes two target search terms, namely a search term a and a search term B, where in the search record in the first set time period, the number of times that the search term a and the search term B are simultaneously searched for is 5000, the search term a is searched for 6000 times as an independent search term, the search term B is searched for 1000 times as an independent search term, and then the recommendation degree may be determined to be based on formula 1
Figure BDA0002796673630000202
Further, in the embodiment of the application, when determining the frequency of the search term a and the search term B, how many search terms have been searched in the first set time period may be considered, and the frequency of searching in the first set time period may be represented more accurately.
In particular, the degree of recommendation may be characterized as
Figure BDA0002796673630000203
Where N indicates that N search terms have been searched within a first set period of time.
Of course, the above is only an exemplary method for determining the search correlation between different target search terms in the set of search terms to be selected, and there are other methods, which are not described herein again.
In the embodiment of the application, after the search correlation among different target search terms in the to-be-selected search term set is determined, the recommendation degree of the to-be-selected search term set can be determined by determining the operation of the different target search terms in each to-be-selected search term set after the search.
In this embodiment of the application, the operation performed after the search refers to an operation performed on the search information by the target user or another user after the target user or another user performs the search to obtain the search information, where the operation may be a click operation or a sharing operation, and the operation is not specifically limited herein.
In summary, in the embodiment of the present application, the operation times of a search result obtained after a search term set to be selected is searched within a first set time period is obtained; determining the historical click rate of the search word set to be selected according to the operation times corresponding to the search word set to be selected and the first historical frequency of the search word set to be selected; and after the historical click rate is determined, determining the recommendation degree of the search word set to be selected by combining the search correlation among different target search words.
For example, in the embodiment of the present application, a specific method for determining the historical click rate of the to-be-selected search term set according to the operation times corresponding to the to-be-selected search term set and the first historical frequency of the to-be-selected search term set being searched may be shown in formula 3, and specifically is:
ctr (Q) ═ number of clicks (Q)/number of searches (Q) formula 3
The ctr (q) may represent a historical click rate of each target search term in the set of search terms to be selected, or the ctr (q) may represent a historical click rate of all target search terms in the set of search terms to be selected.
If the ctr (Q) can represent the historical click rate of all target search terms in the set of search terms to be selected, the click times (Q) in formula 3 are the total click times of all target search terms in the set of search terms to be selected, i.e. the sum of the operation times; the search times (Q) represents the total times of the search times of all target search words in the search word set to be selected.
After the historical click rate of the search term set to be selected is determined, the recommendation degree of the search term set to be selected can be determined by combining the search correlation among different target search terms in the search term set to be selected.
Specifically, in the embodiment of the present application, the obtained ctr (q) in formula 3 and PMI in formula 1 may be synthesized to determine the recommendation degree of the set of words to be searched.
Optionally, the sum of ctr (q) and PMI in formula 1 may be used as the recommendation degree of the set of words to be searched.
Or the product of ctr (q) and PMI in formula 1 may be used as the recommendation degree of the set of words to be searched.
Or the average value between ctr (q) and PMI in formula 1 may be used as the recommendation degree of the word set to be searched.
Of course, there are other methods for comprehensively determining the recommendation degree of the word set to be searched in combination with ctr (q) and PMI in formula 1, which are not described herein.
In the embodiment of the application, after the recommendation degree of the word set to be searched is determined, whether the search word set is a target search information set or not can be determined based on the recommendation degree of each search word set; in an alternative embodiment, if the recommendation degree of each search word set matches the recommendation degree threshold, the search word set may be considered as the target search information set.
In this embodiment of the application, the fact that the recommendation degree of each search term set is matched with the recommendation degree threshold means that the recommendation degree of each search term set is greater than the recommendation degree threshold, or the difference between the recommendation degree of each search term set and the recommendation degree threshold is smaller than a set threshold, which is not limited herein.
In the embodiment of the application, all target search information sets can be displayed, a display quantity threshold value can also be set, the recommendation degrees corresponding to the target search information sets are sorted from high to low, and the target search information sets with the quantity matched with the display quantity threshold value are displayed for target users.
Step S402, responding to the search operation triggered by the target user aiming at any target search information set, and searching based on the target search words in the target search set.
In the embodiment of the application, after the target user determines the target search information set, the target user triggers a search operation triggered by the target search information set, and then searches based on each target search word in the target search information set.
In this embodiment of the application, the searching based on the target search term in the target search set may be performed by a terminal, or may be performed by a server in communication with the terminal, which is not described herein again.
In the embodiment of the application, at least one target search information set is displayed to a user by responding to the operation of selecting target search information triggered by a target user aiming at a target text, and each target search information set comprises at least two different target search words; that is to say, when a target user needs to search for a part of target search terms in a target text, what is displayed to the user is not any target search term in the target text, but a plurality of target search information sets, and the target user can select any target search information set for searching.
Further, in the embodiment of the present application, when a target user determines a target search information set that needs to be searched, a search operation for the target search information set is responded, and a search is performed based on a target search word in the target search information set.
That is to say, in the embodiment of the present application, when a target user needs to search for part of information in a target text, multiple target search information sets can be provided to the target user, and each target search information set includes at least two different target search terms, that is, a requirement of the target user for common search of the at least two target search terms in the target text can be met, so that a requirement of the target user can be met through one search, search efficiency is improved, convenience is provided for the target user, and user experience of the target user is improved.
Based on the same technical concept, an embodiment of the present application provides a target search information determining apparatus 500, as shown in fig. 5, including:
a target search information set display unit 501, configured to display at least one target search information set in response to a target search information selection operation triggered by a target user for a target text, where the target search information set includes at least two different target search terms, each target search term is partial text information in the target text, and positions of any two target search terms in each target search information set in the target text are discontinuous;
the searching unit 502 is configured to perform a search based on a target search word in a target search set in response to a search operation triggered by a target user for any target search information set.
Optionally, the apparatus 500 further includes a target search information set determining unit 503, where the target search information set determining unit 503 is configured to:
obtaining a search word set to be selected, wherein the search word set to be selected comprises at least two target search words, and the position of each target search word in a target text is discontinuous;
determining the recommendation degree of the search term set to be selected based on the first historical frequency of the search term set to be selected in the first set time period and the second historical frequency of each target search term to be searched;
and if the recommendation degree is determined to be matched with the recommendation degree threshold value, the search word set to be selected is used as the target search information set.
Optionally, the target search information set determining unit 503 is further configured to:
removing stop words in the target text to obtain a target text to be segmented;
performing word segmentation processing on a target text to be segmented to obtain a word segmentation result;
taking each noun in the word segmentation result as a search word to be selected;
determining the historical search frequency of each search word to be selected in the word segmentation result, which is searched in a second set time period, and taking each search word to be selected, of which the historical search frequency is greater than a search frequency threshold value, as a target search word.
Optionally, the target search information set determining unit 503 is further configured to:
and determining that the similarity between the semantics of the search word set to be selected and the semantics of the target text is greater than a semantic similarity threshold.
Optionally, the target search information set determining unit 503 is specifically configured to:
converting each target search word in the search word set to be selected into a corresponding N-dimensional word vector, and adding the ith-dimensional word vector of each target search word to obtain a first semantic vector of the search word set to be selected, wherein N is more than or equal to 1, and i belongs to N;
determining an N-dimensional word vector corresponding to each text message in the target text, and adding the ith-dimensional word vector of each text message to obtain a second semantic vector of the target text;
and determining the similarity between the semantics of the to-be-selected search word set and the target text semantics according to the first semantic vector and the second semantic vector.
Optionally, the target search information set determining unit 503 is specifically configured to:
the method comprises the steps of obtaining a search log in a first set time period, determining a first historical frequency of searching all target search words in a to-be-searched word set in the search log at the same time, and determining a second historical frequency of searching each target search word in the search log;
and determining a first target value according to the product of the first search times and the number of the target search words in the search log, determining a second target value according to the product of the second historical frequency of each target search word in the set, and determining the recommendation degree of the search word set to be selected according to the quotient of the first target value and the second target value.
Optionally, the target search information set determining unit 503 is further configured to:
obtaining the operation times of search results obtained after the search word set to be selected is searched within the first set time period;
determining the historical click rate of the search word set to be selected according to the operation times corresponding to the search word set to be selected and the first historical frequency of the search word set to be selected;
and determining the recommendation degree of the search term set to be selected according to the quotient of the first target value and the second target value and the historical click rate of the search term set to be selected.
Based on the same technical concept, the embodiment of the present application provides a computer device, as shown in fig. 6, including at least one processor 601 and a memory 602 connected to the at least one processor, where a specific connection medium between the processor 601 and the memory 602 is not limited in the embodiment of the present application, and the processor 601 and the memory 602 are connected through a bus in fig. 6 as an example. The bus may be divided into an address bus, a data bus, a control bus, etc.
In the embodiment of the present application, the memory 602 stores instructions executable by the at least one processor 601, and the at least one processor 601 may execute the steps included in the foregoing target search information determining method by executing the instructions stored in the memory 602.
The processor 601 is a control center of the computer device, and may connect various parts of the computer device by using various interfaces and lines, and create a virtual machine by executing or executing instructions stored in the memory 602 and calling data stored in the memory 602. Optionally, the processor 601 may include one or more processing units, and the processor 601 may integrate an application processor and a modem processor, wherein the application processor mainly handles an operating system, a user interface, an application program, and the like, and the modem processor mainly handles wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 601. In some embodiments, the processor 601 and the memory 602 may be implemented on the same chip, or in some embodiments, they may be implemented separately on separate chips.
The processor 601 may be a general-purpose processor, such as a Central Processing Unit (CPU), a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, configured to implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present Application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.
The memory 602, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 602 may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charge Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory 602 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 602 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium storing a computer program executable by a computer device, which, when the program is run on the computer device, causes the computer device to perform the steps of the target search information determination method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart 1 flow or flows and/or block 1 block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows of FIG. 1 and/or block diagram block or blocks of FIG. 1.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart 1 flow or flows and/or block 1 block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method for determining target search information, the method comprising:
responding to an operation of selecting target search information triggered by a target user aiming at a target text, and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous;
and responding to a search operation triggered by the target user aiming at any target search information set, and searching based on the target search words in the target search set.
2. The method of claim 1, wherein before responding to a target user selecting target search information triggered by a target user for a target text, the method further comprises:
obtaining a search word set to be selected, wherein the search word set to be selected comprises at least two target search words, and the position of each target search word in the target text is discontinuous;
determining the recommendation degree of the search word set to be selected based on the first historical frequency of searching the search word set to be selected in a first set time period and the second historical frequency of searching each target search word;
and if the recommendation degree is determined to be matched with the recommendation degree threshold value, the search word set to be selected is combined as the target search information set.
3. The method of claim 2, wherein before obtaining the set of search terms to be selected, the method further comprises:
removing stop words in the target text to obtain a target text to be segmented;
performing word segmentation processing on the target text to be word segmented to obtain a word segmentation result;
taking each noun in the word segmentation result as the search word to be selected;
determining the historical search frequency of each search word to be selected in the word segmentation result, which is searched in a second set time period, and taking each search word to be selected, of which the historical search frequency is greater than a search frequency threshold value, as the target search word.
4. The method of claim 3, wherein after obtaining the set of search terms to be selected, further comprising:
and determining that the similarity between the semantics of the search word set to be selected and the semantics of the target text is greater than a semantic similarity threshold value.
5. The method of claim 4, wherein the determining the similarity between the semantics of the set of search terms to be selected and the target text semantics comprises:
converting each target search word in the search word set to be selected into a corresponding N-dimensional word vector, and adding the ith-dimensional word vector of each target search word to obtain a first semantic vector of the search word set to be selected, wherein N is greater than or equal to 1, and i belongs to N;
determining an N-dimensional word vector corresponding to each text message in the target text, and adding the ith-dimensional word vector of each text message to obtain a second semantic vector of the target text;
and determining the similarity between the semantics of the search word set to be selected and the target text semantics according to the first semantic vector and the second semantic vector.
6. The method according to claim 2, wherein the determining the recommendation degree of the search term set to be selected based on a first historical frequency of the search term set to be selected being searched within a first set time period and a second historical frequency of each of the target search terms being searched comprises:
obtaining a search log in the first set time period, determining the first historical frequency of searching all the target search words in the search log in the set of words to be searched at the same time, and determining the second historical frequency of searching each target search word in the search log;
determining a first target value according to the product of the first search times and the number of the target search words in the search log, determining a second target value according to the product of the second historical frequency of searching each target search word in the set, and determining the recommendation degree of the search word set to be selected according to the quotient of the first target value and the second target value.
7. The method of claim 6, wherein determining a first target value according to a product of the first number of searches and a number of target search terms in the search log, and determining a second target value according to a product of the second historical frequency with which each of the target search terms in the set is searched further comprises:
obtaining the operation times of search results obtained after the search word set to be selected is searched within the first set time period;
determining the historical click rate of the search word set to be selected according to the operation times corresponding to the search word set to be selected and the first historical frequency of the search word set to be selected;
and determining the recommendation degree of the search term set to be selected according to the quotient of the first target value and the second target value and the historical click rate of the search term set to be selected.
8. An object search information determination apparatus characterized by comprising:
the target search information set display unit is used for responding to a target search information selection operation triggered by a target user aiming at a target text and displaying at least one target search information set, wherein the target search information set comprises at least two different target search words, each target search word is partial text information in the target text, and the positions of any two target search words in each target search information set in the target text are discontinuous;
and the searching unit is used for responding to a searching operation triggered by the target user aiming at any target searching information set and searching based on the target searching words in the target searching set.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any one of claims 1 to 7 are performed by the processor when the program is executed.
10. A computer-readable storage medium, in which a computer program is stored which is executable by a computer device, and which, when run on the computer device, causes the computer device to carry out the steps of the method as claimed in any one of claims 1 to 7.
CN202011334168.8A 2020-11-25 2020-11-25 Target search information determination method and device Pending CN112347365A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011334168.8A CN112347365A (en) 2020-11-25 2020-11-25 Target search information determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011334168.8A CN112347365A (en) 2020-11-25 2020-11-25 Target search information determination method and device

Publications (1)

Publication Number Publication Date
CN112347365A true CN112347365A (en) 2021-02-09

Family

ID=74364751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011334168.8A Pending CN112347365A (en) 2020-11-25 2020-11-25 Target search information determination method and device

Country Status (1)

Country Link
CN (1) CN112347365A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360769A (en) * 2021-06-28 2021-09-07 北京百度网讯科技有限公司 Information query method and device, electronic equipment and storage medium
CN114722179A (en) * 2022-04-26 2022-07-08 国信专达(杭州)科技有限公司 Retrieval analysis and data fusion method based on information tracing

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360769A (en) * 2021-06-28 2021-09-07 北京百度网讯科技有限公司 Information query method and device, electronic equipment and storage medium
CN113360769B (en) * 2021-06-28 2024-02-09 北京百度网讯科技有限公司 Information query method, device, electronic equipment and storage medium
CN114722179A (en) * 2022-04-26 2022-07-08 国信专达(杭州)科技有限公司 Retrieval analysis and data fusion method based on information tracing
CN114722179B (en) * 2022-04-26 2023-07-04 国信专达(杭州)科技有限公司 Retrieval analysis and data fusion method based on information tracing

Similar Documents

Publication Publication Date Title
CN111046221B (en) Song recommendation method, device, terminal equipment and storage medium
CN108241741B (en) Text classification method, server and computer readable storage medium
CN112231569B (en) News recommendation method, device, computer equipment and storage medium
US9268767B2 (en) Semantic-based search system and search method thereof
CN113434636B (en) Semantic-based approximate text searching method, semantic-based approximate text searching device, computer equipment and medium
US9767417B1 (en) Category predictions for user behavior
US9767204B1 (en) Category predictions identifying a search frequency
CN111737997A (en) Text similarity determination method, text similarity determination equipment and storage medium
CN110427453B (en) Data similarity calculation method, device, computer equipment and storage medium
EP2862104A1 (en) Search method and apparatus
WO2019133506A1 (en) Intelligent routing services and systems
CN107943895A (en) Information-pushing method and device
CN112685648A (en) Resource recommendation method, electronic device and computer-readable storage medium
CN112347365A (en) Target search information determination method and device
US10387934B1 (en) Method medium and system for category prediction for a changed shopping mission
CN117194616A (en) Knowledge query method and device for vertical domain knowledge graph, computer equipment and storage medium
CN111966894A (en) Information query method and device, storage medium and electronic equipment
CN116738057A (en) Information recommendation method, device, computer equipment and storage medium
CN116662538A (en) Text abstract generation method, device, equipment and medium based on multitask learning
CN115618126A (en) Search processing method, system, computer readable storage medium and computer device
CN112766995A (en) Article recommendation method and device, terminal device and storage medium
CN111985217B (en) Keyword extraction method, computing device and readable storage medium
CN112148988B (en) Method, apparatus, device and storage medium for generating information
CN112749256A (en) Text processing method, device, equipment and storage medium
CN113792131A (en) Keyword extraction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40038749

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination