CN105159898B - A kind of method and apparatus of search - Google Patents
A kind of method and apparatus of search Download PDFInfo
- Publication number
- CN105159898B CN105159898B CN201410262143.XA CN201410262143A CN105159898B CN 105159898 B CN105159898 B CN 105159898B CN 201410262143 A CN201410262143 A CN 201410262143A CN 105159898 B CN105159898 B CN 105159898B
- Authority
- CN
- China
- Prior art keywords
- user
- word string
- query
- information
- intention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a kind of method and apparatus of search, the method includes: to be scanned for the original query word string when receiving the original query word string of the first user submission, obtain the matched network information;According to the network information judge the original query word string whether be more query intentions inquiry word string;If so, the original query word string is rewritten as multiple the first inquiry word strings with the query intention respectively according to each query intention;The second user that there is same or similar query intention with the first inquiry word string is searched according to the first inquiry word string respectively;Wherein, the second user has community information;The network information and the corresponding community information of the second user are synthesized into search result.The embodiment of the present invention avoids the first user and repeats to carry out cumbersome artificial filter to the network information of magnanimity, reduces the consuming of the first user time and energy, substantially increases the efficiency, quality and capacity of acquisition of information.
Description
Technical field
The present invention relates to the technical fields of search, method and a kind of device of search more particularly to a kind of search.
Background technique
With the rapid development of network, the network information is sharply increased.User in the network information of magnanimity in order to find institute
The network information needed is scanned for usually using search engine.
Search engine refers to that collecting information from internet automatically is supplied to what user was inquired after centainly arranging
System.Network information vastness is multifarious, and has no order, and all network informations are as the island one by one on vast sea, webpage chain
Connecing is bridge criss-cross between these islands, and search engine, then draws an open-and-shut information map for user,
It is consulted at any time for user.
But the contradiction that the network information speed increased and people obtain between information needed ability is more and more prominent, mistake
The network information of amount makes user carry out cumbersome artificial filter when searching for the network information, takes considerable time and smart
The search efficiency of power, the network information is very low.
Summary of the invention
The embodiment of the present invention is the technical problem to be solved is that a kind of method of search is provided, to improve the network information
Search efficiency.
Correspondingly, the embodiment of the invention also provides a kind of device of search, to guarantee the realization of the above method and answer
With.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of methods of search, comprising:
It when receiving the original query word string of the first user submission, is scanned for, is obtained with the original query word string
The matched network information;
According to the network information judge the original query word string whether be more query intentions inquiry word string;If so,
The original query word string is then rewritten as multiple the first inquiries with the query intention respectively according to each query intention
Word string;
Searching respectively according to the first inquiry word string has same or similar query intention with the first inquiry word string
Second user;Wherein, the second user has community information;
The network information and the corresponding community information of the second user are synthesized into search result.
Preferably, described to judge that the step of whether the original query word string is the inquiry word string of more query intentions includes:
Obtain the matched fisrt feature network information of the original query word string;The fisrt feature network information includes row
The preceding M network information of the highest preceding N network information of sequence and/or history number of clicks at most;
Obtain the second feature network information of other query word String matchings;The second feature network information includes sorting most
The preceding B network information of the high preceding A network information and/or history number of clicks at most;
Judge in the fisrt feature network information whether to include at least two second feature network informations;If so, sentencing
The fixed original query word string is the inquiry word string of more query intentions;Wherein, M, N, A, B are positive integer.
Preferably, described to judge that the step of whether the original query word string is the inquiry word string of more query intentions includes:
The corresponding entity class of the original query word string is searched in set knowledge base;
When the entity class is more than two, determine that the original query word string is the inquiry word string of more query intentions.
Preferably, described to judge that the step of whether the original query word string is the inquiry word string of more query intentions includes:
The associated Feature Words of original query word string are searched in set knowledge base;
Judge in the webpage of the whole network, whether the quantity of the Feature Words is more than preset quantity threshold value;If so, using knowing
The entity class for knowing library classifies to the Feature Words;
When acquisition at least two is classified, determine that the original query word string is the inquiry word string of more query intentions.
Preferably, the lookup has the step of the second user of same or similar query intention with the first inquiry word string
Suddenly include:
Each of described first user first is obtained respectively inquires the corresponding first query intention information of word string and described
Second query intention information of second user;
Calculate separately the similarity of the first query intention information Yu the second query intention information;
When the similarity is greater than preset similarity threshold, the first inquiry word string and the second user are judged
With the same or similar query intention.
Preferably, the first query intention information includes first eigenvector, and the first eigenvector is according to
First inquiry word string is determined;
The second query intention information includes second feature vector, and the second feature vector is according to second inquiry
Word string is determined;
Wherein, the second inquiry word string is the inquiry word string that the second user is formerly submitted.
Preferably, the first eigenvector comprises at least one of the following:
The associated feature vector of participle and the first query word String matching of first inquiry word string and the first inquiry word string
The associated feature vector of the network information;
The second feature vector comprises at least one of the following:
The associated feature vector of participle and the second query word String matching of second inquiry word string and the second inquiry word string
The associated feature vector of the network information.
Preferably, the step by the network information and the corresponding community information synthesis search result of the second user
Suddenly include:
First user under each query intention is calculated to spend closely with being associated with for the second user;
The corresponding community information of the second user is ranked up according to association degree closely;
Network information community information corresponding with the second user after sequence is synthesized into search result.
Preferably, described to calculate being associated with for first user and the second user under each query intention and spend closely
Step includes:
To the similarity of the first query intention information and the second query intention information described under each query intention,
And/or the related information between first user and the second user, and/or, the second user is looked into described second
It askes the historical operation information record being intended to and configures corresponding weight;
To configuration weight after the first query intention information and the second query intention information similarity,
And/or the related information between first user and the second user, and/or, the second user is looked into described second
It askes the historical operation information being intended to and carries out read group total, obtain first user and the second user under each query intention
Association spend closely.
Preferably, the related information between first user and the second user comprises at least one of the following:
The average connection duration in average connection number, preset time period, the quantity of common friend in preset time period,
Dwelling places;
The second user comprises at least one of the following the historical operation information of second query intention:
The corresponding searching times of second query intention, the corresponding network information of second query intention browsing when
The corresponding search continuous days of long, described second query intention.
Preferably, there is community's friend relation between first user and the second user.
The embodiment of the invention also discloses a kind of devices of search, comprising:
Network information search module, for receive the first user submission original query word string when, with described original
Inquiry word string scans for, and obtains the matched network information;
More query intention judgment modules, for judging whether the original query word string is look into more according to the network information
Ask the inquiry word string being intended to;If so, calling query word falsification writing module;
Query word falsification writing module, it is multiple for being rewritten as the original query word string respectively according to each query intention
The first inquiry word string with the query intention;
User's searching module has phase with the first inquiry word string for searching respectively according to the first inquiry word string
The second user of same or similar query intention;Wherein, the second user has community information;
Search result synthesis module, for searching the network information and the corresponding community information synthesis of the second user
Hitch fruit.
Preferably, more query intention judgment modules include:
Fisrt feature network information acquisition submodule, for obtaining the matched fisrt feature network of the original query word string
Information;The fisrt feature network information include the highest preceding N network information of sequence and/or history number of clicks at most before
The M network information;
Second feature network information acquisition submodule, the second feature network for obtaining other query word String matchings are believed
Breath;The second feature network information includes the preceding B of the highest preceding A network information of sequence and/or history number of clicks at most
The network information;
Feature Network Information judging submodule, for judging in the fisrt feature network information whether to include at least two
The second feature network information;If so, calling the first decision sub-module;
First decision sub-module, for determining that the original query word string is the inquiry word string of more query intentions;Wherein, M,
N, A, B are positive integer.
Preferably, more query intention judgment modules include:
Entity class searches submodule, for searching the corresponding entity of the original query word string in set knowledge base
Classification;
Second decision sub-module, for determining that the original query word string is more when the entity class is more than two
The inquiry word string of query intention.
Preferably, more query intention judgment modules include:
Feature Words search submodule, for searching the associated feature of original query word string in set knowledge base
Word;
Quantity judging submodule, for judging in the webpage of the whole network, whether the quantity of the Feature Words is more than present count
Measure threshold value;If so, calling classification submodule;
Classification submodule, for being classified using the entity class of knowledge base to the Feature Words;
Third decision sub-module, for when acquisition at least two is classified, determining that the original query word string is more inquiries
The inquiry word string of intention.
Preferably, user's searching module includes:
Query intention acquisition of information submodule, it is corresponding for obtaining the inquiry word string of each of described first user first respectively
The first query intention information and the second user the second query intention information;
Query intention information similarity calculation submodule, for calculating separately the first query intention information and described the
The similarity of two query intention information;
Judging submodule, for judging first query word when the similarity is greater than preset similarity threshold
String has the same or similar query intention with the second user.
Compared with prior art, the embodiment of the present invention includes following advantages:
It is scanned in the embodiment of the present invention with the original query word string that the first user submits, obtains matched network letter
Original query word string is rewritten as multiple with this by breath when judging original query word string for the inquiry word string of more query intentions
First inquiry word string of query intention, and the second user that there is same or similar query intention with the first user is searched, and
The community information of the network information and second user is synthesized into search result, so that when the first user demand is indefinite, according to each
Kind classification demand is screened the community good friend of user with regard to subject categories by analysis search log, respectively obtained and each theme
The mostly concerned each second user of classification, so that the search need of the first user is finely divided, make user search need not
It can also recommend contact person similar with active user's demand out in specific situation, avoid the first user repetition to the net of magnanimity
Network information carries out cumbersome artificial filter, reduces the consuming of the first user time and energy, decreases user equipment and net
The system resources consumption stood decreases the occupancy of network bandwidth, substantially increases the efficiency, quality and capacity of acquisition of information.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the embodiment of the method for search of the invention;
Fig. 2 is a kind of displaying exemplary diagram of community information of the invention;
Fig. 3 is a kind of structural block diagram of the Installation practice of search of the invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
Referring to Fig.1, show a kind of step flow chart of the embodiment of the method for search of the invention, can specifically include as
Lower step:
Step 101, it when receiving the original query word string of the first user submission, is searched with the original query word string
Rope obtains the matched network information;
Using the embodiment of the present invention, the first user can log in the first client, then the first user can pass through first
Client submits original query word string, request search and the matched network information of original query word string.
It, then can be original according to this when receiving the original query word string of the first user submission in the embodiment of the present invention
Word string Rapid Detection network information in index database is inquired, the covariance mapping of the network information and inquiry is carried out, to will export
Result be ranked up.
It is illustrated by taking search engine as an example, the search routine of search engine is divided into two parts, first is that front end user is asked
Process is sought, second is that rear end makes data procedures.
One, front end user request process:
1. receiving request: receiving the inquiry word string that user inputs in search engine;
2. query word is analyzed: carrying out word segmentation processing to inquiry word string;
3. retrieval: according to word segmentation result, from the inverted index of pre-production, searching candidate's relevant to word segmentation result
The network information;
4. sequence: for the candidate network information, being ranked up according to dimensions such as content relevance, timeliness;
5. showing: the webpage after sequence is come out in search engine webpage representation.
Two, rear end makes data procedures:
1. webpage capture: grabbing the network information of internet and guarantor by the linking relationship between webpage using crawler technology
It deposits.
2. compilation of index: the network information for having grabbed preservation is analyzed, such as to web page title and page text into
Row word segmentation processing makes inverted index according to word segmentation result, uses for front end user request process.
Step 102, according to the network information judge the original query word string whether be more query intentions query word
String;If so, thening follow the steps 103;
Each searching request that user is issued may imply potential query intention behind, when original query word string
When being associated with multiple queries intention, user demand is indefinite.
For example, the original query word string that user submits when searching for is " the semi-gods and the semi-devils ", pent-up demand may have three classes:
Film " the semi-gods and the semi-devils ", TV play " the semi-gods and the semi-devils ", three kinds of game " the semi-gods and the semi-devils " can be on this basis by original query word string
" the semi-gods and the semi-devils " is rewritten.
In one preferred embodiment of the invention, step 102 may include following sub-step:
Sub-step S11 obtains the matched fisrt feature network information of the original query word string;The fisrt feature network
Information may include the preceding M network information of the highest preceding N network information of sequence and/or history number of clicks at most;
Sub-step S12 obtains the second feature network information of other query word String matchings;The second feature network information
It may include the preceding B network information of the highest preceding A network information of sequence and/or history number of clicks at most;
Whether sub-step S13 judges in the fisrt feature network information to include at least two second feature network informations;
If so, executing sub-step S14;
Sub-step S14 determines that the original query word string is the inquiry word string of more query intentions.
It should be noted that M, N, A, B all can be positive integers.
In the concrete realization, by the search result (i.e. with the network information of query word String matching) of analysis inquiry word string and
User search for log, judge the inquiry word string whether be more intent query words inquiry word string.
It is possible to further obtain the N network information before the search result of all query words, example by searching for log statistic
Such as the preceding M item net of preceding 10 URL (Uniform Resource Locator, uniform resource locator) and number of clicks at most
Network information, such as preceding 10 URL.Before if the highest preceding N network information of sequence and number of clicks of inquiry word string a are most
The M network information, the highest preceding A network information of sequence and number of clicks comprising inquiry word string b and inquiry word string c are most
The preceding B network information, it may be considered that inquiry word string a is more intent query words, and user demand has two classes, a kind of demand
Related to inquiry word string b, another kind of demand is related to inquiry word string c.
For example, obtaining the highest network information of sequence as shown in Table 1 by search log statistic and/or history being clicked
The most network informations of number.
Table 1, the highest network information of sequence and/or the most network information lists of history number of clicks
It can analyze to obtain by table 1:
The highest network information of " the semi-gods and the semi-devils " sequence includes that " dragon oath game " and " the semi-gods and the semi-devils TV play " sorts
The highest network information;
The most network informations of " the semi-gods and the semi-devils " history number of clicks include " dragon oath game " and " the semi-gods and the semi-devils TV
It is acute " the most network informations of history number of clicks;
Therefore can know, when user searches for, the query demand of original query word string " the semi-gods and the semi-devils " can have two kinds, point
It Wei " dragon oath game " and " the semi-gods and the semi-devils TV play ".
, can be in the query demand for precalculating each inquiry word string using the embodiment of the present invention, then production is as shown in table 2
More query intention dictionaries, receive user submission original query word string when, can be searched in the query intention dictionary
The original query word string then can be determined that the original query word string is the inquiry word string of more query intentions when finding.
Table 2, more intent query word string lists
In one preferred embodiment of the invention, step 102 may include following sub-step:
Sub-step S21 searches the corresponding entity class of the original query word string in set knowledge base;
Sub-step S22 determines that the original query word string is more query intentions when the entity class is more than two
Inquire word string.
The corresponding entity class of original query word string can be obtained by knowledge library lookup in the embodiment of the present invention, from
And classify to user query demand.
For example, user searches for " the fiery shadow person of bearing ", lookup knowledge base, which obtains " the fiery shadow person of bearing ", two entity class, first is that unrestrained
It draws, first is that cartoon.So user demand can be divided by two classes according to the classification in knowledge base: " the caricature fire shadow person of bearing " and
" the cartoon fire shadow person of bearing ", and on this basis rewrite original query word string " the fiery shadow person of bearing ".
It should be noted that knowledge base is structuring in knowledge engineering, easy to operate, Yi Liyong, comprehensively organized knowledge
Cluster is the needs solved for a certain (or certain) field question, is being calculated using certain (or several) knowledge representation mode
The knowledge piece set interknited for storing, organize, managing and using in machine memory.These knowledge pieces include related to field
Theoretical knowledge, factual data, the heuristic knowledge obtained by expertise, such as definition related in certain field, theorem and fortune
Algorithm and common sense knowledge etc..
Entity can be with corresponding one specific individual, such as can be Liu Dehua, Zhang Baizhi, woods in star's classification
Green rosy clouds etc., entity also include some wide in range individuals, such as people, film star, singer of representative classification etc..
In one preferred embodiment of the invention, step 102 may include following submodule:
Sub-step S31 searches the associated Feature Words of original query word string in set knowledge base;
Sub-step S32 judges in the webpage of the whole network whether the quantity of the Feature Words is more than preset quantity threshold value;If
It is then to execute sub-step S33;
Sub-step S33 classifies to the Feature Words using the entity class of knowledge base;
Sub-step S34 determines that the original query word string is the inquiry of more query intentions when acquisition at least two is classified
Word string.
In the embodiment of the present invention user demand can be determined in conjunction with the webpage of knowledge base and internet.
In the concrete realization, first several Feature Words can be extracted, analysis is then passed through according to the physical contents of knowledge base
Internet web page obtains the correlation degree and demand intensity of Feature Words and entity, and finally selection amount threshold determines original
Inquire the final classification of word string.
For example, user, in search, the original query word string of submission is " Tsinghua University ", then it can be from the " clear of knowledge base
In the physical contents of Hua Da ", " education of undergraduate course " is extracted, " graduate education ", " two school gates ", " MoonlIght on the Lotus Pond ", " first is eaten
The Feature Words such as the Room ", " garden Jin Chun dining room ".Then by analyzing internet web page, statistics occurs [clear simultaneously in webpage
Hua Da, education of undergraduate course], [Tsinghua University, graduate education], the Feature Words such as [Tsinghua University, two school gates] webpage number.Simultaneously
The webpage of appearance is more, and it is closer to indicate that the specific word is associated with " Tsinghua University ".More than the Feature Words of preset amount threshold
It may act as the potential demand of user.
Further according to the classification system of knowledge base, Feature Words are classified.
For example, " education of undergraduate course ", " graduate education " are enrollment class, " two school gates ", " MoonlIght on the Lotus Pond " are sight spot class, " first
Institute dining room ", " garden Jin Chun dining room " are dining room class.
Finally demand relevant to " Tsinghua University " is just divided into three classes, i.e., enrollment class demand, sight spot class demand, dining room class need
It asks.
Step 103, the original query word string is rewritten as respectively according to each query intention multiple with the inquiry
The first inquiry word string being intended to;
After the analysis of more intent queries, a fuzzy query demand, it can it is converted into the query demand of multiple determinations,
I.e. first inquiry word string can be for the inquiry word string for determining query demand.
For example, original query word string " the semi-gods and the semi-devils " can be rewritten as the first inquiry word string " the semi-gods and the semi-devils film ", " sky dragon
Eight TV plays ", " dragon oath game ".
Step 104, it searches respectively with the first inquiry word string according to the first inquiry word string with same or similar
The second user of query intention;Wherein, the second user can have community information;
During the present invention is implemented, it can be found out according to original query word string and be looked into what each first inquiry word string matched respectively
It askes and is intended to, be then directed to different query intentions, matching meets the second user of the corresponding query intention of the first inquiry word string.
For example, original query word string is " the semi-gods and the semi-devils ", and the second user of lookup may include three when user searches for
Part, i.e., the tool for inquiring word string " the semi-gods and the semi-devils film ", " the semi-gods and the semi-devils TV play " and " dragon oath game " with first respectively
There is the second user of same or similar query intention, guarantees to recommend in the indefinite situation of search need to use with current out
The similar contact person of family demand.
In the concrete realization, it can have community's friend relation between first user and the second user, then originally
Social account, such as immediate communication tool user, all types of websites (such as forum, discussion bar, door can be associated in inventive embodiments
Family website etc.) registration user etc., it is associated with community's friend relation of available first user of social account, in the first user
Community good friend user in search matching second user.
It should be noted that community's friend relation may include one or more levels friend relation, for example, level-one friend relation
User can be active user good friend user, second level good friend user can for active user good friend user it is corresponding
Good friend user etc., the embodiments of the present invention are not limited thereto.
Certainly, non-community's friend relation, i.e. second user be can have between first user and the second user
It can be strange user for the first user, then can search matched the in the embodiment of the present invention in global scope
Two users.
Wherein, the second user can have community information, and community can be several social groups or social organization
It is gathered in some field and is formed by the collectively owned business in life that is mutually related, such as forum, microblogging, discussion bar, portal
Website, instant communicating system etc., i.e. community information may include user's head portrait, user's name, User ID, address etc.
Deng.
In one preferred embodiment of the invention, step 104 may include following sub-step:
Sub-step S41 obtains each of described first user first respectively and inquires the corresponding first query intention letter of word string
Second query intention information of breath and the second user;
First query intention information can be the first user of mark in the indefinite situation of query intention, a certain subdivision
The information of the corresponding query intention of subject categories, the second query intention information can be the letter of mark second user query intention
Breath.
In a preferred example of an embodiment of the present invention, the first query intention information may include fisrt feature to
Amount, the second query intention information may include second feature vector;
Wherein, the vector information that first eigenvector can be intended to for the first user query of mark, second feature vector can
Think the vector information of mark second user query intention.
Then in this example,
The first query intention information may include corresponding first eigenvector, and the first eigenvector can
To be determined respectively according to the first inquiry word string;
The second query intention information may include second feature vector, and the second feature vector can be according to described
Second inquiry word string is determined;
Wherein, the second inquiry word string can be the inquiry word string that the second user is formerly submitted
In this example, it can search by analyzing inquiry word string, search result and search log and represent inquiry
The feature of the query intention of word string, is calculated characteristic value, so that query word string list is shown as feature vector.
The relevant feature vector of query intention of inquiry word string can be divided into three categories, and the first kind can be query word string sheet
The feature vector of body, the second class can be able to be and query word for the associated feature vector of participle with inquiry word string, third class
The associated feature vector of the network information of String matching, these feature vectors may be used to indicate the query intention of inquiry word string.
Then in the concrete realization, the first eigenvector may include following at least one:
The associated feature vector of participle and the first query word String matching of first inquiry word string and the first inquiry word string
The associated feature vector of the network information;
The second feature vector may include following at least one:
The associated feature vector of participle and the second query word String matching of second inquiry word string and the second inquiry word string
The associated feature vector of the network information.
In a kind of preferable example that the present invention is implemented, the associated feature vector of participle with the first inquiry word string can
To comprise at least one of the following:
The synonymous word string of first inquiry word string, the participle of the first inquiry word string, the first inquiry word string participle part of speech, the
One inquiry word string participle synonym, first inquiry word string participle different degree;
Described and the first query word String matching associated feature vector of the network information may include following at least one:
Title with the network information of the first query word String matching, the webpage with the network information of the first query word String matching
Mark is inquired with the history click information of the network information of the first query word String matching, with the first associated other of inquiry word string
Word string;
The associated feature vector of participle with the second inquiry word string may include following at least one:
The synonymous word string of second inquiry word string, the participle of the second inquiry word string, the second inquiry word string participle part of speech, the
Two inquiry word strings participle synonym, second inquiry word string participle different degree;
Described and the second query word String matching associated feature vector of the network information may include following at least one:
Title with the network information of the second query word String matching, the webpage with the network information of the second query word String matching
Mark is inquired with the history click information of the network information of the second query word String matching, with the second associated other of inquiry word string
Word string.
The example of first/second feature vector can be such that
1, word string itself is inquired;
For example, revised first inquiry word string " the semi-gods and the semi-devils TV play " itself.
2, the synonymous word string of word string is inquired;
In this example, the synonymous word string of inquiry word string can be found in the good synonym dictionary of pre-production.For example,
" the semi-gods and the semi-devils " and " new the semi-gods and the semi-devils " is synonym, and " new the semi-gods and the semi-devils " and " the good version of the semi-gods and the semi-devils clock Chinese " is synonym (this
Class synonym always can be synonym with newest one edition the semi-gods and the semi-devils with actual change).
3, the participle term of word string is inquired;
In this example, query word can be segmented, the term after being segmented.For example, to inquiry word string " sky dragon eight
Term after portion's TV play " participle has two [the semi-gods and the semi-devils, TV plays].
4, the part of speech of the participle term of word string is inquired;
In this example, part of speech analysis can be carried out to participle term, obtain the part of speech of participle term.For example, participle term
[the semi-gods and the semi-devils, TV play] corresponding part of speech is [noun, noun].
5, the synonym of the participle term of word string is inquired;
In this example, the synonym of participle term can be searched in the synonym dictionary of pre-production.For example, participle
The synonym of term [the semi-gods and the semi-devils, TV play] is [the semi-gods and the semi-devils, serial].
6, the different degree of the participle term of word string is inquired;
In this example, log can be searched for by statistics, obtain TF (Term Frequency, the word of each participle term
Frequently) and IDF (Inverse Document Frequency, anti-document frequency).TF-IDF is a kind of statistical method, to assess
Significance level of one words for a copy of it file in a file set or a corpus.The importance of words is with it
The directly proportional increase of the number occurred hereof, but the frequency that can occur in corpus with it simultaneously is inversely proportional decline.Then
The different degree of each participle term can be indicated in this example by TF-IDF.For example, participle term [" the semi-gods and the semi-devils ", " electricity
Depending on play "] in, the TF-IDF value of " the semi-gods and the semi-devils " is higher than the TF-IDF value of " TV play ", then " the semi-gods and the semi-devils " is than " TV play "
Different degree is high, includes more information content.
7, with the title of the network information of query word String matching;
In this example, the title of the network information can refer to and inquire that word string is corresponding, and (N is by preceding N that search engine returns
Positive integer, such as the 10) title of search result, can be used for the relevant text of locating query word string and keyword.For example, searching
Rope " the semi-gods and the semi-devils ", first three title of the search result of return is respectively that " the new good version of the semi-gods and the semi-devils clock Chinese (complete 42 collection) is online
Viewing-* * video display ", " the semi-gods and the semi-devils (2013)-the semi-gods and the semi-devils (2013) complete or collected works (1-42 is complete)-* * video " and " the semi-gods and the semi-devils _ point
Collect plot-* * net ".
8, with the banner of the network information of query word String matching;
In this example, banner can be the information that can represent the webpage that one uniquely determines, such as unified resource
Identifier (Uniform Resource Identifier, URI), uniform resource identifier can specifically include unified resource again
Finger URL (Uniform Resource Locator, URL) or uniform resource name (Uniform Resource Name,
URN) etc..(M is positive integer to M, such as the 10) URL of the network information, can be used for positioning before being specifically as follows search result
Inquire the relevant network address of word string and website.For example, search " the semi-gods and the semi-devils ", first three URL of search result are respectively as follows:
" http://kan.***.com/search/ keyword=%E5%A4%A9%E9%BE%99%E5%
85%AB%E9%83%A8 ";
"http://tv.***.com/s2013/tlbbwsj2013/";
" http://www.***.com/drama/KysdNWU=/episode ".
9, with the history click information of the network information of query word String matching;
In this example, history click information can be for the user of the search inquiry word string, the click feelings in search result
The statistics of condition.It is more important, more relevant with inquiry word string which network information is measured by user behavior.For example, user searches for
" the semi-gods and the semi-devils " 10000 times, the click of first three URL are shown in table 1.
Table 3, history click information table
By table 3 it can be shown that the URL of the Article 2 network information and inquiry word string are more relevant.
10, word strings are inquired with associated other of inquiry word string;
In this example, it may search for submitting the user of the inquiry word string which also has been searched for other inquiry word strings, Ke Yiyong
In the relevant some concepts of expression inquiry word string.For example, the user of search " 18 is big ", has also searched for " two Conferences ", " the 18 of party
Spirit " etc..
Certainly, above-mentioned first/second feature vector is intended only as example, in implementing the embodiments of the present invention, can basis
Other first/second feature vectors are arranged in actual conditions, and the embodiments of the present invention are not limited thereto.In addition, in addition to above-mentioned
Outside one/second feature vector, those skilled in the art can also use other first/second feature vectors according to actual needs,
The embodiment of the present invention is also without restriction to this.
Sub-step S42 calculates the similarity of the first query intention information and the second query intention information;
In the concrete realization, inquiry word string can be clustered according to the similitude of query intention.
In a preferred example of an embodiment of the present invention, sub-step S42 can further include following sub-step:
Sub-step S421 calculates the similarity between the first eigenvector and the second feature vector.
In this example, for the feature vector determined by inquiry word string, clustering algorithm (such as hierarchical clustering can be used
Algorithm/kmeans algorithm etc.) similarity is calculated, word string, which will be inquired, further according to similarity carries out category division.
For example, the first inquiry word string " the semi-gods and the semi-devils TV play " and the second inquiry word string " the semi-gods and the semi-devils Zhong Hanliang in table 4
The corresponding first eigenvector of version " and second feature vector, identical part have:
1, the participle term that the participle term for inquiring word string has a different degree high is identical, i.e., " the semi-gods and the semi-devils ";
2, first three network information of search result is identical, i.e., with the title of the network information of query word String matching, with look into
The banner for asking the matched network information of word string is identical;
3, inquiring with " the semi-gods and the semi-devils TV play " associated other includes " the good version of the semi-gods and the semi-devils clock Chinese " in word strings, and all
There is identical query word " the semi-gods and the semi-devils ".
Table 4, feature vector contrast table
In the cluster process using clustering algorithm, these same sections can be quantified and fisrt feature is calculated
The similarity of vector sum second feature vector.
Sub-step S43 judges the first inquiry word string and institute when the similarity is greater than preset similarity threshold
Second user is stated with the same or similar query intention.
In the concrete realization, when similarity is more than default similarity threshold, then first word string and the second query word are inquired
String can gather for one kind, i.e. query intention and second user after this corresponding subdivision of the first user are same or similar.
First eigenvector and second feature vector are more similar, and the first inquiry word string and the second inquiry word string are more possible to
It is to be gathered in cluster process for one kind, the inquiry meaning of query intention and second user after this corresponding subdivision of the first user
Scheme more similar or even identical.
For example, the first inquiry word string " the semi-gods and the semi-devils TV play " and second inquire word string " the good version of the semi-gods and the semi-devils clock Chinese " can be with
Gather for one kind, the first inquiry word string " loan application " and the second inquiry word string " application provide a loan process " can gather for one kind.
In the concrete realization, user and its query intention, inquiry word string/spy can be saved after user inquires
The corresponding relationship of vector and its query intention is levied, to facilitate subsequent lookup that there is same or similar look into each first inquiry word string
Ask the second user being intended to.
For example, the corresponding relationship can be saved according to format as shown in table 5.
Table 5, user-query intention, inquiry word string/feature vector-query intention corresponding lists
When searching the second user that there is same or similar query intention with the first user, looked into according to the user-of preservation
Ask be intended to, inquiry word string/feature vector-query intention corresponding lists and the first user first eigenvector, be calculated with
First user query are intended to the same or similar second user.
Steps are as follows for specific calculating:
1, the first eigenvector A of the first user is determined;
2, using the feature vector in A and user-query intention, inquiry word string/feature vector-query intention corresponding lists
A1, A2 ..., An (n is positive integer) calculate similarity, it is corresponding to find the highest feature phase vector Ai of similarity (i is positive integer)
Query intention i;
3, the query intention i obtained according to step 2, in user-query intention, inquiry word string/feature vector-query intention
In corresponding lists, the second user of query intention i is found.
Step 105, the network information and the corresponding community information of the second user are synthesized into search result.
It, can be using the community information of the network information and second user as final search result in the embodiment of the present invention.
In one preferred embodiment of the invention, step 105 may include following sub-step:
Sub-step S51 calculates first user under each query intention and spends closely with being associated with for the second user;
In the embodiment of the present invention, influence the first user to be associated with the factor spent closely with second user to may include three portions
Point, first part is the similarity of query intention, and second part is the familiarity of the first user and second user, Part III
It is familiarity of the second user to query intention.
In a preferred example of an embodiment of the present invention, sub-step S51 can further include following sub-step:
Sub-step S511, by the corresponding each subdivision classification of original query word string, i.e., under each query intention described
The similarity of one query intention information and the second query intention information, and/or, first user and the second user
Between related information, and/or, the second user configures corresponding power to the historical operation information of second query intention
Weight;
Sub-step S512, to the first query intention information and the second query intention information after configuration weight
Similarity, and/or, the related information between first user and the second user, and/or, the second user pair
The historical operation information of second query intention carries out read group total, obtains first user and institute under each query intention
The association for stating second user is spent closely.
In this example, can by historical data and search log analysis, the second query intention information it is similar
Degree, and/or, the related information between first user and the second user, and/or, the second user is to described the
Then the numerical value of each factor in the historical operation information of two query intentions configures weight with experience according to actual needs, such as
Different degree is higher, and weight then can be bigger, finally by various factors weighted calculation, obtaining being associated with close degree.
In practical applications, the similarity of the first query intention information and the second query intention information can be in step 104
In be calculated.Inquiry word string is more similar, and query intention is then more similar.
For example, second user A was searched in the corresponding subdivision classification " TV play " of original query word string " the semi-gods and the semi-devils "
" the semi-gods and the semi-devils TV play complete or collected works ", second user B searched for " the semi-gods and the semi-devils introduction ", then second user A is than second user B's
Query intention is closer to the first user, then the association of second user A association of the degree than second user B closely is spent bigger closely.
In the concrete realization, the related information between first user and the second user may include it is following at least
It is a kind of:
The average connection duration in average connection number, preset time period, the quantity of common friend in preset time period,
Dwelling places.
In this example, related information can identify the familiarity of the first user and second user, more frequent connection
Second user, familiarity is higher, then association degree closely is then higher.
The second user may include following at least one to the historical operation information of second query intention:
The history of second the query intention corresponding searching times and the matched network information of the second query intention
Number of clicks, the browsing duration of the corresponding network information of second query intention, the corresponding search of second query intention
Continuous days.
In this example, historical operation information can identify second user to the level of understanding of the query intention, to the inquiry
It is intended to spend that the time is more, more known second user, understands that higher, then association degree closely is then higher.
Searching times corresponding for the second query intention, can be in user-query intention as shown in table 5, query word
String/feature vector-query intention corresponding lists are found, such as can be with for the sequence of the corresponding searching times of query intention 1
For 2 > user of user 1.
It, can from search log for the history number of clicks with the matched network information of the second query intention
To obtain second user to the number of clicks of the second inquiry word string, number of clicks is more, it can be said that the webpage quantity of bright browsing,
Content is more, higher to the familiarity of the second query intention.
The browsing duration of the network information corresponding for the second query intention statistics can obtain second from search log
The time quantum of user's browsing the second inquiry word string related web page, the browsing time is longer, then to the familiarity of the second query intention
It is higher.
Search continuous days corresponding for the second query intention statistics can obtain second user and look into from search log
Ask the continuous days of same query intention.Number of days is more, the duration is longer, it can be said that bright second user anticipates to the second inquiry
Scheme more familiar.For example, second user A continues a search in month " Japan's tourism ", second user B continues search " Japanese trip in three days
Trip ", it may be considered that this query intention is more familiar with some second user A to " Japan travels " than second user B.
For example, the first inquiry word string " the semi-gods and the semi-devils TV play ", the second user with same or similar query intention have
It three, respectively second user A, second user B, second user C, influences to be associated with the factor spent closely as shown in table 6.
Table 6, association degree contrast table closely
Wherein, second user A is compared with second user C as the first user connection frequently, but is more familiar with inquiry meaning
Figure.Second user C is compared with second user B, is contacted more frequently with the first user, is more familiar with to the query intention.
According to sub-step S51, same available and the first inquiry word string " the semi-gods and the semi-devils film ", " dragon oath game "
Second user with same or similar query intention.
Sub-step S52 is ranked up the corresponding community information of the second user according to association degree closely;
In this example, it can be ranked up from high to low according to association degree closely, i.e., sequence sorts;Certainly, this example
In can also be ranked up from low to high according to association degree closely, i.e. Bit-reversed, the embodiments of the present invention are not limited thereto.
For example, the degree closely of association shown in table 6: 155 > 135 > 117.2, the collating sequence of available second user are as follows:
Second user A > second user C > second user B.
Network information community information corresponding with the second user after sequence is synthesized search result by sub-step S53.
Search result synthesis after the completion of, then can in the client by the community information of the second user after sequence together with
The network information is presented to the first user, such as by the head portrait of each second user on the right side of the corresponding network information of the first inquiry word string
Show, carries out communication exchange for the first user.
As shown in Fig. 2, can be opened up according to different query intentions, i.e. TV play class demand, game demand, film demand
Show community information of second user, including its head portrait, title etc..For example, being " the semi-gods and the semi-devils " by query word string original in table 5
When, user 1, user 2 show that user 3 shows in the case where segmenting classification " game " under corresponding subdivision classification " TV play ";It is original to look into
When inquiry word string is " Tsinghua University ", user 4 shows that user 5 is in the case where segmenting classification " tourism " under corresponding subdivision classification " enrollment "
It shows.
In another example good friend A studied " recruiting for Tsinghua University by search engine in the instant messaging good friend of active user
Life " category information, good friend B studied " tourism " category information of Tsinghua University by search engine, then " Tsing-Hua University is big for active user's input
Learn " when, the head portrait of good friend A, good friend B can be attached to the class label of " enrollment " on the right side of search results pages, " tourism " respectively
Under, make active user when search need is indefinite, the associated user in community good friend is finely divided, it is corresponding thin for its selection
It is exchanged again after sub-category good friend.
It, can be in the synthesis of search result, to the community information construction of second user and the using the embodiment of the present invention
The entrance object for the communication software that two users are communicated, the first user can trigger the entrance pair by modes such as mouse clicks
As directly carrying out instant messaging with second user.
Certainly, the first user can also use other approach and second user after the community information for obtaining second user
It is communicated.
For example, the first user can obtain the second user if including mail address in the community information of second user
Outlook (one for sending and receiving, the application program writing, manage Email) entrance, to mail address transmission mail.
In another example the first user can pass through if including user's name or User ID in the community information of second user
Corresponding immediate communication tool, all types of websites (such as forum, discussion bar, portal website etc.) find second user progress
Communication.
In other embodiments, user can scan in mobile client, wirelessly submit original query
Word string obtains matched radio network information, judges original query word string for the query word of more query intentions in wireless server
When string, original query word string is rewritten as multiple the first inquiry word strings with the query intention, and search corresponding inquiry respectively
The second user of intention, and the community information of the network information and second user synthesis wireless search result is back to mobile client
End, the corresponding instant communication software that user directly calls in mobile client are linked up with selected second user.
Traditional search engine can only search for the network information, and active user inputs in the community websites such as microblogging, forum and looks into
Word string is ask, community website can return to user relevant to word string is inquired and microblogging/model, but return is searched in community website
User is will to inquire word string to match to obtain with community information (mainly user name), is not carried out to the search need of user thin
Point, it is even more impossible to obtain the user of similar demands.
It is scanned in the embodiment of the present invention with the original query word string that the first user submits, obtains matched network letter
Original query word string is rewritten as multiple with this by breath when judging original query word string for the inquiry word string of more query intentions
First inquiry word string of query intention, and search has the second of same or similar query intention to use with the first inquiry word string
Family, and the community information of the network information and second user is synthesized into search result, so that being pressed when the first user demand is indefinite
According to various classification demands, subject categories are screened by analysis search log to the community good friend of user, respectively obtain with each
The mostly concerned each second user of subject categories needs user in search so that the search need of the first user is finely divided
Contact person similar with active user's demand out can also be recommended by asking in indefinite situation, avoided the first user and repeated to magnanimity
The network information carry out cumbersome artificial filter, reduce the consuming of the first user time and energy, decrease user equipment
With the system resources consumption of website, the occupancy of network bandwidth is decreased, improves the efficiency, quality and capacity of acquisition of information.
Referring to Fig. 3, a kind of structural block diagram of the Installation practice of search of the present invention is shown, can specifically include such as lower die
Block:
Network information search module 301, for receive the first user submission original query word string when, with the original
The inquiry word string that begins scans for, and obtains the matched network information;
More query intention judgment modules 302, for according to the network information judge the original query word string whether be
The inquiry word string of more query intentions;If so, calling query word falsification writing module 303;
Query word falsification writing module 303, for being rewritten as the original query word string respectively according to each query intention
Multiple the first inquiry word strings with the query intention;
User's searching module 304 has phase with the first inquiry word string for searching respectively according to the network information
The second user of same or similar query intention;Wherein, the second user has community information;
Search result synthesis module 305, for closing the network information and the corresponding community information of the second user
At search result.
In one preferred embodiment of the invention, more query intention judgment modules 302 may include following submodule
Block:
Fisrt feature network information acquisition submodule, for obtaining the matched fisrt feature network of the original query word string
Information;The fisrt feature network information include the highest preceding N network information of sequence and/or history number of clicks at most before
The M network information;
Second feature network information acquisition submodule, the second feature network for obtaining other query word String matchings are believed
Breath;The second feature network information includes the preceding B of the highest preceding A network information of sequence and/or history number of clicks at most
The network information;
Feature Network Information judging submodule, for judging in the fisrt feature network information whether to include at least two
The second feature network information;If so, calling the first decision sub-module;
First decision sub-module, for determining that the original query word string is the inquiry word string of more query intentions;Wherein, M,
N, A, B are positive integer.
In one preferred embodiment of the invention, more query intention judgment modules 302 may include following submodule
Block:
Entity class searches submodule, for searching the corresponding entity of the original query word string in set knowledge base
Classification;
Second decision sub-module, for determining that the original query word string is more when the entity class is more than two
The inquiry word string of query intention.
In one preferred embodiment of the invention, more query intention judgment modules 302 may include following submodule
Block:
Feature Words search submodule, for searching the associated feature of original query word string in set knowledge base
Word;
Quantity judging submodule, for judging in the webpage of the whole network, whether the quantity of the Feature Words is more than present count
Measure threshold value;If so, calling classification submodule;
Classification submodule, for being classified using the entity class of knowledge base to the Feature Words;
Third decision sub-module, for when acquisition at least two is classified, determining that the original query word string is more inquiries
The inquiry word string of intention.
In one preferred embodiment of the invention, user's searching module 304 may include following submodule:
Query intention acquisition of information submodule, it is corresponding for obtaining the inquiry word string of each of described first user first respectively
The first query intention information and the second user the second query intention information;
Query intention information similarity calculation submodule, for calculating separately the first query intention information and described the
The similarity of two query intention information;
Judging submodule, for judging first query word when the similarity is greater than preset similarity threshold
String has the same or similar query intention with the second user.
In one preferred embodiment of the invention, the first query intention information may include first eigenvector,
The first eigenvector can be determined according to the first inquiry word string;
The second query intention information may include second feature vector, and the second feature vector can be according to described
Second inquiry word string is determined;
Wherein, the second inquiry word string is the inquiry word string that the second user is formerly submitted.
In one preferred embodiment of the invention, the query intention information similarity calculation submodule may include as
Lower submodule:
Feature vector similarity calculation submodule, for calculate the first eigenvector and the second feature vector it
Between similarity.
In a preferred example of an embodiment of the present invention, the first eigenvector may include following at least one:
The associated feature vector of participle and the first query word String matching of first inquiry word string and the first inquiry word string
The associated feature vector of the network information;
The second feature vector may include following at least one:
The associated feature vector of participle and the second query word String matching of second inquiry word string and the second inquiry word string
The associated feature vector of the network information.
In a preferred example of an embodiment of the present invention, the associated feature vector of participle with the first inquiry word string
May include following at least one:
The synonymous word string of first inquiry word string, the participle of the first inquiry word string, the first inquiry word string participle part of speech, the
One inquiry word string participle synonym, first inquiry word string participle different degree;
Described and the first query word String matching associated feature vector of the network information may include following at least one:
Title with the network information of the first query word String matching, the webpage with the network information of the first query word String matching
Mark is inquired with the history click information of the network information of the first query word String matching, with the first associated other of inquiry word string
Word string;
The associated feature vector of participle with the second inquiry word string may include following at least one:
The synonymous word string of second inquiry word string, the participle of the second inquiry word string, the second inquiry word string participle part of speech, the
Two inquiry word strings participle synonym, second inquiry word string participle different degree;
Described and the second query word String matching associated feature vector of the network information may include following at least one:
Title with the network information of the second query word String matching, the webpage with the network information of the second query word String matching
Mark is inquired with the history click information of the network information of the second query word String matching, with the second associated other of inquiry word string
Word string.
In one preferred embodiment of the invention, described search result synthesis module 305 may include following submodule:
Association degree computational submodule closely, for calculating first user and the second user under each query intention
Association spend closely;
Community information sorting sub-module, for being carried out according to association degree closely to the community information of the second user
Sequence;
Submodule is synthesized, for knot to be searched in the community information synthesis of the second user after the network information and sequence
Fruit.
In one preferred embodiment of the invention, association degree computational submodule closely may include following submodule
Block:
Weight configures submodule, for the first query intention information described under each query intention and second inquiry
The similarity of intent information, and/or, the related information between first user and the second user, and/or, described
Two users configure corresponding weight to the historical operation information record of second query intention;
Read group total submodule, for the first query intention information and second inquiry after configuration weight
The similarity of intent information, and/or, the related information between first user and the second user, and/or, described
Two users carry out read group total to the historical operation information of second query intention, obtain described first under each query intention
User spends closely with being associated with for the second user.
In one preferred embodiment of the invention, the related information between first user and the second user can
To comprise at least one of the following:
The average connection duration in average connection number, preset time period, the quantity of common friend in preset time period,
Dwelling places;
The second user may include following at least one to the historical operation information of second query intention:
The corresponding searching times of second query intention, the corresponding network information of second query intention browsing when
The corresponding search continuous days of long, described second query intention.
In one preferred embodiment of the invention, it can have community between first user and the second user
Friend relation.
Method to a kind of search provided by the present invention and a kind of device of search above, are described in detail, this
Apply that a specific example illustrates the principle and implementation of the invention in text, the explanation of above example is only intended to
It facilitates the understanding of the method and its core concept of the invention;For those of ordinary skill in the art, according to the thought of the present invention, In
There will be changes in specific embodiment and application range, in conclusion the content of the present specification should not be construed as to this hair
Bright limitation.
Claims (22)
1. a kind of method of search characterized by comprising
When receiving the original query word string of the first user submission, is scanned for, matched with the original query word string
The network information;
According to the network information judge the original query word string whether be more query intentions inquiry word string;If so, pressing
The original query word string is rewritten as multiple the first inquiry word strings with the query intention respectively according to each query intention;
Searching respectively according to the first inquiry word string has the of same or similar query intention with the first inquiry word string
Two users;Wherein, the second user has community information;
The network information and the corresponding community information of the second user are synthesized into search result;
According in described search result second user and the second user corresponding community information construction and second user into
The entrance object of the communication software of row communication, the first user directly carry out Instant Messenger with second user by the entrance object
News.
2. judging whether the original query word string is inquire more the method according to claim 1, wherein described
The step of inquiry word string of intention includes:
Obtain the matched fisrt feature network information of the original query word string;The fisrt feature network information includes sorting most
The preceding M network information of the high preceding N network information and/or history number of clicks at most;
Obtain the second feature network information of other query word String matchings;The second feature network information includes that sequence is highest
The preceding B network information of the preceding A network information and/or history number of clicks at most;
Judge in the fisrt feature network information whether to include at least two second feature network informations;If so, determining institute
State the inquiry word string that original query word string is more query intentions;Wherein, M, N, A, B are positive integer.
3. judging whether the original query word string is inquire more the method according to claim 1, wherein described
The step of inquiry word string of intention includes:
The corresponding entity class of the original query word string is searched in set knowledge base;
When the entity class is more than two, determine that the original query word string is the inquiry word string of more query intentions.
4. judging whether the original query word string is inquire more the method according to claim 1, wherein described
The step of inquiry word string of intention includes:
The associated Feature Words of original query word string are searched in set knowledge base;
Judge in the webpage of the whole network, whether the quantity of the Feature Words is more than preset quantity threshold value;If so, using knowledge base
Entity class classify to the Feature Words;
When acquisition at least two is classified, determine that the original query word string is the inquiry word string of more query intentions.
5. method according to claim 1 or 2 or 3 or 4, which is characterized in that the lookup and the first inquiry word string
The step of having the second user of same or similar query intention includes:
Each of described first user first is obtained respectively inquires the corresponding first query intention information of word string and described second
Each of user second inquires the corresponding second query intention information of word string;
Calculate separately the similarity of the first query intention information Yu the second query intention information;
When the similarity is greater than preset similarity threshold, judge that the first inquiry word string has with the second user
The same or similar query intention.
6. according to the method described in claim 5, it is characterized in that, the first query intention information include fisrt feature to
Amount, the first eigenvector are determined according to the first inquiry word string;
The second query intention information includes second feature vector, and the second feature vector is according to the second inquiry word string
It is determined;
Wherein, the second inquiry word string is the inquiry word string that the second user is formerly submitted.
7. according to the method described in claim 6, it is characterized in that, the first eigenvector comprises at least one of the following:
First inquiry word string, the network with the participle associated feature vector and the first query word String matching of the first inquiry word string
The feature vector of information association;
The second feature vector comprises at least one of the following:
Second inquiry word string, the network with the participle associated feature vector and the second query word String matching of the second inquiry word string
The feature vector of information association.
8. according to the method described in claim 5, it is characterized in that, described that the network information and the second user is corresponding
Community information synthesize search result the step of include:
First user under each query intention is calculated to spend closely with being associated with for the second user;
The corresponding community information of the second user is ranked up according to association degree closely;
Network information community information corresponding with the second user after sequence is synthesized into search result.
9. according to the method described in claim 8, it is characterized in that, it is described calculate under each query intention first user with
The step of association of the second user is spent closely include:
To the similarity of the first query intention information and the second query intention information described under each query intention, and/or,
Related information between first user and the second user, and/or, the second user is to second query intention
Historical operation information record configure corresponding weight;
To the similarity of the first query intention information and the second query intention information after configuration weight, and/or,
Related information between first user and the second user, and/or, the second user is to second query intention
Historical operation information carry out read group total, obtain being associated with for first user and the second user under each query intention
Degree closely.
10. according to the method described in claim 9, it is characterized in that, pass between first user and the second user
Connection information comprises at least one of the following:
The average connection duration in average connection number, preset time period, the quantity of common friend, inhabitation in preset time period
Position;
The second user comprises at least one of the following the historical operation information of second query intention:
The corresponding searching times of second query intention, the corresponding network information of second query intention browsing duration,
The corresponding search continuous days of second query intention.
11. method described according to claim 1 or 2 or 3 or 4 or 6 or 7 or 9 or 10, which is characterized in that first user
There is community's friend relation between the second user.
12. a kind of device of search characterized by comprising
Network information search module, for receive the first user submission original query word string when, with the original query
Word string scans for, and obtains the matched network information;
More query intention judgment modules, for judging whether the original query word string is more inquiry meanings according to the network information
The inquiry word string of figure;If so, calling query word falsification writing module;
Query word falsification writing module, for being rewritten as multiple having respectively by the original query word string according to each query intention
First inquiry word string of the query intention;
User's searching module, for searching respectively with the first inquiry word string according to the first inquiry word string with identical or
The second user of similar query intention;Wherein, the second user has community information;
Search result synthesis module, for tying the network information and the corresponding community information synthesis search of the second user
Fruit;
Entrance object formation module, for according in described search result second user and the corresponding community of the second user
The entrance object for the communication software that information structuring and second user are communicated, the first user is by the entrance object, directly
Instant messaging is carried out with second user.
13. device according to claim 12, which is characterized in that more query intention judgment modules include:
Fisrt feature network information acquisition submodule, for obtaining the matched fisrt feature network letter of the original query word string
Breath;The fisrt feature network information includes the preceding M of the highest preceding N network information of sequence and/or history number of clicks at most
The network information;
Second feature network information acquisition submodule, for obtaining the second feature network information of other query word String matchings;Institute
Stating the second feature network information includes the preceding B network of the highest preceding A network information of sequence and/or history number of clicks at most
Information;
Feature Network Information judging submodule, for judging in the fisrt feature network information whether to include at least two second
Feature Network Information;If so, calling the first decision sub-module;
First decision sub-module, for determining that the original query word string is the inquiry word string of more query intentions;Wherein, M, N, A,
B is positive integer.
14. device according to claim 12, which is characterized in that more query intention judgment modules include:
Entity class searches submodule, for searching the corresponding entity class of the original query word string in set knowledge base
Not;
Second decision sub-module, for when the entity class is more than two, determining that the original query word string is more inquiries
The inquiry word string of intention.
15. device according to claim 12, which is characterized in that more query intention judgment modules include:
Feature Words search submodule, for searching the associated Feature Words of original query word string in set knowledge base;
Quantity judging submodule, for judging in the webpage of the whole network, whether the quantity of the Feature Words is more than preset quantity threshold
Value;If so, calling classification submodule;
Classification submodule, for being classified using the entity class of knowledge base to the Feature Words;
Third decision sub-module, for when acquisition at least two is classified, determining that the original query word string is more query intentions
Inquiry word string.
16. device described in 2 or 13 or 14 or 15 according to claim 1, which is characterized in that user's searching module includes:
Query intention acquisition of information submodule inquires word string corresponding for obtaining each of described first user first respectively
Each of one query intention information and the second user second inquires the corresponding second query intention information of word string;
Query intention information similarity calculation submodule is looked into for calculating separately the first query intention information with described second
Ask the similarity of intent information;
Judging submodule, for the similarity be greater than preset similarity threshold when, judge it is described first inquiry word string and
The second user has the same or similar query intention.
17. device according to claim 16, which is characterized in that the first query intention information include fisrt feature to
Amount, the first eigenvector are determined according to the first inquiry word string;
The second query intention information includes second feature vector, and the second feature vector is according to the second inquiry word string
It is determined;
Wherein, the second inquiry word string is the inquiry word string that the second user is formerly submitted.
18. device according to claim 17, which is characterized in that the first eigenvector comprises at least one of the following:
First inquiry word string, the network with the participle associated feature vector and the first query word String matching of the first inquiry word string
The feature vector of information association;
The second feature vector comprises at least one of the following:
Second inquiry word string, the network with the participle associated feature vector and the second query word String matching of the second inquiry word string
The feature vector of information association.
19. device according to claim 16, which is characterized in that described search result synthesis module includes:
Association degree computational submodule closely, for calculating the pass of first user and the second user under each query intention
Connection degree closely;
Community information sorting sub-module, for arranging according to association degree closely the community information of the second user
Sequence;
Submodule is synthesized, for the community information of the second user after the network information and sequence to be synthesized search result.
20. device according to claim 19, which is characterized in that the association spends computational submodule closely and includes:
Weight configures submodule, for the first query intention information described under each query intention and second query intention
The similarity of information, and/or, the related information between first user and the second user, and/or, described second uses
Family configures corresponding weight to the historical operation information record of second query intention;
Read group total submodule, for the first query intention information and second query intention after configuration weight
The similarity of information, and/or, the related information between first user and the second user, and/or, described second uses
Family carries out read group total to the historical operation information of second query intention, obtains first user under each query intention
It is spent closely with being associated with for the second user.
21. device according to claim 20, which is characterized in that the pass between first user and the second user
Connection information comprises at least one of the following:
The average connection duration in average connection number, preset time period, the quantity of common friend, inhabitation in preset time period
Position;
The second user may include following at least one to the historical operation information of second query intention:
The corresponding searching times of second query intention, the corresponding network information of second query intention browsing duration,
The corresponding search continuous days of second query intention.
22. device described in 2 or 13 or 14 or 15 or 17 or 18 or 20 or 21 according to claim 1, which is characterized in that described
There is community's friend relation between one user and the second user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410262143.XA CN105159898B (en) | 2014-06-12 | 2014-06-12 | A kind of method and apparatus of search |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410262143.XA CN105159898B (en) | 2014-06-12 | 2014-06-12 | A kind of method and apparatus of search |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105159898A CN105159898A (en) | 2015-12-16 |
CN105159898B true CN105159898B (en) | 2019-11-26 |
Family
ID=54800755
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410262143.XA Active CN105159898B (en) | 2014-06-12 | 2014-06-12 | A kind of method and apparatus of search |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105159898B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951422B (en) * | 2016-01-07 | 2021-05-28 | 腾讯科技(深圳)有限公司 | Webpage training method and device, and search intention identification method and device |
CN106021516A (en) * | 2016-05-24 | 2016-10-12 | 百度在线网络技术(北京)有限公司 | Search method and device |
CN106971004B (en) * | 2017-04-26 | 2021-04-06 | 百度在线网络技术(北京)有限公司 | Search result providing method and device |
CN108182290B (en) * | 2018-01-30 | 2022-03-25 | 深圳市富途网络科技有限公司 | Estimation method for community content hot sequencing |
CN109543026A (en) * | 2018-12-12 | 2019-03-29 | 广东小天才科技有限公司 | Analytic content acquisition method of mathematical formula and family education equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101136869A (en) * | 2006-08-30 | 2008-03-05 | 高鹏 | Method for generating search intention based contacts group of instant communication system |
CN102016845A (en) * | 2008-04-29 | 2011-04-13 | 微软公司 | Social network powered query refinement and recommendations |
CN102402589A (en) * | 2011-10-26 | 2012-04-04 | 北京百度网讯科技有限公司 | Method and equipment for providing reference research information related to research request |
CN102456054A (en) * | 2010-10-28 | 2012-05-16 | 腾讯科技(深圳)有限公司 | Searching method and system |
CN103942198A (en) * | 2013-01-18 | 2014-07-23 | 佳能株式会社 | Method and device for mining intentions |
-
2014
- 2014-06-12 CN CN201410262143.XA patent/CN105159898B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101136869A (en) * | 2006-08-30 | 2008-03-05 | 高鹏 | Method for generating search intention based contacts group of instant communication system |
CN102016845A (en) * | 2008-04-29 | 2011-04-13 | 微软公司 | Social network powered query refinement and recommendations |
CN102456054A (en) * | 2010-10-28 | 2012-05-16 | 腾讯科技(深圳)有限公司 | Searching method and system |
CN102402589A (en) * | 2011-10-26 | 2012-04-04 | 北京百度网讯科技有限公司 | Method and equipment for providing reference research information related to research request |
CN103942198A (en) * | 2013-01-18 | 2014-07-23 | 佳能株式会社 | Method and device for mining intentions |
Also Published As
Publication number | Publication date |
---|---|
CN105159898A (en) | 2015-12-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Feng et al. | An expert recommendation algorithm based on Pearson correlation coefficient and FP-growth | |
US11663254B2 (en) | System and engine for seeded clustering of news events | |
CN105045875B (en) | Personalized search and device | |
CN110147437A (en) | A kind of searching method and device of knowledge based map | |
Tran et al. | Hashtag recommendation approach based on content and user characteristics | |
CN107256267A (en) | Querying method and device | |
CN105159898B (en) | A kind of method and apparatus of search | |
CN107729336A (en) | Data processing method, equipment and system | |
CN110390094B (en) | Method, electronic device and computer program product for classifying documents | |
CN108664515B (en) | A kind of searching method and device, electronic equipment | |
CN105786810B (en) | The method for building up and device of classification mapping relations | |
CN104881447A (en) | Searching method and device | |
US20130332440A1 (en) | Refinements in Document Analysis | |
KR100557874B1 (en) | Method of scientific information analysis and media that can record computer program thereof | |
CN110175289B (en) | Mixed recommendation method based on cosine similarity collaborative filtering | |
Shi et al. | [Retracted] Research on Fast Recommendation Algorithm of Library Personalized Information Based on Density Clustering | |
CN111753151A (en) | Service recommendation method based on internet user behaviors | |
CN115329078B (en) | Text data processing method, device, equipment and storage medium | |
CN116226533A (en) | News associated recommendation method, device and medium based on association prediction model | |
Shi et al. | A hybrid approach for automatic mashup tag recommendation | |
CN108460131A (en) | A kind of tag along sort processing method and processing device | |
CN112052402B (en) | Information recommendation method and device, electronic equipment and storage medium | |
Bakariya et al. | Pattern mining approach for social network services | |
CN105159899B (en) | Searching method and device | |
KR102041915B1 (en) | Database module using artificial intelligence, economic data providing system and method using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |