CN106610972A - Query rewriting method and apparatus - Google Patents
Query rewriting method and apparatus Download PDFInfo
- Publication number
- CN106610972A CN106610972A CN201510689095.7A CN201510689095A CN106610972A CN 106610972 A CN106610972 A CN 106610972A CN 201510689095 A CN201510689095 A CN 201510689095A CN 106610972 A CN106610972 A CN 106610972A
- Authority
- CN
- China
- Prior art keywords
- sample
- word
- vector
- expansion word
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/2448—Query languages for particular applications; for extensibility, e.g. user defined types
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a query rewriting method and apparatus. The method can comprise the steps of receiving a search keyword input by a user; selecting an extended word corresponding to the search keyword, wherein the similarity between semantic vectors corresponding to the extended word and the search keyword in a semantic vector space of a preset dimension reaches preset similarity; and rewriting the search keyword to the selected extended word. Through the technical scheme, query rewriting can be realized in combination with semanteme, so that the pushed word coverage rate and the rewriting accuracy can be improved.
Description
Technical field
The application is related to search technique field, more particularly to inquiry Improvement and device.
Background technology
User is required for using function of search under many scenes.When search operation is performed, user can be with
Any search keyword is input into, and corresponding Search Results are provided by search engine.
However, the search keyword of user input is often relatively more random, user can not be directly embodied
Actual intention, cause Search Results to meet the actual demand of user.
The content of the invention
In view of this, the application provides a kind of inquiry Improvement and device, can look into reference to semantic realization
Ask and rewrite, contribute to lifting and push away word coverage rate and rewrite the degree of accuracy.
For achieving the above object, the application offer technical scheme is as follows:
According to the first aspect of the application, it is proposed that one kind inquiry Improvement, including:
The search keyword of receiving user's input;
The expansion word corresponding to the search keyword is chosen, the expansion word exists with the search keyword
Similarity in the semantic vector space of default dimension respectively between corresponding semantic vector reaches default similar
Degree;
The search keyword is rewritten as into selected expansion word.
According to the second aspect of the application, it is proposed that one kind inquiry re-writing device, including:
Receiving unit, the search keyword of receiving user's input;
Unit is chosen, the expansion word corresponding to the search keyword is chosen, the expansion word is searched with described
Similarity of the rope keyword in the semantic vector space of default dimension respectively between corresponding semantic vector reaches
To default similarity;
Unit is rewritten, the search keyword is rewritten as into selected expansion word.
From above technical scheme, the application by by search keyword and expansion word be mapped as it is semantic to
Semantic vector in quantity space, search keyword can be embodied by the similarity between semantic vector and is expanded
Semantic degree of correlation between exhibition word, so as to simplify semantic comparison procedure, improves the standard that inquiry is rewritten
Exactness.Meanwhile, by the determination to semantic degree of correlation, eliminate between search keyword and expansion word
Text similarity demand, contribute to lifting and push away word coverage rate.
Description of the drawings
Fig. 1 is a kind of flow chart of inquiry Improvement of the exemplary embodiment of the application one;
Fig. 2 is the schematic diagram that a kind of inquiry of the exemplary embodiment of the application one is rewritten;
Fig. 3 is the schematic diagram that another kind of inquiry of the exemplary embodiment of the application one is rewritten;
Fig. 4 is the flow chart of another kind of inquiry Improvement of the exemplary embodiment of the application one;
Fig. 5 is a kind of for realizing the sample training process that inquiry is rewritten of the exemplary embodiment of the application one
Flow chart;
Fig. 6 is the schematic diagram that another inquiry of the exemplary embodiment of the application one is rewritten;
Fig. 7 is the structural representation of a kind of electronic equipment of the exemplary embodiment of the application one;
Fig. 8 is a kind of block diagram of inquiry re-writing device of the exemplary embodiment of the application one.
Specific embodiment
As described by background section, due to the search keyword of user input it is more random, it is past
Toward its true intention can not be embodied, so as to cause Search Results not meet the actual demand of user.For
The technical problem is solved, QR (query rewrite, inquiry is rewritten) is proposed in correlation technique and is processed
Means, can be analyzed by the search keyword to user input, and are replaced with automatically and can be embodied
The expansion word of the actual intention of user.
In the related, it is proposed that many kinds realize the technological means of QR, mainly include:
(1) based on text similarity.Specifically, by such as TF-IDF (term frequency-inverse
Document frequency) etc. mode calculate text similarity between search keyword and expansion word, really
Determine the corresponding expansion word of search keyword.But, this mode cannot calculate the search without co-occurrence word and close
Similarity (cannot such as determine the similarity of " apple " and iphone between) between keyword and expansion word,
And when there is various explanations in same word, it is easy to occur bad (not meeting the actual demand of user)
Expansion word (such as " apple fruit basket " and " i Phone ").
(2) based on semantic rules.Specifically, by setting up semantic rules, selection meets semantic rules
Expansion word.It should be noted that the foundation of semantic rules can not be obtained really, comparison search is crucial
Word and the semanteme of expansion word, are based only on the current understanding of developer and are judged, with great limitation
Property, the degree of accuracy and the coverage rate for pushing away word is all very low, and needs the later stage constantly to safeguard that regular, exploitation is new
Rule, cost is very high and actual effect is unsatisfactory.
Therefore, the application by improve correlation technique in inquiry rewrite method, to solve correlation technique in
The technical problem of presence.It is that the application is further described, there is provided the following example:
Fig. 1 is a kind of flow chart of inquiry Improvement of the exemplary embodiment of the application one, such as Fig. 1 institutes
Show, the method may comprise steps of:
Step 102, the search keyword of receiving user's input.
Step 104, chooses the expansion word corresponding to the search keyword, and the expansion word is searched with described
Similarity of the rope keyword in the semantic vector space of default dimension respectively between corresponding semantic vector reaches
To default similarity.
In the present embodiment, by the way that search keyword and expansion word are respectively mapped to into semantic vector space,
Can realize that the actual semanteme between search keyword and expansion word compares, and be not limited in correlation technique only
In literal enterprising this similarity-rough set of style of writing, contribute to lifting the degree of accuracy for pushing away word;Simultaneously as being each
The actual semanteme of individual word, thus the degree of accuracy of the developer to understanding and the setting of semantic rules is not limited to,
And do not need later maintenance.
Step 106, by the search keyword selected expansion word is rewritten as.
From above-described embodiment, the application by search keyword and expansion word by being mapped as semantic vector
Semantic vector in space, can be embodied search keyword with extension by the similarity between semantic vector
Semantic degree of correlation between word, so as to simplify semantic comparison procedure, improves the accurate of inquiry rewriting
Degree.Meanwhile, by the determination to semantic degree of correlation, eliminate between search keyword and expansion word
Text similarity demand, contributes to lifting and pushes away word coverage rate.
1st, QR principles
Embodiment as shown in Figure 1 understands, in the technical scheme of the application, the realization of QR processes according to
Search keyword and expansion word are each mapped to Lai Yu the semantic vector in semantic vector space, to lead to
The contrast of semantic vector is crossed determining the semantic relevancy between search keyword and expansion word.
In order to realize above-mentioned mapping process, as shown in Fig. 2 be able to will be searched for by neural network algorithm
Keyword or expansion word are mapped in semantic vector space, to obtain corresponding semantic vector.For example,
Such as when the search keyword of user input is " i Phone ", if " i Phone " is mapped to
Semantic vector space, then can obtain corresponding semantic vector 1, such as the semantic vector 1 is X;And
When there are an alternative words is " iphone6 ", it is assumed that be somebody's turn to do " iphone6 " and map to semantic vector sky
Between obtain corresponding semantic vector 2, such as the semantic vector 2 is Y, if then vector X and vector Y
Between there is default similarity, then it is assumed that the alternative words " iphone6 " and search keyword " apple hand
There is higher semantic relevancy between machine ", thus can be using the alternative words " iphone6 " as searching
The corresponding expansion word of rope keyword " i Phone ", thus search keyword " i Phone " is rewritten
For " iphone6 ".
Wherein, search keyword or expansion word are being mapped to into semantic vector space and corresponding semanteme is being obtained
When vectorial, as an exemplary embodiment, directly search keyword or expansion word can be mapped as into correspondence
Semantic vector.And as another exemplary embodiment, as shown in figure 3, this realizes that process can include:
By neural network algorithm by all participles for constituting search keyword or expansion word be respectively mapped to it is semantic to
Quantity space, obtains corresponding participle vector;Search keyword or expansion word will be constituted according to preset strategy
Respectively corresponding participle vector is combined all participles, and using the whole term vector for obtaining as above-mentioned language
Adopted vector;By the way that each participle is each mapped to into corresponding participle vector, contribute to reducing processing procedure
Complexity.
For example, such as when the search keyword of user input is " i Phone ", by this
The word segmentation processing of search keyword, it is assumed that corresponding participle includes that participle 11 is " apple " and participle 12
" mobile phone " etc., then by the way that all participles are respectively mapped to into semantic vector space, respectively obtain corresponding
It corresponding to the X1 of participle " apple ", participle vector 32 is corresponding to participle " mobile phone " that participle vector 31 is
X2 etc..Similarly, it is assumed that there are alternative words " iphone6 ", participle is carried out to the alternative words
Process, obtain corresponding participle including participle 21 " iphone " and participle 22 " 6 " etc., then by inciting somebody to action
All participles are respectively mapped to semantic vector space, respectively obtain corresponding participle vector 41 be corresponding to point
The Y1 of word " iphone ", participle vector 42 are Y2 corresponding to participle " 6 " etc..
Then, according to preset strategy all participle vectors corresponding to search keyword " i Phone " (i.e.
Participle vector 31 " X1 " and participle vector 32 " X2 " etc.) be combined, obtain corresponding whole word to
Amount 1, such as the whole term vector 1 are X;Also, according to preset strategy to alternative words " iphone6 "
Corresponding all participles vector (i.e. participle vector 41 " Y1 " and participle vector 42 " Y2 " etc.) are carried out
Combination, obtains corresponding whole term vector 2, such as the whole term vector 2 is Y.So, search keyword
Semantic relevancy analysis between " i Phone " and alternative words " iphone6 ", you can it is right to be converted to
Similarity analysis between whole term vector 1 " X " and whole term vector 2 " Y ".
It is apparent that being completely absent between search keyword " i Phone " and word " iphone6 "
Literal text similarity, and semantic rules setting difficulty between the two is very big, by related skill
During technical scheme in art, it is difficult to realize similar QR process exactly.And in the application, by inciting somebody to action
Search keyword and alternative words be each mapped to the whole term vector 1 in semantic vector space (the whole word to
Amount 1 can be used as the semantic vector of search keyword) and whole term vector 2 (the whole term vector 2 can be made
For the semantic vector of alternative words), will can realize between search keyword and alternative words more difficult
Semantic relevancy, be converted to realize it is similar between relatively simple whole term vector 1 and whole term vector 2
Degree compares, it is possible to achieve more accurate, easily QR process operation, determines that search keyword is corresponding
Expansion word.
2nd, process is realized based on the QR of sample training
In order that each word can be mapped to correctly in semantic vector space, i.e. the equal energy of each participle
The participle vector being correctly mapped as in semantic vector space is reached, and and then is combined as corresponding word correspondingly
Whole term vector (the whole term vector can be by the semantic vector as corresponding word), sample can be passed through
Training is vectorial to be previously obtained all possible participle corresponding participle in semantic vector space.Press below
According to the execution sequence that sample training and QR are processed, the technical scheme of the application is described in detail.
Fig. 4 is the flow chart of another kind of inquiry Improvement of the exemplary embodiment of the application one, such as Fig. 4
Shown, the method may comprise steps of:
Step 402, extracts training sample.
In the case of one kind, the historical behavior of user can to a great extent embody search keyword and expand
Semantic relevancy between exhibition word, thus the historical behavior of user can be based on, choose suitable training sample
This.For example, training sample can include:The historical search extracted in historical search click logs is closed
Keyword and the corresponding history expansion word of clicked business object;Such as when have input search in user's history
During keyword " i Phone ", certain business object is clicked in Search Results, such as the business pair
As corresponding history expansion word be " iphone6 " when, can be by historical search keyword " i Phone "
As sample searches keyword, using history expansion word " iphone6 " as sample expansion word.
In the case of another kind, the data or information related to clicked business object can be based on, be obtained
Corresponding training sample.For example, training sample can come from:
1) in historical search click logs extract historical search keyword and from clicked business object
Show the prediction expansion word extracted in content;For example, when historical search keyword is " i Phone "
When, also include " iphone6P " in the displaying content of clicked business object, and the word
" iphone6P " is considered as having higher semantic phase between historical search keyword " i Phone "
Guan Du, thus using the word " iphone6P " as prediction expansion word.Wherein it is possible to by historical search
Keyword " i Phone " as sample searches keyword, will prediction expansion word " iphone6P " conduct
Sample expansion word.
2) the history expansion word for extracting in historical search click logs and the displaying from clicked business object
The forecasting search keyword extracted in content;For example, when history expansion word is " iphone6 ",
Also include " apple latest version " in the displaying content of clicked business object, and the word " apple
Latest version " is considered as having higher semantic relevancy between history expansion word " iphone6 ", thus
Using the word " apple latest version " as forecasting search keyword.Wherein it is possible to forecasting search is crucial
Word " apple latest version " as sample searches keyword, using history expansion word " iphone6 " as sample
Expansion word.
3) the forecasting search keyword for extracting from the displaying content of the clicked business object and prediction
Expansion word;For example, when historical search keyword is " i Phone ", clicked business object
When corresponding history expansion word is " iphone6 ", if in the displaying content of the clicked business object also
Including " apple latest version " and " iphone6P ", and the word " apple latest version " is considered as and word
There is higher semantic relevancy between language " iphone6P ", thus by the word " apple latest version "
As forecasting search keyword, using the word " iphone6P " as prediction expansion word.Wherein it is possible to
Using forecasting search keyword " apple latest version " as sample searches keyword, expansion word will be predicted
" iphone6P " is used as sample expansion word.
In still another case, the cognitive and judgement based on user itself, can actively create search crucial
Word and corresponding expansion word, and think that there is between the two higher semantic relevancy;Wherein it is possible to point
Not using user create search keyword as sample searches keyword, using user create expansion word as
Sample expansion word.
Certainly, for three kinds of above-mentioned situations and three kinds of specific implementations in the case of second, can
To think specifically to be enumerated five kinds of sources of training sample;Accordingly, can choose therein any one
Plant or various implementations, as the source of the training sample in the technical scheme of the application.Or,
Can be using part implementation therein as necessary implementation, and another part is used as optional benefit
Sufficient mode, such as using above-mentioned the first situation as necessary implementation, and by other four kinds of realization sides
Formula is used as optionally supplying mode.
Step 404, training participle vector.
With reference to Fig. 5 and Fig. 6, the training process of the participle vector in the step is described in detail.
Wherein, Fig. 5 is a kind of for realizing the sample training mistake that inquiry is rewritten of the exemplary embodiment of the application one
The flow chart of journey;Fig. 6 is the schematic diagram that another inquiry of the exemplary embodiment of the application one is rewritten.Such as
Shown in Fig. 5, the sample training process may comprise steps of:
Step 502, obtains sample characteristics phrase.
When in the present embodiment, due to extracting training sample in step 402, sample searches keyword with
Correspond between sample expansion word, thus will a mutual corresponding sample searches keyword and a sample
This expansion word is used as a sample characteristics phrase, and sample searches keyword therein or sample expansion word divide
Not as a sample characteristics word in the sample characteristics phrase.
Step 504A, to the sample searches keyword in sample characteristics phrase word segmentation processing is carried out, and is somebody's turn to do
All participles of sample searches keyword.
As shown in fig. 6, such as carrying out after word segmentation processing to sample searches keyword, sample point is respectively obtained
Word 11 ' and sample participle 12 ' etc.;So, it is assumed that the sample searches keyword is " i Phone ", then
Sample participle 11 ' can be able to be " mobile phone " for " apple ", sample participle 12 '.
Step 506A, generates sample participle vector.
In the present embodiment, for above-mentioned sample participle 11 ' and sample participle 12 ' etc., it is right to generate respectively
The sample participle vector 31 ' answered and sample participle vector 32 ' etc..For example, it is assumed that sample participle vector
31 ' is X1, sample analysis vector 32 ' is X2, then when semantic vector space is tieed up for n, vectorial X1,
Vectorial X2 etc. is n-dimensional vector, such as vector X1={ x11, x12, x13..., x1n, vectorial X2={ x21,
x22, x23..., x2nEtc..
Wherein, due to subsequently also needing to complete to operate the training of each sample participle vector, thus herein
For each concrete numerical value of sample participle vector on every dimension is not required, as long as guaranteeing each
Sample participle vector is n dimensions.For example, such as can be raw by way of random initializtion
Each sample participle vector of random number is into every dimension, i.e., arbitrary sample participle vector Xi is each
Numerical value x in individual dimensioni1、xi2、……、xinEtc. being random value.
Step 508A, generates the whole term vector of sample.
In the present embodiment, all participles of sample searches keyword correspond respectively to sample participle vector 31 '
With sample participle vector 32 ' etc., and group is carried out to above-mentioned all sample participle vectors according to preset strategy
Close, you can obtain the whole term vector 1 ' of the corresponding sample of sample searches keyword.Wherein, the application is not
The preset strategy is limited, as long as the preset strategy has repeatable feasibility, and the sample for generating
Whole term vector 1 ' is consistent with the dimension of sample participle vector, such as be above-mentioned n-dimensional vector, you can should
For in the technical scheme of the application.
For example, can be by the corresponding all sample participle vectors of sample searches keyword in every dimension
On numerical value respectively according to being calculated corresponding to the preset algorithm of above-mentioned preset strategy, obtain the whole word of sample
The corresponding numerical value in each dimension of vector 1 '.Wherein, the preset algorithm can be:Average algorithm, plus
Weight average algorithm etc., the application is not limited this.
Such as, when sample searches keyword is corresponding to sample participle vector 31 ' and sample participle vector 32 '
When, i.e. vector X1 and vector X2, it is assumed that preset algorithm is average algorithm, then respectively to sample participle to
Amount 31 ' and numerical value of the sample participle vector 32 ' in each dimension carry out average computation, obtain corresponding sample
This whole term vector 1 ' is X '={ x1', x2' ..., xn', wherein x1'=(x11+x21)/2、x2'=(x12+x22)
/ 2 ... ..., xn'=(x1n+x2n)/2。
It is of course also possible to pass through following manner so that above-mentioned for the generating mode of the whole term vector 1 ' of sample
It is easier to operate to:When the semantic vector space is tieed up for n, all of any feature word will be constituted
Respectively corresponding n dimensions participle vector constitutes m × n specifications to m participle in the semantic vector space
Eigenmatrix;The each column m element in the eigenmatrix is calculated according to preset algorithm respectively,
To obtain numerical value of the corresponding whole term vector of any feature word in respective dimensions;By the calculating of each row
As a result it is combined as the corresponding n of any feature word and ties up whole term vector.
Such as, when sample searches keyword is corresponding to sample participle vector 31 ' and sample participle vector 32 ',
And each sample participle vector is when being 9 dimension, i.e. m=2, n=9, then by sample participle vector 31 ' and sample
The eigenmatrix that well-behaved term vector 32 ' is constituted is:
Then, respectively by the individual elements of 2 (m=2) in each column in this feature matrix W x according to pre- imputation
Method is calculated, you can obtain the whole term vector 1 ' of sample, i.e. X '={ x1', x2' ..., x9’}。
Wherein, if preset algorithm is average algorithm, x1'=(x11+x21)/2、x2'=(x12+x22)/2 ... ...,
x9'=(x19+x29)/2.If preset algorithm is Weighted Average Algorithm, the whole term vector 1 ' of sample can be calculated
Numerical value in each dimension is:x1'=x11×a1+x21×a2, x2'=x12×b1+x22×b2... ...,
x9'=x19×i1+x29×i2, wherein a1、a2Deng the weighted value for being respectively respective element;Wherein, in weighting
In average algorithm, in same row the weight of each element can participle corresponding with the element appearance word frequency just
Correlation, such as can obtain above-mentioned weighted value according to TF-IDF algorithms, and certain the application is not to this
Limited.
With step 504A~step 508A analogously, in step 504B, step 506B and step 508B
In, can for the corresponding all participles of sample expansion word (than sample participle 21 ' as shown in Figure 6 and
Sample participle 22 ' etc.), corresponding sample participle vector is generated respectively (than sample as shown in Figure 6 point
Term vector 41 ' and sample participle vector 42 ' etc.), and according to above-mentioned preset strategy, by all of sample
Participle vector is combined as the whole term vector 2 ' of corresponding sample, such as the whole term vector 2 ' of the sample is Y '.
Step 510, training sample.
In the present embodiment, the similarity between the whole term vector 1 ' of sample and the whole term vector 2 ' of sample is calculated,
It is assumed that now the similarity is initial similarity Z1.And when obtaining sample characteristics phrase in step 502,
The default degree of association is respectively provided between sample searches keyword and sample expansion word in each sample characteristics phrase
Z, default degree of association Z embodies the actual semanteme between sample searches keyword and the sample expansion word
The degree of correlation.And due to generating respectively in step 506A and step 506B during each sample participle vector, often
Numerical value of the individual sample participle vector in each dimension is arbitrary value, thus the whole term vector 1 ' of sample and sample
Initial similarity Z1 between whole term vector 2 ' often and does not meet default degree of association Z.
Therefore, it can default degree of association Z as target, by neural network algorithm pair and sample characteristics
It is each accordingly that sample searches keyword and sample expansion word in phrase distinguishes the whole term vector of corresponding sample
Sample participle vector is trained, i.e., to sample participle vector 31 ', the sample participle vector shown in Fig. 6
32 ', sample participle vector 41 ' and sample participle vector 42 ' etc. are trained, by each sample participle
Numerical value change of the vector in each dimension so that the whole term vector 1 ' of corresponding sample and the whole term vector of sample
2 ' numerical value and similarity between the two respectively in each dimension produce corresponding change, from
And by the similarity between the whole term vector 1 ' of sample and the whole term vector 2 ' of sample by initial similarity Z1 progressively
It is changed to close in default degree of association Z, until being matched with, (equal or difference is less than default
Numerical value) default degree of association Z, then it is assumed that training is completed.
Based on above-mentioned principle, when training operation is performed, following loss function can be set up:
Wherein,For training objective, target is above-mentioned default degree of association Z, and output is the whole word of sample
Similarity between vector 1 ' and the whole term vector 2 ' of sample, and the initial value of output is above-mentioned initial phase
Like degree Z1.
So, each layer hidden variable and active coating ginseng of neutral net is constantly updated by reflecting transmission method
Number and term vector, finally cause loss function to minimize, then the whole term vector 1 ' of sample and the whole word of sample to
Similarity between amount 2 ' will be matched with default degree of association Z.
Wherein, preset degree of association Z can according to the corresponding hits of corresponding sample characteristics phrase, browse
Number, click ratio, browse ratio etc. and obtain, such as when hits/ratio, browse number/ratio it is higher when,
The numerical value of corresponding default degree of association Z is bigger, shows corresponding sample searches keyword and sample expansion word
Between have higher semantic relevancy.It is of course also possible to determine the default degree of association according to other specification
Z, the application is not limited this.
Step 512A, obtains participle vector.
In the present embodiment, as shown in fig. 6, in the whole term vector 1 ' of complete paired samples and the whole term vector of sample
After similarity training between 2 ', it is determined that by the corresponding sample participle vector instruction of sample searches keyword
It is corresponding participle vector to practice, such as sample participle vector 31 ' is trained to participle vector 31 (in figure not
Illustrate), sample participle vector 32 ' be trained to participle vector 32 (not shown)s.Correspondingly,
With the whole term vector 2 ' of sample after training, change respectively turns to the whole word shown in Fig. 6 to the whole term vector 1 ' of sample
Vector 1 and whole term vector 2.
Step 512B, obtains participle vector.
In the present embodiment, as shown in fig. 6, in the whole term vector 1 ' of complete paired samples and the whole term vector of sample
After similarity training between 2 ', it is determined that the corresponding sample participle vector of sample expansion word is trained for
Corresponding participle vector, such as sample participle vector 41 ' be trained to participle 41 (not shown)s of vector,
Sample participle vector 42 ' is trained to participle 42 (not shown)s of vector.Correspondingly, the whole word of sample
With the whole term vector 2 ' of sample after training, change respectively turns to the He of whole term vector 1 shown in Fig. 6 to vector 1 '
Whole term vector 2.
Step 406, combines whole term vector, as the semantic vector of corresponding words.
In the present embodiment, the training sample that step 402 is extracted includes many sample characteristics phrases, respectively
Individual sample characteristics phrase is processed by the embodiment shown in above-mentioned Fig. 5, can obtain all samples
The word segmentation result set that the corresponding sample participle of Feature Words is constituted, and each sample in the word segmentation result set
The corresponding sample participle vector of this participle is trained to corresponding participle vector.
And when combining whole term vector in a step 406, being not only combination, to obtain sample characteristics word corresponding whole
Term vector, when the sample participle also in word segmentation result set can be non-sample Feature Words in any combination,
The corresponding whole term vector of the non-sample Feature Words is obtained by the vector combination of corresponding participle.Wherein, non-sample
Feature Words can be alternative words, such as word of bidding (Bidword) of businessman's purchase etc., or user can
The search keyword that can be input into.
For example, it is assumed that sample searches keyword " i Phone " and sample expansion word " iphone6 "
Sample participle, the sample participle vector sum participle vector that obtains of training it is as shown in table 1 below, then except by
Participle vector P1 and participle vector P2 combinations obtain the corresponding language of sample searches keyword " i Phone "
Adopted vector, and combined and obtained sample expansion word " iphone6 " by participle vector Q1 and participle vector Q2
Corresponding semantic vector, can also obtain such as " apple by any combination to each sample participle
The corresponding semantic vector such as iphone ".
Table 1
It is to be noted that:
Participle vector is combined obtain whole term vector when, training with step 404 should be adopted
" preset strategy " in journey is consistent, i.e., specially in step 508A, step 508B to sample participle to
Amount be combined " preset strategy " when obtaining the whole term vector of sample it is consistent, such as to all participles vector
Numerical value on same dimension carries out average computation or weighted average calculation etc..
And when the similarity between two vectors is calculated, there are in fact various calculations.Citing and
Speech, can directly calculate the similarity of two vectors itself, such as cosine (cosine) distance, Pierre
Inferior coefficient correlation etc.;Or, it is also possible to by mapping to neural net layer, the corresponding search of comparison is crucial
The degree of association between word and expansion word;Or, other modes, the application can also be adopted not to enter to this
Row is limited.
Step 408, generates QR lists.
Step 410, performs QR process.
In the present embodiment, recorded between predefined search keyword and expansion word in QR lists
Corresponding relation, each pair search keyword and expansion word described in the corresponding relation is in semantic vector space
Respectively the similarity between corresponding semantic vector reaches default similarity.
Therefore, the search keyword for being actually entered according to user, it is only necessary to search simultaneously from the QR lists
Extract corresponding word, you can using the word as the corresponding expansion word of the search keyword, and the extension
There is higher semantic relevancy between word certainty and search keyword, it is possible to achieve accurately QR process
And meet the search need of user.
Certainly, the search keyword of user input may be not present in QR lists, or may be simultaneously
Do not set up QR lists in advance, then can be by the way that search keyword be carried out into word segmentation processing, and according to obtaining
Participle in above-mentioned word segmentation result set corresponding sample participle, by corresponding point of these sample participles
Term vector is combined as the corresponding semantic vector of the search keyword, and further by the search keyword with it is standby
The semantic vector for selecting word is compared, and the similarity chosen between semantic vector reaches default similarity
Alternative words, as the corresponding expansion word of the search keyword.
Further, in step 410, it can be ensured that the expansion word that QR process is obtained is closed with search
Keyword belongs to identical business object classification.Such as when user input " i Phone ", initiative recognition
It is " electronic product " to go out the business object classification belonging to the search keyword, and QR is processed as the " electricity
The expansion words such as " iphone6 " under sub- product " classification, the rather than " i Phone of " handicraft " classification
The expansion words such as model ".Wherein it is possible to pass through the historical behavior data for obtaining user, and according to the history
Behavioral data determines the business object classification belonging to search keyword;Such as, the historical behavior data can be with
Historical search, historical viewings, history click, history collection, history purchase including the user etc. are various
Data.
Fig. 7 shows the schematic configuration diagram of the electronic equipment of the exemplary embodiment according to the application.Please
With reference to Fig. 7, in hardware view, the electronic equipment includes processor, internal bus, network interface, interior
Deposit and nonvolatile memory, the hardware being also possible that certainly required for other business.Processor from
Corresponding computer program is read in nonvolatile memory in internal memory and then is run, on logic level
Form inquiry re-writing device.Certainly, in addition to software realization mode, the application is not precluded from other realities
Existing mode, such as mode of logical device or software and hardware combining etc., that is to say, that following handling process
Executive agent be not limited to each logical block, or hardware or logical device.
Refer to Fig. 8, in Software Implementation, the inquiry re-writing device can include receiving unit,
Choose unit and rewrite unit.Wherein:
Receiving unit, the search keyword of receiving user's input;
Unit is chosen, the expansion word corresponding to the search keyword is chosen, the expansion word is searched with described
Similarity of the rope keyword in the semantic vector space of default dimension respectively between corresponding semantic vector reaches
To default similarity;
Unit is rewritten, the search keyword is rewritten as into selected expansion word.
Optionally, it is described selection unit specifically for:
The corresponding relation between predefined search keyword and expansion word is transferred, is remembered in the corresponding relation
The each pair search keyword and expansion word of load in the semantic vector space distinguish corresponding semantic vector it
Between similarity reach default similarity;
Obtain the expansion word corresponding with the search keyword described in the corresponding relation.
Optionally, the semantic vector is by corresponding search keyword or extension by neural network algorithm
Word maps to the semantic vector space and obtains.
Optionally, search keyword or expansion word are mapped to by the semantic vector space by following manner
And obtain the corresponding semantic vector:
The all participles for constituting search keyword or expansion word are respectively mapped to by institute by neural network algorithm
Semantic vector space is stated, corresponding participle vector is obtained;According to preset strategy will constitute search keyword or
Respectively corresponding participle vector is combined all participles of expansion word, and using the whole term vector for obtaining as
The semantic vector.
Optionally, the corresponding participle of the participle vector belongs to all sample characteristics words as training sample
Corresponding word segmentation result set, wherein the sample characteristics word is sample searches keyword or sample expansion word,
And each sample searches keyword constitutes with each sample expansion word being associated have the default degree of association respectively
As eigen phrase;
And, when each participle in the word segmentation result set corresponds respectively to the semantic vector space
In each dimension numerical value when being the sample participle vector of arbitrary initial value, by constituting arbitrary sample characteristics word
Respectively corresponding sample participle vector is combined as arbitrary sample spy to all participles according to the preset strategy
Levy the whole term vector of the corresponding sample of word, and the sample searches keyword and sample in arbitrary sample characteristics phrase
There is corresponding initial similarity between the whole term vector of corresponding sample respectively in expansion word;
Wherein, when with the corresponding default degree of association of the arbitrary sample characteristics phrase as target, by described
Sample searches keyword and sample expansion word in neural network algorithm pair and arbitrary sample characteristics phrase
When respectively the corresponding each sample participle vector of the whole term vector of corresponding sample is trained, if training result
So that the initial similarity is changed to is matched with the default degree of association, it is determined that arbitrary sample is special
Levy the corresponding all participles of phrase and be mapped to the semantic vector space, and with arbitrary sample characteristics
It is each accordingly that sample searches keyword and sample expansion word in phrase distinguishes the whole term vector of corresponding sample
Sample participle vector is trained to the corresponding participle vector of corresponding participle.
Optionally, the training sample is from least one of:
The historical search keyword and clicked business object extracted in historical search click logs is corresponding
History expansion word;
The historical search keyword and from the displaying content of the clicked business object extract it is pre-
Survey expansion word;
The history expansion word and the prediction extracted from the displaying content of the clicked business object are searched
Rope keyword;
The forecasting search keyword and prediction extracted from the displaying content of the clicked business object expands
Exhibition word;
The expansion word that the search keyword and user that user creates is created;
Wherein, the search that the historical search keyword, the forecasting search keyword and user create is closed
By as sample searches keyword, the history expansion word, the prediction expansion word and user create keyword
Expansion word by as sample expansion word.
Optionally, the preset strategy includes:
When the semantic vector space is tieed up for n, all m participles of arbitrary word will be constituted respectively in institute
State the eigenmatrix of corresponding n dimensions participle vector composition m × n specifications in semantic vector space;
The each column m element in the eigenmatrix is calculated according to preset algorithm respectively, to obtain
Numerical value of the corresponding whole term vector of arbitrary word in respective dimensions;
The result of calculation of each row is combined as into n and ties up whole term vector, using as arbitrary word in the semanteme
Corresponding semantic vector in vector space.
Optionally, the preset algorithm includes following arbitrary:
Average algorithm;
The appearance word of the weight of each element participle corresponding with the element in Weighted Average Algorithm, and same row
Frequency positive correlation.
Optionally, the expansion word belongs to identical business object classification with the search keyword.
Optionally, also include:
Acquiring unit, obtains the historical behavior data of user;
Determining unit, according to the historical behavior data, determines the business pair belonging to the search keyword
As classification.
In a typical configuration, computing device include one or more processors (CPU), input/
Output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory
And/or the form, such as read-only storage (ROM) or flash memory (flash such as Nonvolatile memory (RAM)
RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by
Any method or technique is realizing information Store.Information can be computer-readable instruction, data structure,
The module of program or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory
(PRAM), static RAM (SRAM), dynamic random access memory (DRAM),
Other kinds of random access memory (RAM), read-only storage (ROM), electrically erasable
Read-only storage (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage
(CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic cassette tape, tape magnetic
Disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be counted
The information that calculation equipment is accessed.Define according to herein, computer-readable medium does not include that temporary computer can
Read media (transitory media), the such as data-signal and carrier wave of modulation.
Also, it should be noted that term " including ", "comprising" or its any other variant be intended to it is non-
Exclusiveness is included, so that a series of process, method, commodity or equipment including key elements is not only
Including those key elements, but also including other key elements being not expressly set out, or also include for this
The intrinsic key element of process, method, commodity or equipment.In the absence of more restrictions, by language
The key element that sentence "including a ..." is limited, it is not excluded that in process, method, business including the key element
Also there is other identical element in product or equipment.
The preferred embodiment of the application is the foregoing is only, it is all at this not to limit the application
Within the spirit and principle of application, any modification, equivalent substitution and improvements done etc. should be included in
Within the scope of the application protection.
Claims (20)
1. it is a kind of to inquire about Improvement, it is characterised in that to include:
The search keyword of receiving user's input;
The expansion word corresponding to the search keyword is chosen, the expansion word exists with the search keyword
Similarity in the semantic vector space of default dimension respectively between corresponding semantic vector reaches default similar
Degree;
The search keyword is rewritten as into selected expansion word.
2. method according to claim 1, it is characterised in that the selection is corresponding to the search
The expansion word of keyword, including:
The corresponding relation between predefined search keyword and expansion word is transferred, is remembered in the corresponding relation
The each pair search keyword and expansion word of load in the semantic vector space distinguish corresponding semantic vector it
Between similarity reach default similarity;
Obtain the expansion word corresponding with the search keyword described in the corresponding relation.
3. method according to claim 1, it is characterised in that the semantic vector is by nerve
Corresponding search keyword or expansion word are mapped to the semantic vector space and are obtained by network algorithm.
4. method according to claim 3, it is characterised in that will search for crucial by following manner
Word or expansion word map to the semantic vector space and obtain the corresponding semantic vector:
The all participles for constituting search keyword or expansion word are respectively mapped to by institute by neural network algorithm
Semantic vector space is stated, corresponding participle vector is obtained;According to preset strategy will constitute search keyword or
Respectively corresponding participle vector is combined all participles of expansion word, and using the whole term vector for obtaining as
The semantic vector.
5. method according to claim 4, it is characterised in that the corresponding participle of the participle vector
Belong to the corresponding word segmentation result set of all sample characteristics words as training sample, wherein the sample is special
Word is levied for sample searches keyword or sample expansion word, and each sample searches keyword respectively be associated
Each sample expansion word constitute there is eigen phrase as the default degree of association;
And, when each participle in the word segmentation result set corresponds respectively to the semantic vector space
In each dimension numerical value when being the sample participle vector of arbitrary initial value, by constituting arbitrary sample characteristics word
Respectively corresponding sample participle vector is combined as arbitrary sample spy to all participles according to the preset strategy
Levy the whole term vector of the corresponding sample of word, and the sample searches keyword and sample in arbitrary sample characteristics phrase
There is corresponding initial similarity between the whole term vector of corresponding sample respectively in expansion word;
Wherein, when with the corresponding default degree of association of the arbitrary sample characteristics phrase as target, by described
Sample searches keyword and sample expansion word in neural network algorithm pair and arbitrary sample characteristics phrase
When respectively the corresponding each sample participle vector of the whole term vector of corresponding sample is trained, if training result
So that the initial similarity is changed to is matched with the default degree of association, it is determined that arbitrary sample is special
Levy the corresponding all participles of phrase and be mapped to the semantic vector space, and with arbitrary sample characteristics
It is each accordingly that sample searches keyword and sample expansion word in phrase distinguishes the whole term vector of corresponding sample
Sample participle vector is trained to the corresponding participle vector of corresponding participle.
6. method according to claim 5, it is characterised in that the training sample from down to
It is one of few:
The historical search keyword and clicked business object extracted in historical search click logs is corresponding
History expansion word;
The historical search keyword and from the displaying content of the clicked business object extract it is pre-
Survey expansion word;
The history expansion word and the prediction extracted from the displaying content of the clicked business object are searched
Rope keyword;
The forecasting search keyword and prediction extracted from the displaying content of the clicked business object expands
Exhibition word;
The expansion word that the search keyword and user that user creates is created;
Wherein, the search that the historical search keyword, the forecasting search keyword and user create is closed
By as sample searches keyword, the history expansion word, the prediction expansion word and user create keyword
Expansion word by as sample expansion word.
7. method according to claim 4, it is characterised in that the preset strategy includes:
When the semantic vector space is tieed up for n, all m participles of arbitrary word will be constituted respectively in institute
State the eigenmatrix of corresponding n dimensions participle vector composition m × n specifications in semantic vector space;
The each column m element in the eigenmatrix is calculated according to preset algorithm respectively, to obtain
Numerical value of the corresponding whole term vector of arbitrary word in respective dimensions;
The result of calculation of each row is combined as into n and ties up whole term vector, using as arbitrary word in the semanteme
Corresponding semantic vector in vector space.
8. method according to claim 7, it is characterised in that the preset algorithm includes following
One:
Average algorithm;
The appearance word of the weight of each element participle corresponding with the element in Weighted Average Algorithm, and same row
Frequency positive correlation.
9. method according to claim 1, it is characterised in that the expansion word and the search are closed
Keyword belongs to identical business object classification.
10. method according to claim 9, it is characterised in that also include:
Obtain the historical behavior data of user;
According to the historical behavior data, the business object classification belonging to the search keyword is determined.
11. a kind of inquiry re-writing devices, it is characterised in that include:
Receiving unit, the search keyword of receiving user's input;
Unit is chosen, the expansion word corresponding to the search keyword is chosen, the expansion word is searched with described
Similarity of the rope keyword in the semantic vector space of default dimension respectively between corresponding semantic vector reaches
To default similarity;
Unit is rewritten, the search keyword is rewritten as into selected expansion word.
12. devices according to claim 11, it is characterised in that the selection unit specifically for:
The corresponding relation between predefined search keyword and expansion word is transferred, is remembered in the corresponding relation
The each pair search keyword and expansion word of load in the semantic vector space distinguish corresponding semantic vector it
Between similarity reach default similarity;
Obtain the expansion word corresponding with the search keyword described in the corresponding relation.
13. devices according to claim 11, it is characterised in that the semantic vector is by god
Corresponding search keyword or expansion word are mapped to the semantic vector space and are obtained by Jing network algorithms.
14. devices according to claim 13, it is characterised in that will be searched for by following manner and closed
Keyword or expansion word map to the semantic vector space and obtain the corresponding semantic vector:
The all participles for constituting search keyword or expansion word are respectively mapped to by institute by neural network algorithm
Semantic vector space is stated, corresponding participle vector is obtained;According to preset strategy will constitute search keyword or
Respectively corresponding participle vector is combined all participles of expansion word, and using the whole term vector for obtaining as
The semantic vector.
15. devices according to claim 14, it is characterised in that corresponding point of the participle vector
Word belongs to the corresponding word segmentation result set of all sample characteristics words as training sample, wherein the sample
Feature Words are sample searches keyword or sample expansion word, and each sample searches keyword respectively to it is related
Each sample expansion word of connection is constituted has eigen phrase as the default degree of association;
And, when each participle in the word segmentation result set corresponds respectively to the semantic vector space
In each dimension numerical value when being the sample participle vector of arbitrary initial value, by constituting arbitrary sample characteristics word
Respectively corresponding sample participle vector is combined as arbitrary sample spy to all participles according to the preset strategy
Levy the whole term vector of the corresponding sample of word, and the sample searches keyword and sample in arbitrary sample characteristics phrase
There is corresponding initial similarity between the whole term vector of corresponding sample respectively in expansion word;
Wherein, when with the corresponding default degree of association of the arbitrary sample characteristics phrase as target, by described
Sample searches keyword and sample expansion word in neural network algorithm pair and arbitrary sample characteristics phrase
When respectively the corresponding each sample participle vector of the whole term vector of corresponding sample is trained, if training result
So that the initial similarity is changed to is matched with the default degree of association, it is determined that arbitrary sample is special
Levy the corresponding all participles of phrase and be mapped to the semantic vector space, and with arbitrary sample characteristics
It is each accordingly that sample searches keyword and sample expansion word in phrase distinguishes the whole term vector of corresponding sample
Sample participle vector is trained to the corresponding participle vector of corresponding participle.
16. devices according to claim 15, it is characterised in that the training sample is from following
At least one:
The historical search keyword and clicked business object extracted in historical search click logs is corresponding
History expansion word;
CZ1511101
The historical search keyword and from the displaying content of the clicked business object extract it is pre-
Survey expansion word;
The history expansion word and the prediction extracted from the displaying content of the clicked business object are searched
Rope keyword;
The forecasting search keyword and prediction extracted from the displaying content of the clicked business object expands
Exhibition word;
The expansion word that the search keyword and user that user creates is created;
Wherein, the search that the historical search keyword, the forecasting search keyword and user create is closed
By as sample searches keyword, the history expansion word, the prediction expansion word and user create keyword
Expansion word by as sample expansion word.
17. devices according to claim 14, it is characterised in that the preset strategy includes:
When the semantic vector space is tieed up for n, all m participles of arbitrary word will be constituted respectively in institute
State the eigenmatrix of corresponding n dimensions participle vector composition m × n specifications in semantic vector space;
The each column m element in the eigenmatrix is calculated according to preset algorithm respectively, to obtain
Numerical value of the corresponding whole term vector of arbitrary word in respective dimensions;
The result of calculation of each row is combined as into n and ties up whole term vector, using as arbitrary word in the semanteme
Corresponding semantic vector in vector space.
18. devices according to claim 17, it is characterised in that the preset algorithm includes following
It is arbitrary:
Average algorithm;
The appearance word of the weight of each element participle corresponding with the element in Weighted Average Algorithm, and same row
Frequency positive correlation.
19. devices according to claim 11, it is characterised in that the expansion word and the search
Keyword belongs to identical business object classification.
20. devices according to claim 19, it is characterised in that also include:
Acquiring unit, obtains the historical behavior data of user;
Determining unit, according to the historical behavior data, determines the business pair belonging to the search keyword
As classification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510689095.7A CN106610972A (en) | 2015-10-21 | 2015-10-21 | Query rewriting method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510689095.7A CN106610972A (en) | 2015-10-21 | 2015-10-21 | Query rewriting method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106610972A true CN106610972A (en) | 2017-05-03 |
Family
ID=58610888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510689095.7A Pending CN106610972A (en) | 2015-10-21 | 2015-10-21 | Query rewriting method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106610972A (en) |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107679119A (en) * | 2017-09-19 | 2018-02-09 | 北京京东尚科信息技术有限公司 | The method and apparatus for generating brand derivative words |
CN107862015A (en) * | 2017-10-30 | 2018-03-30 | 北京奇艺世纪科技有限公司 | A kind of crucial word association extended method and device |
CN108121697A (en) * | 2017-11-16 | 2018-06-05 | 北京百度网讯科技有限公司 | Method, apparatus, equipment and the computer storage media that a kind of text is rewritten |
CN108182200A (en) * | 2017-11-29 | 2018-06-19 | 有米科技股份有限公司 | Keyword expanding method and device based on semantic similarity |
CN108647349A (en) * | 2018-05-15 | 2018-10-12 | 优视科技有限公司 | A kind of content recommendation method, device and terminal device |
CN108710607A (en) * | 2018-04-17 | 2018-10-26 | 达而观信息科技(上海)有限公司 | Text Improvement and device |
CN108733766A (en) * | 2018-04-17 | 2018-11-02 | 腾讯科技(深圳)有限公司 | A kind of data query method, apparatus and readable medium |
CN108776901A (en) * | 2018-04-27 | 2018-11-09 | 微梦创科网络科技(中国)有限公司 | Method and system for advertisement recommendation based on search term |
CN108874773A (en) * | 2018-05-31 | 2018-11-23 | 平安医疗科技有限公司 | Keyword increases method, apparatus, computer equipment and storage medium newly |
CN109063204A (en) * | 2018-09-14 | 2018-12-21 | 郑州云海信息技术有限公司 | Log inquiring method, device, equipment and storage medium based on artificial intelligence |
CN109460458A (en) * | 2018-10-29 | 2019-03-12 | 清华大学 | The prediction technique being intended to and device are rewritten in inquiry |
CN109657145A (en) * | 2018-12-20 | 2019-04-19 | 拉扎斯网络科技(上海)有限公司 | Trade company's searching method and device, electronic equipment and computer readable storage medium |
CN109766537A (en) * | 2019-01-16 | 2019-05-17 | 北京未名复众科技有限公司 | Study abroad document methodology of composition, device and electronic equipment |
CN110019646A (en) * | 2017-10-12 | 2019-07-16 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus for establishing index |
CN110222147A (en) * | 2019-05-15 | 2019-09-10 | 北京百度网讯科技有限公司 | Label extending method, device, computer equipment and storage medium |
CN110287284A (en) * | 2019-05-23 | 2019-09-27 | 北京百度网讯科技有限公司 | Semantic matching method, device and equipment |
CN110334277A (en) * | 2019-06-28 | 2019-10-15 | 北京天眼查科技有限公司 | The recognition methods of user's search behavior and device |
CN110362652A (en) * | 2019-07-19 | 2019-10-22 | 辽宁工程技术大学 | Based on space-semanteme-numerical value degree of correlation spatial key Top-K querying method |
CN110750617A (en) * | 2018-07-06 | 2020-02-04 | 北京嘀嘀无限科技发展有限公司 | Method and system for determining relevance between input text and interest points |
CN110889050A (en) * | 2018-09-07 | 2020-03-17 | 北京搜狗科技发展有限公司 | Method and device for mining generic brand words |
CN110909217A (en) * | 2018-09-12 | 2020-03-24 | 北京奇虎科技有限公司 | Method and device for realizing search, electronic equipment and storage medium |
CN110909021A (en) * | 2018-09-12 | 2020-03-24 | 北京奇虎科技有限公司 | Construction method and device of query rewriting model and application thereof |
CN110969024A (en) * | 2018-09-30 | 2020-04-07 | 北京奇虎科技有限公司 | Query statement rewriting method and device |
CN110990578A (en) * | 2018-09-30 | 2020-04-10 | 北京奇虎科技有限公司 | Method and device for constructing rewriting model |
US10650102B2 (en) * | 2017-06-19 | 2020-05-12 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating parallel text in same language |
CN111353016A (en) * | 2018-12-24 | 2020-06-30 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN111506701A (en) * | 2020-03-25 | 2020-08-07 | 中国平安财产保险股份有限公司 | Intelligent query method and related device |
CN111666292A (en) * | 2020-04-24 | 2020-09-15 | 百度在线网络技术(北京)有限公司 | Similarity model establishing method and device for retrieving geographic positions |
CN112149005A (en) * | 2019-06-27 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
WO2021196934A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Question recommendation method and apparatus based on field similarity calculation, and server |
CN112559686B (en) * | 2020-12-11 | 2023-10-27 | 北京百度网讯科技有限公司 | Information retrieval method and device and electronic equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253982A (en) * | 2011-06-24 | 2011-11-23 | 北京理工大学 | Query suggestion method based on query semantics and click-through data |
CN103020164A (en) * | 2012-11-26 | 2013-04-03 | 华北电力大学 | Semantic search method based on multi-semantic analysis and personalized sequencing |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN104391963A (en) * | 2014-12-01 | 2015-03-04 | 北京中科创益科技有限公司 | Method for constructing correlation networks of keywords of natural language texts |
CN104765769A (en) * | 2015-03-06 | 2015-07-08 | 大连理工大学 | Short text query expansion and indexing method based on word vector |
CN104933183A (en) * | 2015-07-03 | 2015-09-23 | 重庆邮电大学 | Inquiring term rewriting method merging term vector model and naive Bayes |
-
2015
- 2015-10-21 CN CN201510689095.7A patent/CN106610972A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253982A (en) * | 2011-06-24 | 2011-11-23 | 北京理工大学 | Query suggestion method based on query semantics and click-through data |
CN103020164A (en) * | 2012-11-26 | 2013-04-03 | 华北电力大学 | Semantic search method based on multi-semantic analysis and personalized sequencing |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN104391963A (en) * | 2014-12-01 | 2015-03-04 | 北京中科创益科技有限公司 | Method for constructing correlation networks of keywords of natural language texts |
CN104765769A (en) * | 2015-03-06 | 2015-07-08 | 大连理工大学 | Short text query expansion and indexing method based on word vector |
CN104933183A (en) * | 2015-07-03 | 2015-09-23 | 重庆邮电大学 | Inquiring term rewriting method merging term vector model and naive Bayes |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10650102B2 (en) * | 2017-06-19 | 2020-05-12 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating parallel text in same language |
CN107679119A (en) * | 2017-09-19 | 2018-02-09 | 北京京东尚科信息技术有限公司 | The method and apparatus for generating brand derivative words |
CN107679119B (en) * | 2017-09-19 | 2020-06-30 | 北京京东尚科信息技术有限公司 | Method and device for generating brand derivative words |
CN110019646A (en) * | 2017-10-12 | 2019-07-16 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus for establishing index |
CN107862015A (en) * | 2017-10-30 | 2018-03-30 | 北京奇艺世纪科技有限公司 | A kind of crucial word association extended method and device |
CN108121697A (en) * | 2017-11-16 | 2018-06-05 | 北京百度网讯科技有限公司 | Method, apparatus, equipment and the computer storage media that a kind of text is rewritten |
CN108182200B (en) * | 2017-11-29 | 2020-10-23 | 有米科技股份有限公司 | Keyword expansion method and device based on semantic similarity |
CN108182200A (en) * | 2017-11-29 | 2018-06-19 | 有米科技股份有限公司 | Keyword expanding method and device based on semantic similarity |
CN108733766A (en) * | 2018-04-17 | 2018-11-02 | 腾讯科技(深圳)有限公司 | A kind of data query method, apparatus and readable medium |
CN108710607B (en) * | 2018-04-17 | 2022-04-19 | 达而观信息科技(上海)有限公司 | Text rewriting method and device |
CN108710607A (en) * | 2018-04-17 | 2018-10-26 | 达而观信息科技(上海)有限公司 | Text Improvement and device |
CN108776901A (en) * | 2018-04-27 | 2018-11-09 | 微梦创科网络科技(中国)有限公司 | Method and system for advertisement recommendation based on search term |
CN108776901B (en) * | 2018-04-27 | 2021-01-15 | 微梦创科网络科技(中国)有限公司 | Advertisement recommendation method and system based on search terms |
CN108647349A (en) * | 2018-05-15 | 2018-10-12 | 优视科技有限公司 | A kind of content recommendation method, device and terminal device |
CN108874773A (en) * | 2018-05-31 | 2018-11-23 | 平安医疗科技有限公司 | Keyword increases method, apparatus, computer equipment and storage medium newly |
CN108874773B (en) * | 2018-05-31 | 2023-04-18 | 平安医疗科技有限公司 | Keyword newly-adding method and device, computer equipment and storage medium |
CN110750617A (en) * | 2018-07-06 | 2020-02-04 | 北京嘀嘀无限科技发展有限公司 | Method and system for determining relevance between input text and interest points |
CN110889050A (en) * | 2018-09-07 | 2020-03-17 | 北京搜狗科技发展有限公司 | Method and device for mining generic brand words |
CN110909217A (en) * | 2018-09-12 | 2020-03-24 | 北京奇虎科技有限公司 | Method and device for realizing search, electronic equipment and storage medium |
CN110909021A (en) * | 2018-09-12 | 2020-03-24 | 北京奇虎科技有限公司 | Construction method and device of query rewriting model and application thereof |
CN109063204A (en) * | 2018-09-14 | 2018-12-21 | 郑州云海信息技术有限公司 | Log inquiring method, device, equipment and storage medium based on artificial intelligence |
CN110990578A (en) * | 2018-09-30 | 2020-04-10 | 北京奇虎科技有限公司 | Method and device for constructing rewriting model |
CN110969024A (en) * | 2018-09-30 | 2020-04-07 | 北京奇虎科技有限公司 | Query statement rewriting method and device |
CN109460458A (en) * | 2018-10-29 | 2019-03-12 | 清华大学 | The prediction technique being intended to and device are rewritten in inquiry |
CN109657145A (en) * | 2018-12-20 | 2019-04-19 | 拉扎斯网络科技(上海)有限公司 | Trade company's searching method and device, electronic equipment and computer readable storage medium |
CN111353016A (en) * | 2018-12-24 | 2020-06-30 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN111353016B (en) * | 2018-12-24 | 2023-04-18 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN109766537A (en) * | 2019-01-16 | 2019-05-17 | 北京未名复众科技有限公司 | Study abroad document methodology of composition, device and electronic equipment |
CN110222147A (en) * | 2019-05-15 | 2019-09-10 | 北京百度网讯科技有限公司 | Label extending method, device, computer equipment and storage medium |
CN110287284B (en) * | 2019-05-23 | 2021-07-06 | 北京百度网讯科技有限公司 | Semantic matching method, device and equipment |
CN110287284A (en) * | 2019-05-23 | 2019-09-27 | 北京百度网讯科技有限公司 | Semantic matching method, device and equipment |
CN112149005A (en) * | 2019-06-27 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
CN112149005B (en) * | 2019-06-27 | 2023-09-01 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
CN110334277A (en) * | 2019-06-28 | 2019-10-15 | 北京天眼查科技有限公司 | The recognition methods of user's search behavior and device |
CN110362652A (en) * | 2019-07-19 | 2019-10-22 | 辽宁工程技术大学 | Based on space-semanteme-numerical value degree of correlation spatial key Top-K querying method |
CN110362652B (en) * | 2019-07-19 | 2022-11-22 | 辽宁工程技术大学 | Space keyword Top-K query method based on space-semantic-numerical correlation |
CN111506701A (en) * | 2020-03-25 | 2020-08-07 | 中国平安财产保险股份有限公司 | Intelligent query method and related device |
WO2021196934A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Question recommendation method and apparatus based on field similarity calculation, and server |
CN111666292A (en) * | 2020-04-24 | 2020-09-15 | 百度在线网络技术(北京)有限公司 | Similarity model establishing method and device for retrieving geographic positions |
CN111666292B (en) * | 2020-04-24 | 2023-05-26 | 百度在线网络技术(北京)有限公司 | Similarity model establishment method and device for retrieving geographic position |
US11836174B2 (en) | 2020-04-24 | 2023-12-05 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus of establishing similarity model for retrieving geographic location |
CN112559686B (en) * | 2020-12-11 | 2023-10-27 | 北京百度网讯科技有限公司 | Information retrieval method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106610972A (en) | Query rewriting method and apparatus | |
US9317569B2 (en) | Displaying search results with edges/entity relationships in regions/quadrants on a display device | |
Roshdi et al. | Information retrieval techniques and applications | |
TW201942826A (en) | Payment mode recommendation method and device and equipment | |
CN110619044B (en) | Emotion analysis method, system, storage medium and equipment | |
CN107492008A (en) | Information recommendation method, device, server and computer-readable storage medium | |
CN110737756B (en) | Method, apparatus, device and medium for determining answer to user input data | |
CN106897262A (en) | A kind of file classification method and device and treating method and apparatus | |
JP2016212838A (en) | Discovery information system, method, and computer program | |
WO2021146388A1 (en) | Systems and methods for providing answers to a query | |
CN111813930B (en) | Similar document retrieval method and device | |
CN112988980B (en) | Target product query method and device, computer equipment and storage medium | |
JP2015500525A (en) | Method and apparatus for information retrieval | |
US20230385317A1 (en) | Information Retrieval Method, Related System, and Storage Medium | |
Peng et al. | Hierarchical visual-textual knowledge distillation for life-long correlation learning | |
Wang et al. | A differential evolution approach to feature selection and instance selection | |
Yildiz et al. | Improving word embedding quality with innovative automated approaches to hyperparameters | |
GB2568575A (en) | Document search using grammatical units | |
Collarana et al. | A question answering system on regulatory documents | |
Köksal et al. | Improving automated Turkish text classification with learning‐based algorithms | |
CN109101512A (en) | The construction method of law databases, law data query method and device | |
Du et al. | Topic-grained text representation-based model for document retrieval | |
Ruambo et al. | Towards enhancing information retrieval systems: A brief survey of strategies and challenges | |
CN110704613B (en) | Vocabulary database construction and query method, database system, equipment and medium | |
Menon et al. | Gmm-based document clustering of knowledge graph embeddings |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170503 |