CN107145555A - A kind of fuzzy sentence searching method based on participle - Google Patents
A kind of fuzzy sentence searching method based on participle Download PDFInfo
- Publication number
- CN107145555A CN107145555A CN201710296379.9A CN201710296379A CN107145555A CN 107145555 A CN107145555 A CN 107145555A CN 201710296379 A CN201710296379 A CN 201710296379A CN 107145555 A CN107145555 A CN 107145555A
- Authority
- CN
- China
- Prior art keywords
- participle
- keyword
- original text
- word
- original
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of fuzzy sentence searching method based on participle, this method records participle original position by carrying out participle to original text;The word repeated is merged, the word original position repeated is recorded;Participle is carried out to keyword, the number that the number of keyword participle is designated as at least occurring keyword participle once in i, original text is designated as w;Occurrence rate p is calculated, occurrence rate p is more than preset value, scanned for the word segmentation result of keyword, obtain keyword participle position in original text, occurrence rate p is less than preset value, exits search;Calculate keyword participle in original text between position apart from d, compare the difference k between d and corresponding keyword its length whether in the number range allowed, difference k matches the result searched for generally in the number range allowed.The present invention can realize to did obscure, sentence of the word order with changing is retrieved, retrieval result is more accurate, improves recall precision.
Description
Technical field
The present invention relates to a kind of fuzzy sentence searching method based on participle, belong to field of information security technology.
Background technology
In today of network increasingly prosperity, advantageous information is with causing the information of destabilizing factor also increasingly to spread unchecked therewith
Netizen's is ideologically healthy, also for the harmony of society, under many external public arenas, and some contents are will be by examining
It can show.Initial stage is examined in network, is all by manual examination and verification, although this examination & verification mode is accurate and intelligent, with net
The speed that network word is produced is compared, and it is insignificant that its efficiency just shows!Although at this stage by keyword automatic fitration,
Accuracy is very low, and being easily filled into itself is not the information of harmful content, while can not also ensure related harmful content
All filter out.
At this stage, common word search algorithm has two kinds, and a kind of is the word search based on participle, and another is to be based on
The word search of multimode matching.Based on participle search, i.e., first to original text carry out participle, then the keyword to be searched for is entered
Row matching, but this algorithm can only be scanned for word, and sentence can not be matched;Search based on multimode matching, even if
Matched with ACBM, Wu-Manber scheduling algorithm, both may search for word can also search statement, but semanteme can not be analyzed.
Both algorithms have a kind of situation can not match, e.g., by taking keyword " somewhere taxi stops doing business " as an example, but former
In text content be " stop doing business somewhere taxi " or " taxi in somewhere stops doing business ", it is so just to search the keyword,
Although can be scanned for by way of reducing or changing keyword, the confusion of search result is very big, and situation is too many,
The keyword for reducing or changing may be imperfect, and can substantially reduce the speed of search.
It can be seen that the word search based on participle, two kinds of searching algorithm Shortcomings of word search based on multimode matching, no
Can to did obscure, sentence of the word order with changing is scanned for.
The content of the invention
It is an object of the invention to provide a kind of fuzzy sentence searching method based on participle, to solve existing method not
Can to did obscure, sentence of the word order with changing is scanned for the problem of.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of fuzzy sentence searching method based on participle,
Described searching method comprises the following steps:
Step one:The original text treated in range of search carries out participle, records original position of each participle in original text;
Step 2:Original text is carried out after participle, and the word repeated is merged, and records the word each repeated
Original position of the language in original text;
Step 3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as in i, scope to be retrieved
Original text at least there is keyword participle once number be designated as w;
Step 4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than pre-
The numerical value of setting, the word segmentation result to original text is scanned for using the word segmentation result of keyword, is obtained each keyword participle and is existed
The position occurred in original text, occurrence rate p is less than presetting numerical value, exits search;
Step 5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and phase
The difference k between keyword its length is answered whether in the number range allowed, difference k in the number range allowed,
It is fitted on the result searched for generally.
It should be noted that participle can use the existing segmentation methods such as segmenting method based on string matching, be based on
The segmenting method of understanding and the segmenting method based on statistics, segmenting method belong to prior art.
In fuzzy sentence searching method based on participle, described original position refers to the initial of first word of word segmentation result
Original text character number in character position, scope to be retrieved is since 0.By taking " stop doing business somewhere taxi " as an example, participle is " to stop
Industry " " somewhere " " taxi " three results, the position of participle " stopping doing business " is 0.
In fuzzy sentence searching method based on participle, the key that original text, needs in described scope to be retrieved are retrieved
Word uses UTF-8 coded formats.UTF-8 coded formats are that a kind of variable length character for Unicode is encoded, also known as ten thousand states
Under code, UTF-8 coded formats, 1-6 byte is encoded into according to different Digital sizes, conventional English alphabet is encoded into 1
Individual byte, Chinese character is typically 3 bytes, and only very uncommon character can just be encoded into 4-6 byte.UTF-8 codings can be with
Pass through mask bit and shifting function fast reading and writing.Strcmp () is identical with wcscmp () returning result during character string comparison, because
This makes sequence become to be more prone to.
In fuzzy sentence searching method based on participle, keyword participle at least occurs in original text in described step three
1 is once designated as, it is keyword participle number once at least occur that keyword participle does not occur being designated as 0, w in original text.Need
What is further illustrated is the number of times that w and non-keyword participle occur in original text, but the pass of situation is occurred in original text
Key word participle number, by taking " stop doing business somewhere taxi " as an example, participle is " stopping doing business " " somewhere " " taxi " three results, to be checked
If " stopping doing business " does not occur once in the range of the original text of rope, and " somewhere " " taxi " two words are occurred, no matter " certain
Ground " " taxi " occurs several times in the range of original text to be retrieved, and w now is then 2.
In fuzzy sentence searching method based on participle, occurrence rate p is more than or equal to 0 and is less than or equal to 1 in described step four,
Occurrence rate refers to that the number that original text of the keyword participle in scope to be retrieved occurred accounts for the ratio of keyword participle number,
Keyword participle occurs repeatedly representing that the keyword participle occurs in original text in original text, and keyword participle is in scope to be retrieved
The number record increase by 1 that interior original text occurred;Described presetting numerical value is is less than or equal to 1 value more than or equal to 0, in advance
The numerical value of setting represents that occurrence rate of the keyword participle in original text is bigger closer to 1, and p is more than presetting numerical value presence pair
The necessity that the word segmentation result of original text is scanned for using the word segmentation result of keyword.By taking " stop doing business somewhere taxi " as an example, point
Word is " stopping doing business " " somewhere " " taxi " three results, if " stopping doing business " does not occur once in the range of original text to be retrieved,
And " somewhere " " taxi " two words are occurred, w now is then 2, and keyword " stop doing business somewhere taxi " participle number is
3, now, occurrence rate p then be 2/3, due to w value can not possibly exceed participle number 3, therefore occurrence rate p be more than or equal to 0 be less than etc.
In 1, if presetting numerical value is 3/4, occurrence rate p is more than 3/4 for 2/3, therefore uses key in the presence of the word segmentation result to original text
The necessity that the word segmentation result of word is scanned for.If " stopping doing business " " somewhere " does not go out once in the range of original text to be retrieved
It is existing, and only have " taxi " word to occur, now occurrence rate p is then 1/3, and occurrence rate p is 1/3 less than presetting numerical value
3/4, therefore the word segmentation result of original text need not be scanned for using the word segmentation result of keyword.It can be seen that presetting numerical value root
Determined according to the accuracy of retrieval, the accuracy of presetting numerical value more overall search is higher.
In fuzzy sentence searching method based on participle, apart from d and corresponding keyword its length in described step five
Between difference k exceed the number range allowed, exit the display of the retrieval result;The described number range allowed is according to difference
Value k is determined with corresponding keyword its length, is not present in difference k explanation original texts equal with corresponding keyword its length mixed
Confuse, word order phenomenon, there is numerical difference to represent that keyword exists in original text mixed in difference k and corresponding keyword its length
Confuse, word order phenomenon, difference k and corresponding keyword its length exist numerical difference it is bigger represent to exist in original text obscure, word
Language order phenomenon possibility is smaller.Difference k between d and corresponding keyword its length exceeds the number range allowed,
Show textual content in scope to be retrieved be not present with the same or similar content of search key, retrieval knot is also just not present
The displaying of fruit, the difference k between d and corresponding keyword its length shows model to be retrieved in the number range allowed
Textual content in enclosing exist with the same or similar content of search key, retrieval result and doing is obscured, word order
Result with changing also is retrieved one by one, and more detailed description will do further explanation in embodiment below.
In fuzzy sentence searching method based on participle, different keyword participles are referred to key in described step five
Word carries out the different word segmentation results after participle, between the position that different word segmentation results occur in original text apart from d." to stop doing business
Exemplified by somewhere taxi ", participle is " stopping doing business " " somewhere " " taxi " three results, and overmulling may be done in original text to be retrieved
Confuse, word order is with changing, it is therefore desirable to calculate among " stopping doing business " " somewhere " " taxi " three going out in original text between any two
The distance between existing position d.
In fuzzy sentence searching method based on participle, described participle refers to the fractionation to sentence, and sentence is split as
Word, phrase.Fractionation to keyword is not limited to phrase, can be single word or multiple words.
The inventive method has the following advantages that:Participle is carried out by treating the original text in range of search, each participle is recorded
Original position in original text;Original text is carried out after participle, and the word repeated is merged, and record each repeats
Original position of the word in original text;Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as i, treated
The number at least occurring keyword participle once in original text in range of search is designated as w;Keyword participle is calculated in retrieval model
The occurrence rate p, p=w/i, occurrence rate p of original text in enclosing are more than presetting numerical value, and the word segmentation result to original text uses keyword
Word segmentation result scan for, obtain the position that each keyword participle occurs in original text, occurrence rate p is less than presetting number
Value, exits search;Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass
Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed
The result searched for generally.Can realize to did obscure, sentence of the word order with changing is retrieved, retrieval result is more accurate
Really, improve recall precision, be effectively protected the safety of national information, promote the harmony of society with stably.
Brief description of the drawings
Fuzzy sentence searching method schematic flow sheets of the Fig. 1 based on participle;
The algorithm sketch of fuzzy sentence searching methods of the Fig. 2 based on participle.
Embodiment
Following examples are used to illustrate the present invention, but are not limited to the scope of the present invention.
As shown in figure 1, a kind of fuzzy sentence searching method based on participle, searching method comprises the following steps:
S1:The original text treated in range of search carries out participle, records original position of each participle in original text;
S2:Original text is carried out after participle, and the word repeated is merged, and the word that record each repeats exists
Original position in original text;
S3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as the original in i, scope to be retrieved
The number at least occurring keyword participle once in text is designated as w;
S4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than presetting
Numerical value, the word segmentation result to original text scanned for using the word segmentation result of keyword, obtains each keyword participle in original text
The position of middle appearance, occurrence rate p is less than presetting numerical value, exits search;
S5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass
Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed
The result searched for generally.
Below using original text as " stop doing business somewhere taxi, and I likes sun liter on somewhere scenic outlook, scenic outlook, the taxi in somewhere
Car stops doing business ", search key be " somewhere taxi stops doing business " exemplified by more detailed elaboration is done to the inventive method.
The first step:To original text participle, word segmentation result is " [{ " word ":" stopping doing business ", " offset ":0},{"word":" certain
Ground ", " offset ":6},{"word":" taxi ", " offset ":12},{"word":" hiring a car ", " offset ":15},{"
word":" taxi ", " offset ":12},{"word":", ", " offset ":21},{"word":" I ", " offset ":
24},{"word":" love ", " offset ":27},{"word":" somewhere ", " offset ":30},{"word":" view ", "
offset":36},{"word":" scenic outlook ", " offset ":36},{"word":", ", " offset ":45},{"word":"
View ", " offset ":48},{"word":" scenic outlook ", " offset ":48},{"word":" upper ", " offset ":57},{"
word":" sun ", " offset ":60},{"word":" sun liter ", " offset ":60},{"word":", ", " offset ":
69},{"word":" somewhere ", " offset ":72},{"word":" ", " offset ":78},{"word":" taxi ", "
offset":81},{"word":" hiring a car ", " offset ":84},{"word":" taxi ", " offset ":81},{"
word":" stopping doing business ", " offset ":90 }] ", wherein word is the result of participle, and offset represents position of the word in original text
Put.
Second step:After to original text participle, some words occur repeatedly, and such as somewhere is in offset=6, offset=
Multiple positions such as 72 are all occurred, so the result to participle does a processing, dittograph language is merged.Result is such as after merging
Under:
{ hire out:[12,81],
Stop doing business:[0,90],
Love:[27],
On:[57],
Somewhere:[6,30,72],
Taxi:[12,81],
Hire a car:[15,84],
I:[24],
,:[21,45,69],
View:[36,48],
Scenic outlook:[36,48],
The sun:[60],
Sun liter:[60],
's:[78]
}
“:" left side be word, the right be offset arrays.
3rd step:Word segmentation processing is also carried out to keyword, word segmentation result is as follows:
[" stopping doing business ", " somewhere ", " taxi "]
Keyword need not record offset.
4th step:Word segmentation result to original text is scanned for using the word segmentation result of keyword, obtains following result:
[
Stop doing business:[0,90],
Somewhere:[6,30,72],
Taxi:[12,81]
]
The position that keyword word segmentation result occurs in original text can be drawn.An occurrence rate can be set herein:
The number of times that occurrence rate=keyword word segmentation result occurs in original text/keyword word segmentation result number, it is seen that occur
Rate be it is big be equal to 0 it is small be equal to 1 number, only real occurrence rate is more than the occurrence rate set, just into next step.If will go out
Now rate is set to 0.75, and occurrence rate is 3/3=1 in this example, it is clear that 1>0.75, next step can be entered and calculated.
Provided that keyword be " sell cigarette ", word segmentation result is [" sale ", " cigarette "], then [" goes out herein
Sell ", " cigarette "] occurrence rate in given original text to be retrieved is 0/2=0, exit find.
5th step:Offset in 4th step result is compared, beeline is calculated.
Due to having used the length of several words in UTF8 codings, the 4th step result to be respectively:
[
Stop doing business:6,
Somewhere:6,
Taxi:9
]。
Comparing to obtain
[
Stop doing business:0
Somewhere:6
Taxi:12
], this three groups of offset beelines are 6, are met.So position is to match keyword at 0.
[
Stop doing business:90
Somewhere:72
Taxi:81
], " taxi " and " stopping doing business " minimum range are 9 to meet in this three groups, still " somewhere " and " taxi " most short distance
From being but 9, actual difference should be 6 just right, and we can set one and allow the maximum that two words are differed to evade here
This problem.If the value is set to 5, then 9-6=3<5, so meeting.Position is similarly matched to keyword at 72.
So the result matched is " stop doing business somewhere taxi " and " taxi in somewhere stops doing business ".Reach fuzzy matching
Effect.So as to efficiently solve traditional algorithm can not retrieve did obscure, word order is with the search problem changed.
With further reference to Fig. 2, the thought of the inventive method can be more easily understood.It is original text participle, keyword first
Participle, then search key participial construction whether in original text word segmentation result occur, i.e., by calculate occurrence rate come with preset
Numeric ratio compared with, decide whether to enter next step search, into next step after by relatively different keyword participles in original text
Whether the difference k between the distance between position of appearance d and corresponding keyword its length comes in the number range allowed
Decide whether the result that display is searched.
Although above with general explanation and specific embodiment, the present invention is described in detail, at this
On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore,
These modifications or improvements, belong to the scope of protection of present invention without departing from theon the basis of the spirit of the present invention.
Claims (8)
1. a kind of fuzzy sentence searching method based on participle, it is characterised in that:Described searching method comprises the following steps:
Step one:The original text treated in range of search carries out participle, records original position of each participle in original text;
Step 2:Original text is carried out after participle, and the word repeated is merged, and the word that record each repeats exists
Original position in original text;
Step 3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as the original in i, scope to be retrieved
The number at least occurring keyword participle once in text is designated as w;
Step 4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than presetting
Numerical value, the word segmentation result to original text scanned for using the word segmentation result of keyword, obtains each keyword participle in original text
The position of middle appearance, occurrence rate p is less than presetting numerical value, exits search;
Step 5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass
Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed
The result searched for generally.
2. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described starting
Position refers to original text character number in the original character position of first word of word segmentation result, scope to be retrieved since 0.
3. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described is to be checked
The keyword that original text, needs in the range of rope are retrieved uses UTF-8 coded formats.
4. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step
Keyword participle at least occurs once being designated as 1 in original text in three, and keyword participle does not occur being designated as 0, w for extremely in original text
Few keyword participle number occurred once.
5. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step
Occurrence rate p is more than or equal to 0 and is less than or equal to 1 in four, and occurrence rate refers to that original text of the keyword participle in scope to be retrieved occurs
The number crossed accounts for the ratio of keyword participle number, and keyword participle occurs repeatedly representing the keyword participle in original in original text
Occur in text, the number record increase by 1 that original text of the keyword participle in scope to be retrieved occurred;Described is presetting
Numerical value is is less than or equal to 1 value more than or equal to 0, and presetting numerical value represents keyword participle going out in original text closer to 1
Now rate is bigger, and p is more than presetting numerical value and has what the word segmentation result of original text was scanned for using the word segmentation result of keyword
Necessity.
6. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step
Difference k in five between d and corresponding keyword its length exceeds the number range allowed, exits the aobvious of the retrieval result
Show;The described number range allowed determines that difference k and corresponding keyword are certainly according to difference k and corresponding keyword its length
Be not present in the equal explanation original text of body length obscure, word order phenomenon, there is numerical value in difference k and corresponding keyword its length
Difference represent keyword exist in original text obscure, word order phenomenon, there is numerical difference in difference k and corresponding keyword its length
It is bigger expression original text in exist obscure, word order phenomenon possibility it is smaller.
7. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step
Different keyword participles refer to carrying out keyword the different word segmentation results after participle in five, and different word segmentation results are in original text
The distance between position of appearance d.
8. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described participle
Refer to the fractionation to sentence, sentence is split as word, phrase.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710296379.9A CN107145555B (en) | 2017-04-28 | 2017-04-28 | A kind of fuzzy sentence searching method based on participle |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710296379.9A CN107145555B (en) | 2017-04-28 | 2017-04-28 | A kind of fuzzy sentence searching method based on participle |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107145555A true CN107145555A (en) | 2017-09-08 |
CN107145555B CN107145555B (en) | 2019-08-02 |
Family
ID=59774339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710296379.9A Active CN107145555B (en) | 2017-04-28 | 2017-04-28 | A kind of fuzzy sentence searching method based on participle |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107145555B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377706A (en) * | 2019-07-25 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Search statement method for digging and equipment based on deep learning |
CN112069812A (en) * | 2020-08-28 | 2020-12-11 | 喜大(上海)网络科技有限公司 | Word segmentation method, device, equipment and computer storage medium |
CN113177061A (en) * | 2021-05-25 | 2021-07-27 | 马上消费金融股份有限公司 | Searching method and device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101196898A (en) * | 2007-08-21 | 2008-06-11 | 新百丽鞋业(深圳)有限公司 | Method for applying phrase index technology into internet search engine |
CN102314464A (en) * | 2010-07-07 | 2012-01-11 | 北京亮点时间科技有限公司 | Lyrics searching method and lyrics searching engine |
US8315998B1 (en) * | 2003-04-28 | 2012-11-20 | Verizon Corporate Services Group Inc. | Methods and apparatus for focusing search results on the semantic web |
CN104376115A (en) * | 2014-12-01 | 2015-02-25 | 北京奇虎科技有限公司 | Fuzzy word determining method and device based on global search |
CN105045852A (en) * | 2015-07-06 | 2015-11-11 | 华东师范大学 | Full-text search engine system for teaching resources |
-
2017
- 2017-04-28 CN CN201710296379.9A patent/CN107145555B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8315998B1 (en) * | 2003-04-28 | 2012-11-20 | Verizon Corporate Services Group Inc. | Methods and apparatus for focusing search results on the semantic web |
CN101196898A (en) * | 2007-08-21 | 2008-06-11 | 新百丽鞋业(深圳)有限公司 | Method for applying phrase index technology into internet search engine |
CN102314464A (en) * | 2010-07-07 | 2012-01-11 | 北京亮点时间科技有限公司 | Lyrics searching method and lyrics searching engine |
CN104376115A (en) * | 2014-12-01 | 2015-02-25 | 北京奇虎科技有限公司 | Fuzzy word determining method and device based on global search |
CN105045852A (en) * | 2015-07-06 | 2015-11-11 | 华东师范大学 | Full-text search engine system for teaching resources |
Non-Patent Citations (1)
Title |
---|
周俊: "一种不良文本过滤方法", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377706A (en) * | 2019-07-25 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Search statement method for digging and equipment based on deep learning |
CN110377706B (en) * | 2019-07-25 | 2022-10-14 | 腾讯科技(深圳)有限公司 | Search sentence mining method and device based on deep learning |
CN112069812A (en) * | 2020-08-28 | 2020-12-11 | 喜大(上海)网络科技有限公司 | Word segmentation method, device, equipment and computer storage medium |
CN112069812B (en) * | 2020-08-28 | 2024-05-03 | 喜大(上海)网络科技有限公司 | Word segmentation method, device, equipment and computer storage medium |
CN113177061A (en) * | 2021-05-25 | 2021-07-27 | 马上消费金融股份有限公司 | Searching method and device and electronic equipment |
CN113177061B (en) * | 2021-05-25 | 2023-05-16 | 马上消费金融股份有限公司 | Searching method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107145555B (en) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110874531B (en) | Topic analysis method and device and storage medium | |
JP6721179B2 (en) | Causal relationship recognition device and computer program therefor | |
US5752051A (en) | Language-independent method of generating index terms | |
CN105426360B (en) | A kind of keyword abstraction method and device | |
US7549119B2 (en) | Method and system for filtering website content | |
CN111444330A (en) | Method, device and equipment for extracting short text keywords and storage medium | |
CN107180023A (en) | A kind of file classification method and system | |
CN104978314B (en) | Media content recommendations method and device | |
CN107145555B (en) | A kind of fuzzy sentence searching method based on participle | |
DE4232507A1 (en) | Identification process for locating and sorting document in different languages - processing information by comparing sequences of characters with those of a reference document | |
CN107807910A (en) | A kind of part-of-speech tagging method based on HMM | |
US20090157673A1 (en) | Conditional string search | |
CN110929520B (en) | Unnamed entity object extraction method and device, electronic equipment and storage medium | |
CN106951415A (en) | A kind of name of firm searching method and device | |
CN105787121B (en) | A kind of microblogging event summary extracting method based on more story lines | |
CN111259151A (en) | Method and device for recognizing mixed text sensitive word variants | |
CN112100365A (en) | Two-stage text summarization method | |
US20040225497A1 (en) | Compressed yet quickly searchable digital textual data format | |
CN112989414A (en) | Mobile service data desensitization rule generation method based on width learning | |
CN110489559A (en) | A kind of file classification method, device and storage medium | |
CN107341142B (en) | Enterprise relation calculation method and system based on keyword extraction and analysis | |
Fragos et al. | Word sense disambiguation using wordnet relations | |
CN110852059B (en) | Document content difference contrast visual analysis method based on grouping | |
KR20010006632A (en) | Information Processing System | |
EP1412875B1 (en) | Method for processing text in a computer and computer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |