CN107145555A - A kind of fuzzy sentence searching method based on participle - Google Patents

A kind of fuzzy sentence searching method based on participle Download PDF

Info

Publication number
CN107145555A
CN107145555A CN201710296379.9A CN201710296379A CN107145555A CN 107145555 A CN107145555 A CN 107145555A CN 201710296379 A CN201710296379 A CN 201710296379A CN 107145555 A CN107145555 A CN 107145555A
Authority
CN
China
Prior art keywords
participle
keyword
original text
word
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710296379.9A
Other languages
Chinese (zh)
Other versions
CN107145555B (en
Inventor
常帅
邓皓钟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing An Information Technology Co Ltd
Original Assignee
Beijing An Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing An Information Technology Co Ltd filed Critical Beijing An Information Technology Co Ltd
Priority to CN201710296379.9A priority Critical patent/CN107145555B/en
Publication of CN107145555A publication Critical patent/CN107145555A/en
Application granted granted Critical
Publication of CN107145555B publication Critical patent/CN107145555B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of fuzzy sentence searching method based on participle, this method records participle original position by carrying out participle to original text;The word repeated is merged, the word original position repeated is recorded;Participle is carried out to keyword, the number that the number of keyword participle is designated as at least occurring keyword participle once in i, original text is designated as w;Occurrence rate p is calculated, occurrence rate p is more than preset value, scanned for the word segmentation result of keyword, obtain keyword participle position in original text, occurrence rate p is less than preset value, exits search;Calculate keyword participle in original text between position apart from d, compare the difference k between d and corresponding keyword its length whether in the number range allowed, difference k matches the result searched for generally in the number range allowed.The present invention can realize to did obscure, sentence of the word order with changing is retrieved, retrieval result is more accurate, improves recall precision.

Description

A kind of fuzzy sentence searching method based on participle
Technical field
The present invention relates to a kind of fuzzy sentence searching method based on participle, belong to field of information security technology.
Background technology
In today of network increasingly prosperity, advantageous information is with causing the information of destabilizing factor also increasingly to spread unchecked therewith Netizen's is ideologically healthy, also for the harmony of society, under many external public arenas, and some contents are will be by examining It can show.Initial stage is examined in network, is all by manual examination and verification, although this examination & verification mode is accurate and intelligent, with net The speed that network word is produced is compared, and it is insignificant that its efficiency just shows!Although at this stage by keyword automatic fitration, Accuracy is very low, and being easily filled into itself is not the information of harmful content, while can not also ensure related harmful content All filter out.
At this stage, common word search algorithm has two kinds, and a kind of is the word search based on participle, and another is to be based on The word search of multimode matching.Based on participle search, i.e., first to original text carry out participle, then the keyword to be searched for is entered Row matching, but this algorithm can only be scanned for word, and sentence can not be matched;Search based on multimode matching, even if Matched with ACBM, Wu-Manber scheduling algorithm, both may search for word can also search statement, but semanteme can not be analyzed.
Both algorithms have a kind of situation can not match, e.g., by taking keyword " somewhere taxi stops doing business " as an example, but former In text content be " stop doing business somewhere taxi " or " taxi in somewhere stops doing business ", it is so just to search the keyword, Although can be scanned for by way of reducing or changing keyword, the confusion of search result is very big, and situation is too many, The keyword for reducing or changing may be imperfect, and can substantially reduce the speed of search.
It can be seen that the word search based on participle, two kinds of searching algorithm Shortcomings of word search based on multimode matching, no Can to did obscure, sentence of the word order with changing is scanned for.
The content of the invention
It is an object of the invention to provide a kind of fuzzy sentence searching method based on participle, to solve existing method not Can to did obscure, sentence of the word order with changing is scanned for the problem of.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of fuzzy sentence searching method based on participle, Described searching method comprises the following steps:
Step one:The original text treated in range of search carries out participle, records original position of each participle in original text;
Step 2:Original text is carried out after participle, and the word repeated is merged, and records the word each repeated Original position of the language in original text;
Step 3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as in i, scope to be retrieved Original text at least there is keyword participle once number be designated as w;
Step 4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than pre- The numerical value of setting, the word segmentation result to original text is scanned for using the word segmentation result of keyword, is obtained each keyword participle and is existed The position occurred in original text, occurrence rate p is less than presetting numerical value, exits search;
Step 5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and phase The difference k between keyword its length is answered whether in the number range allowed, difference k in the number range allowed, It is fitted on the result searched for generally.
It should be noted that participle can use the existing segmentation methods such as segmenting method based on string matching, be based on The segmenting method of understanding and the segmenting method based on statistics, segmenting method belong to prior art.
In fuzzy sentence searching method based on participle, described original position refers to the initial of first word of word segmentation result Original text character number in character position, scope to be retrieved is since 0.By taking " stop doing business somewhere taxi " as an example, participle is " to stop Industry " " somewhere " " taxi " three results, the position of participle " stopping doing business " is 0.
In fuzzy sentence searching method based on participle, the key that original text, needs in described scope to be retrieved are retrieved Word uses UTF-8 coded formats.UTF-8 coded formats are that a kind of variable length character for Unicode is encoded, also known as ten thousand states Under code, UTF-8 coded formats, 1-6 byte is encoded into according to different Digital sizes, conventional English alphabet is encoded into 1 Individual byte, Chinese character is typically 3 bytes, and only very uncommon character can just be encoded into 4-6 byte.UTF-8 codings can be with Pass through mask bit and shifting function fast reading and writing.Strcmp () is identical with wcscmp () returning result during character string comparison, because This makes sequence become to be more prone to.
In fuzzy sentence searching method based on participle, keyword participle at least occurs in original text in described step three 1 is once designated as, it is keyword participle number once at least occur that keyword participle does not occur being designated as 0, w in original text.Need What is further illustrated is the number of times that w and non-keyword participle occur in original text, but the pass of situation is occurred in original text Key word participle number, by taking " stop doing business somewhere taxi " as an example, participle is " stopping doing business " " somewhere " " taxi " three results, to be checked If " stopping doing business " does not occur once in the range of the original text of rope, and " somewhere " " taxi " two words are occurred, no matter " certain Ground " " taxi " occurs several times in the range of original text to be retrieved, and w now is then 2.
In fuzzy sentence searching method based on participle, occurrence rate p is more than or equal to 0 and is less than or equal to 1 in described step four, Occurrence rate refers to that the number that original text of the keyword participle in scope to be retrieved occurred accounts for the ratio of keyword participle number, Keyword participle occurs repeatedly representing that the keyword participle occurs in original text in original text, and keyword participle is in scope to be retrieved The number record increase by 1 that interior original text occurred;Described presetting numerical value is is less than or equal to 1 value more than or equal to 0, in advance The numerical value of setting represents that occurrence rate of the keyword participle in original text is bigger closer to 1, and p is more than presetting numerical value presence pair The necessity that the word segmentation result of original text is scanned for using the word segmentation result of keyword.By taking " stop doing business somewhere taxi " as an example, point Word is " stopping doing business " " somewhere " " taxi " three results, if " stopping doing business " does not occur once in the range of original text to be retrieved, And " somewhere " " taxi " two words are occurred, w now is then 2, and keyword " stop doing business somewhere taxi " participle number is 3, now, occurrence rate p then be 2/3, due to w value can not possibly exceed participle number 3, therefore occurrence rate p be more than or equal to 0 be less than etc. In 1, if presetting numerical value is 3/4, occurrence rate p is more than 3/4 for 2/3, therefore uses key in the presence of the word segmentation result to original text The necessity that the word segmentation result of word is scanned for.If " stopping doing business " " somewhere " does not go out once in the range of original text to be retrieved It is existing, and only have " taxi " word to occur, now occurrence rate p is then 1/3, and occurrence rate p is 1/3 less than presetting numerical value 3/4, therefore the word segmentation result of original text need not be scanned for using the word segmentation result of keyword.It can be seen that presetting numerical value root Determined according to the accuracy of retrieval, the accuracy of presetting numerical value more overall search is higher.
In fuzzy sentence searching method based on participle, apart from d and corresponding keyword its length in described step five Between difference k exceed the number range allowed, exit the display of the retrieval result;The described number range allowed is according to difference Value k is determined with corresponding keyword its length, is not present in difference k explanation original texts equal with corresponding keyword its length mixed Confuse, word order phenomenon, there is numerical difference to represent that keyword exists in original text mixed in difference k and corresponding keyword its length Confuse, word order phenomenon, difference k and corresponding keyword its length exist numerical difference it is bigger represent to exist in original text obscure, word Language order phenomenon possibility is smaller.Difference k between d and corresponding keyword its length exceeds the number range allowed, Show textual content in scope to be retrieved be not present with the same or similar content of search key, retrieval knot is also just not present The displaying of fruit, the difference k between d and corresponding keyword its length shows model to be retrieved in the number range allowed Textual content in enclosing exist with the same or similar content of search key, retrieval result and doing is obscured, word order Result with changing also is retrieved one by one, and more detailed description will do further explanation in embodiment below.
In fuzzy sentence searching method based on participle, different keyword participles are referred to key in described step five Word carries out the different word segmentation results after participle, between the position that different word segmentation results occur in original text apart from d." to stop doing business Exemplified by somewhere taxi ", participle is " stopping doing business " " somewhere " " taxi " three results, and overmulling may be done in original text to be retrieved Confuse, word order is with changing, it is therefore desirable to calculate among " stopping doing business " " somewhere " " taxi " three going out in original text between any two The distance between existing position d.
In fuzzy sentence searching method based on participle, described participle refers to the fractionation to sentence, and sentence is split as Word, phrase.Fractionation to keyword is not limited to phrase, can be single word or multiple words.
The inventive method has the following advantages that:Participle is carried out by treating the original text in range of search, each participle is recorded Original position in original text;Original text is carried out after participle, and the word repeated is merged, and record each repeats Original position of the word in original text;Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as i, treated The number at least occurring keyword participle once in original text in range of search is designated as w;Keyword participle is calculated in retrieval model The occurrence rate p, p=w/i, occurrence rate p of original text in enclosing are more than presetting numerical value, and the word segmentation result to original text uses keyword Word segmentation result scan for, obtain the position that each keyword participle occurs in original text, occurrence rate p is less than presetting number Value, exits search;Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed The result searched for generally.Can realize to did obscure, sentence of the word order with changing is retrieved, retrieval result is more accurate Really, improve recall precision, be effectively protected the safety of national information, promote the harmony of society with stably.
Brief description of the drawings
Fuzzy sentence searching method schematic flow sheets of the Fig. 1 based on participle;
The algorithm sketch of fuzzy sentence searching methods of the Fig. 2 based on participle.
Embodiment
Following examples are used to illustrate the present invention, but are not limited to the scope of the present invention.
As shown in figure 1, a kind of fuzzy sentence searching method based on participle, searching method comprises the following steps:
S1:The original text treated in range of search carries out participle, records original position of each participle in original text;
S2:Original text is carried out after participle, and the word repeated is merged, and the word that record each repeats exists Original position in original text;
S3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as the original in i, scope to be retrieved The number at least occurring keyword participle once in text is designated as w;
S4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than presetting Numerical value, the word segmentation result to original text scanned for using the word segmentation result of keyword, obtains each keyword participle in original text The position of middle appearance, occurrence rate p is less than presetting numerical value, exits search;
S5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed The result searched for generally.
Below using original text as " stop doing business somewhere taxi, and I likes sun liter on somewhere scenic outlook, scenic outlook, the taxi in somewhere Car stops doing business ", search key be " somewhere taxi stops doing business " exemplified by more detailed elaboration is done to the inventive method.
The first step:To original text participle, word segmentation result is " [{ " word ":" stopping doing business ", " offset ":0},{"word":" certain Ground ", " offset ":6},{"word":" taxi ", " offset ":12},{"word":" hiring a car ", " offset ":15},{" word":" taxi ", " offset ":12},{"word":", ", " offset ":21},{"word":" I ", " offset ": 24},{"word":" love ", " offset ":27},{"word":" somewhere ", " offset ":30},{"word":" view ", " offset":36},{"word":" scenic outlook ", " offset ":36},{"word":", ", " offset ":45},{"word":" View ", " offset ":48},{"word":" scenic outlook ", " offset ":48},{"word":" upper ", " offset ":57},{" word":" sun ", " offset ":60},{"word":" sun liter ", " offset ":60},{"word":", ", " offset ": 69},{"word":" somewhere ", " offset ":72},{"word":" ", " offset ":78},{"word":" taxi ", " offset":81},{"word":" hiring a car ", " offset ":84},{"word":" taxi ", " offset ":81},{" word":" stopping doing business ", " offset ":90 }] ", wherein word is the result of participle, and offset represents position of the word in original text Put.
Second step:After to original text participle, some words occur repeatedly, and such as somewhere is in offset=6, offset= Multiple positions such as 72 are all occurred, so the result to participle does a processing, dittograph language is merged.Result is such as after merging Under:
{ hire out:[12,81],
Stop doing business:[0,90],
Love:[27],
On:[57],
Somewhere:[6,30,72],
Taxi:[12,81],
Hire a car:[15,84],
I:[24],
,:[21,45,69],
View:[36,48],
Scenic outlook:[36,48],
The sun:[60],
Sun liter:[60],
's:[78]
}
“:" left side be word, the right be offset arrays.
3rd step:Word segmentation processing is also carried out to keyword, word segmentation result is as follows:
[" stopping doing business ", " somewhere ", " taxi "]
Keyword need not record offset.
4th step:Word segmentation result to original text is scanned for using the word segmentation result of keyword, obtains following result:
[
Stop doing business:[0,90],
Somewhere:[6,30,72],
Taxi:[12,81]
]
The position that keyword word segmentation result occurs in original text can be drawn.An occurrence rate can be set herein:
The number of times that occurrence rate=keyword word segmentation result occurs in original text/keyword word segmentation result number, it is seen that occur Rate be it is big be equal to 0 it is small be equal to 1 number, only real occurrence rate is more than the occurrence rate set, just into next step.If will go out Now rate is set to 0.75, and occurrence rate is 3/3=1 in this example, it is clear that 1>0.75, next step can be entered and calculated.
Provided that keyword be " sell cigarette ", word segmentation result is [" sale ", " cigarette "], then [" goes out herein Sell ", " cigarette "] occurrence rate in given original text to be retrieved is 0/2=0, exit find.
5th step:Offset in 4th step result is compared, beeline is calculated.
Due to having used the length of several words in UTF8 codings, the 4th step result to be respectively:
[
Stop doing business:6,
Somewhere:6,
Taxi:9
]。
Comparing to obtain
[
Stop doing business:0
Somewhere:6
Taxi:12
], this three groups of offset beelines are 6, are met.So position is to match keyword at 0.
[
Stop doing business:90
Somewhere:72
Taxi:81
], " taxi " and " stopping doing business " minimum range are 9 to meet in this three groups, still " somewhere " and " taxi " most short distance From being but 9, actual difference should be 6 just right, and we can set one and allow the maximum that two words are differed to evade here This problem.If the value is set to 5, then 9-6=3<5, so meeting.Position is similarly matched to keyword at 72.
So the result matched is " stop doing business somewhere taxi " and " taxi in somewhere stops doing business ".Reach fuzzy matching Effect.So as to efficiently solve traditional algorithm can not retrieve did obscure, word order is with the search problem changed.
With further reference to Fig. 2, the thought of the inventive method can be more easily understood.It is original text participle, keyword first Participle, then search key participial construction whether in original text word segmentation result occur, i.e., by calculate occurrence rate come with preset Numeric ratio compared with, decide whether to enter next step search, into next step after by relatively different keyword participles in original text Whether the difference k between the distance between position of appearance d and corresponding keyword its length comes in the number range allowed Decide whether the result that display is searched.
Although above with general explanation and specific embodiment, the present invention is described in detail, at this On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore, These modifications or improvements, belong to the scope of protection of present invention without departing from theon the basis of the spirit of the present invention.

Claims (8)

1. a kind of fuzzy sentence searching method based on participle, it is characterised in that:Described searching method comprises the following steps:
Step one:The original text treated in range of search carries out participle, records original position of each participle in original text;
Step 2:Original text is carried out after participle, and the word repeated is merged, and the word that record each repeats exists Original position in original text;
Step 3:Participle is carried out to the keyword for needing to retrieve, the number of keyword participle is designated as the original in i, scope to be retrieved The number at least occurring keyword participle once in text is designated as w;
Step 4:The occurrence rate p of original text of the keyword participle in range of search is calculated, p=w/i, occurrence rate p is more than presetting Numerical value, the word segmentation result to original text scanned for using the word segmentation result of keyword, obtains each keyword participle in original text The position of middle appearance, occurrence rate p is less than presetting numerical value, exits search;
Step 5:Calculate between the position that different keyword participles occur in original text apart from d, compare apart from d and corresponding pass Whether the difference k between keyword its length is in the number range allowed, difference k is matched in the number range allowed The result searched for generally.
2. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described starting Position refers to original text character number in the original character position of first word of word segmentation result, scope to be retrieved since 0.
3. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described is to be checked The keyword that original text, needs in the range of rope are retrieved uses UTF-8 coded formats.
4. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step Keyword participle at least occurs once being designated as 1 in original text in three, and keyword participle does not occur being designated as 0, w for extremely in original text Few keyword participle number occurred once.
5. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step Occurrence rate p is more than or equal to 0 and is less than or equal to 1 in four, and occurrence rate refers to that original text of the keyword participle in scope to be retrieved occurs The number crossed accounts for the ratio of keyword participle number, and keyword participle occurs repeatedly representing the keyword participle in original in original text Occur in text, the number record increase by 1 that original text of the keyword participle in scope to be retrieved occurred;Described is presetting Numerical value is is less than or equal to 1 value more than or equal to 0, and presetting numerical value represents keyword participle going out in original text closer to 1 Now rate is bigger, and p is more than presetting numerical value and has what the word segmentation result of original text was scanned for using the word segmentation result of keyword Necessity.
6. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step Difference k in five between d and corresponding keyword its length exceeds the number range allowed, exits the aobvious of the retrieval result Show;The described number range allowed determines that difference k and corresponding keyword are certainly according to difference k and corresponding keyword its length Be not present in the equal explanation original text of body length obscure, word order phenomenon, there is numerical value in difference k and corresponding keyword its length Difference represent keyword exist in original text obscure, word order phenomenon, there is numerical difference in difference k and corresponding keyword its length It is bigger expression original text in exist obscure, word order phenomenon possibility it is smaller.
7. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described step Different keyword participles refer to carrying out keyword the different word segmentation results after participle in five, and different word segmentation results are in original text The distance between position of appearance d.
8. a kind of fuzzy sentence searching method based on participle according to claim 1, it is characterised in that:Described participle Refer to the fractionation to sentence, sentence is split as word, phrase.
CN201710296379.9A 2017-04-28 2017-04-28 A kind of fuzzy sentence searching method based on participle Active CN107145555B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710296379.9A CN107145555B (en) 2017-04-28 2017-04-28 A kind of fuzzy sentence searching method based on participle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710296379.9A CN107145555B (en) 2017-04-28 2017-04-28 A kind of fuzzy sentence searching method based on participle

Publications (2)

Publication Number Publication Date
CN107145555A true CN107145555A (en) 2017-09-08
CN107145555B CN107145555B (en) 2019-08-02

Family

ID=59774339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710296379.9A Active CN107145555B (en) 2017-04-28 2017-04-28 A kind of fuzzy sentence searching method based on participle

Country Status (1)

Country Link
CN (1) CN107145555B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377706A (en) * 2019-07-25 2019-10-25 腾讯科技(深圳)有限公司 Search statement method for digging and equipment based on deep learning
CN112069812A (en) * 2020-08-28 2020-12-11 喜大(上海)网络科技有限公司 Word segmentation method, device, equipment and computer storage medium
CN113177061A (en) * 2021-05-25 2021-07-27 马上消费金融股份有限公司 Searching method and device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196898A (en) * 2007-08-21 2008-06-11 新百丽鞋业(深圳)有限公司 Method for applying phrase index technology into internet search engine
CN102314464A (en) * 2010-07-07 2012-01-11 北京亮点时间科技有限公司 Lyrics searching method and lyrics searching engine
US8315998B1 (en) * 2003-04-28 2012-11-20 Verizon Corporate Services Group Inc. Methods and apparatus for focusing search results on the semantic web
CN104376115A (en) * 2014-12-01 2015-02-25 北京奇虎科技有限公司 Fuzzy word determining method and device based on global search
CN105045852A (en) * 2015-07-06 2015-11-11 华东师范大学 Full-text search engine system for teaching resources

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8315998B1 (en) * 2003-04-28 2012-11-20 Verizon Corporate Services Group Inc. Methods and apparatus for focusing search results on the semantic web
CN101196898A (en) * 2007-08-21 2008-06-11 新百丽鞋业(深圳)有限公司 Method for applying phrase index technology into internet search engine
CN102314464A (en) * 2010-07-07 2012-01-11 北京亮点时间科技有限公司 Lyrics searching method and lyrics searching engine
CN104376115A (en) * 2014-12-01 2015-02-25 北京奇虎科技有限公司 Fuzzy word determining method and device based on global search
CN105045852A (en) * 2015-07-06 2015-11-11 华东师范大学 Full-text search engine system for teaching resources

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周俊: "一种不良文本过滤方法", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377706A (en) * 2019-07-25 2019-10-25 腾讯科技(深圳)有限公司 Search statement method for digging and equipment based on deep learning
CN110377706B (en) * 2019-07-25 2022-10-14 腾讯科技(深圳)有限公司 Search sentence mining method and device based on deep learning
CN112069812A (en) * 2020-08-28 2020-12-11 喜大(上海)网络科技有限公司 Word segmentation method, device, equipment and computer storage medium
CN112069812B (en) * 2020-08-28 2024-05-03 喜大(上海)网络科技有限公司 Word segmentation method, device, equipment and computer storage medium
CN113177061A (en) * 2021-05-25 2021-07-27 马上消费金融股份有限公司 Searching method and device and electronic equipment
CN113177061B (en) * 2021-05-25 2023-05-16 马上消费金融股份有限公司 Searching method and device and electronic equipment

Also Published As

Publication number Publication date
CN107145555B (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN110874531B (en) Topic analysis method and device and storage medium
JP6721179B2 (en) Causal relationship recognition device and computer program therefor
US5752051A (en) Language-independent method of generating index terms
CN105426360B (en) A kind of keyword abstraction method and device
US7549119B2 (en) Method and system for filtering website content
CN111444330A (en) Method, device and equipment for extracting short text keywords and storage medium
CN107180023A (en) A kind of file classification method and system
CN104978314B (en) Media content recommendations method and device
CN107145555B (en) A kind of fuzzy sentence searching method based on participle
DE4232507A1 (en) Identification process for locating and sorting document in different languages - processing information by comparing sequences of characters with those of a reference document
CN107807910A (en) A kind of part-of-speech tagging method based on HMM
US20090157673A1 (en) Conditional string search
CN110929520B (en) Unnamed entity object extraction method and device, electronic equipment and storage medium
CN106951415A (en) A kind of name of firm searching method and device
CN105787121B (en) A kind of microblogging event summary extracting method based on more story lines
CN111259151A (en) Method and device for recognizing mixed text sensitive word variants
CN112100365A (en) Two-stage text summarization method
US20040225497A1 (en) Compressed yet quickly searchable digital textual data format
CN112989414A (en) Mobile service data desensitization rule generation method based on width learning
CN110489559A (en) A kind of file classification method, device and storage medium
CN107341142B (en) Enterprise relation calculation method and system based on keyword extraction and analysis
Fragos et al. Word sense disambiguation using wordnet relations
CN110852059B (en) Document content difference contrast visual analysis method based on grouping
KR20010006632A (en) Information Processing System
EP1412875B1 (en) Method for processing text in a computer and computer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant