CN101063975A - Method and system for electronic text-processing and searching - Google Patents

Method and system for electronic text-processing and searching Download PDF

Info

Publication number
CN101063975A
CN101063975A CN 200710087104 CN200710087104A CN101063975A CN 101063975 A CN101063975 A CN 101063975A CN 200710087104 CN200710087104 CN 200710087104 CN 200710087104 A CN200710087104 A CN 200710087104A CN 101063975 A CN101063975 A CN 101063975A
Authority
CN
China
Prior art keywords
text
keyword
speech section
abutting connection
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200710087104
Other languages
Chinese (zh)
Inventor
刘二中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 200710087104 priority Critical patent/CN101063975A/en
Publication of CN101063975A publication Critical patent/CN101063975A/en
Priority to PCT/CN2008/000190 priority patent/WO2008098467A1/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This invention provides one computer and its index engine about electron text process and index technique, which comprises the following steps: when user process keyword index and facing large volume of index results through user interface; through original keyword combination forming key sentence index technique and data distribution multi-layer sub system and list system. The invention can help user for rapid and fine index range without adding servo capacity or deleting irrelative information or core content information to get complete expectation inquire result.

Description

E-text is handled the method and system with retrieval
(1) technical field
The present invention relates to computing machine and search engine handles and the technology of retrieving about e-text.
(2) background technology
Recent decades, the Computer Database retrieval technique has had the progress of network technologies such as very big development, particularly WWW, makes the scale of the database that people can share reach astronomical figure.The user finds information needed or file for convenience, classification or catalogue retrieval system have occurred.This technology is more suitable in the maturation classification field that people are very familiar to, but in magnanimity information field widely, is difficult to set up also be difficult to grasp and use.
The search engine technique that with the keyword search is core is that the user has brought facility.The search system that with the search engine is core generally is positioned on one or more servers or other computer installations, the text analyzing of text library is obtained the index constructor of text index by text (page) storehouse, text index storehouse, basis, and accept the requestor etc. that inquiry generates Search Results and partly form, toward the subsidiary promising text library of contact from the internet or other information sources collect and increase the data acquisition server of text.This system can obtain inquiry's keyword query request by the interactive interface on the client computer and communication network or communication line, in text index storehouse or text library, inquire about, and carry out the correlation analysis of keyword request and text, obtain correlated results and ordering, be provided to interactive interface via communication network or circuit again.This search system uses very convenient rapid, but the index sum that the return result comprises is still very huge, is difficult to consult one by one.
For the potential Query Result to inquiry's most worthy can be come the front to make things convenient for the inquiry as far as possible, the 6th, 285, No. 999 United States Patent (USP)s have proposed to carry out based on webpage hyperlink structure analysis (Page link) technology of Search Results ordering, other ordering techniques have been surpassed, adopted by Google company, obtain unprecedented success.
Yet this technology and other various ordering techniques only are the efficient that has improved keyword search on statistical significance, can not guarantee that Query Result that everyone wishes can both come the front of huge concordance list.For example, we utilize " Google " Chinese website search " Bu Lin " speech, can obtain nearly 300,000 index.We still can not guarantee and can none find the content of expectation on forward position with omitting, accomplish not only tightly but also more convenient.Simultaneously, we but helplessly read the irrelevant information that all main contents repeat again and again before reading the information of expectation.
In order to address this problem, people attempt to develop various new search engine techniques always over past ten years, for example, the technology of " according to the priority ranking tabulation of importance " that No. 6421675 United States Patent (USP) relates to, the technology of " history according to user's data query forms the dynamic object table " that No. 6256633 United States Patent (USP) relates to, " sharing Query Information " technology of CN1151457 Chinese patent with other inquiries, the technology of No. 6990628 United States Patent (USP) relevant " measuring the e-text similarity ".These technology have some advantage, but effect is very limited.
The technology of No. 7089236 United States Patent (USP) can be carried out semantic analysis to the keyword that the inquiry proposes, and interactive interface be presented in different possible semantemes, helps the inquiry to dwindle the hunting zone.The technology that No. the 200510081867.5th, close with it Chinese patent application is by using the keyword search results of webpage classification information dispersion search engine.The problem of these two kinds of technology is, yet at first must set up very complicated huge impossible accurate classification database, judge that by machine a certain page or text belong to which bar of certain keyword or the semanteme or the classification of which bar is very difficult, its reliability is not high.Overlappingly probably between the different semantemes of a keyword or the classification more may there be blank.If increase the level of classification, overlapping will causing takies exploding of storage space.Simultaneously, the inquiry of keyword search also is difficult to accurate assurance in the face of unfamiliar field to all multi-semantic meanings or classification.These have all had a strong impact on the raising of search efficiency.
Therefore, people press for a kind of not only tightly but also the technology of keyword search automotive engine system efficiently, can help the inquiry to dwindle effectively even repeatedly dwindle the scope of consulting.Require between the different range boundary clear and definite, judge there is not the overlapping blank that also do not have easily, accelerating the speed that the inquiry obtains expected result greatly, and guarantee the tightness of search.This also becomes unsolved for many years global problem.
(3) summary of the invention
Purpose of the present invention just provides the e-text of a kind of computing machine or search engine and handles and the technology of retrieving, carry out keyword retrieval and during the user in the face of the Search Results of magnanimity, can repeatedly dwindle the hunting zone rapidly and closely, or reject all kinds of irrelevant informations or duplicate message, obtain desired result exactly and seldom omission.
The method that a plurality of e-texts that contain same keyword are handled that one aspect of the present invention has provided that a kind of computing machine carries out comprises:
Obtain a plurality of e-texts that contain same keyword; Regulation is in abutting connection with the contained words quantity of speech section or in abutting connection with speech section interception way; According to keyword described in each content of text in the part or all of text in abutting connection with speech section or identical with other texts or different in abutting connection with the speech section indirectly, the text and other texts are subdivided into same or different subclass or carry out corresponding identical or different processing;
Described corresponding identical or different processing can comprise: corresponding text has identical or different distributing position or storage mode, perhaps obtain identical or different subclass mark, perhaps make its index have identical or different mark or index entry, perhaps has identical or different arranged mode, perhaps have identical or different display mode or position, perhaps allow at least the part subclass respectively to have and one or morely stride subclass combination or ordering or show at interactive interface in abutting connection with speech section or text at interactive interface;
Described text can be e-file or document or webpage or their summary or index or questions record or exercise question.
Above-mentioned in abutting connection with the speech section or can be the keyword front in abutting connection with the speech section indirectly, also can be the keyword back; Generally be the speech section that one or more speech in the content of text or word even root are formed, also comprise some character when needing, as the letter of abridging, punctuate etc.; Under some necessary situation, judge the identical or different of two speech sections, also can omit the prefix of some speech or the difference in suffix or some function word or non-notional word or punctuate or space.
When the keyword in when retrieval during for a plurality of words that can separate, above-mentioned in abutting connection with the speech section can be meant wherein a certain words (as forward words) or a plurality of words in abutting connection with the speech section.
Described disposal route can also comprise: for belonging to that certain or some same first order subclass or higher subclass or its content contain same keyword and in abutting connection with the different texts of speech section, according to its described same keyword that contains and in abutting connection with identical still different in abutting connection with the speech section of other of speech section, part or all of described text is subdivided into the same or different next stage or the multistage subclass of above-mentioned subclass or carries out corresponding identical or different processing.
Corresponding identical or different processing described herein can comprise equally: corresponding text has identical or different distributing position or storage mode, perhaps obtain identical or different subclass mark, perhaps make its index have identical or different mark or index entry, perhaps has identical or different arranged mode, perhaps have identical or different display mode or position, perhaps allow at least the part subclass respectively to have and one or morely stride subclass combination or ordering or show at interactive interface in abutting connection with speech section or text at interactive interface.
Described disposal route allow successively in abutting connection with the merging of speech section or separately, to reduce or to increase the subclass level.
Described disposal route can also comprise:
The difference of the same keyword of the described text of one of layout reflection in abutting connection with the speech section or indirectly in abutting connection with the speech section or comprise the statement of these speech sections or example sentence or summary example side by side or one or more levels catalogue or tree-shaped catalogue or sequence of precedence relationship, wherein, separately described identical of the one or more different subclass that can comprise described text in abutting connection with speech section or identical indirect in abutting connection with the speech section or comprise statement or the example sentence or the summary example of this speech section, perhaps comprise separately identical of the next stage of this or these subclass or a plurality of subclass of stage further, according to side by side or be subordinate to precedence relationship layout or distribution or storage or displaying in abutting connection with the speech section or indirectly in abutting connection with the speech section or comprise statement or the example sentence or the summary example of this speech section; It is arranged side by side that wherein said speech section or statement or example sentence or summary example can be striden subclass.
Described disposal route can also comprise:
Keyword in described catalogue or tree-shaped catalogue or the sequence is in abutting connection with the speech section or indirectly in abutting connection with the speech section, if it is a kind of that its next stage or stage further have only in abutting connection with the speech section, this speech section can distribute in abutting connection with the original position of speech Duan Zaiqi or stores or show together with its next stage or stage further.
Described disposal route can also comprise:
In above-mentioned text or catalogue or statement or example sentence or summary example or at keyword that they comprised or in abutting connection with the speech section or indirectly near the speech section, can have the number of subsets arranged side by side of its corresponding number of subsets side by side or subordinate's number of subsets or related term or speech section place subclass or contained subordinate's number of subsets or the prompting of textual data purpose.
Described disposal route also can comprise:
In described disposal route or catalogue, arranged side by side subclass or side by side in abutting connection with the speech section or indirectly in abutting connection with speech section or text or statement arranged side by side or the some concrete sorting positions in example sentence or the summary example side by side, can partially or completely depend on following wherein some or a plurality of factor:
Size or the height of clicking rate or the height of keyword occurrence rate of the Page link value of the text or this speech section or statement or example sentence or summary example place text,
The perhaps size of the mean values of the text Page link value of the height of what or this subclass clicking rate of subordinate's number of subsets of this subclass or subordinate's textual data purpose or this subclass,
The perhaps size of the mean values of the text Page link value of the height of what or place subclass clicking rate of subordinate's number of subsets of this speech section or text or statement or example sentence or summary example place subclass or subordinate's textual data purpose or place subclass,
The perhaps size of the Page link value of text that the Page link value of this subclass is the highest or other text example,
The perhaps clicking rate of the clicking rate of this subclass text the highest or that the keyword occurrence rate is the highest or other text example or the height of keyword occurrence rate,
The perhaps ordering of related text in other search websites or searching system Search Results in related text or the associated subset,
Investor's relevant payment of perhaps relevant text or the height of bidding,
The spelling of perhaps relevant speech word in abutting connection with the speech section or the lexicographic order or the stroke of phonetic,
The perhaps source web of text or unit or people's scoring,
The perhaps related text time order and function of including or new and old,
The same subclass that perhaps whether belongs to certain one-level.
When needing, above-described concrete sorting position can decide by a kind of target function value, and target function value depends on one or more variablees, and the part or all of variable of this objective function is represented above-mentioned listed wherein some or a plurality of factors respectively.
Described disposal route can also comprise:
Permission is on the result of above-mentioned processing, and what increase was additional should possess the other keyword that maybe can not possess, and perhaps increases the restriction of time or region or languages or other types or scope, obtains the result of further refining.
Help dwindling more easily the hunting zone like this.
Another aspect of the present invention is a kind of computer data system that comprises storing apparatus, it is characterized in that, described memory storage or the part or all of keyword index that data division contained wherein or the data of text snippet or text distribute in the following manner:
Its text snippet or text contain same keyword and this keyword in abutting connection with the identical or different index of speech section or the data of text snippet or text, be positioned at the distributed areas of the same or different subclass of same keyword set;
When needing, allow to be positioned at same subclass, and its text snippet or text contain same keyword and in abutting connection with the speech section be same etendue critical statement and this statement in abutting connection with the identical or different index data of speech section, be positioned at same or different low one or more levels subclass distributed areas of same subclass.
Described computer data system can be a search engine system, can inquire about more easily or handle like this or provide with the keyword of inquiring about and in abutting connection with the data of the relevant same subclass of speech section and low one or more levels subclass to the user.
The present invention can comprise the computer-readable medium that the data of keyword index or text snippet or text distribute in the above described manner.
Another aspect of the present invention is the computer data system that another kind comprises storing apparatus, it is characterized in that, the data structure of described memory storage or the part or all of keyword index that data division contained is wherein formed and comprised at least:
The keyword section;
One or more in abutting connection with the speech section, by in the corresponding content of text or the at different levels of the predetermined number of the adjacency successively of the keyword in the text snippet form by the mapping of former order in abutting connection with the speech section, be followed successively by: in abutting connection with speech section 1, in abutting connection with speech section 2 ... in abutting connection with speech section N;
The ID section of corresponding text;
The summary section or the header segment that can comprise in case of necessity, the described keyword that corresponding text contains;
This system can allow described keyword section that this computer data system comprises according to search regulation and each one or more combinations or increase and decrease of portmanteau word hop count purpose in the speech section, searches for or searches for corresponding index or content with mapping mode.
Obviously, above-mentioned N minimum is 1, and this moment, relative index had only one in abutting connection with the speech section.
Above-mentioned ID address can be the database text address, or web page address is marked other addresses.
Described computer data system also can be a search engine system.
The present invention can comprise the computer-readable medium of the data structure composition with above-mentioned keyword index.
Another aspect of the present invention has provided the searching method that a kind of search engine provides the desired result of inquiry, the keyword query requirement that this search engine system response inquiry proposes via interactive interface is from relevant information source or the database search of this system and provide and meet text or text snippet or index or its relevant information that above-mentioned keyword requires; The characteristics of this searching method are that this method comprises:
This system receives inquiry's keyword query requirement via interactive interface;
After the affirmation, require inquiry to comprise the database of keyword index according to this keyword;
This system in advance or the above-mentioned keyword that the time will occur in containing the content of text of above-mentioned keyword or in the text snippet in inquiry together with it in abutting connection with the speech section, as key sentence;
The quantity of described words that is comprised in abutting connection with the speech section or character or should in abutting connection with the speech section by mode or particular content be by said system predetermined or the inquiry agrees or acquiescence or selected, perhaps by the inquiry in the selectionbar that interactive interface presents or the position and the mode that comprise the cursor indication of the carrying out on the page of the text snippet of certain concrete index or text or related content determine;
This system in advance or when inquiry according to above-mentioned in abutting connection with speech section or key sentence induction-arrangement go out to have nothing in common with each other in abutting connection with speech section or the key sentence that has nothing in common with each other;
This system generates Search Results according to the key sentence that obtains in advance or when inquiry, that is: will contain the different index or the text snippet of described identical or different key sentence or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry.
Described searching method can also comprise:
In advance or the above-mentioned key sentence that the time will occur in containing the content of text of above-mentioned key sentence or in the text snippet in inquiry together with it in abutting connection with the speech section, perhaps with former key sentence together with it in abutting connection with the speech section, as the key sentence of expansion;
The quantity of described words that is comprised in abutting connection with the speech section or character or should in abutting connection with the speech section by mode or particular content be by said system predetermined or the inquiry agrees or acquiescence or selected, perhaps by the inquiry in the selectionbar that interactive interface presents or the position and the mode that comprise the cursor indication of the carrying out on the page of the text snippet of certain concrete index or text or related content determine;
In advance or when inquiry according to the above-mentioned key sentence that goes out to have nothing in common with each other in abutting connection with speech section or key sentence induction-arrangement in abutting connection with speech section or the expansion that has nothing in common with each other;
Generate Search Results according to the key sentence of expansion that obtains or repeatedly expansion in advance or in when inquiry, that is: and will contain described identical or different expansion key sentence different index or text snippet or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry.
That is to say,, contained all multifiles of same keyword originally, can automatically be divided into some subclass, select for the inquiry according to key sentence or in abutting connection with the difference of speech section according to technology of the present invention.
Described disposal route or data system or searching method can be used for the internet search engine system, also can be used for local or computerized information library searching system independently, for example digital library system, documents and materials storehouse numeral search system.
Along with the expansion again and again of keyword or key sentence, originally the subclass system that will be segmented again and again of the huge PRELIMINARY RESULTS of keyword search is selected through the user, can access required result soon.
When needed, related above-mentioned rest position in abutting connection with the speech section can also be determined according to above-mentioned end or near the symbol the end or word or speech or its font or its color or space in abutting connection with the speech section.
Under special circumstances, for example can stipulate each in abutting connection with the speech section the speech number, for example the speech number is one.
When needing, described searching method can also comprise the marshalling operation:
Promptly allow to contain the various key sentence of same keyword or in abutting connection with speech section or index or text snippet or text, perhaps will contain same former key sentence various expansion key sentence or in abutting connection with speech section or index or text snippet or text, organize into groups separately with catalogue or sequence form and arrange or show, wherein each is only taken in abutting connection with the key sentence at speech section place or index or text snippet or text that each is one or more.
Marshalling has certain unicity or representativeness like this, can help the user to read a small amount of interactive interface image information and just can make one's options.
When the length of the key sentence of selecting when us acquired a certain degree, the core content of index that obtains or summary marshalling sequence will not repeat do not have omission substantially yet.
Described searching method can also comprise:
Make the data of described part or all of keyword index or text snippet or text, similar and different according to its keyword that contains or key sentence or etendue critical statement is distributed in the subset area storage of similar and different subset area or similar and different even lower level;
When keyword query, directly extract or provide the data of corresponding key sentence or keyword index or text snippet or text.
Described searching method also can comprise:
To the subsidiary data acquisition server of the text in the described database or summary or search engine from the internet or the text that obtains of other information sources analyze, produce that described text comprises the keyword section accordingly at least and in abutting connection with the index of speech section and text ID section, comprise text snippet or title in case of necessity, and storage;
In order to when search, in storage, retrieve and provide corresponding index or summary or text according to keyword section that it comprised with in abutting connection with the speech section.
Text ID section is meant the text address, can be database address or URL or form or other forms of representing this URL territory hash, the link that visit can be provided or open text.
Described searching method can also comprise:
Reflection of layout have the text of same keyword or this keyword different stage in the text snippet in abutting connection with between the speech section successively or the tree-shaped catalogue (Fig. 5) of coordination, perhaps between the key sentence of this keyword different stage expansion of reflection successively or the tree-shaped catalogue of coordination; Use during for inquiry.
Described searching method also can comprise selected operation:
Promptly allow described system according to the inquiry on the above-mentioned text of the page of interactive interface or the text snippet key sentence or in the speech section catalogue to certain speech or in abutting connection with cursor indication speech section end or in selectionbar or frame, determine corresponding key sentence, and to the key sentence of the various expansion of this key sentence correspondence or expansion in abutting connection with speech section or index or text snippet or text is organized into groups operation or catalogue is showed, perhaps carrying out the ordering of respective index or text snippet or text shows, perhaps remove operation, the described page or other a plurality of pages are contained the clauses and subclauses of this key sentence or index or text snippet or text reject or the shift position according to the corresponding key sentence of determining.
Described searching method can also comprise ignores operation:
Promptly according to the inquiry browse when comprising index former keyword or that comprise former key sentence or text snippet or text sequence on the interactive interface to the page or the operation on the page, judge that the inquiry browses the present position of this index or text sequence; If can determine, be arranged in the index that comprises certain key sentence in this front, position certain limit or text snippet or text or this key sentence itself always or continuously certain number of times be not opened or link, otherwise do not clicked or paid close attention to or pointed out reservation yet, then remove operation, the described page or other a plurality of pages are contained the clauses and subclauses of this key sentence or index or text snippet or text reject or the shift position according to this key sentence.
This method is moved or is rejected after the inquiry can not being paid close attention to for a long time in the similar fileinfo sequence from behind of file in reading process, reduces the too much puzzlement of garbage.
Another aspect of the present invention is to have provided a kind of response inquiry 1 via interactive interface 2 requirements, and the search engine system of desired Search Results is provided, and comprising:
Server 5, this server is via client computer 3 couplings at communication network 4 or circuit and described interactive interface 2 places;
Be positioned at the search engine 8 of server 5, described search engine 8 comprises: the database 9 that comprises keyword index, and requestor 11, this requestor can require according to the keyword that the inquiry proposes to inquire about and the related data the results list that inquires is offered interactive interface 2 at described database 9;
Its characteristics are:
Described database 9 also comprises text snippet or questions record or the text that comprises keyword, and text summary or questions record can be included in the described keyword index;
Described requestor 11 or search engine 8 comprise keyword expansion parts 10, these parts can to the inquiry keyword carry out extended operation: the above-mentioned keyword that will occur in containing the content of text of above-mentioned keyword or in the text snippet together with it in abutting connection with the speech section, as each different etendue critical statement, and select for use via interactive interface for the inquiry in abutting connection with the tabulation of speech section with its tabulation or with the difference of above-mentioned keyword, perhaps will contain the different index or the text snippet of identical or different described etendue critical statement or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry;
Described keyword expansion parts 10 can carry out the one or many extended operation to key sentence equally, and inquire about accordingly or handle.
Above-described search engine system can be the search system for internet customer service that is positioned at the internet, also can be computerized information library searching system independently.Described server 5 is Computer Storage and treating apparatus, can be single, also can be in groups a plurality of or decentralized configuration.Described client computer 3 can be PC or workstation or other computer installations, when needing, can dispose suitable browser.
Described search engine system can also allow described search engine to comprise index structural member 13, be used for to the subsidiary data acquisition server 12 of the text in the described database or search engine from the internet 4 or the text that obtains of other information sources analyze, produce that described text comprises the keyword section accordingly at least and in abutting connection with the index of speech section and text ID section, and storage.
When needing, can stipulate each speech number in abutting connection with the speech section herein simply, for example the speech number is one.
Described search engine system can also comprise a reflection have the text of same keyword or this keyword different stage in the text snippet in abutting connection with precedence relationship between the speech section in abutting connection with the tree-shaped catalogue of speech section (Fig. 5), perhaps comprise the tree-shaped catalogue of precedence relationship between the key sentence of this keyword different stage expansion of a reflection.
The present invention can also comprise have reflection have the text of same keyword or this keyword different stage in the text snippet in abutting connection with precedence relationship between the speech section in abutting connection with the tree-shaped catalogue of speech section, perhaps reflect the computer-readable medium of the tree-shaped catalogue of precedence relationship between the key sentence of this keyword different stage expansion.
Described corresponding to speech section place in abutting connection with the tree-shaped catalogue of speech section perhaps reflected the corresponding key sentence place of the tree-shaped catalogue of precedence relationship between the key sentence of this keyword different stage expansion, also can show the subclass quantity or the contained quantity of documents of its back.
Described search engine system, can also comprise a kind of graphical user interactive interface (Fig. 7), allow the inquiry to add additional queries information, its interface can comprise a kind of dialog box or choice box 51 or allow key sentence that cursor clicks or in abutting connection with literal or the symbol or the figure of speech section or statement or paragraph or operational order or selection, to receive the selection of inquiry to aspects such as mode of operation or patterns.
Of the present invention is the search technique of core with keyword and in abutting connection with speech, aspect dividing and constantly dwindling same keyword search results scope, have the tightness of dictionary formula and obviously surmount the convenience of prior art, will satisfy vast information search user active demand for a long time greatly.
(4) description of drawings
Figure 1 shows that structured flowchart according to an embodiment of search system of the present invention.
Figure 2 shows that the synoptic diagram of the key sentence generation of one embodiment of the present of invention.
Figure 3 shows that the synoptic diagram of the another kind of key sentence generating mode of embodiments of the invention.
Figure 4 shows that the example operational flow figure of the user of one embodiment of the present of invention at interactive interface.
Figure 5 shows that a reflection keyword different stage that one embodiment of the present of invention show in abutting connection with precedence relationship between the speech section in abutting connection with the tree-shaped catalogue synoptic diagram of speech section.
Figure 6 shows that the workflow diagram of the search engine of one embodiment of the present of invention.
Figure 7 shows that in the search procedure of one embodiment of the present of invention that cursor clicks (selected operation) and generate local screen's picture view of display result.
(5) embodiment
Below in conjunction with accompanying drawing, further specify on the basis of " summary of the invention " in front.
A embodiment illustrated in fig. 1 is the example that can carry out the computer data system of the computing machine e-text disposal route of the present invention-internet search engine system of the key sentence search of expansion can be provided.It comprises: be located at the search engine 8 on the server 5 that has storer 6 and processor 7, this search engine 8 is connected with the client computer 3 that has interactive interface 2 by the communication network 4 of internet; This search engine 8 has database 9, requestor 11 and keyword expansion parts 10 or module, and is connected with index constructor 13 with data acquisition unit 12;
Data acquisition unit 12 for the text library of database 9 from the internet or other information sources collect and increase text, the text analyzing of 13 pairs of text libraries of index constructor obtains text index and offers the keyword index storehouse of database 9;
Each index that this index constructor 13 obtains according to the analysis to text all comprises keyword section, 6 lists ID section, text header section, the text snippet section in abutting connection with speech section, corresponding text; The index in keyword index storehouse distributes by multistage subclass according to the similarities and differences in abutting connection with the speech section at different levels, so that retrieval or extraction.Accordingly in abutting connection with speech section catalogue, in abutting connection with tree-shaped catalogue of speech section (Fig. 5) and key sentence catalogue, also in advance the storage.
Client applications browser on the client computer 3 of embodiment A (InternetExplorer of Microsoft) allows user 1 to retrieve html documents (comprising the Web list) by communication network 4 from server 5.It is mutual with the Web list that retrieves that interactive interface on the client computer 3 (UI) 2 allows users 1 to utilize monitor, keyboard or mouse, and the submission searching request makes one's options and receives Search Results.
A major issue of searching method of the present invention is that the selection mode (or keyword and in abutting connection with the combination of speech section) in abutting connection with the speech section is the generating mode of key sentence.The exemplary key sentence of embodiment A shown in Figure 2 increases in abutting connection with speech section (this example is word) expansion backward one by one along keyword 21 in text snippet.Wherein, 22 is 1 grade of key sentence, and 23 is 2 grades of key sentence, and 24 is 3 grades of key sentence, and 25 is 4 grades of key sentence.
Figure 3 shows that the key sentence generating mode of another kind of Embodiment B.Its 1st grade is positioned at the front of keyword 21 in abutting connection with the speech section, and the 2nd grade is positioned at the back of keyword in abutting connection with speech section and other in abutting connection with the speech section.Wherein, 22 is 1 grade of key sentence, and 23 is 2 grades of key sentence, and 24 is 3 grades of key sentence, and 25 is 4 grades of key sentence.As if this kind front and back taken into account generating mode and be more suitable for searching for the western language file.The length in abutting connection with the speech section of key sentence at different levels (speech number) also can be predesignated or the arrangement of or default system selected in when search by the inquiry.
In other extreme embodiment, also can allow from keyword again and again to the front in abutting connection with speech section expansion, form key sentence at different levels.
For the extended mode of the keyword search of a plurality of speech that allow to separate, should select one as the core keyword, by in conjunction with it form key sentence at different levels in abutting connection with the speech section, these key sentence all have separable all the other keywords; Also can near each speech or speech section of the keyword of a plurality of speech, add one by one in abutting connection with the speech section, form key sentence at different levels according to desired sequence.
The system of embodiment A can select function word, measure word, punctuate, space etc. not to be counted, in the notional word with their merger adjacency calculating when the speech of speech section is counted.This example can have for western language corresponding concrete regulation is also arranged.
In embodiment A,, and inquire about and with the related data the results list that inquires, in order to offering interactive interface at described database 9 according to the keyword request that proposes by the query requests of requestor 11 authenticated 1; Keyword expansion parts 10 replenish as requestor 11, will keep in or handle the corresponding key sentence at different levels of this keyword, corresponding example sentence when needing, in abutting connection with speech section tree structure catalogue (referring to Fig. 5) etc., with the needs that satisfy search then or show; If these contents are not arranged in database 9 or keyword expansion parts 10 as yet, keyword expansion parts 10 will be with its foundation on the keyword query data basis of requestor 11.
In fact, it is very easy to achieve the above object, and can utilize the whole bag of tricks.For example, no matter afterwards still in advance, for a possible keyword or the actual keyword that proposes, no matter be keyword expansion parts 10 or computing machine or other search systems of embodiment A, can appoint from the index that contains this keyword or file sequence looks for (for example article one) index or file to check that the speech of keyword and adjacency or phrase promptly in abutting connection with speech section (according to predetermined length), store them as article one key sentence; Look for second index or file to check the identical of the speech of its keyword adjacency or phrase and article one again? if different, then storage is successively identically then given up; Check again the 3rd index or file and with preceding two comparisons ... the rest may be inferred, will obtain one group of key sentence that has nothing in common with each other each other; In above-mentioned comparison procedure, arrange respectively in groups if will comprise the index or the file of same key sentence in passing, then each subclass forms, otherwise, with each key sentence is that standard is retrieved described index or file sequence respectively by requestor 11, can obtain corresponding each subclass.
If in the sequence of each index or subset of the file, according to said method search for various the 2nd grade in abutting connection with the speech section, will obtain various the 2nd grade of key sentence and corresponding low one-level subclass ... and the rest may be inferred.
If in each subclass that obtains, respectively select one (for example article one) or several index or summary, then obtain required catalogue and example sentence sequence, and then finish the marshalling operation as example sentence.
For putting in order of catalogue and example sentence sequence, present embodiment is according to the size arrangement of a target function value.This target function value is the value of the text of objective function maximum among the respective subset of respective entries, equals the Page link value of the text and clicking rate sum in the recent period.Described example sentence can be by quoting in the text of respective subset target function value maximum.
For the text of adding advertisements content, target function value can equal corresponding bid.
The keyword index subclass distribution system of embodiment A takies bigger memory space unlike existing other keyword index storehouses, and this is one of its outstanding advantage.
In another Embodiment B, its keyword index storehouse does not adopt subclass to distribute, because its index data structure is comprising key word item and several in abutting connection with speech section item, its requestor 11 is according to keyword section and one or more key sentence in abutting connection with speech section combination, can directly should belong to the indexed search of respective subset respectively and displays.In Embodiment B, only need the tree-shaped catalogue of arrangement, even can not change original traditional keyword index database in abutting connection with speech section or key sentence.
Utilize and similarly shown in Figure 5 reflect the keyword different stage in abutting connection with precedence relationship between the speech section, and be illustrated on the picture and will help the overall status that the user understands each subclass or subclass at different levels, to take better search strategy in abutting connection with the tree-shaped catalogue of speech section.This figure has omitted each subclass corresponding text number and has indicated.
Embodiment A can be carried out selected operation, promptly allow described system on the text of the page of interactive interface or summary or on the catalogue or the indication of the cursor of selectionbar, to determine corresponding key sentence, organize into groups operation according to the inquiry, or the ordering displaying, perhaps remove operation.
Figure 7 shows that in the search procedure of one embodiment of the present of invention that cursor clicks and generate local screen's picture view of display result (the selected operation of promptly organizing into groups).
Wherein search box 51 is for input keyword (being " Bu Lin " in this example), and 52 is two kinds of options of clicking operation: ' click and launch ' or ' click and reject ', selected ' click and launch ' in this example.Herein for click to as if picture on the summary 55 showed of description column 53.It is interested when the inquiry reads in the related content of " Boll index ", cursor 54 is aimed at " mark " word to be clicked, like this, " Boll index " from " Bu Lin " to " mark " is just as new key sentence, and by in groups the operation, list several further expand in abutting connection with speech section or example sentence separately 56.
The searching method of embodiment A also comprises ignores operation, write down or analyzed in operation on the page of interactive interface 2 (as skipping) or the data of " pay close attention to and click " of on respective entries, content on the page, being done or " ignore and click " in the time of promptly can browsing the index that comprises former key sentence or text snippet or text sequence to the inquiry, and relevant information is sent to search engine 8, by it key sentence and relative index and summary in the back unheeded always or that do not paid close attention in certain reading time or space are removed operation.
In the system of embodiment A, after user 1 proposed keywords by interactive interface 2 and requires, requestor 11 can be as requested inquired about and the related data the results list that inquires is offered interactive interface 2 at described database; If user 1 wishes expanded keyword, keyword expansion parts 10 will generate corresponding key sentence, and extract or provide desired data by requestor 11 search.
The workflow of this search engine 8 (comprising requestor 11 and keyword expansion parts 10) can illustrate by Fig. 6:
This system starts working according to module 41, and inquiry has or not keyword search requirement (42), does not have and then returns (48); Does having then have or not the keyword expansion operation to require according to module 43 inquiries? if do not have, then carry out 44 common Search Results sequence shows is provided, if any, then carry out 45 demands of coming inquiring user 1 by the prompting frame on the screen of interactive interface 2; Operate accordingly then, provide corresponding information, continue the selection and the demand of inquiring user 1 according to module 46 ... repeat the back several times and provide corresponding search information, return or 49 finish according to user 1 wish execution module 48 according to module 47.
8 corresponding users 1 can represent by Fig. 4 in the operating process of interactive interface with search engine:
Start working after (31) opening interactive interface 2, selected keyword (32) can carry out routine and browse (34), also can select expanded search (33); As selecting (33), promptly utilize etendue critical statement search technique, then need to click the suitable mode of operation of selection: for example choose the length (quantity of the speech that comprised) of keyword first in abutting connection with the speech section by cursor.Its length is short, and the kind (number of subsets) of corresponding key sentence is less, but each subclass the contents are multifarious and disorderly; Its length is long, and the kind (number of subsets) of corresponding key sentence is more, and the core content of each subclass is then more single or concentrated.
Obviously, when the length of the key sentence of selecting when us reaches 4 to 6 speech, foregoing unicity index that obtains or summary marshalling sequence will be that a core content does not repeat do not have " the refining sequence " what are omitted substantially yet, and the file total amount but may reduce several magnitude.
When selecting long key sentence, the bar number of the described unicity index of the first order can be many.Native system allow to utilize clicking operation to change and suitably reduce key sentence in abutting connection with speech or in abutting connection with the quantity of speech section, can significantly reduce the bar number of the first order or this grade unicity index or key sentence or summary or example sentence.
If abandon keyword first in abutting connection with the choosing and the option of other types of speech segment length, system will be automatically according to original be that word or two speech length are operated in groups in abutting connection with the speech section for example with every grade, and the result is presented (35).This moment, user 1 can select 37 directly to open link text in the result, also can according to 36 in being presented in the result of picture selected suitable etendue critical statement (can referring to Fig. 7), and obtain the further Search Results that module 38 shows contents such as () next stage subclass catalogues.
So far, user 1 still can select 40 directly to open link text, also can select the key sentence of selected certain expansion of 39 continuation ... the rest may be inferred, until returning (301).
This statement of etendue critical step by step promptly dwindles the mode of hunting zone step by step, will effectively lock ferret out rapidly.
In embodiment A, certainly in other embodiment of method of the present invention, can write down or accumulative total certain or some or all inquiries in certain time period to the number of clicks of related content of comprising of various keywords of various various key sentence in abutting connection with the speech section, or corresponding statistical module is set when needed.
In Embodiment C, above-mentioned key sentence search technique will combine with existing keyword search technology, when the index order of its subclass inside, perhaps when each bar example sentence is selected in marshalling operation, pay respect or keep ordering or the position of associated documents in the Search Results of the search system of prior art.In other words, technology of the present invention is included on above-mentioned basic skills and the basic structure basis utilization to prior art searching order principle or method.Embodiment B and Embodiment C beyond the specified part aspect, basic identical with embodiment A.
The technical characterictic that above embodiment provides all is suggestive, does not allow to be used for limiting the scope that the present invention includes.

Claims (20)

1, the method that a plurality of e-texts that contain same keyword are handled carried out of a kind of computing machine comprises:
Obtain a plurality of e-texts that contain same keyword; Regulation is in abutting connection with the contained words quantity of speech section or in abutting connection with speech section interception way; According to keyword described in each content of text in the part or all of text in abutting connection with speech section or identical with other texts or different in abutting connection with the speech section indirectly, the text and other texts are subdivided into same or different subclass or carry out corresponding identical or different processing;
Described corresponding identical or different processing can comprise: corresponding text has identical or different distributing position or storage mode, perhaps obtain identical or different subclass mark, perhaps make its index have identical or different mark or index entry, perhaps has identical or different arranged mode, perhaps have identical or different display mode or position, perhaps allow at least the part subclass respectively to have and one or morely stride subclass combination or ordering or show at interactive interface in abutting connection with speech section or text at interactive interface;
Described text can be e-file or webpage or their summary or index or questions record or exercise question.
2, according to the described disposal route of claim 1, comprising: for belonging to that certain or some same first order subclass or higher subclass or its content contain same keyword and in abutting connection with the different texts of speech section, according to its described same keyword that contains and in abutting connection with identical still different in abutting connection with the speech section of other of speech section, part or all of described text is subdivided into same or different next stage of above-mentioned subclass or multistage subclass or carries out corresponding identical or different processing;
Described disposal route allow successively in abutting connection with the merging of speech section or separately, to reduce or to increase the subclass level.
3, according to the described disposal route of claim 1, comprising:
The difference of the same keyword of the described text of one of layout reflection in abutting connection with the speech section or indirectly in abutting connection with the speech section or comprise the statement of these speech sections or example sentence or summary example side by side or one or more levels catalogue or tree-shaped catalogue or sequence of precedence relationship, wherein, separately described identical of the one or more different subclass that can comprise described text in abutting connection with speech section or identical indirect in abutting connection with the speech section or comprise statement or the example sentence or the summary example of this speech section, perhaps comprise separately identical of the next stage of this or these subclass or a plurality of subclass of stage further, according to side by side or be subordinate to precedence relationship layout or distribution or storage or displaying in abutting connection with the speech section or indirectly in abutting connection with the speech section or comprise statement or the example sentence or the summary example of this speech section; It is arranged side by side that wherein said speech section or statement or example sentence or summary example can be striden subclass.
4, according to the described disposal route of claim 3, comprising:
Keyword in described catalogue or tree-shaped catalogue or the sequence is in abutting connection with the speech section or indirectly in abutting connection with the speech section, if it is a kind of that its next stage or stage further have only in abutting connection with the speech section, this speech section can distribute in abutting connection with the original position of speech Duan Zaiqi or stores or show together with its next stage or stage further.
5, according to claim 1 or 2 or 3 or 4 described disposal routes, comprising:
In above-mentioned text or catalogue or statement or example sentence or summary example or at keyword that they comprised or in abutting connection with the speech section or indirectly near the speech section, can have the number of subsets arranged side by side of its corresponding number of subsets side by side or subordinate's number of subsets or related term or speech section place subclass or contained subordinate's number of subsets or the prompting of textual data purpose.
6, according to claim 1 or 2 or 3 or 4 described disposal routes, comprising:
In described disposal route or catalogue, arranged side by side subclass or side by side in abutting connection with the speech section or indirectly in abutting connection with speech section or text or statement arranged side by side or the some concrete sorting positions in example sentence or the summary example side by side, partially or completely depend on following wherein some or a plurality of factor:
Size or the height of clicking rate or the height of keyword occurrence rate of the Page link value of the text or this speech section or statement or example sentence or summary example place text,
The perhaps size of the mean values of the text Page link value of the height of what or this subclass clicking rate of subordinate's number of subsets of this subclass or subordinate's textual data purpose or this subclass,
The perhaps size of the mean values of the text Page link value of the height of what or place subclass clicking rate of subordinate's number of subsets of this speech section or text or statement or example sentence or summary example place subclass or subordinate's textual data purpose or place subclass,
The perhaps size of the Page link value of text that the Page link value of this subclass is the highest or other text example,
The perhaps clicking rate of the clicking rate of this subclass text the highest or that the keyword occurrence rate is the highest or other text example or the height of keyword occurrence rate,
The perhaps ordering of related text in other search websites or searching system Search Results in related text or the associated subset,
Investor's relevant payment of perhaps relevant text or the height of bidding,
The spelling of perhaps relevant speech word in abutting connection with the speech section or the lexicographic order or the stroke of phonetic,
The perhaps source web of text or unit or people's scoring,
The perhaps related text time order and function of including or new and old,
The same subclass that perhaps whether belongs to certain one-level.
7, according to the described disposal route of claim 6, comprising:
In described disposal route or catalogue, arranged side by side subclass or side by side in abutting connection with the speech section or indirectly in abutting connection with speech section or text or statement arranged side by side or the some concrete sorting positions in example sentence or the summary example side by side, can decide by a kind of target function value, target function value depends on one or more variablees, and the variable of this objective function is partly or entirely represented above-mentioned listed wherein some or a plurality of factors respectively.
8, according to claim 1 or 2 or 3 or 4 or 7 described disposal routes, comprising:
Permission is on the result of above-mentioned processing, and what increase was additional should possess the other keyword that maybe can not possess, and perhaps increases the restriction of time or region or languages or other types or scope, obtains the result of further refining.
9, a kind of computer data system that comprises storing apparatus is characterized in that, described memory storage or the part or all of keyword index that data division contained wherein or the data of text snippet or text distribute in the following manner:
Its text snippet or text contain same keyword and this keyword in abutting connection with the identical or different index of speech section or the data of text snippet or text, be positioned at the distributed areas of the same or different subclass of same keyword set;
This system allows to be positioned at same subclass, and its text snippet or text contain same keyword and in abutting connection with the speech section be same etendue critical statement and this statement in abutting connection with the identical or different index data of speech section, be positioned at same or different low one or more levels subclass distributed areas of same subclass.
10, a kind of computer data system that comprises storing apparatus is characterized in that, the data structure of described memory storage or the part or all of keyword index that data division contained is wherein formed and comprised at least:
The keyword section;
One or more in abutting connection with the speech section, by in the corresponding content of text or the at different levels of the predetermined number of the adjacency successively of the keyword in the text snippet form by the mapping of former order in abutting connection with the speech section, be followed successively by: in abutting connection with speech section 1, in abutting connection with speech section 2 ... in abutting connection with speech section N;
The ID section of corresponding text;
The summary section or the header segment that can comprise the described keyword that corresponding text contains;
Allow described keyword section that this computer data system comprises according to search regulation and each one or more combinations or increase and decrease of portmanteau word hop count purpose in the speech section, search for or search for corresponding index or content with mapping mode.
11, a kind of search engine provides the desired result's of inquiry searching method, the keyword query requirement that this search engine system response inquiry proposes via interactive interface is from relevant information source or the database search of this system and provide and meet text or text snippet or index or its relevant information that above-mentioned keyword requires; The characteristics of this searching method are that this method comprises:
This system receives inquiry's keyword query requirement via interactive interface;
After the affirmation, require inquiry to comprise the database of keyword index according to this keyword;
This system in advance or the above-mentioned keyword that the time will occur in containing the content of text of above-mentioned keyword or in the text snippet in inquiry together with it in abutting connection with the speech section, as key sentence;
The quantity of described words that is comprised in abutting connection with the speech section or character or should in abutting connection with the speech section by mode or particular content be by said system predetermined or the inquiry agrees or acquiescence or selected, perhaps by the inquiry in the selectionbar that interactive interface presents or the position and the mode that comprise the cursor indication of the carrying out on the page of the text snippet of certain concrete index or text or related content determine;
This system in advance or when inquiry according to above-mentioned in abutting connection with speech section or key sentence induction-arrangement go out to have nothing in common with each other in abutting connection with speech section or the key sentence that has nothing in common with each other;
This system generates Search Results according to the key sentence that obtains in advance or when inquiry, that is: will contain the different index or the text snippet of described identical or different key sentence or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry.
12, according to the described searching method of claim 11, this method also comprises:
In advance or the above-mentioned key sentence that the time will occur in containing the content of text of above-mentioned key sentence or in the text snippet in inquiry together with it in abutting connection with the speech section, perhaps with former key sentence together with it in abutting connection with the speech section, as the key sentence of expansion;
The quantity of described words that is comprised in abutting connection with the speech section or character or should in abutting connection with the speech section by mode or particular content be by said system predetermined or the inquiry agrees or acquiescence or selected, perhaps by the inquiry in the selectionbar that interactive interface presents or the position and the mode that comprise the cursor indication of the carrying out on the page of the text snippet of certain concrete index or text determine;
In advance or when inquiry according to the above-mentioned key sentence that goes out to have nothing in common with each other in abutting connection with speech section or key sentence induction-arrangement in abutting connection with speech section or the expansion that has nothing in common with each other;
Generate Search Results according to the key sentence of expansion that obtains or repeatedly expansion in advance or in when inquiry, that is: and will contain described identical or different expansion key sentence different index or text snippet or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry.
13,, operate comprising marshalling according to claim 11 or 12 described searching methods:
Promptly allow to contain the various key sentence of same keyword or in abutting connection with speech section or index or text snippet or text, perhaps will contain same former key sentence various expansion key sentence or in abutting connection with speech section or index or text snippet or text, organize into groups separately with catalogue or sequence form and arrange or show, wherein each is only taken in abutting connection with the key sentence at speech section place or index or text snippet or text that each is one or more.
14, according to claim 11 or 12 described searching methods, it is characterized in that:
Make the data of described part or all of keyword index or text snippet or text, similar and different according to its keyword that contains or key sentence or etendue critical statement is distributed in the subset area storage of similar and different subset area or similar and different even lower level;
So that when keyword query, directly extract or provide the data of corresponding key sentence or keyword index or text snippet or text.
15, according to claim 11 or 12 described searching methods, comprising:
To the subsidiary data acquisition server of the text in the described database or summary or search engine from the internet or the text that obtains of other information sources analyze, produce that described text comprises the keyword section accordingly at least and in abutting connection with the index of speech section and text ID section, comprise text snippet or title in case of necessity, and storage;
In order to when search, in storage, retrieve and provide corresponding index or summary or text according to keyword section that it comprised with in abutting connection with the speech section.
16, according to claim 11 or 12 described searching methods, comprising:
Reflection of layout have the text of same keyword or this keyword different stage in the text snippet in abutting connection with between the speech section successively or the tree-shaped catalogue of coordination, perhaps between the key sentence of this keyword different stage expansion of reflection successively or the tree-shaped catalogue of coordination; Use during for inquiry.
17,, operate comprising selected according to claim 11 or 12 or 13 or 14 or 16 described searching methods:
Promptly allow described system according to the inquiry on the above-mentioned text of the page of interactive interface or the text snippet key sentence or in the speech section catalogue to certain speech or in abutting connection with cursor indication speech section end or in selectionbar or frame, determine corresponding key sentence, and to the key sentence of the various expansion of this key sentence correspondence or expansion in abutting connection with speech section or index or text snippet or text is organized into groups operation or catalogue is showed, perhaps carrying out the ordering of respective index or text snippet or text shows, perhaps remove operation, the described page or other a plurality of pages are contained the clauses and subclauses of this key sentence or index or text snippet or text reject or the shift position according to the corresponding key sentence of determining.
18, according to claim 11 or 12 described searching methods, comprising ignoring operation:
Promptly according to the inquiry browse when comprising index former keyword or that comprise former key sentence or text snippet or text sequence on the interactive interface to the page or the operation on the page, judge that the inquiry browses the present position of this index or text sequence; If can determine, be arranged in the index that comprises certain key sentence in this front, position certain limit or text snippet or text or this key sentence itself always or continuously certain number of times be not opened or link, otherwise do not clicked or paid close attention to or pointed out reservation yet, then remove operation, the described page or other a plurality of pages are contained the clauses and subclauses of this key sentence or index or text snippet or text reject or the shift position according to this key sentence.
19, a kind of response inquiry provides the search engine system of desired Search Results via the interactive interface requirement, comprising:
Server, this server is via the client computer coupling at communication network or circuit and described interactive interface place;
Be positioned at the search engine of server, described search engine comprises: the database that comprises keyword index, and requestor, this requestor can require according to the keyword that the inquiry proposes to inquire about and the related data the results list that inquires is offered interactive interface at described database;
Its characteristics are:
Described database also comprises text snippet or questions record or the text that comprises keyword, and text summary or questions record can be included in the described keyword index;
Described requestor or search engine comprise the keyword expansion parts, these parts can to the inquiry or keyword to be checked carry out extended operation: the above-mentioned keyword that will occur in containing the content of text of above-mentioned keyword or in the text snippet together with it in abutting connection with the speech section, as each different etendue critical statement, and select for use via interactive interface for the inquiry in abutting connection with the tabulation of speech section with its tabulation or with the difference of above-mentioned keyword, perhaps will contain the different index or the text snippet of identical or different described etendue critical statement or text is retrieved or processing or layout or arrangement, select for use via interactive interface for the inquiry;
Described keyword expansion parts can carry out the one or many extended operation to key sentence equally, and inquire about accordingly or handle.
20, according to the described search engine system of claim 19, wherein:
Comprise a kind of graphical user interactive interface, allow the inquiry to add additional queries information, its interface can comprise a kind of dialog box or choice box or allow key sentence that cursor clicks or in abutting connection with literal or the symbol or the figure of speech section or statement or paragraph or operational order or selection, to receive the selection of inquiry to aspects such as mode of operation or patterns.
CN 200710087104 2007-02-15 2007-03-21 Method and system for electronic text-processing and searching Pending CN101063975A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN 200710087104 CN101063975A (en) 2007-02-15 2007-03-21 Method and system for electronic text-processing and searching
PCT/CN2008/000190 WO2008098467A1 (en) 2007-02-15 2008-01-25 Convenient method and system of electric text processing and retrieve

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN200710079309 2007-02-15
CN200710079309.4 2007-02-15
CN 200710087104 CN101063975A (en) 2007-02-15 2007-03-21 Method and system for electronic text-processing and searching

Publications (1)

Publication Number Publication Date
CN101063975A true CN101063975A (en) 2007-10-31

Family

ID=38964999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200710087104 Pending CN101063975A (en) 2007-02-15 2007-03-21 Method and system for electronic text-processing and searching

Country Status (1)

Country Link
CN (1) CN101063975A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008098467A1 (en) * 2007-02-15 2008-08-21 Erzhong Liu Convenient method and system of electric text processing and retrieve
CN101566984B (en) * 2008-07-11 2011-02-09 博采林电子科技(深圳)有限公司 Search engine used in personal hand-held equipment and resource search method
CN101630315B (en) * 2008-07-16 2011-09-14 清华大学 Quick retrieval method and system
CN102341800A (en) * 2009-03-17 2012-02-01 富士通株式会社 Search processing method and apparatus
CN101763424B (en) * 2009-12-14 2013-03-06 刘二中 Method for determining characteristic words and searching according to file content
CN102117276B (en) * 2009-12-31 2013-04-03 北大方正集团有限公司 Method and device conducting follow-up treatments on search results
WO2013076655A1 (en) * 2011-11-22 2013-05-30 Ho Keung Tse Information search
CN103136274A (en) * 2011-12-02 2013-06-05 北大方正集团有限公司 Date retrieval method and device used for content resource data base
CN103164491A (en) * 2011-12-19 2013-06-19 北大方正集团有限公司 Method and device for processing and retrieving data
CN103314371A (en) * 2010-12-31 2013-09-18 肖岩 Retrieval method and system
CN103841656A (en) * 2012-11-22 2014-06-04 三星电子株式会社 Mobile terminal and data processing method thereof
CN104123074A (en) * 2013-04-26 2014-10-29 株式会社东芝 Target area estimation apparatus, method and program
CN105872170A (en) * 2016-03-28 2016-08-17 北京小米移动软件有限公司 Method and device for searching contact person
CN107622046A (en) * 2017-09-01 2018-01-23 广州慧睿思通信息科技有限公司 A kind of algorithm according to keyword abstraction text snippet
CN108415959A (en) * 2018-02-06 2018-08-17 北京捷通华声科技股份有限公司 A kind of file classification method and device
CN109255621A (en) * 2018-09-30 2019-01-22 中国银行股份有限公司 A kind of information processing method and system
CN110059243A (en) * 2019-03-21 2019-07-26 广东瑞恩科技有限公司 Data optimization engine method, apparatus, equipment and computer readable storage medium
CN116628201A (en) * 2023-05-18 2023-08-22 浙江数洋科技有限公司 Intelligent grouping and pushing method for text database

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008098467A1 (en) * 2007-02-15 2008-08-21 Erzhong Liu Convenient method and system of electric text processing and retrieve
CN101566984B (en) * 2008-07-11 2011-02-09 博采林电子科技(深圳)有限公司 Search engine used in personal hand-held equipment and resource search method
CN101630315B (en) * 2008-07-16 2011-09-14 清华大学 Quick retrieval method and system
CN102341800A (en) * 2009-03-17 2012-02-01 富士通株式会社 Search processing method and apparatus
CN102341800B (en) * 2009-03-17 2014-10-29 富士通株式会社 Search processing method and apparatus
CN101763424B (en) * 2009-12-14 2013-03-06 刘二中 Method for determining characteristic words and searching according to file content
CN102117276B (en) * 2009-12-31 2013-04-03 北大方正集团有限公司 Method and device conducting follow-up treatments on search results
CN103314371A (en) * 2010-12-31 2013-09-18 肖岩 Retrieval method and system
WO2013076655A1 (en) * 2011-11-22 2013-05-30 Ho Keung Tse Information search
CN103136274A (en) * 2011-12-02 2013-06-05 北大方正集团有限公司 Date retrieval method and device used for content resource data base
CN103164491A (en) * 2011-12-19 2013-06-19 北大方正集团有限公司 Method and device for processing and retrieving data
CN103164491B (en) * 2011-12-19 2016-03-30 北大方正集团有限公司 The method and apparatus of a kind of data processing and retrieval
CN103841656A (en) * 2012-11-22 2014-06-04 三星电子株式会社 Mobile terminal and data processing method thereof
CN104123074A (en) * 2013-04-26 2014-10-29 株式会社东芝 Target area estimation apparatus, method and program
CN105872170B (en) * 2016-03-28 2019-05-10 北京小米移动软件有限公司 Method and apparatus for searching for contact person
CN105872170A (en) * 2016-03-28 2016-08-17 北京小米移动软件有限公司 Method and device for searching contact person
CN107622046A (en) * 2017-09-01 2018-01-23 广州慧睿思通信息科技有限公司 A kind of algorithm according to keyword abstraction text snippet
CN108415959A (en) * 2018-02-06 2018-08-17 北京捷通华声科技股份有限公司 A kind of file classification method and device
CN109255621A (en) * 2018-09-30 2019-01-22 中国银行股份有限公司 A kind of information processing method and system
CN109255621B (en) * 2018-09-30 2022-03-11 中国银行股份有限公司 Information processing method and system
CN110059243A (en) * 2019-03-21 2019-07-26 广东瑞恩科技有限公司 Data optimization engine method, apparatus, equipment and computer readable storage medium
CN110059243B (en) * 2019-03-21 2024-05-07 广东瑞恩科技有限公司 Data engine optimization method, device, equipment and computer readable storage medium
CN116628201A (en) * 2023-05-18 2023-08-22 浙江数洋科技有限公司 Intelligent grouping and pushing method for text database
CN116628201B (en) * 2023-05-18 2023-10-20 浙江数洋科技有限公司 Intelligent grouping and pushing method for text database

Similar Documents

Publication Publication Date Title
CN101063975A (en) Method and system for electronic text-processing and searching
US11294970B1 (en) Associating an entity with a search query
CN100501745C (en) Convenient method and system for electronic text-processing and searching
US9323827B2 (en) Identifying key terms related to similar passages
US9251206B2 (en) Generalized edit distance for queries
US8122032B2 (en) Identifying and linking similar passages in a digital text corpus
US8949214B1 (en) Mashup platform
US8386478B2 (en) Methods and systems for unobtrusive search relevance feedback
US7698331B2 (en) Matching and ranking of sponsored search listings incorporating web search technology and web content
US20110173216A1 (en) Dynamic aggregation and display of contextually relevant content
US20070250501A1 (en) Search result delivery engine
CN101061478A (en) Providing information relating to a document
US20160224621A1 (en) Associating A Search Query With An Entity
US20100094845A1 (en) Contents search apparatus and method
US20110191328A1 (en) System and method for extracting representative media content from an online document
JPH1125108A (en) Automatic extraction device for relative keyword, document retrieving device and document retrieving system using these devices
CN1609859A (en) Search result clustering method
CN1839386A (en) Internet searching using semantic disambiguation and expansion
CN1918568A (en) Interface for a universal search engine
CN1487452A (en) System for carrying out universal search management in one or more networks
US20090119283A1 (en) System and Method of Improving and Enhancing Electronic File Searching
CN1687925A (en) Method for realizing bilingual web page searching
US20190065502A1 (en) Providing information related to a table of a document in response to a search query
CN102214183A (en) Search engine query method for combining feedback contents of pages with fixed ranking
US20120179709A1 (en) Apparatus, method and program product for searching document

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication