CN104572871A - Method and device for searching based on index table - Google Patents

Method and device for searching based on index table Download PDF

Info

Publication number
CN104572871A
CN104572871A CN201410802676.2A CN201410802676A CN104572871A CN 104572871 A CN104572871 A CN 104572871A CN 201410802676 A CN201410802676 A CN 201410802676A CN 104572871 A CN104572871 A CN 104572871A
Authority
CN
China
Prior art keywords
text
participle
concordance list
information
symbiosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410802676.2A
Other languages
Chinese (zh)
Inventor
刘曙
关涛
于立柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Information Technology Beijing Co Ltd
Original Assignee
LeTV Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Information Technology Beijing Co Ltd filed Critical LeTV Information Technology Beijing Co Ltd
Priority to CN201410802676.2A priority Critical patent/CN104572871A/en
Publication of CN104572871A publication Critical patent/CN104572871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for searching based on an index table. The method comprises the following steps of distributing a text label for a collected text, and performing word segmenting on the text; according to each segmented word after text processing, extracting parameter information of the text, and according to the parameter information of the text, updating the index table; identifying a search term inputted by a user, and segmenting the words of the search term; according to each of the segmented words after the word segmenting, respectively traversing in the index table, and outputting a traversing result. The method has the advantage that by integrating an index table establishing and updating mechanism, and utilizing the index table to search, target result of the searching request proposed by the user can be more reasonably, efficiently and quickly obtained.

Description

Based on method and the device of concordance list retrieval
Technical field
The application relates to technical field of information retrieval, is specifically related to a kind ofly carry out the method retrieved and device based on concordance list.
Background technology
The development of Internet technology brings great convenience to the life of the mankind, and various content is flooded with network, and the content how to find oneself to pay close attention in the ocean of internet is also the problem that internet developers make great efforts to solve always.There is oneself search engine each website, and the correlated results searched, after receiving Client-initiated searching request, can be fed back to user by interface by the search engine of website of the prior art.
Therefore, how by a kind of how to utilize this index structure to search relevant search information accurately and rapidly and feed back to user become a technical matters urgently to be resolved hurrily.
Summary of the invention
The object of the application is to provide a kind of and carries out the method retrieved and device based on concordance list.
In order to reach above-mentioned purpose, this application discloses a kind of method of carrying out retrieving based on concordance list, comprising: for the text collected distributes Text Flag, and word segmentation processing is carried out to described text; Extract the parameter information of described text according to each participle obtained after described text-processing, and upgrade concordance list according to the parameter information of described text; Identify the search word of user's input and participle is carried out to described search word, traveling through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, exporting traversing result.
Further, the parameter information of described text is extracted according to each participle obtained after described text-processing, and upgrade concordance list according to the parameter information of described text, comprise: the position of the number of times that each participle obtained after adding up described text-processing occurs in described text and appearance, and the symbiosis information that the position of the number of times occurred in described text according to each participle and appearance forms each participle in described text stores; Described Text Flag is bundled in the symbiosis information of each participle in described text, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list.
Further, the symbiosis information that the number of times occurred in described text according to each participle and the position of appearance form each participle in described text stores, comprise: store with the symbiosis information of the mode of memory block to each participle, a memory block stores the symbiosis information of each participle in one or more text, the symbiosis information belonging to all participles of same text is assigned to same memory block, and the symbiosis information of each participle to be stored in each memory block stores by current available superlatively location.
Further, in each memory block, be provided with timestamp, storage time the last in each memory block of described timestamp record; With the multiple memory block of unidirectional loop chain table organization, identify initial memory block respectively with head pointer and tail pointer and terminate memory block, on direction from head pointer to tail pointer, the storage time shown by the timestamp of each memory block is more and more far away apart from current time.
Further, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list, comprise: adopt two-way annular chain meter to set up described concordance list, the corresponding participle of each node in described concordance list also stores the symbiosis information corresponding with described participle; For the symbiosis information of each participle after binding process, according to each in described each participle, the node of described concordance list travels through, when hitting a certain participle, the node that the described participle of hit is corresponding stores the symbiosis information of the corresponding participle after binding process, or, when there is no the participle hit, there is no the participle hit described in blank node in described concordance list creates, and on described blank node, store the symbiosis information not having the participle hit described in after binding process.
Further, periodically the validity of all nodes in described concordance list is inquired about, when the symbiosis invalidates information of all participles after the binding process that node memory stores up, in described concordance list, shield described node; When the duration of described node conductively-closed is more than a pre-determined threshold, empties described node, retain the memory headroom of described node.
Further, identify the search word of user's input and participle is carried out to described search word, travel through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, export traversing result, comprise: identify the search word of user's input and participle is carried out to described search word, travel through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, obtain multiple nodes of hit; For each node of hit, obtain all symbiosis information stored in described node respectively and form one group of symbiosis information, thus obtain the many group symbiosis information corresponding with the nodes of described hit; The symbiosis information that there is one text and identify is extracted by described many group symbiosis information, for the symbiosis information with one text mark, the position that in the symbiosis information of more described one text mark in couples, corresponding participle occurs in the text and the number of times of appearance, distance between the position that described corresponding participle occurs in the text be less than or equal to one first thresholding and described number of times is less than or equal to second thresholding time, described one text mark is exported.
Further, the parameter information of described text is extracted according to each participle obtained after described text-processing, and upgrade concordance list according to the parameter information of described text, comprise: the number calculating described participle, using the number of described participle as text size, also record the acquisition time of described text size; By one or more combination in the acquisition time of described Text Flag, described text size or described text size, carry out merging as text message object with described text, upgrade concordance list according to described text message object.
Further, upgrade concordance list, comprising: using Text Flag as source code according to described text message object, be mapped as operand with the mask preset through digitwise operation or logical operation, the operand described mapping obtained is as memory address; The text message object corresponding to described Text Flag according to described memory address stores.
Further, concordance list is upgraded according to described text message object, comprise: be periodically detected as the significance bit mark that stored text message object is arranged, when the acquisition time distance current time of text size in described text message object is more than a preset duration, the significance bit of described text message object mark is set to lose efficacy.
Further, identify the search word of user's input and participle is carried out to described search word, travel through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, export traversing result, comprise: identify the search word of user's input and participle is carried out to described search word, travel through according in the text of each the effective text message object stored in described concordance list respectively in multiple participles that word segmentation processing obtains, the text message object of hit is exported.
In order to reach above-mentioned purpose, the application further discloses a kind of device carrying out retrieving based on concordance list, comprising: participle configuration module, distributes Text Flag, and carry out word segmentation processing to described text for the text for collecting; Index upgrade module, for extracting the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list according to the parameter information of described text; Retrieval module, for identifying the search word that user inputs and carrying out participle to described search word, travels through respectively according to each in multiple participles that word segmentation processing obtains in described concordance list, exports traversing result.
Compared with prior art, the application can obtain and comprise following technique effect:
1) the application's set mechanism concordance list set up and upgrades, and rely on above-mentioned concordance list and search for, more rationally efficiently can capture the objective result of Client-initiated searching request fast.
2) the application carries out participle by the correlated results that interface searches and obtains symbiosis information forming index structure thus setting up concordance list and upgrade, thus greatly reduce the data volume of storage, and accelerate the speed retrieving traversal in concordance list, so that more rationally efficiently capture the objective result of Client-initiated searching request fast.
Certainly, above technique effect might not be reached simultaneously.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide further understanding of the present application, and form a application's part, the schematic description and description of the application, for explaining the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the method flow schematic diagram of the embodiment of the present application.
Fig. 2 is the method flow schematic diagram of step S102 in Fig. 1.
Fig. 3 is the method flow schematic diagram of step S104 in Fig. 1.
Fig. 4 is the method flow schematic diagram of step S102 ~ S104 in Fig. 1.
Fig. 5 is the structure drawing of device of the embodiment of the present application.
Fig. 6 is the another structure drawing of device of the embodiment of the present application.
Fig. 7 is the another structure drawing of device of the embodiment of the present application.
Embodiment
Drawings and Examples will be coordinated below to describe the embodiment of the application in detail, by this to the application how application technology means solve technical matters and the implementation procedure reaching technology effect can fully understand and implement according to this.
As employed some vocabulary to censure specific components in the middle of instructions and claim.Those skilled in the art should understand, and hardware manufacturer may call same assembly with different noun.This specification and claims are not used as with the difference of title the mode distinguishing assembly, but are used as the criterion of differentiation with assembly difference functionally." comprising " as mentioned in the middle of instructions and claim is in the whole text an open language, therefore should be construed to " comprise but be not limited to "." roughly " refer to that in receivable error range, those skilled in the art can solve the technical problem within the scope of certain error, reach described technique effect substantially.In addition, " couple " word and comprise directly any and indirectly electric property coupling means at this.Therefore, if describe a first device in literary composition to be coupled to one second device, then represent described first device and directly can be electrically coupled to described second device, or be indirectly electrically coupled to described second device by other devices or the means that couple.Instructions subsequent descriptions is implement the better embodiment of the application, and right described description is for the purpose of the rule that the application is described, and is not used to the scope limiting the application.The protection domain of the application is when being as the criterion depending on the claims person of defining.
Also it should be noted that, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the commodity of a series of key element or system not only comprises those key elements, but also comprise other key elements clearly do not listed, or also comprise by this commodity or the intrinsic key element of system.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within the commodity or system comprising described key element and also there is other identical element.
Therefore, those of ordinary skill in the art, need not creative work under the inspiration of the following embodiment of the application, the above-mentioned core concept of the application can be applied to the occasion that other external audio input-output device are connected with intelligent television, repeat no more in detail.
embodiment 1
The method of the present embodiment has gathered the mechanism of just arranging index and inverted index two kinds of ways of search, certainly, also can realize the search of just arranging index or inverted index individually.Fig. 1 is the method flow schematic diagram of the embodiment of the present application, for describing the method flow based on comprising the concordance list just arranging index and inverted index and carrying out searching for, comprising:
Step S100, for the text collected distributes Text Flag, and carries out word segmentation processing to described text;
Collected document comprises multiple territory, and each territory includes a text, handled by the method for the present embodiment to as if the text in each territory, word segmentation processing is carried out to the text in arbitrary territory and in the text in this territory, generates symbiosis information for each participle.
The mark in what Text Flag was indicated is exactly this territory, Text Flag like this for the not same area belonging to same document is different, can retrieve in the symbiosis information belonging to same Text Flag according to the search word of user's input when subsequent step is retrieved, the result therefore finally retrieved is the text corresponding with Text Flag.Certain Text Flag also can be the mark of whole document, the Text Flag belonging to the not same area of same document is like this identical, can retrieve in the symbiosis information belonging to same Text Flag according to the search word of user's input when subsequent step is retrieved, the result therefore finally retrieved is the document corresponding with Text Flag.
Described text is the text in a document in arbitrarily-shaped domain, and described text also can be the document including multiple territory, and described territory comprises: title, text, evaluation etc., and the participle obtained comprises word and/or word.
Step S102, extracts the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list according to the parameter information of described text;
Concordance list involved by this step comprises two kinds, is just arranging index and inverted index, can alternatively or in combination use, and is just arranging when index and inverted index combinationally use and is not retraining between the two and connect each other, but non-interfering parallel scheme.
Described concordance list is before this step, and the parameter information according to described text establishes in advance, and no matter is just arranging index or inverted index, all needs to utilize the parameter information of described text just can complete renewal.For inverted index, the parameter information of required utilization comprises the symbiosis information of each participle in described text, for symbiosis information extraction and how to utilize symbiosis information to inverted index upgrade in embodiment 1.1 describe in detail.For just arranging index, the parameter information of required utilization comprises text message object, for text message object extraction and how to utilize text message object align row index carry out upgrading in embodiment 1.2 describe in detail.
Step S104, identifies the search word of user's input and carries out participle to described search word, traveling through respectively, the Search Results of output matching according to each in multiple participles that word segmentation processing obtains in described concordance list.
For inverted index, travel through according to each in multiple participles that word segmentation processing obtains, what obtain is the symbiosis information that binding has Text Flag, also need to be combined with each other according to this part symbiosis information to carry out analyzing and processing, finally obtain Text Flag, Text Flag is exported as traversing result, how to travel through and how analyzing and processing will be described in detail in embodiment 1.1.
For just arranging index, travel through according to each in multiple participles that word segmentation processing obtains, acquisition be text message object, and these text message objects just can directly as traversing result export, how to travel through and will describe in detail in embodiment 1.2.
embodiment 1.1
Be in this example under the framework of embodiment 1, describe in detail and symbiosis information is extracted, and utilize symbiosis information to upgrade inverted index in concordance list.Certainly, after upgrading inverted index, inverted index can also be used to carry out traveling through and exporting according to the search word of user's input.
Fig. 2 is the method flow schematic diagram of step S102 in Fig. 1, for describing the method flow upgrading concordance list based on participle, comprising:
Step S1020, the position of the number of times that each participle obtained after adding up described text-processing occurs in described text and appearance, and the symbiosis information that the position of the number of times occurred in described text according to each participle and appearance forms each participle in described text stores.
This step mainly comprises three core links---the generation of statistics, symbiosis information and storage.
1) statistics
Statistics mainly adds up the position of number of times that each participle occurs in described text and appearance.The statistics of number of times occurred better is understood, but how the position occurred is added up, add up after how record, and whether the same with number of times the form presented simple, intuitive be readable, and this is technological difficulties.
To each participle of the sequential obtained after described text-processing, identify with the ordinal number of each participle the position that each participle occurs in described text.Such as, for the document A collected, the title of document A is " in the film that actor A is acted the leading role, actor A appearance is very handsome ", and each participle sequential obtained after word segmentation processing for " actor A | act the leading role | | film | in | actor A | appearance | very | handsome ", the ordinal number mark of these participles is followed successively by 0,1,2,3,4,5,6,7,8, " actor A " this participle occurred twice, corresponding ordinal number mark is respectively 0 and 5, therefore just represents with 0 and 5 position that " actor A " this participle occurs in the title (i.e. text) of document A.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
2) generation of symbiosis information
For each participle in text, all can have a symbiosis information, this symbiosis information comprises the position of number of times that this participle occurs in described text and appearance.Document A is still used to be described as an example, the each participle sequential obtained after the title of document A carries out word segmentation processing for " actor A | act the leading role | | film | in | actor A | appearance | very | handsome ", " actor A " this participle occurred twice, the position occurred is respectively 0 and 5, the symbiosis information being formed in " actor A " this participle in the title of described document A according to the number of times of appearance and the position of appearance is (2,0,5), the number of times of the positional representation appearance of first numeral, the position of the positional representation appearance of following digital.It should be noted that the position that described arbitrary participle occurs in described text occurs in increasing mode, also store in increasing mode when generating symbiosis information.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
3) store
In order to effectively utilize limited memory source and the data of efficient managed storage.The application is to store with the symbiosis information of the mode of memory block to each participle, and the quantity of memory block can according to the data volume dynamic-configuration needing the symbiosis information stored.If the storage space of memory block conforms to (both are less than predetermined threshold value at difference) just with the data volume of the symbiosis information of each participle in a text, the symbiosis information of each participle in a text just can be stored at a memory block.If the data volume of the symbiosis information of each participle is less in a text, and the storage space of memory block is larger, the symbiosis information of each participle in multiple text can be stored at a memory block, but should be noted that, because the symbiosis information belonging to all participles of same text can lose efficacy together, and the recycling that empties of memory block also needs entirety to carry out, so the symbiosis information belonging to all participles of same text needs to be assigned to same memory block, such globality is convenient to the cleaning of the entirety of memory block discharge and recycle very much; And if the symbiosis information belonging to all participles of same text cannot be scattered and be stored in different memory blocks, can cause there is a large amount of rubbish in memory block like this, and cannot clear up memory block because of being less than still effective data and discharge the memory block of inefficacy.If memory block remains the symbiosis information that untapped storage space fails to lay down each participle in a text and just fills with 0.
The symbiosis information of each participle to be stored in each memory block stores by current available superlatively location, and the reference position reading each memory block is generally from lowest address, what this mode just can ensure the lowest address storage of each memory block is up-to-date symbiosis information, also can obtain up-to-date symbiosis information at first when reading data from each memory block.
With the multiple memory block of unidirectional loop chain table organization, identify initial memory block respectively with head pointer and tail pointer and terminate memory block, memory block on direction from head pointer to tail pointer is available memory block, because be annular chain meter, the memory block on the direction from tail pointer to head pointer is memory block for subsequent use; Timestamp is provided with in each memory block, storage time the last in each memory block of described timestamp record, on direction from head pointer to tail pointer, storage time shown by the timestamp of each memory block is more and more far away apart from current time, namely the symbiosis information stored in each memory block is more and more older, never upgrade, certain symbiosis information is old and do not have to upgrade and might not represent the symbiosis invalidates information that stores in memory block, only has text itself to fail just to illustrate these symbiosis invalidates information with text.
Step S1022, is bundled in described Text Flag in the symbiosis information of each participle in described text, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list.
Described concordance list (inverted index) preferably adopts two-way annular chain meter to set up, and can certainly select other forms, the application is not limited to this.
Described concordance list establishes in advance, and described concordance list has multiple node, and wherein the corresponding participle of each node also stores the symbiosis information corresponding with described participle; The participle that such as, node A in concordance list is corresponding is " actor A ", and this node A stores the symbiosis information corresponding with " actor A ".Suppose all to comprise " actor A " this participle in the text collected in advance, so in these texts, each text can have the symbiosis information of one " actor A ", and the symbiosis information of " actor A " of these texts has been stored on this node A all in advance.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
When setting up corresponding relation, for the symbiosis information of each participle after binding process, according to each in described each participle, the node of described concordance list travels through, the node of hit adds the symbiosis information of the corresponding participle after binding process.The upper example that continues also illustrates with node A, for the text containing " actor A " this participle that is newly collected, generate the symbiosis information of participle " actor A " in this text and after binding with Text Flag, travel through on the node of concordance list according to " actor A " this participle, hit node A, just " actor A " symbiosis information in this text being added is stored in node A, so just equals to upgrade the symbiosis information of node A.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application
When there is the participle not having to hit after traversal on the node at described concordance list, there is no the participle hit described in blank node in described concordance list creates, and on described blank node, store the symbiosis information not having the participle hit described in after binding process.The upper example that continues also illustrates, if just there is no node correspondence " actor A " in concordance list, generate the symbiosis information of participle " actor A " in this text and after binding with Text Flag, travel through on the node of concordance list according to " actor A " this participle, obviously cannot hit, blank Node B now just in concordance list creates " actor A " this participle, and blank Node B is stored in " actor A " symbiosis information interpolation in this text.It should be noted that, in concordance list, select the blank node of current available superlatively location to store.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
When traveling through in the node of described concordance list according to each in described each participle, preferentially can search the superlatively location in described node, because the superlatively location of described node is provided with participle indicating bit, be used to indicate the participle corresponding to described node.Such as, the participle that concordance list interior joint A is corresponding is " actor A ", that is provided with participle indicating bit in the superlatively location of node A, being used to refer to corresponding participle is specially " actor A ", superlatively location in each node is preferentially retrieved so that quick position is to destination node during such retrieval, if participle corresponding to symbiosis information mates with participle indicating bit, explanation have found node, just can travel through in node, if participle corresponding to symbiosis information does not mate with participle indicating bit, illustrate that this node is not right, node after directly can skipping examination, greatly improve recall precision.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
In addition, when a node cannot store all symbiosis information of a participle, jointly stored the symbiosis information of described participle by multiple node, the participle indicating bit of described multiple node all indicates described participle.Such as, " actor A " this word has all been there is in a lot of text, so the symbiosis information of " actor A " is a lot, data volume arrives greatly cannot be held by a node, so now have node A, Node B, node C jointly to store the symbiosis information of " actor A " respectively, each node stores the part in the symbiosis information of " actor A " respectively, participle indicating bit simultaneously in node A, Node B, node C all indicates corresponding participle to be " actor A ", just facilitates very much like this for retrieval.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
Can be stored in node by symbiosis information when upgrading described concordance list, be in fact be stored in node the address that symbiosis information is stored in memory block.When the symbiosis information of each participle after binding process is stored in node, store to described start address by described side-play amount, be used for identifying the free memory in each node relative to the side-play amount of described start address, described available storage space is that described side-play amount is to described start address.Total storage space in each node is pre-configured, and described total storage space is the part in node between start address and end address.Such as, address in node is that 0x08010000--0x08011000 represents total storage space, side-play amount is 0x08010501, so represent that 0x08010000--0x08010500 is the free memory that can write new data, and 0x08010501--0x08011000 represents the unavailable storage space writing data.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
Step S1024, periodically inquires about the validity of all nodes in described concordance list, clears up for the node lost efficacy.
Except carrying out the interpolation of symbiosis information to concordance list, also to the data in concordance list be eliminated in this step, comprise: when the symbiosis invalidates information of all participles after the binding process inquiring node memory storage, directly do not delete, because likely there is mistake in one query process, therefore in order to avoid deleting by mistake, now in described concordance list, first shield described node, then wait for a period of time, during this period of time probably occur repeatedly to the inquiry whether described node lost efficacy, if or lost efficacy, remain masked state, if when the duration of described node conductively-closed is more than a pre-determined threshold, then can confirm that its failure state is true, empty described node, retain the memory headroom of described node.
It should be noted that, the execution of step S1024 is unfettered for opportunity, its can with step S1020 ~ S1022 executed in parallel, also can perform prior to step S1020, even perform prior to S100, can perform after step S1022, but the execution of step S1024 and S100, S1020, S1022 do not have positive connection yet.
Fig. 3 is the method flow schematic diagram of step S104 in Fig. 1, comprising:
Step S1040, subsequent steps S1022 perform, and identify the search word of user's input and carry out participle to described search word, traveling through respectively according to each in multiple participles that word segmentation processing obtains in described concordance list, obtaining the node of hit.
With an application example, above-mentioned steps is made an explanation below.
Suppose that the search word that user inputs is " actor A performer second ", the each participle sequential obtained after word segmentation processing is " actor A | performer's second ", travel through in described concordance list according to " actor A ", obtain the node A of hit, travel through in described concordance list according to " performer's second ", obtain the Node B of hit.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
Step S1042, for any node of hit, obtains all symbiosis information stored in described any node respectively and forms one group of symbiosis information, thus obtain the many group symbiosis information corresponding with the nodes of described hit;
The last application example that continues makes an explanation to above-mentioned steps.
For node A, obtain first group of symbiosis information (occur) of " actor A ", comprise occur11, occur12, occur13 etc., the Text Flag of occur11 binding is 1391, the Text Flag of occur12 binding is the Text Flag that 1392, occur13 binds is 1393.
For Node B, obtain second group of symbiosis information of " performer's second ", comprise occur21, occur22, occur23 etc., the Text Flag of occur21 binding is 1391, the Text Flag of occur22 binding is the Text Flag that 2392, occur23 binds is 2393.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
Step S1044, the symbiosis information that there is one text and identify is extracted by described many group symbiosis information, for the symbiosis information with one text mark, the distance of position that in the symbiosis information of more described one text mark in couples, corresponding participle occurs in the text and the number of times of appearance, when described distance is less than or equal to one first thresholding and described number of times is less than or equal to second thresholding, the Text Flag in the symbiosis information described one text identified exports.
The last application example that continues makes an explanation to above-mentioned steps.
Comparative analysis finds that the occur21 in occur11 and second group of symbiosis information in first group of symbiosis information has one text mark 1391, illustrates that occur11 and occur21 comes from same section text.Extract occur11 and occur21, the symbiosis information that occur11 carries is (2,0,5), and illustrate that " actor A " is occurred twice in the text of 1391 at Text Flag, position is respectively 0 and 5; The symbiosis information that occur21 carries is (2,2,6), and illustrate that " performer's second " is occurred twice in the text of 1391 at Text Flag, position is respectively 2 and 6.The position relatively occurred in occur11 and occur21 and number of times, both number of times are equal, and the distance of the position occurred all is no more than thresholding 2, illustrate at Text Flag to be that in the text of 1391, " actor A " and " performer's second " these two words all follow hard on appearance, meet very much the intention of user search, therefore Text Flag 1391 is exported, this Text Flag be 1391 text be exactly the target of user search.The application is not limited to the content of above example, and above-mentioned numeral, form and document content do not limit the protection domain of the application.
It should be noted that, if be three after the search word participle of user's input, then can find three groups of symbiosis information, if find a symbiosis information in three groups of symbiosis information respectively, three symbiosis information have one text mark, when carrying out the number of times comparison of the distance of the position occurred and appearance, need comparison in couples between any two.
embodiment 1.2
This example is under the framework of embodiment 1, describes in detail and extracts text message object, and utilizes text message object to upgrade just arranging index in concordance list.Certainly, after more arranging index the first month of the lunar year, can also just arrange index according to the search word use of user's input and carry out traveling through and exporting.
Fig. 4 is the method flow schematic diagram of step S102 ~ S104 in Fig. 1, comprising:
Step S1120, calculates the number of described participle, using the number of described participle as text size, also records the acquisition time of described text size.
With an application example, above-mentioned steps is made an explanation below.
For the document A collected, the title (text) of document A is " in the film that actor A is acted the leading role, actor A appearance is very handsome ", and obtain after word segmentation processing for " actor A | act the leading role | | film | in | actor A | appearance | very | handsome ", one has 9 participles, so text size is exactly 9, the time simultaneously obtaining text size is 2014-7-1912:23:32.
Step S1122, by one or more combination in the acquisition time of described Text Flag, described text size or described text size, carry out merging as text message object with described text, upgrade concordance list according to described text message object, perform step S1140.
When upgrading concordance list according to described text message object, using Text Flag as source code, be mapped as operand with the mask preset through digitwise operation or logical operation, the operand described mapping obtained is as memory address; The text message object corresponding to described Text Flag according to described memory address stores.Memory address for the application of text message object is 0 – 4294967296, and the quantity of total text is generally limited within 1,000 ten thousand, and therefore adopting mask mode to be carried out storing in Text Flag mapped inner-storage address is very save storage resources.Memory address for the application of text message object is effective range, does not have the situation exceeding effective range in theory and occurs, if exceed this scope, and automatic ignored request output error message.
Step S1140, subsequent steps S1122 performs, identify the search word of user's input and participle is carried out to described search word, travel through according in the text of each the effective text message object stored in described concordance list respectively in multiple participles that word segmentation processing obtains, the text message object of hit is exported.
Step S1124, periodically be detected as the significance bit mark that stored text message object is arranged, when the acquisition time distance current time of text size in described text message object is more than a preset duration, the significance bit of described text message object mark is set to lose efficacy.
Because also saving the time obtaining text size in text message object, this time namely sets up time of text message object, when the acquisition time distance current time of text size in described text message object is more than a preset duration, now can think that this text message object is too outmoded, the significance bit of described text message object mark is set to lose efficacy.Because the Internet resources that collected text is corresponding can be constantly updated, especially internet multimedia resource renewal speed is faster, and same multimedia messages may repeatedly be revised, therefore utilize the time as judging that the factor whether old information eliminates is rational.Inefficacy text message object is preferably shielded, and deletes in follow-up opportune moment.
It should be noted that, the execution of step S1124 is unfettered for opportunity, its can with step S1120 ~ S1140 executed in parallel, also can perform prior to step S1120, even perform prior to S100, can perform after step S1140, but the execution of step S1024 and S100 ~ S1140 do not have positive connection yet.
embodiment 2
The device of the present embodiment has gathered the device just arranging index and inverted index two kinds of ways of search, certainly, also can realize the search of just arranging index or inverted index individually.Fig. 5 is the structure drawing of device of the embodiment of the present application, for describing the device based on comprising the concordance list just arranging index and inverted index and carrying out searching for, comprising:
Participle configuration module 90, distributes Text Flag for the text for collecting, and carries out word segmentation processing to described text; Collected document comprises multiple territory, and each territory includes a text, handled by the method for the present embodiment to as if the text in each territory, word segmentation processing is carried out to the text in arbitrary territory and in the text in this territory, generates symbiosis information for each participle.The mark in what Text Flag was indicated is exactly this territory, Text Flag like this for the not same area belonging to same document is different, can retrieve in the symbiosis information belonging to same Text Flag according to the search word of user's input when subsequent step is retrieved, the result therefore finally retrieved is the text corresponding with Text Flag.Certain Text Flag also can be the mark of whole document, the Text Flag belonging to the not same area of same document is like this identical, can retrieve in the symbiosis information belonging to same Text Flag according to the search word of user's input when subsequent step is retrieved, the result therefore finally retrieved is the document corresponding with Text Flag.
Described text is the text in a document in arbitrarily-shaped domain, and described text also can be the document including multiple territory, and described territory comprises: title, text, evaluation etc., and the participle obtained comprises word and/or word
Index upgrade module 92, couples with participle configuration module 90, for extracting the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list 96 according to the parameter information of described text; Here the concordance list 96 related to comprises two kinds, just arranging concordance list 962 and inverted index table 961, alternatively or in combination can use, just arrange when concordance list 962 and inverted index table 961 combinationally use and do not retraining between the two and connect each other, but non-interfering parallel scheme; For inverted index, the parameter information of required utilization comprises the symbiosis information of each participle in described text, and for just arranging index, the parameter information of required utilization comprises text message object.
Retrieval module 94, couples with concordance list 96, for identifying the search word that user inputs and carrying out participle to described search word, travels through respectively according to each in multiple participles that word segmentation processing obtains in described concordance list 96, exports traversing result.
embodiment 2.1
The device of the present embodiment is under the framework of embodiment 2, realizes the maintenance of inverted index and the device of retrieval in concordance list.Fig. 6 is the another structure drawing of device of the embodiment of the present application, for describing the device upgrading concordance list based on participle and utilize concordance list to retrieve, comprising:
Participle configuration module 90, distributes Text Flag for the text for collecting, and carries out word segmentation processing to described text;
Index upgrade module 92, comprises further: the statistics storage unit 920 coupled with participle configuration module 90 respectively and index upgrade unit 922;
Described statistics storage unit 920, the number of times occurred in described text for each participle obtained after adding up described text-processing and the position of appearance, and the symbiosis information that the position of the number of times occurred in described text according to each participle and appearance forms each participle in described text stores;
Described index upgrade unit 922, couple with statistics storage unit 920, for described Text Flag is bundled in the symbiosis information of each participle in described text, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list 961.Concordance list 961 is inverted index table.
When generating symbiosis information, in the symbiosis information of described statistics storage unit 920 arbitrary participle in described text, store in increasing mode the position that described arbitrary participle occurs in described text; Also to each participle of the sequential obtained after described text-processing, identify with the ordinal number of each participle the position that each participle occurs in described text.
When storing symbiosis information, described statistics storage unit 920 stores with the symbiosis information of the mode of memory block to each participle, a memory block stores the symbiosis information of each participle in one or more text, the symbiosis information belonging to all participles of same text is assigned to same memory block, and the symbiosis information of each participle to be stored in each memory block stores by current available superlatively location; And in each memory block, be provided with timestamp, storage time the last in each memory block of described timestamp record; With the multiple memory block of unidirectional loop chain table organization, identify initial memory block respectively with head pointer and tail pointer and terminate memory block, on direction from head pointer to tail pointer, the storage time shown by the timestamp of each memory block is more and more far away apart from current time.
Described index upgrade unit 922 adopts two-way annular chain meter to set up described concordance list 961, and the corresponding participle of each node in described concordance list 961 also stores the symbiosis information corresponding with described participle; For the symbiosis information of each participle after binding process, according to each in described each participle, the node of described concordance list 961 travels through, the node of hit adds the symbiosis information of the corresponding participle after binding process; When there is the participle not having to hit after traversal on the node at described concordance list 961, the participle hit is not had described in blank node in described concordance list 961 creates, and on described blank node, store the symbiosis information not having the participle hit described in after binding process, wherein, in described concordance list 961, the blank node of current available superlatively location is selected to store.
When traveling through on the node of described concordance list 961 according to each in described each participle, described index upgrade unit 922 preferentially searches the superlatively location of described node, the superlatively location of described node is provided with participle indicating bit, be used to indicate the participle that described node stores, wherein, when a node cannot store all symbiosis information of a participle, jointly stored the symbiosis information of described participle by multiple node, the participle indicating bit of described multiple node all indicates described participle.
When symbiosis information is stored in node, in fact that the address that symbiosis information is stored in memory block is stored in node, the symbiosis information of described index upgrade unit 922 to each participle after binding process stores to described start address by described side-play amount, be used for identifying the free memory in each node relative to the side-play amount of described start address, described available storage space is that described side-play amount is to described start address, total storage space in each node is pre-configured, described total storage space is the part between start address and end address.
In addition, described index upgrade unit 922 is periodically inquired about the validity of all nodes in described concordance list 961, when the symbiosis invalidates information of all participles after the binding process that node memory stores up, in described concordance list, shields described node; When the duration of described node conductively-closed is more than a pre-determined threshold, empties described node, retain the memory headroom of described node.
Described retrieval module 94, couple mutually with the concordance list 961 of index upgrade unit 922 updating maintenances, for identifying the search word that user inputs and carrying out participle to described search word, travel through in described concordance list 961 respectively according to each in multiple participles that word segmentation processing obtains, obtain multiple nodes of hit; For each node of hit, obtain all symbiosis information stored in described node respectively and form one group of symbiosis information, thus obtain the many group symbiosis information corresponding with the nodes of described hit; The symbiosis information that there is one text and identify is extracted by described many group symbiosis information, for the symbiosis information with one text mark, the distance of position that in the symbiosis information of more described one text mark in couples, corresponding participle occurs in the text and the number of times of appearance, when described distance is less than or equal to one first thresholding and described number of times is less than or equal to second thresholding, the Text Flag in the symbiosis information described one text identified exports.
The annexation of the participle configuration module 90 of the present embodiment, index upgrade module 92, retrieval module 94 and concordance list 961 and functions of modules, and there is corresponding relation between embodiment 1.1, therefore the present embodiment repeats no more, and weak point refers to embodiment 1.1.
embodiment 2.2
The device of the present embodiment is under the framework of embodiment 2, realizes just arranging the maintenance of index and the device of retrieval in concordance list.Fig. 7 is the another structure drawing of device of the embodiment of the present application, for describing the device upgrading concordance list based on participle and utilize concordance list to retrieve, comprising:
Participle configuration module 90, distributes Text Flag for the text for collecting, and carries out word segmentation processing to described text;
Index upgrade module 92, comprises further: statistics storage unit 924 and index upgrade unit 926;
Statistics storage unit 924, couples with participle configuration module 90, for calculating the number of described participle, using the number of described participle as text size, also records the acquisition time of described text size; By one or more combination in the acquisition time of described Text Flag, described text size or described text size, carry out merging as text message object with described text;
Index upgrade unit 926, couples with statistics storage unit 924, for upgrading concordance list 962 according to described text message object; Concordance list 962 is for just to arrange concordance list.
When described text message object is stored into concordance list 962, index upgrade unit 926 is using Text Flag as source code, and be mapped as operand with the mask preset through digitwise operation or logical operation, the operand described mapping obtained is as memory address; The text message object corresponding to described Text Flag according to described memory address stores.
In addition, described index upgrade unit 926 is also for being periodically detected as the significance bit mark that stored text message object is arranged, when the acquisition time distance current time of text size in described text message object is more than a preset duration, the significance bit of described text message object mark is set to lose efficacy.
Described retrieval module 94, couple with concordance list 962, for identifying the search word that user inputs and carrying out participle to described search word, travel through in the text of the effective text message object stored in described concordance list 962 respectively according to each in multiple participles that word segmentation processing obtains, the text message object of hit is exported.
The annexation of the participle configuration module 90 of the present embodiment, index upgrade module 92, retrieval module 94 and concordance list 962 and functions of modules, and there is corresponding relation between embodiment 1.2, therefore the present embodiment repeats no more, and weak point refers to embodiment 1.2.
It should be noted that, if the function of embodiment 4.1 and embodiment 4.2 will be realized under the framework of embodiment 4 simultaneously, only need by the corresponding module Function Integration Mechanism of embodiment 4.1 and embodiment 4.2, annexation such as Fig. 6 of module can realize.
Above-mentioned explanation illustrate and describes some preferred embodiments of the application, but as previously mentioned, be to be understood that the application is not limited to the form disclosed by this paper, should not regard the eliminating to other embodiments as, and can be used for other combinations various, amendment and environment, and can in invention contemplated scope described herein, changed by the technology of above-mentioned instruction or association area or knowledge.And the change that those skilled in the art carry out and change do not depart from the spirit and scope of the application, then all should in the protection domain of the application's claims.

Claims (12)

1. carry out the method retrieved based on concordance list, it is characterized in that, comprising:
For the text collected distributes Text Flag, and word segmentation processing is carried out to described text;
Extract the parameter information of described text according to each participle obtained after described text-processing, and upgrade concordance list according to the parameter information of described text;
Identify the search word of user's input and participle is carried out to described search word, traveling through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, exporting traversing result.
2. method according to claim 1, is characterized in that, extracts the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list according to the parameter information of described text, comprises further:
The position of the number of times that each participle obtained after adding up described text-processing occurs in described text and appearance, and the symbiosis information that the position of the number of times occurred in described text according to each participle and appearance forms each participle in described text stores;
Described Text Flag is bundled in the symbiosis information of each participle in described text, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list.
3. method according to claim 2, is characterized in that, the symbiosis information that the number of times occurred in described text according to each participle and the position of appearance form each participle in described text stores, and comprises further:
Store with the symbiosis information of the mode of memory block to each participle, a memory block stores the symbiosis information of each participle in one or more text, the symbiosis information belonging to all participles of same text is assigned to same memory block, and the symbiosis information of each participle to be stored in each memory block stores by current available superlatively location.
4. method according to claim 3, is characterized in that,
Timestamp is provided with, storage time the last in each memory block of described timestamp record in each memory block;
With the multiple memory block of unidirectional loop chain table organization, identify initial memory block respectively with head pointer and tail pointer and terminate memory block, on direction from head pointer to tail pointer, the storage time shown by the timestamp of each memory block is more and more far away apart from current time.
5. method according to claim 2, is characterized in that, the symbiosis information of each participle after binding process is set up corresponding relation to the corresponding participle in concordance list thus upgrades described concordance list, comprising further:
Adopt two-way annular chain meter to set up described concordance list, the corresponding participle of each node in described concordance list also stores the symbiosis information corresponding with described participle;
For the symbiosis information of each participle after binding process, according to each in described each participle, the node of described concordance list travels through, when hitting a certain participle, the node that the described participle of hit is corresponding stores the symbiosis information of the corresponding participle after binding process, or, when there is no the participle hit, there is no the participle hit described in blank node in described concordance list creates, and on described blank node, store the symbiosis information not having the participle hit described in after binding process.
6. the method according to any one of claim 2 to 5, is characterized in that,
Periodically the validity of all nodes in described concordance list is inquired about, when the symbiosis invalidates information of all participles after the binding process that node memory stores up, in described concordance list, shield described node;
When the duration of described node conductively-closed is more than a pre-determined threshold, empties described node, retain the memory headroom of described node.
7. method according to claim 2, it is characterized in that, identify the search word of user's input and participle is carried out to described search word, traveling through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, export traversing result, comprise further:
Identify the search word of user's input and participle is carried out to described search word, traveling through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, obtaining multiple nodes of hit;
For each node of hit, obtain all symbiosis information stored in described node respectively and form one group of symbiosis information, thus obtain the many group symbiosis information corresponding with the nodes of described hit;
The symbiosis information that there is one text and identify is extracted by described many group symbiosis information, for the symbiosis information with one text mark, the position that in the symbiosis information of more described one text mark in couples, corresponding participle occurs in the text and the number of times of appearance, distance between the position that described corresponding participle occurs in the text be less than or equal to one first thresholding and described number of times is less than or equal to second thresholding time, described one text mark is exported.
8. method according to claim 1, is characterized in that, extracts the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list according to the parameter information of described text, comprises further:
Calculate the number of described participle, using the number of described participle as text size, also record the acquisition time of described text size;
By one or more combination in the acquisition time of described Text Flag, described text size or described text size, carry out merging as text message object with described text, upgrade concordance list according to described text message object.
9. method according to claim 8, is characterized in that, upgrades concordance list, comprise further according to described text message object:
Using Text Flag as source code, be mapped as operand with the mask preset through digitwise operation or logical operation, the operand described mapping obtained is as memory address;
The text message object corresponding to described Text Flag according to described memory address stores.
10. method according to claim 8, is characterized in that, upgrades concordance list, comprise further according to described text message object:
Periodically be detected as the significance bit mark that stored text message object is arranged, when the acquisition time distance current time of text size in described text message object is more than a preset duration, the significance bit of described text message object mark be set to lose efficacy.
11. methods according to claim 8, it is characterized in that, identify the search word of user's input and participle is carried out to described search word, traveling through in described concordance list respectively according to each in multiple participles that word segmentation processing obtains, export traversing result, comprise further:
Identify the search word of user's input and participle is carried out to described search word, travel through according in the text of each the effective text message object stored in described concordance list respectively in multiple participles that word segmentation processing obtains, the text message object of hit is exported.
12. 1 kinds of devices carrying out retrieving based on concordance list, is characterized in that, comprising:
Participle configuration module, distributes Text Flag for the text for collecting, and carries out word segmentation processing to described text;
Index upgrade module, for extracting the parameter information of described text according to each participle obtained after described text-processing, and upgrades concordance list according to the parameter information of described text;
Retrieval module, for identifying the search word that user inputs and carrying out participle to described search word, travels through respectively according to each in multiple participles that word segmentation processing obtains in described concordance list, exports traversing result.
CN201410802676.2A 2014-12-19 2014-12-19 Method and device for searching based on index table Pending CN104572871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410802676.2A CN104572871A (en) 2014-12-19 2014-12-19 Method and device for searching based on index table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410802676.2A CN104572871A (en) 2014-12-19 2014-12-19 Method and device for searching based on index table

Publications (1)

Publication Number Publication Date
CN104572871A true CN104572871A (en) 2015-04-29

Family

ID=53088933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410802676.2A Pending CN104572871A (en) 2014-12-19 2014-12-19 Method and device for searching based on index table

Country Status (1)

Country Link
CN (1) CN104572871A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930404A (en) * 2016-04-15 2016-09-07 清华大学 Symbiotic relationship analysis based service combination topic evolution diagram construction method
CN106709042A (en) * 2016-12-30 2017-05-24 北京小度互娱科技有限公司 Index updating method and device
CN106874327A (en) * 2016-07-08 2017-06-20 阿里巴巴集团控股有限公司 A kind of method of counting and device for business datum
CN109918375A (en) * 2019-02-26 2019-06-21 杭州云象网络技术有限公司 It is a kind of based on block chain and the big text of distributed storage storage, index and search method
CN111625617A (en) * 2020-06-01 2020-09-04 Oppo广东移动通信有限公司 Data indexing method and device and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023989A (en) * 2009-09-23 2011-04-20 阿里巴巴集团控股有限公司 Information retrieval method and system thereof
CN102236719A (en) * 2011-07-25 2011-11-09 西交利物浦大学 Page search engine based on page classification and quick search method
US20130024459A1 (en) * 2011-07-20 2013-01-24 Microsoft Corporation Combining Full-Text Search and Queryable Fields in the Same Data Structure
CN103064847A (en) * 2011-10-20 2013-04-24 北京中搜网络技术股份有限公司 Indexing equipment, indexing method, search device, search method and search system
CN103186622A (en) * 2011-12-30 2013-07-03 北大方正集团有限公司 Updating method of index information in full text retrieval system and device thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023989A (en) * 2009-09-23 2011-04-20 阿里巴巴集团控股有限公司 Information retrieval method and system thereof
US20130024459A1 (en) * 2011-07-20 2013-01-24 Microsoft Corporation Combining Full-Text Search and Queryable Fields in the Same Data Structure
CN102236719A (en) * 2011-07-25 2011-11-09 西交利物浦大学 Page search engine based on page classification and quick search method
CN103064847A (en) * 2011-10-20 2013-04-24 北京中搜网络技术股份有限公司 Indexing equipment, indexing method, search device, search method and search system
CN103186622A (en) * 2011-12-30 2013-07-03 北大方正集团有限公司 Updating method of index information in full text retrieval system and device thereof

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930404A (en) * 2016-04-15 2016-09-07 清华大学 Symbiotic relationship analysis based service combination topic evolution diagram construction method
CN105930404B (en) * 2016-04-15 2019-02-12 清华大学 A kind of Services Composition subject evolution figure building method based on symbiosis analysis
CN106874327A (en) * 2016-07-08 2017-06-20 阿里巴巴集团控股有限公司 A kind of method of counting and device for business datum
CN106874327B (en) * 2016-07-08 2021-02-23 创新先进技术有限公司 Counting method and device for business data
CN106709042A (en) * 2016-12-30 2017-05-24 北京小度互娱科技有限公司 Index updating method and device
CN106709042B (en) * 2016-12-30 2020-09-25 北京小度互娱科技有限公司 Index updating method and equipment
CN109918375A (en) * 2019-02-26 2019-06-21 杭州云象网络技术有限公司 It is a kind of based on block chain and the big text of distributed storage storage, index and search method
CN111625617A (en) * 2020-06-01 2020-09-04 Oppo广东移动通信有限公司 Data indexing method and device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN104572871A (en) Method and device for searching based on index table
CN102164186B (en) Method and system for realizing cloud search service
CN108255925A (en) A kind of display methods and its terminal of data list structure alteration
CN103678494A (en) Method and device for client side and server side data synchronization
CN104268428A (en) Visual configuration method for index calculation
CN104951512A (en) Public sentiment data collection method and system based on Internet
CN103733195A (en) Managing storage of data for range-based searching
CN102541529A (en) Query page generating device and method
CN103593352A (en) Method and device for cleaning mass data
CN103473076A (en) Issuing method and issuing system for code version
CN104469832A (en) Fault analyzing and positioning auxiliary system for mobile communication network
CN106503274A (en) A kind of Data Integration and searching method and server
CN102110102A (en) Data processing method and device, and file identifying method and tool
CN104077385A (en) Classification and retrieval method of files
CN103077192B (en) A kind of data processing method and system thereof
CN111008020A (en) Method for analyzing logic expression into general query statement
CN103914488A (en) Document collection, identification, association, search and display system
CN111324604A (en) Database table processing method and device, electronic equipment and storage medium
CN101675415A (en) Program pattern analyzer, pattern appearance status information production method, pattern information generating device, and program
CN103914487A (en) Document collection, identification and association system
CN112579454B (en) Task data processing method, device and equipment
CN103914486A (en) Document search and display system
CN110990350A (en) Log analysis method and device
CN104572879A (en) Method and device for updating index table and method and device for searching based on index table
CN107291938A (en) Order Query System and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150429

WD01 Invention patent application deemed withdrawn after publication