CN100458784C - Researching system and method used in digital labrary - Google Patents

Researching system and method used in digital labrary Download PDF

Info

Publication number
CN100458784C
CN100458784C CNB2006100720756A CN200610072075A CN100458784C CN 100458784 C CN100458784 C CN 100458784C CN B2006100720756 A CNB2006100720756 A CN B2006100720756A CN 200610072075 A CN200610072075 A CN 200610072075A CN 100458784 C CN100458784 C CN 100458784C
Authority
CN
China
Prior art keywords
retrieval
layer
index
server
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2006100720756A
Other languages
Chinese (zh)
Other versions
CN101051309A (en
Inventor
廖祥文
孙健
王斌
杨东波
程学旗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CNB2006100720756A priority Critical patent/CN100458784C/en
Publication of CN101051309A publication Critical patent/CN101051309A/en
Application granted granted Critical
Publication of CN100458784C publication Critical patent/CN100458784C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A searching system used in digital library comprises searching controller layer with one or multiple searching controller for retransmitting query to searching controller layer and for returning back searched result to user, searching server layer with one or multiple searching server for storing core searching data and for executing said searching as well as for providing searched result to searching controller layer, indexing server layer with one on multiple indexing server for storing all indexing data and for searching out relevant indexing to obtain searching result.

Description

Searching system that in digital library, is adopted and search method
Technical field
The present invention relates to information retrieval field, more specifically, relate to a kind of searching system that in digital library, is adopted and search method, can retrieve and be easy to expansion efficiently, and can be applied to have the digital library of mass data and large-scale concurrent visit.
Background technology
Current, there are a lot of technology to can be used for making up digital library.General digital library adopts data base method, and this method can make up the smaller library system of data scale quickly and easily.But when data scale rose to the TB level, its index scale was very huge, and retrieval rate is low, can't satisfy the requirement of current information explosive growth.On the other hand, when user concurrent query requests amount increased, these technology can't flexible expansion, is difficult to adapt to ever-increasing user's request.
Current, also exist by some famous information retrieval techniques that commercial search engine adopted.These technology adopt collector usually, and (INTERNET) goes up extracting webpage automatically from the internet, and the employing index technology is that webpage carries out index.In these technology, typically use inverted list (Inverted List), and sort and return result for retrieval, for the user provides second retrieval service of level based on the webpage characteristics.
Yet the books retrieval has the characteristics of himself: different with the Web data of automatic extracting, the data in library are the higher structural data of quality through processing, and its data content is horn of plenty more; In addition, the Web retrieval is just retrieved at the web plane content, and Books Retrieve System needs deeper field level retrieval; In addition, the accuracy of preceding tens result for retrieval is more paid attention in the Web retrieval, and the books retrieval requires to look into entirely, looks into standard, and requires permanently effective retrieval.
Present commercial search engine does not fully take into account these characteristics of Digital Library Services, and very high for the requirement of hardware resource, and this, is difficult to accomplish as the library of public service at present.
Along with enriching constantly of library's development and collection digitalization resource in recent years, digital library has had a large amount of digital resources, and need externally provide service by the internet.This has just proposed a challenge to construction of digital library: how to face the magnanimity metadata and serve whole world demander and construct a searching system.Therefore, people press for a kind of high-level efficiency, extendible digital library construction method, this method must satisfy the books retrieval and look into requirement complete, that look into standard, and can expand along with the growth of Data Growth, user concurrent inquiry, and can handle multilingual data source.
Summary of the invention
Therefore, the objective of the invention is to propose a kind of searching system that in digital library, is adopted and search method, can retrieve and be easy to expansion efficiently, and can be applied to have the digital library of mass data and large-scale concurrent visit.
In order to realize above purpose, the present invention proposes a kind of searching system that in digital library, is adopted, comprise: the retrieval controller layer that comprises one or more retrieval controllers, be used for being forwarded to the retrieval server layer from user's user inquiring, and to handling to return to the user from the corresponding result for retrieval of retrieval server layer; The retrieval server layer that comprises one or more retrieval servers, be used to store the nexus index data of digital library, and at first in described nexus index data, carry out retrieval to obtain result for retrieval at described user inquiring, and in the nexus index data, can't obtain under the situation of result for retrieval, then by the access index server layer obtaining result for retrieval, and the result for retrieval that is obtained is offered the retrieval controller layer; And the index server layer that comprises one or more index servers, be used to store all index datas of digital library, so that from described all index datas, retrieve corresponding index to obtain result for retrieval by the visit of retrieval server layer.
Preferably, described searching system also comprises: the Distributor layer is used for the user inquiring from the user is carried out load balance process.
Preferably, described load balance process adopt load distribution based on the IP layer, based on the load distribution of transport layer, realize based on the load distribution of application layer.
Preferably, described load distribution based on the IP layer comprises the Round-Robin mode based on the IP layer.
Preferably, realize handling from the corresponding result for retrieval of retrieval server layer by described retrieval controller layer by described corresponding result for retrieval being merged and generating summary.
Preferably, described retrieval controller layer is organized with the XML form result for retrieval.
Preferably, retrieval server in the described retrieval server layer forms corresponding retrieval server group according to the mode of setting up corresponding nexus index corresponding to different book databases respectively, and the index server in the described index server layer forms corresponding index server group according to the mode of preserving corresponding index corresponding to different book databases respectively.
Preferably, described nexus index is in two modes between the threshold value and sets up according to the chain length of arranging of index.
Preferably, described nexus index is set up according to the historical occurrence frequency of index.
Preferably, described nexus index is according to remitting foundation with the corresponding core word of user access activity feature.
Preferably, when the user concurrent query requests increased, the number of retrieval controller and retrieval server group increased in the mode of linearity.
Preferably, when the data scale of digital library increased, the number of retrieval controller and index server increased in the mode of linearity.
To achieve these goals, according to the present invention, a kind of search method that is adopted in digital library has also been proposed, comprise: will be forwarded to the retrieval server layer from user's user inquiring by the retrieval controller layer, and to handling to return to the user from the corresponding result for retrieval of retrieval server layer; By the retrieval server layer at described user inquiring, at first in digital library's nexus index data of being stored, carry out retrieval to obtain result for retrieval, and in the nexus index data, can't obtain under the situation of result for retrieval, then by the access index server layer obtaining result for retrieval, and the result for retrieval that is obtained is offered the retrieval controller layer; And response is from the visit of retrieval server layer, retrieves corresponding index to obtain result for retrieval by the index server layer from all index datas of the digital library that stored.
Description of drawings
Below in conjunction with the detailed description of preferred embodiment of accompanying drawing to being adopted, above-mentioned purpose of the present invention, advantage and feature will become apparent by reference, wherein:
Fig. 1 shows according to embodiments of the invention, the schematic configuration view of the searching system that is adopted in digital library; And
Fig. 2 shows the synoptic diagram to the test platform framework of searching system shown in Figure 1.
Embodiment
The preferred embodiments of the present invention are described below with reference to the accompanying drawings.
Fig. 1 shows according to embodiments of the invention, the schematic configuration view of the searching system that is adopted in digital library.This searching system has adopted the architectural framework based on cluster (cluster).
As shown in Figure 1, comprise according to the searching system that in digital library, is adopted of the embodiment of the invention: Distributor layer 1, it adopts the Round-Robin mode to carry out load balancing; Retrieval controller layer 2, comprising Web server as retrieval controller, this Web server adopts Apache Server, and the raw data document stored with the XML form, and adopt conflation algorithm that the result for retrieval from retrieval server is merged, generate summary and return to the user; Retrieval server layer 3 is retrieved and result for retrieval is returned to retrieval controller layer 2 user inquiring; And index server layer 4, deposit all data directories, so that be 3 service of retrieval server layer.Describe the configuration of each level below respectively in detail.
1) the Distributor layer 1
At first, how the Distributor of describing in the Distributor layer 1 carries out load balancing,, how to realize the load distribution function of Distributor layer 1 that is.
At present, load distribution mainly contains 3 types: based on the load distribution of IP layer, based on the load distribution of the 4th layer (transport layer), based on the load distribution of the 7th layer (application layer), and, there is the relevant hardware switch at various types of load distribution.As example, in the present invention, can adopt based on the Round-Robin mode of IP layer and carry out load balancing.
2) the retrieval controller layer 2
Retrieval controller in the retrieval controller layer 2 is described below, and how dispatch user is inquired about to retrieval server, how to carry out merge sort, generates summary and returns to the user, promptly realizes the retrieval control function of retrieval controller layer 2.
Usually, as the Web server employing Apache Server of retrieval controller, and load the FastCgi module simultaneously.
In this Web server, the book data corresponding to result for retrieval is organized with the XML form.As example, its form is as follows:
<METADATA>
<ID>00001<ID>
<TITLE〉modern information retrieval</TITLE 〉
<AUTHOR>Ricardo?Baeza-Yates,etc</AUTHOR>
<PUBLISHER〉China Machine Press</PUBLISHER 〉
</METADATA>
Simultaneously, adopt following algorithm to serve in this Web server for the user:
This Web server obtains user inquiring Query_A from the Distributor device; Then the user inquiring Query_A that is obtained is transmitted to corresponding retrieval server.After retrieval server is finished retrieval, obtain result for retrieval the retrieval server in respective sets, for example, obtained the result for retrieval of 3 retrieval servers in the respective sets: retrieval server A returns R to it a, retrieval server B returns R to it b, retrieval server C returns R to it cThen, retrieval controller is to result for retrieval R a, R b, R cCarry out merge sort to obtain amalgamation result R.
Obtain result for retrieval and generate summary according to amalgamation result R, the summary that is generated is returned to the user.
3) the retrieval server layer 3
Once more, how the retrieval server of describing in the retrieval server layer 3 is retrieved according to the nexus index of this locality, promptly realizes the search function of retrieval server layer 3.
Because what face is mass data, it is not real on the retrieval server that the index of all data all is placed on.Therefore, in the present invention, introduced the notion of " retrieval server group ".Just, index is distributed on the different machines, all these machines have been formed a retrieval server group.The service that the user is provided according to the searching system that is adopted in the digital library of the present invention is the branch library searching, data also are branch storehouse tissues, therefore, can adopt the mode of local distribution, that is,, set up corresponding index respectively corresponding to different book databases, then, be that the user serves by different retrieval server groups.
But if the index of all data all is placed on the retrieval server group, when user concurrent query requests number increases, the machine number that needs so will be very huge.
In order to address this problem, in the present invention, introduced " nexus index " notion.Two threshold value LOW_INDEX_LIST and HIGH_INDEX_LIST set in index to all data.Then, the chain length INVERTED_LIST_LENGTH that arranges with index satisfies:
The index that all of LOW_INDEX_LIST<=INVERTED_LIST_LENGTH<=HIGH_INDEX_LIST are arranged the chain composition is called " nexus index ".Two threshold value LOW_INDEX_LIST and HIGH_INDEX_LIST can adjust according to practice.Usually, " nexus index " size account for all data index 30%.
The foundation of nexus index can be adopted multiple foundation, and for example, the foundation of this nexus index can realize according to the historical occurrence frequency of index, perhaps can be according to remitting realization with the corresponding core word of user access activity feature.
When the user inquiring that receives from retrieval controller layer 2, described retrieval server layer 3 is at described user inquiring, at first in described nexus index data, carry out retrieval to obtain result for retrieval, and in the nexus index data, can't obtain under the situation of result for retrieval, then by access index server layer 4 obtaining result for retrieval, and the result for retrieval that is obtained is offered retrieval controller layer 2.
4) the index server layer 4
At last, will describe the configuration of the index server in the index server layer 4, promptly realize the magnanimity index datastore function of index server layer 4.
The related index data of digital library can be a mass data.Because the data volume of all index datas is very huge, it also is unpractical on the single machine that all index datas are placed on.Usually, as the above mentioned, according to book classification book data being divided into a plurality of storehouses when digital library handles data provides service for the user.Therefore, in the present invention, adopted the mode of local distribution.Just,, set up index respectively, load and be distributed on the different index server groups all index server group formation retrieval server layers corresponding to different book databases.
On this basis, the visit that described index server layer 4 can respond retrieval server layer 3 retrieves corresponding index obtaining result for retrieval from described all index datas, and result for retrieval is returned to retrieval server layer 3.
In addition, to it should be noted that in order handling multilingually, in retrieving of the present invention, to have adopted the UNICODE coding.
Below with reference to Fig. 2 synoptic diagram to the test platform framework of searching system shown in Figure 1 is described.Further, with the architectural framework of explanation based on Fig. 1 description, how little from data scale and the framework that framework is big to data scale and user concurrent query requests number is big that user concurrent query requests number is few deduce scheme, the scaling problem of searching system promptly shown in Figure 1.
As shown in Figure 2, the test platform framework to searching system shown in Figure 1 comprises: 4 CPU2G, in save as the node machine of 8G, 16 CPU2G, in save as on the cluster platform that the rolling reamer machine of 4G constitutes and test.Operating system is Redhat Linux 9.0 (kernel version 2 .4.26#SMP).Concrete configuration in Fig. 2 shows as: a plurality of simulant-clients, gigabit Ethernet, Web server access device and a retrieval server group of planes.
The data of the user's experiment in this test can comprise: (1) Chinese contents data; (2) western language contents data; (3) EBSCO data; (4) CNMARC data; (5) USMARC data.Related metadata total amount is more than 2,300 ten thousand, and capacity is 13.6G.
Being distributed as of each load: each storehouse, field (autograph, summary, publishing house, author) evenly distribute, include single term, 2 terms, 3 terms " with " inquiry respectively account for 23% approximately; Two terms " or " account for 11%; 4 terms " with " account for 10%; 5 terms " with " account for 5%; Two terms " non-" account for 5%.Result for retrieval returns preceding 500 results (document id and weights).
Can obtain according to this test: be the nexus index of 2G (about 2,300 ten thousand metadata) if load the index size on single node, then retrieval rate is 283.3 to reply/second (replies/second), and the average occupancy of retrieval controller CPU is about 27%; When retrieval controller loaded 2 retrieval servers, the average occupancy of CPU was about 51%, retrieval rate be 555.5 reply/second; Load 1 average occupancy of retrieval server CPU and be about 90%, retrieval rate be 740.8 reply/second.What particularly point out is, the performance of Distributor is not a bottleneck, and common 1 to 2 just can be satisfied bigger concurrent request number, and we are set to 2 in deduction.
When data scale increases, newly-increased data are set up index, be loaded on the new retrieval server, simultaneously new retrieval server is added in each group, specifically, each number of organizing the retrieval server node is to be determined by the big or small q of the free memory of the total big or small p of index and each node, that is, and and required retrieval server interstitial content=p/q.
When userbase increases, will dispose the retrieval server of more groups of isomorphisms, each is organized retrieval server and loads identical data directory, transmits the user by load equalizer simultaneously and asks to corresponding retrieval server group.
Under this data qualification, deduce the hardware configuration scheme that obtains 200,000,000 records and 10000 concurrent requests based on current test macro needs:
Article 200,000,000, the index size of record is about 20G, and index server needs the node of 2 16G internal memories;
The size of nexus index is about: 20 * 30%=6G;
14 of retrieval controller number: 10000/740.8=13.49;
36 groups of retrieval server group number: 10000/283.3=35.30;
The node number of every group of retrieval server: 6GB/2GB=3;
Load distribution device: 2;
Therefore, total node number is: 14+38 * 3+2+2=132 platform.
As mentioned above, according to the present invention, made up towards the searching system of digital library system with mass data, the concurrent query requests of large-scale consumer.The present invention has adopted four coating systems frameworks.In realization, the present invention fully takes into account the characteristics that the library uses, and index is set up in the branch storehouse.When the user concurrent query requests increases, by increasing corresponding retrieval controller and retrieval server group, increase in the mode of approximately linear; When the data scale of digital library increases, also increase retrieval controller and index server in each retrieval controller group in the mode of approximately linear.In addition, owing to adopted the UINCODE coding, so this searching system and search method are independent of languages.
Although below show the present invention in conjunction with the preferred embodiments of the present invention, one skilled in the art will appreciate that under the situation that does not break away from the spirit and scope of the present invention, can carry out various modifications, replacement and change to the present invention.Therefore, the present invention should not limited by the foregoing description, and should be limited by claims and equivalent thereof.

Claims (18)

1, a kind of searching system that is adopted in digital library comprises:
The retrieval controller layer that comprises one or more retrieval controllers is used for the user inquiring from the user is forwarded to the retrieval server layer, and to handling to return to the user from the corresponding result for retrieval of retrieval server layer;
The retrieval server layer that comprises one or more retrieval servers, be used to store the nexus index data of digital library, and at first in described nexus index data, carry out retrieval to obtain result for retrieval at described user inquiring, and in the nexus index data, can't obtain under the situation of result for retrieval, then by the access index server layer obtaining result for retrieval, and the result for retrieval that is obtained is offered the retrieval controller layer; And
The index server layer that comprises one or more index servers is used to store all index datas of digital library, so that the visit of response retrieval server layer retrieves corresponding index obtaining result for retrieval from described all index datas,
Wherein, described nexus index data are set up one of in such a way: the chain length of arranging according to index is in two modes between the threshold value and sets up; Historical occurrence frequency according to index is set up; And according to remitting foundation with the corresponding core word of user access activity feature.
2, system according to claim 1 is characterized in that also comprising:
The Distributor layer is used for the user inquiring from the user is carried out load balance process.
3, system according to claim 2, it is characterized in that described load balance process adopt load distribution based on the IP layer, based on the load distribution of transport layer, realize based on the load distribution of application layer.
4, system according to claim 3 is characterized in that described load distribution based on the IP layer comprises the Round-Robin mode based on the IP layer.
5, system according to claim 1 is characterized in that being realized handling from the corresponding result for retrieval of retrieval server layer by described corresponding result for retrieval being merged and generating summary by described retrieval controller layer.
6, system according to claim 5 is characterized in that described retrieval controller layer organizes with the XML form result for retrieval.
7, system according to claim 1, it is characterized in that the retrieval server in the described retrieval server layer forms corresponding retrieval server group according to the mode of setting up corresponding nexus index corresponding to different book databases respectively, and the index server in the described index server layer forms corresponding index server group according to the mode of preserving corresponding index corresponding to different book databases respectively.
8, system according to claim 7 is characterized in that when the user concurrent query requests increased, the number of retrieval controller and retrieval server group increased in the mode of linearity.
9, system according to claim 7 is characterized in that the number of retrieval server and index server increases in the mode of linearity when the data scale of digital library increases.
10, a kind of search method that is adopted in digital library comprises:
To be forwarded to the retrieval server layer from user's user inquiring by the retrieval controller layer, and to handling to return to the user from the corresponding result for retrieval of retrieval server layer;
By the retrieval server layer at described user inquiring, at first in digital library's nexus index data of being stored, carry out retrieval to obtain result for retrieval, and in the nexus index data, can't obtain under the situation of result for retrieval, then by the access index server layer obtaining result for retrieval, and the result for retrieval that is obtained is offered the retrieval controller layer; And
Response is from the visit of retrieval server layer, retrieves corresponding index obtaining result for retrieval from all index datas of the digital library that stored by the index server layer,
Wherein, described nexus index data are set up one of in such a way: the chain length of arranging according to index is in two modes between the threshold value and sets up; Historical occurrence frequency according to index is set up; And according to remitting foundation with the corresponding core word of user access activity feature.
11, method according to claim 10 is characterized in that also comprising:
By the Distributor layer user inquiring from the user is carried out load balance process.
12, method according to claim 11, it is characterized in that described load balance process adopt load distribution based on the IP layer, based on the load distribution of transport layer, realize based on the load distribution of application layer.
13, method according to claim 12 is characterized in that described load distribution based on the IP layer comprises the Round-Robin mode based on the IP layer.
14, method according to claim 10 is characterized in that by described retrieval controller layer the step of handling from the corresponding result for retrieval of retrieval server layer being comprised: the corresponding result for retrieval from the retrieval server layer is merged and generates summary.
15, method according to claim 14 is characterized in that described retrieval controller layer organizes with the XML form result for retrieval.
16, method according to claim 10, it is characterized in that described retrieval server layer comprises: one or more retrieval servers, described retrieval server forms corresponding retrieval server group according to the mode of setting up corresponding nexus index corresponding to different book databases respectively; And described index server layer comprises: one or more index servers, described index server forms corresponding index server group according to the mode of preserving corresponding index corresponding to different book databases respectively.
17, method according to claim 16 is characterized in that when the user concurrent query requests increased, the number of retrieval controller and retrieval server group increased in the mode of linearity;
18, method according to claim 16 is characterized in that the number of retrieval server and index server increases in the mode of linearity when the data scale of digital library increases.
CNB2006100720756A 2006-04-06 2006-04-06 Researching system and method used in digital labrary Active CN100458784C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100720756A CN100458784C (en) 2006-04-06 2006-04-06 Researching system and method used in digital labrary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100720756A CN100458784C (en) 2006-04-06 2006-04-06 Researching system and method used in digital labrary

Publications (2)

Publication Number Publication Date
CN101051309A CN101051309A (en) 2007-10-10
CN100458784C true CN100458784C (en) 2009-02-04

Family

ID=38782725

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100720756A Active CN100458784C (en) 2006-04-06 2006-04-06 Researching system and method used in digital labrary

Country Status (1)

Country Link
CN (1) CN100458784C (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200979A (en) * 2010-03-26 2011-09-28 上海市浦东科技信息中心 Distributed parallel information retrieval system and distributed parallel information retrieval method
CN102760137A (en) * 2011-04-27 2012-10-31 上海特易信息科技有限公司 Distributed full-text search method and distributed full-text search system
CN102779134B (en) * 2011-05-12 2015-05-13 同程网络科技股份有限公司 Lucene-based distributed search method
CN102779160B (en) * 2012-06-14 2016-02-03 中金数据***有限公司 Mass data information index system and index structuring method
CN103020300B (en) * 2012-12-28 2017-04-12 杭州华三通信技术有限公司 Method and device for information retrieval
CN103714144B (en) * 2013-12-25 2017-05-10 新华三技术有限公司 Device and method for information retrieval
CN103678697A (en) * 2013-12-26 2014-03-26 乐视网信息技术(北京)股份有限公司 Reverse index storage method and system thereof
CN104917847A (en) * 2015-07-03 2015-09-16 成都怡云科技有限公司 Mobile library based on cloud desktop
CN105677737B (en) * 2015-12-29 2018-08-21 河海大学 Management of Periodical Information system
CN105808656B (en) * 2016-02-26 2019-09-06 广州品唯软件有限公司 A kind of processing framework and its access method for self-service access
CN106776910A (en) * 2016-11-30 2017-05-31 咪咕数字传媒有限公司 The display methods and device of a kind of Search Results

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0934897A (en) * 1995-07-18 1997-02-07 Toshiba Corp Book management system
CN1374603A (en) * 2001-03-09 2002-10-16 刘莎 Internet information sharing system and method
US20020152190A1 (en) * 2001-02-07 2002-10-17 International Business Machines Corporation Customer self service subsystem for adaptive indexing of resource solutions and resource lookup
JP2003242180A (en) * 2002-02-13 2003-08-29 Ricoh Co Ltd Whole sentence retrieval device
CN1529460A (en) * 2003-10-14 2004-09-15 北京邮电大学 Whole load equalizing method based on global network positioning
CN1707476A (en) * 2005-05-06 2005-12-14 贺方升 Auxiliary translation searching engine system and method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0934897A (en) * 1995-07-18 1997-02-07 Toshiba Corp Book management system
US20020152190A1 (en) * 2001-02-07 2002-10-17 International Business Machines Corporation Customer self service subsystem for adaptive indexing of resource solutions and resource lookup
CN1374603A (en) * 2001-03-09 2002-10-16 刘莎 Internet information sharing system and method
JP2003242180A (en) * 2002-02-13 2003-08-29 Ricoh Co Ltd Whole sentence retrieval device
CN1529460A (en) * 2003-10-14 2004-09-15 北京邮电大学 Whole load equalizing method based on global network positioning
CN1707476A (en) * 2005-05-06 2005-12-14 贺方升 Auxiliary translation searching engine system and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
海量数据的索引与检索***. 张刚,孙健,丁国栋,米嘉,王斌.全国网络与信息安全技术研讨会. 2004 *

Also Published As

Publication number Publication date
CN101051309A (en) 2007-10-10

Similar Documents

Publication Publication Date Title
CN100458784C (en) Researching system and method used in digital labrary
CN100462979C (en) Distributed indesx file searching method, searching system and searching server
CN104252536B (en) A kind of internet log data query method and device based on hbase
CN103020204B (en) A kind of method and its system carrying out multi-dimensional interval query to distributed sequence list
CN101604324B (en) Method and system for searching video service websites based on meta search
Cambazoglu et al. Scalability challenges in web search engines
CN103020281B (en) A kind of data storage and retrieval method based on spatial data numerical index
CN107038207A (en) A kind of data query method, data processing method and device
US8359318B2 (en) System and method for distributed index searching of electronic content
CN110162528A (en) Magnanimity big data search method and system
CN102332030A (en) Data storing, managing and inquiring method and system for distributed key-value storage system
CN102375853A (en) Distributed database system, method for building index therein and query method
CN102521405A (en) Massive structured data storage and query methods and systems supporting high-speed loading
CN103353901B (en) The orderly management method of table data based on Hadoop distributed file system and system
CN101354726A (en) Method for managing memory metadata of cluster file system
US9262511B2 (en) System and method for indexing streams containing unstructured text data
CN105117502A (en) Search method based on big data
US9195745B2 (en) Dynamic query master agent for query execution
CN104239377A (en) Platform-crossing data retrieval method and device
CN104391908B (en) Multiple key indexing means based on local sensitivity Hash on a kind of figure
CN106095951B (en) Data space multi-dimensional indexing method based on load balancing and inquiry log
CN101963993B (en) Method for fast searching database sheet table record
CN102521383A (en) Method for storing and accessing mass files in distributed system
CN103823805B (en) Community-based correlation note commending system and recommendation method
CN102597969A (en) Database management device using key-value store with attributes, and key-value-store structure caching-device therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20071010

Assignee: Branch DNT data Polytron Technologies Inc

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2018110000033

Denomination of invention: Researching system and method used in digital labrary

Granted publication date: 20090204

License type: Common License

Record date: 20180807