CN109710612A - Vector index recalls method, apparatus, electronic equipment and storage medium - Google Patents

Vector index recalls method, apparatus, electronic equipment and storage medium Download PDF

Info

Publication number
CN109710612A
CN109710612A CN201811595045.2A CN201811595045A CN109710612A CN 109710612 A CN109710612 A CN 109710612A CN 201811595045 A CN201811595045 A CN 201811595045A CN 109710612 A CN109710612 A CN 109710612A
Authority
CN
China
Prior art keywords
parameter
configuration
index
business application
application side
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811595045.2A
Other languages
Chinese (zh)
Other versions
CN109710612B (en
Inventor
段雪涛
吴永巍
杨冰霜
陈再萍
侍路登
闵保红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811595045.2A priority Critical patent/CN109710612B/en
Publication of CN109710612A publication Critical patent/CN109710612A/en
Application granted granted Critical
Publication of CN109710612B publication Critical patent/CN109710612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Method, apparatus, electronic equipment and storage medium are recalled the invention discloses a kind of vector index.Wherein, this method comprises: when receiving the request network packet that upstream business application side is sent, data parsing is carried out to request network packet and obtains corresponding vector expression;It determines the scene demand parameter of upstream business application side, and corresponding target configuration parameter is determined from the first configuration file pre-established according to scene demand parameter;Corresponding indexed search algorithm and index database are selected according to target configuration parameter;Using corresponding indexed search algorithm, vector is expressed and is matched with the vector index in corresponding index database, the vector index that will match to is recalled.This method can also take into account accuracy, efficiency and cost while realizing that vector index is recalled.

Description

Vector index recalls method, apparatus, electronic equipment and storage medium
Technical field
The present invention relates to data processing field more particularly to a kind of vector index recall method, apparatus, electronic equipment and Computer readable storage medium.
Background technique
Vector index technology is that a kind of semantic-based index technology is different with traditional inverted index, it simultaneously passes through object The direct feature of material indexes to construct, but material is described by multidimensional characteristic, switchs to vector and is indicated, and passes through Vector similarity is recalled to be indexed.
In the related technology, vector index system generallys use following three kinds of schemes to realize that index is recalled: scheme 1) pass through Violence calculates progress vector index and recalls, it may be assumed that is calculated by violence, vector distance calculation is not limited to Euclidean distance, cosine Distance, manhatton distance etc., the vector of progress is recalled with highest accuracy in computation, due to needing to carry out the overall situation essence of vector It really calculates, recalls accuracy highest, effect is relatively best;Scheme 2) mode based on index of the picture carries out vector index and recalls, it may be assumed that By using layer structure, side is layered by characteristic radius, each vertex average degree in all layers is made to become constant, To which the computation complexity of polylogarithm is fallen below logarithmic complexity;Scheme 3) the indexed mode progress vector based on binary tree Index is recalled, it may be assumed that global y-bend tree topology is established by way of constructing cluster centre point come recursive iteration, so that each The lookup time complexity of point is logarithmic complexity.
But the problems of above-mentioned three kinds of schemes are: scheme 1) violence calculating have the advantages that accuracy is high, still Due to needing to accurately calculate to full dose library, so that calculating, cost is high, and production cost is very big;Scheme 2) it is based on index of the picture side The calculating time-consuming of formula is less, advantageous for online access delay, but the operation program is larger for memory consumption, and memory is deposited It is very high to store up cost;Scheme 3) the index construct mode based on binary tree, can occur falling on leaf child node in query process The problem of data section points are less than TopN similar numbers and two similar back end are divided into binary tree difference branch Situation, and solve this problem and need to establish more y-bend trees, the rising for being easy to cause index cost, retrieving cost, accuracy Also it will receive certain influence.
Therefore, the cost and effective balance of vector index system under large-scale data environment how to be solved the problems, such as, so that Accuracy, efficiency and cost can also be taken into account while vector index is recalled by realizing, have become urgent problem to be solved.
Summary of the invention
The purpose of the present invention is intended to solve above-mentioned one of technical problem at least to a certain extent.
For this purpose, the first purpose of this invention be to propose a kind of vector index recall method.This method can solve The cost of vector index system and effective balance problem under large-scale data environment, so that while realizing that vector index is recalled Accuracy, efficiency and cost can also be taken into account.
What second object of the present invention was to propose a kind of vector index recalls device.
Third object of the present invention is to propose a kind of electronic equipment.
Fourth object of the present invention is to propose a kind of computer readable storage medium.
In order to achieve the above objectives, the vector index that first aspect present invention embodiment proposes recalls method, comprising: is connecing When receiving the request network packet that upstream business application side is sent, to the request network packet carry out data parsing obtain it is corresponding to Amount expression;Determine the scene demand parameter of the upstream business application side, and according to the scene demand parameter from pre-establishing The first configuration file in determine corresponding target configuration parameter;Corresponding index inspection is selected according to the target configuration parameter Rope algorithm and index database;Using the corresponding indexed search algorithm, by vector expression in the corresponding index database Vector index matched, the vector index that will match to is recalled.
Vector index according to an embodiment of the present invention recalls method, in the request for receiving upstream business application side and sending When network packet, data parsing is carried out to request network packet and obtains corresponding vector expression, and determines the field of upstream business application side Scape demand parameter, and corresponding target configuration ginseng is determined from the first configuration file pre-established according to scene demand parameter Number can select corresponding indexed search algorithm and index database according to target configuration parameter, then, using corresponding index later Vector is expressed and is matched with the vector index in corresponding index database by searching algorithm, and the vector index that will match to carries out It recalls.Suitable indexed search algorithm and corresponding index are selected according to the scene demand parameter of upstream business application method Library, and then can realize that vector index is recalled based on the suitable indexed search algorithm of selection and corresponding index database, so as to Enough under extensive vector index scene, effectively balances cost input, calculates effect and operating lag, and to use this hair It bright application side can be under limited resource input, according to items such as different application scenarios, access pressure, online access delays Best effect is taken under part, so as to realize vector index recall while can also take into account accuracy, efficiency and at This.
In order to achieve the above objectives, the vector index that second aspect of the present invention embodiment proposes recalls device, comprising: data Parsing module, for being counted to the request network packet when receiving the request network packet that upstream business application side is sent Corresponding vector expression is obtained according to parsing;Scene demand determining module, for determining that the scene of the upstream business application side needs Seek parameter;Optimized parameter determining module, for true from the first configuration file pre-established according to the scene demand parameter Make corresponding target configuration parameter;Selecting module, for selecting corresponding indexed search to calculate according to the target configuration parameter Method and index database;Index recall module, for use the corresponding indexed search algorithm, by the vector expression with it is described right The vector index in index database answered is matched, and the vector index that will match to is recalled.
Vector index according to an embodiment of the present invention recalls device, can receive upstream industry by data resolution module When the request network packet that business application side is sent, the corresponding vector of data parsing acquisition is carried out to request network packet and is expressed, scene needs Ask determining module to determine the scene demand parameter of upstream business application side, optimized parameter determining module according to scene demand parameter from Determine that corresponding target configuration parameter, selecting module can be selected according to target configuration parameter in the first configuration file pre-established Select corresponding indexed search algorithm and index database, index recalls module using corresponding indexed search algorithm, by vector expression with Vector index in corresponding index database is matched, and the vector index that will match to is recalled.Answered according to upstream business Suitable indexed search algorithm and corresponding index database are selected with the scene demand parameter of method, and then can be based on the conjunction of selection Suitable indexed search algorithm and corresponding index database realize that vector index is recalled, so as in extensive vector index scene Under, it effectively balances cost input, calculate effect and operating lag, and making can be limited using application side of the invention Under resource input, according to different application scenarios, access pressure, online access delay etc. under the conditions of take best effect, from And accuracy, efficiency and cost can be also taken into account while realizing that vector index is recalled.
In order to achieve the above objectives, the electronic equipment that third aspect present invention embodiment proposes, comprising: memory, processor And the computer program that is stored in the memory and can run on the processor, the processor execute the computer When program, that realizes vector index described in first aspect present invention embodiment recalls method.
In order to achieve the above objectives, the computer readable storage medium that fourth aspect present invention embodiment proposes, stores thereon There is computer program, vector rope described in first aspect present invention embodiment is realized when the computer program is executed by processor That draws recalls method.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is the flow chart for recalling method of vector index according to an embodiment of the invention;
Fig. 2 is the flow chart according to an embodiment of the present invention for establishing the first configuration file;
Fig. 3 is the configuration diagram of vector index system according to an embodiment of the present invention;
Fig. 4 is the structural schematic diagram for recalling device of vector index according to an embodiment of the invention;
Fig. 5 is the structural schematic diagram for recalling device of vector index in accordance with another embodiment of the present invention;
Fig. 6 is the structural schematic diagram for recalling device of the vector index of another embodiment according to the present invention;
Fig. 7 is the structural schematic diagram of electronic equipment according to an embodiment of the invention.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings describe the vector index of the embodiment of the present invention recalls method, apparatus, electronic equipment and calculating Machine readable storage medium storing program for executing.
Fig. 1 is the flow chart for recalling method of vector index according to an embodiment of the invention.It should be noted that this The method of recalling of the vector index of inventive embodiments can be applied to vector index system, which can be configured in electricity In sub- equipment, for example, the electronic equipment can be the equipment with vector index call back function.
As shown in Figure 1, the method for recalling of the vector index may include:
S110 carries out data solution to request network packet when receiving the request network packet that upstream business application side is sent Analysis obtains corresponding vector expression.
Specifically, the request network packet that receivable upstream business application side is sent can when receiving the request network packet Data parsing is carried out to the request network packet, obtains corresponding vector expression.For example, receiving visitor by taking searching service as an example When the searching request that family end is sent, which can be parsed, obtain search term included in the request, and to the search Word carries out vector conversion, obtains corresponding vector expression.
S120, determine the scene demand parameter of upstream business application side, and according to scene demand parameter from pre-establishing Corresponding target configuration parameter is determined in first configuration file;It wherein, include under various scene demands in the first configuration file Configuration parameter, such as the configuration parameter be allocation optimum parameter, that is, be most suitable for current scene demand under configuration parameter.
Wherein, in an embodiment of the present invention, the scene demand parameter refers to the access pressure of upstream business application side The requirement of (such as include pressure peak, pressure mean value), the requirement for computational accuracy, for the time-consuming requirement of access and Intend to put into how many cost (such as upper limit of input cost) etc..
Specifically, upstream business application side can also send the scene of itself while sending the request network packet Demand parameter, so that vector index system obtains the scene demand parameter of the upstream business application side;Alternatively, can be preparatory Configuration file is established, includes the corresponding relationship of each upstream business application side He its scene demand parameter in the configuration file, For example, corresponding scene demand parameter can be found out from the corresponding relationship by the identification information of upstream business application side, thus The scene demand parameter of the service application side of available currently transmitted service request.
Optionally, in one embodiment of the invention, this can be pre-established in the following manner be used to store each industry The configuration file of the scene demand parameter of business application method: before receiving the request network packet that upstream business application side is sent, Can upstream traffic application side provide scene demand parameter configuration interface, wherein the configuration interface includes default configuration parameters With the config option of the default configuration parameters, and the upstream business application side is received on the configuration interface according to The configuration modifications carried out based on default configuration parameters obtain the scene demand parameter letter that the upstream business application side configures Breath, and the scene demand parameter information that the upstream business application side configures is stored into the second configuration file.
That is, can upstream traffic application side scene demand parameter configuration interface is provided, which may include But it is not limited to default configuration parameters and the config option of the default configuration parameters etc..Upstream business application can according to itself Scene demand carries out the configuration of scene demand parameter in the configuration interface.If upstream business application side oneself needs itself scene It asks indefinite, provided default configuration parameters may be used on the configuration interface as the scene demand parameter of itself, wherein It is appreciated that the default configuration parameters usually consider higher effect (such as accuracy rate, online retrieving time-consuming), cost input meeting It is on the high side.Upstream business application side can according to self-demand according to carrying out configuration modifications based on the default configuration parameters, thus Obtain the scene demand parameter of meet demand.Later, the field that all upstream business application sides can be configured on the configuration interface Scape demand parameter is summarized and is stored, to obtain second configuration file.
In an embodiment of the present invention, when the identification information in the request network packet including the upstream business application side When, the identification information of upstream business application side described in the request network packet can be obtained, and according to the identification information, from institute State the scene demand parameter that the upstream business application side is determined in the second configuration file.
That is, the request network packet can be obtained when receiving the request network packet that upstream business application side is sent In the upstream business application side identification information, and found out from the second configuration file pre-established according to the identification information with The corresponding scene demand parameter of the identification information, to obtain the scene demand parameter of the upstream business application side.
It, can be according to the scene demand parameter from preparatory after determining the scene demand parameter of the upstream business application side Corresponding target configuration parameter is determined in the first configuration file established.Wherein, in an embodiment of the present invention, which matches Setting parameter can be allocation optimum parameter, and it is that can be tied under current scene demand parameter which, which is appreciated that, It closes operation cost and indexes the optimization of recall effects many factors, and the Index Algorithm type and index database type used. As an example, the index that the configuration parameter may include but be not limited to that vector index system uses builds library type, vector rope Draw dimension, the quantity for building library index tree, online indexed search algorithm, index compression ratio etc..
In an embodiment of the present invention, as shown in Fig. 2, first configuration file can be pre-established by following steps:
S210 constructs the configuration set of each single item parameter in the configuration parameter;
For example, the index that can be used vector index system builds library type, vector index dimension, builds library index tree Quantity, online indexed search algorithm, index compression than etc. each single item parameter in configuration parameters all construct respective configuration set, For example, the configuration set of vector index dimension can be 16 dimensions, 32 dimensions, 64 dimensions, 128 dimensions etc., can be constructed with this for the vector Index dimension configuration item array, for another example, index database type configuration set can be index of the picture build library, binary tree index build library Deng the configuration set of the other configurations item such as index database is also that similar mode is constructed, and so on, configuration ginseng can be constructed The respective configuration set of each single item parameter in number.
S220 carries out traversal collocation from the configuration set of each single item parameter, obtains a variety of adapter combinations;
Optionally, traversal collocation is carried out to each element in the configuration set of each single item parameter, obtains multiple adapter combinations. For be easy to understand the traversal collocation meaning, below will citing it is illustrated, for example, with contain this three parameters of A, B and C, A The configuration set of item parameter is { a1, a2 }, and the configuration set of B parameters is { b1, b2, b3 }, and the configuration set of C parameters is { c1, c2 }, carry out traversal collocation following adapter combination can be obtained: { a1, b1, c1 }, { a1, b1, c2 }, { a1, b2, c1 }, a1, b2,c2}、{a1,b3,c1}、{a1,b3,c2}、{a2,b1,c1}、{a2,b1,c2}、{a2,b2,c1}、{a2,b2,c2}、{a2, b3,c1}、{a2,b3,c2}。
S230 obtains sample scene demand parameter;
For example, the scene demand parameters of a variety of various kinds can be obtained as sample, so as to the various adapter combinations pair of later use These samples are trained, and are suitble to the optimal fit of the sample to combine to select from the various adapter combinations.
S240 is trained the sample scene demand parameter according to a variety of adapter combinations, with from a variety of adaptations Determine that the targeted fit being suitble under the sample scene combines in combination;
Sample scene demand parameter is trained that is, every kind of adapter combination can be used, according to the training result It can determine that the targeted fit being suitble under the sample scene is combined from a variety of adapter combinations, for example determine one group The allocation optimum parameter being suitble under the sample scene.
As an example, it can be based on the sample scene demand parameter, using a variety of adapter combinations respectively to industry Business simulation request carries out vector index and recalls, and records the consumption of the system performance under various adapter combinations, recalls accuracy and call together Time-consuming is returned, and according to record result from a variety of adapter combinations, determines that the target being suitble under the sample scene is suitable With combination, targeted fit combination is appreciated that it is the adapter combination being most suitable under the sample scene.
For example, the simulation of upstream business application side sends service analogue request (request is normal service request) to vector rope Draw system.Vector index system is while parsing service analogue request, and based on the upstream business application side Sample scene demand parameter triggers the concurrent request to a variety of Index Algorithms in downstream, and the consumption to each algorithm calculated performance It is recorded, such as CPU (Central Processing Unit, central processing unit) utilization rate, memory including calculate node Utilization rate, IO (Input/Output, input/output) Expenditure Levels if operated under virtualized environment record calculating CPU, memory, Quota (quota) utilization rate of IO of example etc.;Record simultaneously different index recall accuracy that algorithm is recalled and Recall time-consuming.In this way, can determine to be most suitable in the sample scene from used a variety of adapter combinations according to record result Under targeted fit combination.
S250 matches the target that parameters included in targeted fit combination are determined as under the sample scene Set parameter;
S260 stores the target configuration parameter under the sample scene into first configuration file.
S210-S260 can establish first configuration file through the above steps as a result, can in first configuration file Include the configuration parameter under each scene demand parameter, associative operation cost can be obtained under current scene demand parameter With index recall effects many factors optimization, and use Index Algorithm type and index database type.
S130 selects corresponding indexed search algorithm and index database according to target configuration parameter.
Optionally, a variety of Index Algorithms are encapsulated into an Index Algorithm library in advance, and a variety of index databases is united One management.For example, settable index Operator Library subsystem and vector index library subsystem, wherein index Operator Library subsystem is negative Duty index operator encapsulation, the present invention can support encapsulation multiple types, including but not limited to based on Euclidean distance, COS distance, Violence computational algorithm, a kind of Faiss (similarity retrieval tool) Operator Library, the Annoy of manhatton distance scheduling algorithm are (approximate nearest Neighbour's search) Operator Library, NMS (non-maxima suppression) Operator Library etc..
In this step, when obtaining the corresponding target configuration parameter of the scene demand parameter, can be matched according to the target It sets parameter and chooses corresponding indexed search algorithm from the index Operator Library subsystem, and from vector index library subsystem It is middle to choose corresponding index database.
S140 is carried out vector expression with the vector index in corresponding index database using corresponding indexed search algorithm Matching, the vector index that will match to are recalled.
That is, using the indexed search algorithm of selection, calculate the vector expression in the index database to Similarity between amount index, and the vector index that similarity reaches certain threshold value is recalled as result is recalled, with anti- Upstream business of feeding application side.
It should be noted that business scenario not immobilizes, for example, online request QPS (Query Per Second, often Second query rate) there may be the difference of Pinggu phase daily, wave crest and the gap of trough phase may have the size of several times;Online retrieving consumption When and the requirement for accuracy also have fluctuating change.In this case its corresponding configuration parameter can change, therefore, In this case, the real-time dynamic of configuration parameter is supported to adjust.Optionally, in one embodiment of the invention, described During upstream business application side real time access vector index system, the vector index system is acquired for the upstream industry Used performance indicator data when the access of business application side, and according to a variety of adapter combinations to collected performance indicator number According to being trained, to be combined from the targeted fit for determining to be suitble under the performance indicator data in a variety of adapter combinations, it Afterwards, parameters included in the targeted fit combination under the performance indicator data and the target configuration parameter are calculated Difference, when the difference is greater than or equal to preset threshold, by the configuration of service application side described in first configuration file Parameter replaces with parameters included in the targeted fit combination under the performance indicator data.
For example, during upstream business application side real time access vector index system, it can synchronous acquisition vector index The log of internal system module, the content of acquisition can include: service request qps, current time-consuming state recall accuracy, current The performance indicators data such as cpu utilization rate, memory usage, IO utilization rate.After being collected into these achievement datas, when specified Between be spaced, using in described kind of adapter combination indexed search algorithm and index database carry out vector index recall, symbol is calculated Close the configuration parameter of current collected scene demand, and by the configuration parameter and the currently used mesh in the upstream business application side It marks configuration parameter and carries out difference calculating, when the difference reaches the threshold value of triggering configuration change, can trigger configuration parameter change, i.e., By the configuration parameter of service application side described in first configuration file, the target replaced under the performance indicator data is suitable With parameters included in combination.
In order to make those skilled in the art clearly understand the present invention, will be exemplified below.Firstly, this The dress method of recalling of the vector index of inventive embodiments can be applied to vector index system.For example, as shown in figure 3, for this The configuration diagram of the vector index system of inventive embodiments.Wherein, vector index system can include: intelligent scheduling subsystem System, index Operator Library subsystem and amount index database subsystem.Wherein, which may include resolver, configuration pipe Device and intelligent scheduler and training system are managed, which is responsible for receiving the request network packet that upstream business application side is sent, into The parsing of row data.Resolver is joined after analysis request data packet, through configuration manager and intelligent scheduler according to scene demand Number selects suitable indexed search algorithm and corresponding index database.Index the envelope of Operator Library subsystem responsible indexed search algorithm Dress, the design of the vector index system support that encapsulate a plurality of types of index operators includes into a set of general-purpose subsystem, and not It is limited to based on Euclidean distance, cos COS distance, the violence computational algorithm of manhatton distance scheduling algorithm, Faiss Operator Library, Annoy Operator Library, nms Operator Library etc..The building and storage service of vector index library subsystem responsible vector index, include and are not limited to The building and storage, the building of index database based on binary tree and storage of the index database calculated based on violence, based on the index of figure Library building and storage.The specific system component and interaction relation of the design are as shown in Figure 3.
It is appreciated that the present invention can mainly design two kinds of problems in design and implementation: various one is how to train Allocation optimum parameter under scene;Another kind is that how in real time dynamic adjusts allocation optimum parameter after business scenario changes. The example of specific implementation is presented below:
1) for the allocation optimum parameter how trained under various scenes:
In order to obtain associative operation cost under different scenes and index the optimization of recall effects many factors, need Want that a set of intelligent scheduling difference recalls the scheduler of operator and index database and a set of training system is trained system parameter And amendment.Intelligent scheduling subsystem in the present invention completes the work that the design of this part needs, wherein training system is core Component.
Training process: the simulation of upstream business application method sends regular traffic and requests to give vector directory system, and resolver exists The concurrent request to a variety of indexed search algorithms in downstream is triggered by training system while analysis request, training system is to each The consumption of indexed search algorithm calculated performance is recorded, and (cpu utilization rate, the utilization rate of memory including calculate node, IO disappear Consume situation and record the cpu of calculated examples, the quota utilization rate of memory, IO if operated under virtualized environment), simultaneously Record different index recalls the accuracy and recall time-consuming that operator is recalled.Training system (compares according to the scene demand of upstream business Such as search index qps, online access time-consuming demand) and cost input (index online query calculates the cpu cost input needed, And index carrying cost investment) as input parameter, a set of applied field is calculated automatically using various indexed search algorithms Allocation optimum parameter under scape, comprising: the index that system uses build library type, vector index dimension, the quantity for building library index tree, The core parameters such as online indexed search algorithm, index compression ratio.The corresponding core parameter of these business scenarios will be stored in One configuration file, is managed by configuration manager and is safeguarded.
2) how in real time dynamic adjusts allocation optimum parameter after changing for business scenario:
Business scenario not immobilizes, for example online request qps may have the difference of Pinggu phase, wave crest and trough daily The gap of phase may have the size of several times;Online retrieving is time-consuming and the requirement for accuracy also has fluctuating change.It is this In the case of optimum state parameter can change.Therefore, the present invention in this case, supports the real-time dynamic of allocation optimum parameter Adjustment.
The real-time dynamic adjustment process of allocation optimum parameter: in the mistake of upstream business application side real time access vector index system Cheng Zhong, training system synchronous acquisition vector index internal system module log, the content of acquisition includes: service request qps, current Time-consuming state recalls the base values such as accuracy, current cpu utilization rate, memory usage, IO utilization rate.It is being collected into these After index, in conjunction with parameter configuration currently used in configuration manager, used as input parameter every specified time interval Various indexed search algorithm automatic triggers calculate the optimized parameter solution under real-time scene.When optimized parameter solution and currently used Allocation optimum parameter diff (parameter differences that the diff refers to the two) difference reaches threshold value (such as diff value of triggering configuration change Can be the weighted difference of parameters diff) when, by training system triggering configuration manager triggering configuration parameter change.As a result, When business scenario sends fluctuation and variation, which can be realized real-time, the adjust automatically of optimized parameter, it is ensured that environment occurs It remains to take optimal effectiveness when variation.
It should be noted that the change of triggering needs to meet the requirement of online search index delay, and for matching essence Otherwise the requirement of exactness still uses current configuration parameters.If being related to switching (wherein, the index database switching hair of index database It is raw to exist: if for example, the mode that the mode comparison index of the picture that binary tree index builds library builds library more meets business side's demand, such as Under current request pressure, more meets demand of the business side to cost or online retrieving time-consuming, cut then index database can be triggered Change), and index database needs under the control of intelligent scheduler there are under the distributed environment of the more copies of more fragments, need to accomplish The synchronism switching of all index database nodes avoids the same upstream requests from obtaining the feelings of different types of index database search result Condition, it is ensured that it is lossless that recall effects are indexed in handoff procedure.
Wherein, in the present embodiment, the synchronism switchings of all index database nodes is achieved in that: passing through configuration manager It is unified to send new index database enabling signal to index database node, after the new index database of all nodes starts successfully, to configuration Manager feeds back starting state.Then in the request scheduling to new index database that intelligent scheduler can send upstream business side, it Old index database is discarded afterwards.
Vector index according to an embodiment of the present invention recalls dress method, asks receive that upstream business application side sends When seeking network packet, data parsing is carried out to request network packet and obtains corresponding vector expression, and determines upstream business application side Scene demand parameter, and corresponding target configuration is determined from the first configuration file pre-established according to scene demand parameter Parameter can select corresponding indexed search algorithm and index database according to target configuration parameter, then, using corresponding rope later Draw searching algorithm, vector expressed and is matched with the vector index in corresponding index database, the vector index that will match into Row is recalled.Suitable indexed search algorithm and corresponding index are selected according to the scene demand parameter of upstream business application method Library, and then can realize that vector index is recalled based on the suitable indexed search algorithm of selection and corresponding index database, so as to Enough under extensive vector index scene, effectively balances cost input, calculates effect and operating lag, and to use this hair It bright application side can be under limited resource input, according to items such as different application scenarios, access pressure, online access delays Best effect is taken under part, so as to realize vector index recall while can also take into account accuracy, efficiency and at This.
The vector index provided with above-mentioned several embodiments recalls that method is corresponding, and a kind of embodiment of the invention also mentions Device is recalled for a kind of vector index, device and above-mentioned several realities are recalled due to vector index provided in an embodiment of the present invention That applies the vector index of example offer recalls that method is corresponding, therefore also fits in the embodiment for recalling method of aforementioned vector index For the device of recalling of vector index provided in this embodiment, it is not described in detail in the present embodiment.Fig. 4 is according to the present invention The structural schematic diagram for recalling device of the vector index of one embodiment.As shown in figure 4, the vector index recalls device 400 It may include: data resolution module 410, scene demand determining module 420, optimized parameter determining module 430, selecting module 440 Module 450 is recalled with index.
Specifically, data resolution module 410 is used for when receiving the request network packet that upstream business application side is sent, right It requests network packet to carry out data parsing and obtains corresponding vector expression.
Scene demand determining module 420 is used to determine the scene demand parameter of upstream business application side.
Optimized parameter determining module 430 is used to be determined from the first configuration file pre-established according to scene demand parameter Corresponding target configuration parameter out;Wherein, including the configuration parameter under various scene demands in the first configuration file.
Selecting module 440 is used to select corresponding indexed search algorithm and index database according to target configuration parameter.
Index recall module 450 for use corresponding indexed search algorithm, by vector express in corresponding index database Vector index matched, the vector index that will match to is recalled.
Optionally, in one embodiment of the invention, as shown in figure 5, the vector index is recalled device 400 and can also be wrapped It includes: configuration module 460.Wherein, configuration module 460 can be used for providing the configuration of scene demand parameter to the upstream business application side Interface, wherein the configuration interface includes the config option of default configuration parameters and the default configuration parameters, and described in reception The configuration modifications that upstream business application side carries out based on the configuration interface according to the default configuration parameters, obtain institute State the scene demand parameter information that upstream business application side configures, and the scene demand that the upstream business application side is configured Parameter information is stored into the second configuration file.
It wherein, include the identification information of the upstream business application side in the request network packet;In implementation of the invention In example, scene demand determining module 420 is specifically used for: obtaining the mark of upstream business application side described in the request network packet Information;According to the identification information, the scene demand of the upstream business application side is determined from second configuration file Parameter.
It should be noted that in one embodiment of the invention, the configuration parameter includes that vector index system uses Index build library type, vector index dimension, the quantity for building library index tree, online indexed search algorithm, index compression ratio;Its In, in an embodiment of the present invention, as shown in fig. 6, the device 400 of recalling of the vector index may also include that configuration file is established Module 470, configuration file, which establishes module 470, can be used for pre-establishing first configuration file.Wherein, in implementation of the invention In example, configuration file is established module 470 and is specifically used for: constructing the configuration set of each single item parameter in the configuration parameter;From institute It states and carries out traversal collocation in the configuration set of each single item parameter, obtain a variety of adapter combinations;Obtain sample scene demand parameter;Root The sample scene demand parameter is trained according to a variety of adapter combinations, to determine to be suitble to from a variety of adapter combinations Targeted fit combination under the sample scene;Parameters included in targeted fit combination are determined as described Target configuration parameter under sample scene;Target configuration parameter under the sample scene is stored to first configuration file In.
As an example, configuration file is established module 470 and is specifically used for: being based on the sample scene demand parameter, adopts Vector index is carried out to service analogue request respectively with a variety of adapter combinations to recall;Record the system under various adapter combinations Performance consumption recalls accuracy and recalls time-consuming;According to record result from a variety of adapter combinations, determine to be suitble in institute State the targeted fit combination under sample scene.
It should be noted that business scenario not immobilizes, for example, online request QPS (Query Per Second, often Second query rate) there may be the difference of Pinggu phase daily, wave crest and the gap of trough phase may have the size of several times;Online retrieving consumption When and the requirement for accuracy also have fluctuating change.In this case its corresponding allocation optimum parameter can change, Therefore, in this case, the real-time dynamic of allocation optimum parameter is supported to adjust.Optionally, in one embodiment of the present of invention In, as shown in fig. 6, the device 400 of recalling of the vector index may also include that acquisition module 480, training module 490, difference calculate Module 4100 and optimized parameter update module 4110.Wherein, acquisition module 480 is used to visit in real time in the upstream business application side During asking vector index system, used when acquiring the vector index system for upstream business application side access Performance indicator data;Training module 490 is for carrying out collected performance indicator data according to a variety of adapter combinations Training, to be combined from the targeted fit for determining to be suitble under the performance indicator data in a variety of adapter combinations;Difference calculates Module 4100 is used to calculate parameters and the target included in the targeted fit combination under the performance indicator data The difference of configuration parameter;Optimized parameter update module 4110 is used for when the difference is greater than or equal to preset threshold, will be described The configuration parameter of service application side described in first configuration file replaces with the targeted fit combination under the performance indicator data Included in parameters.
Vector index according to an embodiment of the present invention recalls device, can receive upstream industry by data resolution module When the request network packet that business application side is sent, the corresponding vector of data parsing acquisition is carried out to request network packet and is expressed, scene needs Ask determining module to determine the scene demand parameter of upstream business application side, optimized parameter determining module according to scene demand parameter from Determine that corresponding target configuration parameter, selecting module can be selected according to target configuration parameter in the first configuration file pre-established Select corresponding indexed search algorithm and index database, index recalls module using corresponding indexed search algorithm, by vector expression with Vector index in corresponding index database is matched, and the vector index that will match to is recalled.Answered according to upstream business Suitable indexed search algorithm and corresponding index database are selected with the scene demand parameter of method, and then can be based on the conjunction of selection Suitable indexed search algorithm and corresponding index database realize that vector index is recalled, so as in extensive vector index scene Under, it effectively balances cost input, calculate effect and operating lag, and making can be limited using application side of the invention Under resource input, according to different application scenarios, access pressure, online access delay etc. under the conditions of take best effect, from And accuracy, efficiency and cost can be also taken into account while realizing that vector index is recalled.
In order to realize above-described embodiment, the invention also provides a kind of electronic equipment.
Fig. 7 is the structural schematic diagram of electronic equipment according to an embodiment of the invention.As shown in fig. 7, the electronic equipment 700 may include: to include: memory 710, processor 720 and be stored in memory 710 and can run on processor 720 Computer program 730, when processor 720 executes computer program 730, realize described in any of the above-described a embodiment of the present invention to Amount index recalls method.
In order to realize above-described embodiment, the invention also provides a kind of computer readable storage mediums, are stored thereon with meter Calculation machine program realizes vector index described in any of the above-described a embodiment of the present invention when computer program is executed by processor Recall method.
In the description of the present invention, it is to be understood that, term " first ", " second " are used for description purposes only, and cannot It is interpreted as indication or suggestion relative importance or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the One ", the feature of " second " can explicitly or implicitly include at least one of the features.In the description of the present invention, " multiple " It is meant that at least two, such as two, three etc., unless otherwise specifically defined.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as to limit of the invention System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of the invention Type.

Claims (14)

1. a kind of vector index recalls method characterized by comprising
When receiving the request network packet that upstream business application side is sent, data parsing is carried out to the request network packet and is obtained Corresponding vector expression;
Determine the scene demand parameter of the upstream business application side, and according to the scene demand parameter from pre-established Corresponding target configuration parameter is determined in one configuration file;
Corresponding indexed search algorithm and index database are selected according to the target configuration parameter;
Using the corresponding indexed search algorithm, by vector expression and the vector index in the corresponding index database into Row matching, the vector index that will match to are recalled.
2. the method according to claim 1, wherein the request network packet sent in reception upstream business application side Before, the method also includes:
Scene demand parameter configuration interface is provided to the upstream business application side, wherein the configuration interface includes that default is matched Set the config option of parameter and the default configuration parameters;
Receive that the upstream business application side carries out based on the configuration interface according to the default configuration parameters matches Amendment is set, the scene demand parameter information that the upstream business application side configures is obtained;
The scene demand parameter information that the upstream business application side configures is stored into the second configuration file.
3. according to the method described in claim 2, it is characterized in that, including the upstream business application in the request network packet The identification information of side;Determine the scene demand parameter of the upstream business application side, comprising:
Obtain the identification information of upstream business application side described in the request network packet;
According to the identification information, the scene demand ginseng of the upstream business application side is determined from second configuration file Number.
4. the method according to claim 1, wherein including under various scene demands in first configuration file Configuration parameter, the configuration parameter includes that the index that uses of vector index system builds library type, vector index dimension, builds Ku Suo Draw quantity, the online indexed search algorithm, index compression ratio of tree;First configuration file is pre-established by following steps:
Construct the configuration set of each single item parameter in the configuration parameter;
Traversal collocation is carried out from the configuration set of each single item parameter, obtains a variety of adapter combinations;
Obtain sample scene demand parameter;
The sample scene demand parameter is trained according to a variety of adapter combinations, to be determined from a variety of adapter combinations It is suitble to the targeted fit under the sample scene to combine out;
Parameters included in targeted fit combination are determined as the target configuration parameter under the sample scene;
Target configuration parameter under the sample scene is stored into first configuration file.
5. according to the method described in claim 4, it is characterized in that, according to a variety of adapter combinations to the sample scene need Parameter is asked to be trained, to wrap from the targeted fit combination for determining to be suitble under the sample scene in a variety of adapter combinations It includes:
Based on the sample scene demand parameter, service analogue is requested respectively using a variety of adapter combinations to carry out vector rope Draw and recalls;
The system performance consumption under various adapter combinations is recorded, accuracy is recalled and recalls time-consuming;
According to record result from a variety of adapter combinations, the targeted fit group being suitble under the sample scene is determined It closes.
6. method according to claim 4 or 5, which is characterized in that further include:
During the upstream business application side real time access vector index system, acquires the vector index system and be directed to The upstream business application side used performance indicator data when accessing;
Collected performance indicator data are trained according to a variety of adapter combinations, to be determined from a variety of adapter combinations It is suitble to the targeted fit under the performance indicator data to combine out;
Calculate parameters and the target configuration parameter included in the targeted fit combination under the performance indicator data Difference;
When the difference is greater than or equal to preset threshold, the configuration of service application side described in first configuration file is joined Number replaces with parameters included in the targeted fit combination under the performance indicator data.
7. a kind of vector index recalls device characterized by comprising
Data resolution module, for when receiving the request network packet that upstream business application side is sent, to the request network Packet carries out data parsing and obtains corresponding vector expression;
Scene demand determining module, for determining the scene demand parameter of the upstream business application side;
Optimized parameter determining module, for being determined from the first configuration file pre-established according to the scene demand parameter Corresponding target configuration parameter;
Selecting module, for selecting corresponding indexed search algorithm and index database according to the target configuration parameter;
Index recalls module, for using the corresponding indexed search algorithm, by vector expression and the corresponding rope The vector index drawn in library is matched, and the vector index that will match to is recalled.
8. device according to claim 7, which is characterized in that further include:
Configuration module, for providing scene demand parameter configuration interface to the upstream business application side, wherein configuration circle Face includes the config option of default configuration parameters and the default configuration parameters, and receives the upstream business application side described According to the configuration modifications carried out based on the default configuration parameters on configuration interface, the upstream business application side configuration is obtained Scene demand parameter information, and the scene demand parameter information that the upstream business application side configures is stored to second and is matched It sets in file.
9. device according to claim 8, which is characterized in that include the upstream business application in the request network packet The identification information of side;The scene demand determining module is specifically used for:
Obtain the identification information of upstream business application side described in the request network packet;
According to the identification information, the scene demand ginseng of the upstream business application side is determined from second configuration file Number.
10. device according to claim 7, which is characterized in that include various scene demands in first configuration file Under configuration parameter, the configuration parameter includes that the index that uses of vector index system builds library type, vector index dimension, Jian Ku The quantity of index tree, online indexed search algorithm, index compression ratio;Described device further include:
Configuration file establishes module, for pre-establishing first configuration file;
Wherein, the configuration file is established module and is specifically used for:
Construct the configuration set of each single item parameter in the configuration parameter;
Traversal collocation is carried out from the configuration set of each single item parameter, obtains a variety of adapter combinations;
Obtain sample scene demand parameter;
The sample scene demand parameter is trained according to a variety of adapter combinations, to be determined from a variety of adapter combinations It is suitble to the targeted fit under the sample scene to combine out;
Parameters included in targeted fit combination are determined as the target configuration parameter under the sample scene;
Target configuration parameter under the sample scene is stored into first configuration file.
11. device according to claim 10, which is characterized in that the configuration file is established module and is specifically used for:
Based on the sample scene demand parameter, service analogue is requested respectively using a variety of adapter combinations to carry out vector rope Draw and recalls;
The system performance consumption under various adapter combinations is recorded, accuracy is recalled and recalls time-consuming;
According to record result from a variety of adapter combinations, the targeted fit group being suitble under the sample scene is determined It closes.
12. device described in 0 or 11 according to claim 1, which is characterized in that further include:
Acquisition module, for during the upstream business application side real time access vector index system, acquisition it is described to Measure used performance indicator data when directory system is accessed for the upstream business application side;
Training module, for being trained according to a variety of adapter combinations to collected performance indicator data, with from a variety of Determine that the targeted fit being suitble under the performance indicator data combines in adapter combination;
Difference calculating module, for calculate the targeted fit under the performance indicator data combination included in parameters with The difference of the target configuration parameter;
Optimized parameter update module is used for when the difference is greater than or equal to preset threshold, will be in first configuration file The configuration parameter of the service application side replaces with items included in the targeted fit combination under the performance indicator data Parameter.
13. a kind of electronic equipment characterized by comprising memory, processor and be stored in the memory and can be described The computer program run on processor when the processor executes the computer program, is realized according to claim 1 to 6 Any one of described in vector index recall method.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program That vector index according to any one of claim 1 to 6 is realized when being executed by processor recalls method.
CN201811595045.2A 2018-12-25 2018-12-25 Vector index recall method and device, electronic equipment and storage medium Active CN109710612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811595045.2A CN109710612B (en) 2018-12-25 2018-12-25 Vector index recall method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811595045.2A CN109710612B (en) 2018-12-25 2018-12-25 Vector index recall method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109710612A true CN109710612A (en) 2019-05-03
CN109710612B CN109710612B (en) 2021-05-18

Family

ID=66258327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811595045.2A Active CN109710612B (en) 2018-12-25 2018-12-25 Vector index recall method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109710612B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110221817A (en) * 2019-06-17 2019-09-10 北京酷我科技有限公司 A kind of data recall module and recommender system
CN110427453A (en) * 2019-05-31 2019-11-08 平安科技(深圳)有限公司 Similarity calculating method, device, computer equipment and the storage medium of data
CN110598078A (en) * 2019-09-11 2019-12-20 京东数字科技控股有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
CN111414527A (en) * 2020-03-16 2020-07-14 腾讯音乐娱乐科技(深圳)有限公司 Similar item query method and device and storage medium
CN111581032A (en) * 2020-05-21 2020-08-25 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for operating data and rolling back data
CN112395396A (en) * 2019-08-12 2021-02-23 科沃斯商用机器人有限公司 Question-answer matching and searching method, device, system and storage medium
CN112463952A (en) * 2020-12-22 2021-03-09 安徽商信政通信息技术股份有限公司 News text aggregation method and system based on neighbor search
CN113792184A (en) * 2020-08-04 2021-12-14 北京沃东天骏信息技术有限公司 Advertisement recall method, device and system, computer storage medium and electronic equipment
CN115278374A (en) * 2021-04-29 2022-11-01 中移动金融科技有限公司 Video recall method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101603836A (en) * 2009-07-21 2009-12-16 中国科学院地理科学与资源研究所 Multi-user concurrent guidance path inquiry balance control method and device
CN102200974A (en) * 2010-03-25 2011-09-28 北京师范大学 Unified information retrieval intelligent agent system and method for search engine
CN102663088A (en) * 2012-03-31 2012-09-12 百度在线网络技术(北京)有限公司 Method and equipment for providing search results
CN103744866A (en) * 2013-12-18 2014-04-23 北京百度网讯科技有限公司 Searching method and device
CN105159971A (en) * 2015-08-26 2015-12-16 成都布林特信息技术有限公司 Cloud platform data retrieval method
CN105183774A (en) * 2015-08-07 2015-12-23 北京思特奇信息技术股份有限公司 Intelligent query method and system
CN106649818A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Recognition method and device for application search intentions and application search method and server
CN107491518A (en) * 2017-08-15 2017-12-19 北京百度网讯科技有限公司 Method and apparatus, server, storage medium are recalled in one kind search
US20180157737A1 (en) * 2015-01-30 2018-06-07 Splunk Inc. Systems and methods for distributing indexer configurations
CN108170719A (en) * 2017-12-05 2018-06-15 深圳市金立通信设备有限公司 A kind of search method, server and computer readable storage medium
CN108647329A (en) * 2018-05-11 2018-10-12 中国联合网络通信集团有限公司 Processing method, device and the computer readable storage medium of user behavior data
US20180300349A1 (en) * 2015-01-30 2018-10-18 Splunk Inc. Source type definition configuration using a graphical user interface

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101603836A (en) * 2009-07-21 2009-12-16 中国科学院地理科学与资源研究所 Multi-user concurrent guidance path inquiry balance control method and device
CN102200974A (en) * 2010-03-25 2011-09-28 北京师范大学 Unified information retrieval intelligent agent system and method for search engine
CN102663088A (en) * 2012-03-31 2012-09-12 百度在线网络技术(北京)有限公司 Method and equipment for providing search results
CN103744866A (en) * 2013-12-18 2014-04-23 北京百度网讯科技有限公司 Searching method and device
US20180157737A1 (en) * 2015-01-30 2018-06-07 Splunk Inc. Systems and methods for distributing indexer configurations
US20180300349A1 (en) * 2015-01-30 2018-10-18 Splunk Inc. Source type definition configuration using a graphical user interface
CN105183774A (en) * 2015-08-07 2015-12-23 北京思特奇信息技术股份有限公司 Intelligent query method and system
CN105159971A (en) * 2015-08-26 2015-12-16 成都布林特信息技术有限公司 Cloud platform data retrieval method
CN106649818A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Recognition method and device for application search intentions and application search method and server
CN107491518A (en) * 2017-08-15 2017-12-19 北京百度网讯科技有限公司 Method and apparatus, server, storage medium are recalled in one kind search
CN108170719A (en) * 2017-12-05 2018-06-15 深圳市金立通信设备有限公司 A kind of search method, server and computer readable storage medium
CN108647329A (en) * 2018-05-11 2018-10-12 中国联合网络通信集团有限公司 Processing method, device and the computer readable storage medium of user behavior data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
彭黎文: "《用户可配置的搜索引擎的设计与实现》", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427453A (en) * 2019-05-31 2019-11-08 平安科技(深圳)有限公司 Similarity calculating method, device, computer equipment and the storage medium of data
CN110427453B (en) * 2019-05-31 2024-03-19 平安科技(深圳)有限公司 Data similarity calculation method, device, computer equipment and storage medium
CN110221817B (en) * 2019-06-17 2023-01-17 北京酷我科技有限公司 Data recall module and recommendation system
CN110221817A (en) * 2019-06-17 2019-09-10 北京酷我科技有限公司 A kind of data recall module and recommender system
CN112395396A (en) * 2019-08-12 2021-02-23 科沃斯商用机器人有限公司 Question-answer matching and searching method, device, system and storage medium
CN110598078A (en) * 2019-09-11 2019-12-20 京东数字科技控股有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
CN110598078B (en) * 2019-09-11 2022-09-30 京东科技控股股份有限公司 Data retrieval method and device, computer-readable storage medium and electronic device
CN111414527A (en) * 2020-03-16 2020-07-14 腾讯音乐娱乐科技(深圳)有限公司 Similar item query method and device and storage medium
CN111414527B (en) * 2020-03-16 2023-10-10 腾讯音乐娱乐科技(深圳)有限公司 Query method, device and storage medium for similar items
CN111581032A (en) * 2020-05-21 2020-08-25 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for operating data and rolling back data
CN111581032B (en) * 2020-05-21 2023-06-27 抖音视界有限公司 Method, device, equipment and storage medium for operating data and rolling back data
CN113792184A (en) * 2020-08-04 2021-12-14 北京沃东天骏信息技术有限公司 Advertisement recall method, device and system, computer storage medium and electronic equipment
CN112463952B (en) * 2020-12-22 2023-05-05 安徽商信政通信息技术股份有限公司 News text aggregation method and system based on neighbor search
CN112463952A (en) * 2020-12-22 2021-03-09 安徽商信政通信息技术股份有限公司 News text aggregation method and system based on neighbor search
CN115278374A (en) * 2021-04-29 2022-11-01 中移动金融科技有限公司 Video recall method and device
CN115278374B (en) * 2021-04-29 2024-05-07 中移动金融科技有限公司 Video recall method and device

Also Published As

Publication number Publication date
CN109710612B (en) 2021-05-18

Similar Documents

Publication Publication Date Title
CN109710612A (en) Vector index recalls method, apparatus, electronic equipment and storage medium
CN110046183A (en) A kind of time series data polymerization search method, equipment and medium
CN110019247A (en) Data storage and querying method, device and monitoring system
CN108664516A (en) Enquiring and optimizing method and relevant apparatus
CN107103068A (en) The update method and device of service buffer
CN109144791A (en) Data conversion storage method, apparatus and data management server
CN109299115A (en) A kind of date storage method, device, server and storage medium
CN106649687A (en) Method and device for on-line analysis and processing of large data
CN108536808A (en) A kind of data capture method and device based on Spark Computational frames
CN111488377A (en) Data query method and device, electronic equipment and storage medium
CN110858912A (en) Streaming media caching method and system, caching policy server and streaming service node
CN112905638A (en) Horn-shaped time slice processing method
CN108763323A (en) Meteorological lattice point file application process based on resource set and big data technology
CN109033173A (en) It is a kind of for generating the data processing method and device of multidimensional index data
CN108182204A (en) The processing method and processing device of data query based on house prosperity transaction multi-dimensional data
CN114090631A (en) Data query method and device, electronic equipment and storage medium
CN116089414B (en) Time sequence database writing performance optimization method and device based on mass data scene
CN107277095B (en) Session segmentation method and device
CN113297245A (en) Method and device for acquiring execution information
Wang et al. Block storage optimization and parallel data processing and analysis of product big data based on the hadoop platform
US7987181B2 (en) System and method for directing query traffic
CN110502543A (en) Device performance data storage method, device, equipment and storage medium
CN109150819B (en) A kind of attack recognition method and its identifying system
CN115525603A (en) Storage statistics method and device, computer readable storage medium and AI device
CN104077282B (en) The method and apparatus of processing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant