CN113343043B - Index construction method, index retrieval method, and corresponding device, terminal and medium - Google Patents

Index construction method, index retrieval method, and corresponding device, terminal and medium Download PDF

Info

Publication number
CN113343043B
CN113343043B CN202110732632.7A CN202110732632A CN113343043B CN 113343043 B CN113343043 B CN 113343043B CN 202110732632 A CN202110732632 A CN 202110732632A CN 113343043 B CN113343043 B CN 113343043B
Authority
CN
China
Prior art keywords
index
target
node
search
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110732632.7A
Other languages
Chinese (zh)
Other versions
CN113343043A (en
Inventor
周继臣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202110732632.7A priority Critical patent/CN113343043B/en
Publication of CN113343043A publication Critical patent/CN113343043A/en
Application granted granted Critical
Publication of CN113343043B publication Critical patent/CN113343043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides an index construction method, a retrieval method, a corresponding device, a terminal and a storage medium, wherein the index construction method comprises the following steps: acquiring a first object and a preset index multi-way tree, wherein the first object is an object to be index-built; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to a second object, wherein the second object is a plurality of objects associated with the leaf nodes; determining a target leaf node corresponding to a first object by taking a first layer of the index multi-way tree as a current layer; associating the first object with the target leaf node and determining the remaining index fields in the first object; and determining a target primary index corresponding to the number of the residual index fields of the first object in the index structure of the target leaf node, and associating the first object with the target primary index to obtain a target index multi-way tree. The index of the object can be flexibly configured according to the service requirement, and the expansibility is good.

Description

Index construction method, index retrieval method, and corresponding device, terminal and medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to an index construction method, a search method, and corresponding devices, terminals, and storage media.
Background
With the continuous development of internet technology, internet advertisements, which directly or indirectly promote goods or services in the form of text, pictures, audio, video or other forms, are favored by merchants through internet media such as websites, webpages, internet application programs and the like.
When a merchant puts advertisements, the condition of putting advertisements is defined for the put advertisements, and after receiving a plurality of advertisements put by a plurality of merchants, the Internet advertisement platform needs to construct an advertisement index according to the putting condition of the advertisements, so that when receiving an advertisement recall request sent by a client, the advertisement index can find advertisements conforming to the advertisement recall request from the plurality of advertisements for recall. Specifically, the advertisement recall request may be considered as a request containing target advertisement placement conditions, and after receiving the advertisement recall request, the advertisement platform needs to retrieve target advertisements matching the placement conditions contained in the advertisement recall request from the plurality of advertisements, and return the target advertisements to the advertisement recall request sender.
In the prior art, when an advertisement index is constructed, the advertisement index is generally realized through a data structure of map nested maps, and each layer in the data structure is fixed, so that the expansibility of the index constructing method in the prior art is low, and if advertisement putting conditions are more, the map is deeply nested, and the subsequent retrieval efficiency by using the index is influenced.
Disclosure of Invention
The invention provides an index construction method, an index retrieval method, a corresponding device, a terminal and a storage medium, so that the problem of low index expansibility in the prior art is solved to a certain extent.
According to a first aspect of the present invention, there is provided a method of constructing an index, the method comprising:
acquiring a first object and a preset index multi-way tree; the first object is an object to be indexed, and comprises a plurality of index fields, wherein each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to the second object; the second object is a plurality of objects associated with the leaf node, and the index structure comprises a plurality of primary indexes for representing the number of residual index fields of the second object; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer of the index multi-way tree to the leaf nodes in index fields contained in the second object;
Determining a target index dimension of the current layer by taking a first layer of the index multi-way tree as the current layer, searching a target index field corresponding to the target index dimension from index fields contained in the first object, taking a node in the current layer, of which the index dimension value is matched with the target dimension value of the target index field, as a current node, and when the current node is not a leaf node, taking the next layer corresponding to the current node as the current layer, and continuing to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
associating the first object with the target leaf node and determining a remaining index field of the first object;
and determining a target primary index matched with the number of the residual index fields of the first object from the primary indexes of the index structure, and associating the first object with the target primary index to obtain a target index multi-way tree.
According to a second aspect of the present invention, there is provided a retrieval method, the method comprising:
acquiring a retrieval request and a target index multi-way tree; the search request comprises a plurality of search terms, and each search term comprises a search dimension and a corresponding search dimension value; the target index multi-way tree is constructed according to the index construction method; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree;
Determining a target index dimension of the current layer by taking a first layer of the target index multi-way tree as the current layer, searching a target retrieval item corresponding to the target index dimension from the retrieval request, taking a node matched with a target retrieval dimension value of the target retrieval item in the current layer as the current node, and continuously executing the step until the current node is a leaf node by taking a next layer corresponding to the current node as the current layer when the current node is not a leaf node; determining the current node as a target leaf node;
determining unmatched search terms in the search request that do not correspond to index dimensions in a traversal path from the first tier to the target leaf node;
determining a target primary index corresponding to the number of unmatched search terms from an index structure stored in the target leaf node;
and determining the object associated with the target primary index as a candidate object, and searching a target object matched with the retrieval request from the candidate objects corresponding to the target primary index.
According to a third aspect of the present invention, there is provided an index building apparatus, the apparatus comprising:
The first acquisition module is used for acquiring a first object and a preset index multi-way tree, wherein the first object is an object to be indexed, the first object comprises a plurality of index fields, and each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to the second object; the second object is a plurality of objects associated with the leaf node, and the index structure comprises a plurality of primary indexes for representing the number of residual index fields of the second object; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer of the index multi-way tree to the leaf nodes in index fields contained in the second object;
the target node determining module is configured to determine a target index dimension of the current layer by using a first layer of the index multi-way tree as a current layer, search a target index field corresponding to the target index dimension from index fields included in the first object, and use a node in the current layer, in which an index dimension value matches with a target dimension value of the target index field, as a current node, and when the current node is not a leaf node, use a next layer corresponding to the current node as the current layer, and continue to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
The first association module is used for associating the first object with the target leaf node and determining the residual index field of the first object;
and the second association module is used for determining a target primary index matched with the number of the residual index fields of the first object from the primary indexes of the index structure, and associating the first object with the target primary index to obtain a target index multi-way tree.
According to a fourth aspect of the present invention, there is provided a retrieval device, the device comprising:
the second acquisition module is used for acquiring the search request and the target index multi-way tree; the search request comprises a plurality of search terms, and each search term comprises a search dimension and a corresponding search dimension value; the target index multi-way tree is constructed according to the index construction method; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree;
the search node determining module is used for determining a target index dimension of the current layer by taking a first layer of the target index multi-way tree as the current layer, searching a target search item corresponding to the target index dimension from the search request, taking a node matched with a target search dimension value of the target search item in the current layer as the current node, and continuously executing the step until the current node is a leaf node by taking a next layer corresponding to the current node as the current layer when the current node is not a leaf node; determining the current node as a target leaf node;
A first determining module, configured to determine an unmatched search term in the search request that does not correspond to an index dimension in a traversal path from the first layer to the target leaf node;
the second determining module is used for determining target primary indexes corresponding to the number of unmatched retrieval items from the index structure stored in the target leaf node;
and the third determining module is used for determining the object associated with the target primary index as a candidate object and searching a target object matched with the retrieval request from the candidate objects corresponding to the target primary index.
According to a fifth aspect of the present invention, there is provided a terminal comprising: a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of constructing an index as described in any of the first aspects or performs the method of retrieving as described in any of the second aspects.
According to a sixth aspect of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of constructing an index as described in any of the first aspects, or performs the method of retrieving as described in any of the second aspects.
Compared with the background art, the embodiment of the invention has the following advantages:
the embodiment of the invention provides a method for constructing an index, which comprises the steps of obtaining a first object and a preset index multi-way tree; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to a second object, wherein the second object is a plurality of objects associated with the leaf nodes; determining a target leaf node corresponding to a first object by taking a first layer of the index multi-way tree as a current layer; associating the first object with the target leaf node and determining the remaining index fields in the first object; and determining a target primary index corresponding to the number of the residual index fields of the first object in the index structure of the target leaf node, and associating the first object with the target primary index to obtain a target index multi-way tree. The index of the object can be flexibly configured according to the service requirement, and the expansibility is good.
The embodiment of the invention provides a retrieval method, which comprises the steps of obtaining a retrieval request and a target index multi-way tree; the target index multi-way tree is constructed by the index constructing method; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree; traversing the first layer of the target index multi-way tree in the direction of the leaf nodes by taking the first layer as the current layer, and determining the target leaf nodes corresponding to the retrieval request; determining target primary indexes corresponding to the number of unmatched search items in target leaf nodes according to the unmatched search items in the search request; searching a target object matched with the search request from candidate objects corresponding to the target primary index; the target object meeting the condition can be quickly searched; when the object is an advertisement order item, the retrieval request is an advertisement recall request, and the target advertisement corresponding to the advertisement recall request can be quickly recalled by the retrieval method provided by the embodiment of the invention.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a flowchart illustrating steps of a method for constructing an index according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an index multi-tree in a method for constructing an index according to an embodiment of the present invention;
FIG. 3 is an equivalent structural schematic diagram of a portion of the structure of FIG. 2;
FIG. 4 is a schematic diagram of an index multi-tree structure according to a specific example in a method for constructing an index according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating steps of a search method according to an embodiment of the present invention;
FIG. 6 is a schematic block diagram of an index building apparatus according to an embodiment of the present invention;
fig. 7 is a schematic block diagram of a search device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Referring to fig. 1, a schematic flow chart of steps of an index construction method according to an embodiment of the present invention is shown, where the method is used to construct an index for an object to be constructed, so that a target object matching with a search request can be quickly found through the index. The subject of execution of the method may be a system, apparatus, or device with computing, processing capabilities. Wherein the object may include merchandise, advertisements, and the like. The present embodiment is illustrated with the object being an advertisement order item.
The method provided by the embodiment specifically comprises the following steps:
step 101, acquiring a first object and a preset index multi-way tree, wherein the first object is an object to be indexed, and comprises a plurality of index fields, and each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to a second object, the second object being a plurality of objects associated with the leaf nodes.
The index structure includes a plurality of primary indexes for characterizing the number of remaining index fields of the second object; the remaining index field refers to an index field which is not matched with an index dimension in a traversing path from a first layer of the index multi-way tree to a leaf node in the index fields contained in the second object; it will be appreciated that the number of remaining index fields of an object associated with the same primary index is the same.
For the advertising field, the advertising platform can receive advertisements put by merchants, and the advertisements can be in the forms of texts, pictures, videos, audios and the like. When a merchant puts an advertisement on an advertisement platform, setting a targeting condition for the advertisement, for example, setting a region for putting the advertisement, the age of a corresponding user and the like; meanwhile, the advertisement platform also sets the orientation condition for the advertisement, and the orientation condition is generally set according to the payment mode of the advertisement by a merchant, the existing advertisement display positions, and the like, wherein the existing advertisement display positions comprise a start screen advertisement display position, a first page rolling advertisement display position, an advertisement display position when a page jumps, and the like.
A file associated with an advertisement may be generated in connection with targeting conditions, which may be understood as an advertisement order item, i.e., an object to be indexed, in an embodiment of the present invention. Thus, the advertisement order item and the advertisement are in one-to-one correspondence, i.e., after one advertisement order item is determined, one advertisement can be uniquely determined; and, when an advertisement is determined, an advertisement order item may also be uniquely determined.
The data representing the targeting condition in the advertisement order item is generally represented by a special field, which is inconvenient to expand and difficult to realize the configuration of the index. Wherein, the field representing the orientation condition can be understood as an index field in the embodiment of the present invention; that is, the advertisement order item includes an index field therein; thus, each advertisement order item needs to be processed. The method comprises the following steps:
and encoding the index field in the advertisement order item to obtain the dimension and the dimension value corresponding to the index field.
In this embodiment, when creating the index of the advertisement order item, the advertisement order item needs to be acquired first, and then the index field in the advertisement order item is encoded to obtain the dimension and the dimension value corresponding to the index field, where the specific structural form of the index field may be expressed as < key, value >, where key is the dimension and value is the dimension value. For example, the targeting condition corresponding to < age,3> is a population of 3 years of age, where age (age) is the dimension and 3 (years) is the dimension value; as another example, the orientation condition corresponding to < BillingType, CPD > is that the charging type is CPD (Cost per Day) type. By uniformly encoding the index fields, subsequent expansion can be facilitated, and configuration of the index can be realized.
In this embodiment, a preset configuration file may be pre-established, where the preset configuration file is used to create a preset index multi-way tree, where the preset configuration file includes index dimensions corresponding to each layer of the index multi-way tree, and it may be understood that each layer of the index multi-way tree may include multiple nodes, where nodes in the same layer correspond to the same index dimension, and different nodes in the same layer correspond to different index dimension values. The establishment of the preset configuration file can be established by combining the dimensionalities contained in the received advertisement order items, and can also be established according to actual experience, and the dimensionalities with larger discrimination degree are generally selected to establish the index multi-way tree.
In order to improve efficiency of index construction and efficiency of later retrieval, in the embodiment of the present invention, a depth of an index multi-way tree is smaller than the number of index fields contained in a first object. And, a leaf node of the index multi-way tree stores index structures corresponding to a plurality of objects (second objects) associated with the leaf node. That is, the plurality of objects associated with the leaf node of the index multi-way tree in the embodiment of the present invention may further construct an index structure, and store the constructed index structure in the leaf node. Through the combination of a plurality of index modes, the efficiency of index construction and the efficiency of later retrieval can be improved.
Referring now to FIG. 2, a schematic diagram of an index multi-tree is shown in FIG. 2, according to an embodiment of the present invention.
In the structure of the index multi-way tree provided in this example, the index dimension corresponding to the first layer (the layer corresponding to the root node) of the index multi-way tree is an ad_zone_id, and the first layer includes a plurality of nodes, and each node corresponds to an index dimension value of the ad_zone_id. Specifically, the first layer includes an ad_zone_id1 node, an ad_zone_id2 node, and an ad_zone_id3 node; the index dimension value corresponding to the node ad_zone_id1 is ad_zone_id1; the index dimension value corresponding to the node ad_zone_id2 is ad_zone_id2; the index dimension value corresponding to the ad_zone_id3 node is ad_zone_id3. When the index dimension value corresponding to the ad_zone_id index dimension further includes index dimension values other than the above-mentioned ad_zone_id1, ad_zone_id2, and ad_zone_id3, a corresponding node may be added in the first layer, and thus, the scalability of the index may be improved.
Aiming at the node of the ad_zone_id1, the corresponding index dimension of the next layer is the ad_release_type, and the nodes contained in the layer of which the index dimension is the ad_release_type comprise the node of the ad_release_type 1 and the node of the ad_release_type 2; the index dimension of the next layer corresponding to the ad_release_type1 node is location, and the index dimension of the next layer corresponding to the ad_release_type2 node is location.
For the ad_zone_id2 node and the ad_zone_id3 node, the corresponding index dimension of the next layer is the platform. The layer of index dimensions that is platform contains nodes that are platform1 nodes, platform2 nodes, and platform3 nodes. For the platform1 node, the platform2 node and the platform3 node, the corresponding index dimension of the corresponding next layer is location.
It should be noted that when index dimensions corresponding to the next layer of nodes are the same, the next layer of nodes is not related to each other. For example, in fig. 2, the index dimensions corresponding to the next layer of the ad_zone_id2 node and the ad_zone_id3 node are both the platform, and it is understood that the index dimension corresponding to the next layer on the branch of the ad_zone_id2 node is the platform, and the index dimension corresponding to the next layer on the branch of the ad_zone_id3 node is the platform; as shown in fig. 3, fig. 3 is another representation of the association of the table layer with the ad_zone_id2 node and the ad_zone_id3 node in the index dimension of fig. 2.
When a node does not have a corresponding next layer, the node is determined as a leaf node.
The number of layers of the index multi-way tree and the index dimension and index dimension value corresponding to each layer can be freely adjusted according to actual conditions, and the expansibility is good.
102, determining a target index dimension of a current layer by taking a first layer of an index multi-way tree as the current layer, searching a target index field corresponding to the target index dimension from index fields contained in a first object, taking a node in the current layer, of which the index dimension value is matched with the target dimension value of the target index field, as a current node, and when the current node is not a leaf node, taking a next layer corresponding to the current node as the current layer, and continuing to execute the step until the current node is the leaf node; and determining the current node as a target leaf node.
In this embodiment, after the index multi-way tree is created, the first layer of the index multi-way tree may be traversed, the dimension matching with the index dimension of the first layer may be searched from the index field of the first object according to the index dimension of the first layer, further, the dimension value corresponding to the dimension matching with the index dimension of the first layer in the index field of the first object may be determined, then, the current node matching with the dimension value may be determined from the plurality of nodes of the first layer, then, the next layer (the second layer) of the current node may be taken as the current layer, the dimension matching with the index dimension of the second layer may be searched from the index field of the first object according to the index dimension of the current layer, and the current node matching with the dimension value may be determined from the plurality of nodes of the second layer until the current node is the leaf node of the index multi-way tree, at this time, the leaf node may be determined as the target leaf node.
When the first object is an advertisement order item, starting from the first layer of the index multi-way tree, traversing to the direction of the leaf nodes according to the dimension and the dimension value contained in the advertisement order item, and determining the corresponding leaf nodes.
Step 103, associating the first object with the target leaf node and determining the remaining index field of the first object.
The remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer to the target leaf node in index fields contained in the first object.
In this embodiment, when a leaf node (target leaf node) corresponding to a first object is found, the first object is associated with the corresponding target leaf node; at this time, the plurality of index fields of the first object also have index fields that do not match the index dimensions in the traversal path from the first layer to the target leaf node, i.e., the remaining index fields. The remaining index field of the first object is determined for use in subsequent steps.
And 104, determining a target primary index matched with the number of the residual index fields of the first object from the primary indexes of the index structure, and associating the first object with the target primary index to obtain a target index multi-way tree.
In this embodiment, each leaf node stores an index structure corresponding to the second object. Specifically, in step 103 described above, the first object is associated with the corresponding target leaf node, and thus, for each leaf node, its associated object, i.e., the second object, may be determined.
In practical application, the greater the depth of the index multi-way tree, the more complex the creation of the preset configuration file is relatively; in order to reduce the difficulty of creating the preset configuration file, a more general dimension is generally selected to create the index multi-way tree, and the multi-way tree can be specifically created in combination with service requirements, so that the index dimension of the index multi-way tree has a larger degree of distinction.
Illustratively, this dimension of billing type may be set at one of the layers of the index multi-way tree for the advertisement order item.
Generally, the number of the second objects of each leaf node is larger, and in order to improve the efficiency of the subsequent retrieval of the target objects, the embodiment further constructs an index structure for the second objects, where the index structure specifically includes a plurality of primary indexes, and each primary index is related to the number of the remaining index fields of the second object of the leaf node; i.e., objects corresponding to the same primary index have the same number of remaining index fields.
Illustratively, in handling a large number of advertisement orders, leaf nodes of an index multi-way tree may be associated with a large number of advertisement orders, for which purpose an index structure is built from the number of remaining dimensions of advertisement orders for the plurality of advertisement orders associated with each leaf node.
For example, one leaf node associates m ad orders, where there are n1 ad orders with a number of remaining dimensions a, n2 ad orders with a number of remaining dimensions b, n3 ad orders with a number of remaining dimensions c, and n1+n2+n3=m.
At this time, the index structure constructed according to m advertisement order items may include a primary index a, a primary index b, and a primary index c; wherein, the primary index a is associated with n1 advertisement order items with the number of the remaining dimensions a; the primary index b is associated with n2 advertisement order items with the number of the remaining dimensions b; the primary index c associates n3 advertisement order items of the number c of remaining dimensions. Thus, the m advertisement order items associated with the leaf nodes are further grouped, and the efficiency of subsequent retrieval can be improved.
Since there may not be a primary index corresponding to the number of remaining index fields of the first object in the index structure during the process of constructing the index, in an alternative embodiment, determining the target primary index corresponding to the number of remaining index fields of the first object in the index structure, and associating the first object with the target primary index includes:
Searching whether a target primary index corresponding to the number of the residual index fields of the first object exists in an index structure of the target leaf node;
if not, generating a target primary index corresponding to the number of the residual index fields of the first object in the index structure, and associating the first object with the target primary index;
if so, the first object is associated with the target primary index.
The primary index in the index structure of the embodiment is not fixed, and therefore, the expansibility of the index can be improved.
Further, since the objects associated with each primary index are commonly characterized by the same number of remaining index fields, but the specific contents of the remaining index fields for the different objects associated with the same primary index are not necessarily the same, in an alternative implementation of this embodiment, a primary index may include multiple index search terms corresponding to the remaining index fields of the multiple objects associated with the primary index. Specifically, the method comprises the following steps:
determining a target index retrieval item matched with the residual index field of the first object;
the first object is associated with a target index retrieval item.
In this embodiment, after determining the target primary index corresponding to the first object, each remaining index field of the first object is obtained, the target index search item corresponding to each remaining index field is sequentially searched, and the first object is associated with each searched target index search item.
In the process of constructing the index, there may not be an index search item matching the remaining index field of the first object in the primary index, so in an alternative embodiment, the determining the target index search item matching the remaining index field of the first object includes:
searching whether a target index retrieval item matched with the residual index field exists in the target primary index;
if not, generating a target index retrieval item matched with the residual index field in the target primary index.
The efficiency of subsequent retrieval can be further improved by associating the first object with the target index retrieval item matched with the remaining index fields of the first object; the index multi-way tree associated with the first object is determined as a target index multi-way tree.
In order for those skilled in the art to better understand the present solution, embodiments of the present invention will be illustrated below with reference to the following examples, but it should be understood that the embodiments of the present invention are not limited thereto.
As shown in fig. 4, fig. 4 is a pre-constructed index multi-tree in this example, where the depth of the index multi-tree is 3, the index dimension corresponding to the first layer is KeyO, and the nodes included in the first layer have Value1 nodes. The index dimension corresponding to the next layer (second layer) corresponding to the Value1 node is KeyP, and the nodes contained in the layer (second layer) comprise the Value1 node and the Value2 node.
The index dimension corresponding to the next layer (the third layer of the first branch) corresponding to the Value1 node of the second layer is KeyQ, and the nodes contained in the layer (the third layer of the first branch) include the Value1 node, the Value2 node and the Value3 node. Since the third layer of the first branch is the last layer, the Value1 node, the Value2 node and the Value3 node of the third layer of the first branch are all leaf nodes.
The index dimension corresponding to the next layer (the third layer of the second branch) corresponding to the Value2 node of the second layer is KeyR, and the nodes contained in the layer (the third layer of the second branch) include the Value1 node and the Value2 node. Since the third layer of the second branch is the last layer, the Value1 node and the Value2 node of the third layer of the second branch are leaf nodes.
If the index field included in the object to be indexed currently has { < KeyO, value1>, < KeyP, value1>, < KeyQ, value2>, < Key1, value1>, < Key2, value2> }, at this time, traversing from the first layer of the index multi-tree, where the index dimension corresponding to the first layer is KeyO, finding the index field corresponding to the index dimension is KeyO from the index field, that is < KeyO, value1>, and determining that the current node is the first layer Value1 node according to the dimension Value1 corresponding to the dimension KeyO. Since the current node is not a leaf node, the next layer (second layer) corresponding to the current node is taken as the current layer, the index dimension corresponding to the current layer (second layer) is KeyP, an index field corresponding to the index dimension KeyP, that is < KeyP, value1>, is found from the index fields, and the current node is determined to be the second layer Value1 node according to the dimension Value1 corresponding to the dimension KeyP. Since the current node is not a leaf node, the next layer (the third layer of the first branch) corresponding to the current node is taken as the current layer, the index dimension corresponding to the current layer (the third layer of the first branch) is KeyQ, an index field corresponding to the index dimension being KeyQ, that is < KeyQ, value2>, is found from the index fields, and the current node is the third layer Value2 node of the first branch according to the dimension Value2 corresponding to the dimension KeyQ. At this time, the third layer Value2 node of the first branch is a leaf node, and if the third layer Value2 node of the first branch is determined to be a target leaf node, the object to be indexed is associated with the target leaf node (the third layer Value2 node of the first branch).
At this time, the remaining index fields included in the object to be indexed currently have { < Key1, value1>, < Key2, value2> }, and the number is 2; and associating the object to be currently indexed with the primary indexes with the number of 2 in the index structure of the target leaf node.
If the objects (second objects) associated with the current target leaf node are: ad1, ad2, ad3, ad4, ad5, ad6, ad7, each object containing the remaining index field is as follows:
Ad1:<Key1,Value1>
Ad2:<Key1,Value1>,<Key2,Value2>
Ad3:<Key1,Value1>,<Key3,Value3>
Ad4:<Key1,Value1>,<Key4,Value4>
Ad5:<Key3,Value3>,<Key5,Value5>
Ad6:<Key1,Value1>,<Key2,Value2>,<Key3,Value3>
Ad7:<Key1,Value1>,<Key2,Value2>,<Key4,Value4>
since the number of the residual index fields of the Ad1 is 1, the number of the residual index fields of the Ad2, the Ad3, the Ad4 and the Ad5 is 2, and the number of the residual index fields of the Ad6 and the Ad7 is 3, at this time, 3 primary indexes are included in the index structure of the target leaf node, namely a first primary index corresponding to the number of the residual index fields of 1, a second primary index corresponding to the number of the residual index fields of 2, and a third primary index corresponding to the number of the residual index fields of 3; the primary index contained by the index structure is represented as follows:
first level index name Associated objects
First level index Ad1
Second level index Ad2、Ad3、Ad4、Ad5
Third level index Ad6、Ad7
Since the number of the remaining index fields contained in the object to be indexed (the name is set to Ad 8) is 2, determining the target primary index corresponding to the number of the remaining index fields of Ad8 as the second primary index; associating Ad8 with the second level index; the primary index contained by the index structure is represented as follows:
First level index name Associated objects
First level index Ad1
Second level index Ad2、Ad3、Ad4、Ad5、Ad8
Third level index Ad6、Ad7
Associated with the second level index before associated with Ad8 are Ad2, ad3, ad4, ad5, from the remaining index fields of Ad2, ad3, ad4, ad5, it can be determined that the index retrieval items in the second level index can be represented as follows:
Figure BDA0003139620630000141
since the Ad8 residual index fields are < Key1, value1> and < Key2, value2>, the target index retrieval items are determined to be < Key1, value1> and < Key2, value2>, and Ad8 is associated with the target index retrieval items, so that the following results are obtained:
Figure BDA0003139620630000142
the index structure and associated objects of the target leaf node may be obtained at this time as follows:
Figure BDA0003139620630000143
Figure BDA0003139620630000151
if Ad8 is the first object to be indexed, the associated target leaf node is found in the index multi-way tree previously constructed in fig. 4, and the primary index in the index structure in the previous target leaf node may be empty, at this time, the primary index may be generated according to the number of the remaining index fields of Ad8, and the index search term under the primary index may be generated according to the content of the remaining index fields of Ad 8.
The index of other objects to be constructed with the index can be obtained by the same method, and then the target index multi-way tree associated with a plurality of first objects is obtained, the depth of the target index multi-way tree is small, each leaf node stores another index structure, the expansion is convenient, and the efficiency of the subsequent retrieval using the target index multi-way tree can be improved through the combination of the multi-way tree and other index structures.
The embodiment of the invention provides a method for constructing an index, which comprises the steps of obtaining a first object and a preset index multi-way tree; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store corresponding index structures corresponding to second objects, wherein the second objects are a plurality of objects associated with the leaf nodes; determining a target leaf node corresponding to a first object by taking a first layer of the index multi-way tree as a current layer; associating the first object with the target leaf node and determining the remaining index fields in the first object; and determining a target primary index corresponding to the number of the residual index fields of the first object in the index structure of the target leaf node, and associating the first object with the target primary index to obtain a target index multi-way tree. The index of the object can be flexibly configured according to the service requirement, and the expansibility is good.
Referring to fig. 5, a schematic step flow diagram of a search method according to an embodiment of the present invention is shown, where the method is used to determine a target object that meets a search request from a plurality of objects. The subject of execution of the method may be a system, apparatus, or device with computing, processing capabilities. The present embodiment is illustratively described as applied to an advertising platform for retrieving an advertisement order item that meets an advertisement recall request from a plurality of advertisement order items.
The method provided by the embodiment specifically comprises the following steps:
step 501, obtaining a search request and a target index multi-way tree; the search request comprises a plurality of search terms, each search term comprising a search dimension and a corresponding search dimension value; the target index multi-way tree is obtained through the embodiment of the index constructing method; the search request contains a number of search terms greater than the depth value of the target index multi-way tree.
According to the embodiment of the method for constructing the index, nodes of the same layer in the target index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values.
In this embodiment, the search request includes a plurality of search terms for indicating to find a target object matching the plurality of search terms from among the plurality of objects.
When the retrieval request is an advertisement recall request, the object is a pre-stored advertisement order item. The initiation of the advertisement recall request may be a client. For example, a user's client application may include advertisement slots for displaying advertisements. When a user opens the client application, or when the user uses the client application to browse the content therein, the client can send an advertisement recall request to the advertisement platform to request to acquire the matched advertisement content so as to display in the corresponding advertisement display position.
The advertisement recall request carries flow attribute information corresponding to the targeting condition. Since the traffic attribute information corresponding to the targeting condition in the advertisement recall request is generally represented by a specific field, it is difficult to implement the configuration of the index. Thus, there is a need to process advertisement recall requests. The method comprises the following steps:
and encoding the flow attribute information according to a preset rule to obtain a search dimension and a search dimension value corresponding to the flow attribute information, namely a search term.
In this embodiment, when advertisement retrieval is performed, the obtained flow attribute information in the advertisement recall request is encoded according to a preset rule, so as to obtain a retrieval item corresponding to the flow attribute information, and a specific structural form of the retrieval item may be expressed as < key, value >. By uniformly encoding the flow attribute information, the configuration of the index can be realized.
In this embodiment, the target index multi-tree corresponding to the plurality of objects is created by the method for constructing the index provided in the foregoing embodiment, and specific reference may be made to the foregoing description, which is not repeated herein. The method can obtain that nodes of the same layer in the target index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the target index multi-way tree is less than the number of search terms contained in the search request.
Step 502, the first layer of the target index multi-way tree is taken as the current layer, the target index dimension of the current layer is determined, a target search item corresponding to the target index dimension is searched from a search request, a node in the current layer, the index dimension value of which is matched with the target search dimension value of the target search item, is taken as the current node, when the current node is not a leaf node, the next layer corresponding to the current node is taken as the current layer, the step is continuously executed until the current node is a leaf node, and the current node is determined as the target leaf node.
In this embodiment, after the target index multi-tree is obtained, the target index multi-tree may be traversed from a first layer, a search dimension matching the index dimension of the first layer is searched from a search term of the search request according to the index dimension of the first layer, further, a search dimension value corresponding to the search dimension matching the index dimension of the first layer is determined in the search term of the search request, then, a current node matching the search dimension value is determined from a plurality of nodes of the first layer, then, a next layer (a second layer) of the current node is taken as the current layer, a search dimension matching the index dimension of the second layer is searched from the search term of the search request according to the index dimension of the current layer, a search dimension value corresponding to the search dimension is determined, a current node matching the search dimension value is determined from a plurality of nodes of the second layer until the current node is a leaf node of the target index multi-tree, and the leaf node is determined as the target leaf node.
When the search request is an advertisement recall request, starting from the first layer of the target index multi-way tree, traversing towards the direction of leaf nodes according to the search dimension and the search dimension value contained in the advertisement recall request, and determining the corresponding leaf nodes.
Step 503 determines non-matching search terms in the search request that do not correspond to index dimensions in the traversal path from the first tier to the target leaf node.
In the present embodiment, when a leaf node (target leaf node) to which the search request corresponds is found, a plurality of search terms of the search request have search terms that do not match the index dimension in the traversal path from the first layer to the target leaf node, i.e., search terms that do not match. And determining unmatched search terms of the search request for subsequent steps.
At step 504, a target primary index corresponding to the number of unmatched search terms is determined from the index structure stored by the target leaf node.
The index structure is an index structure corresponding to a plurality of objects associated with the target leaf node, and comprises a plurality of primary indexes associated with the number of the residual index fields of the objects; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer to the target leaf node in index fields contained in the object; the number of the remaining index fields of the object corresponding to the same primary index is the same.
In this embodiment, the target leaf node stores an index structure corresponding to a plurality of objects associated with the target leaf node, where the index structure includes a plurality of primary indices, and the primary indices are used to characterize the number of object residual index fields, i.e., the primary indices are related to the number of object residual index fields. The target primary index in the index structure is determined according to the number of unmatched search terms of the search request, and it can be understood that the number of residual index fields of the object corresponding to the target primary index is equal to the number of unmatched search terms of the search request.
In step 505, the object associated with the target primary index is determined as a candidate object, and a target object matched with the search request is searched from the candidate objects corresponding to the target primary index.
In this embodiment, the object associated with the target primary index is a part of the object associated with the target leaf node, so that the search range can be reduced and the search efficiency can be improved through the target primary index.
In an optional embodiment of the present invention, the primary index includes a plurality of index search terms, where the plurality of index search terms are a set of remaining index fields of a plurality of candidate objects corresponding to the primary index, and searching for a target object matching the search request from the candidate objects corresponding to the target primary index may include:
Determining a target index search term matched with each unmatched search term;
obtaining a quasi target object corresponding to each target index retrieval item;
and searching a target object matched with the retrieval request from the quasi target objects.
In this embodiment, the object associated with the target primary index is further divided by the index search terms, and the target object matched with the search request is searched from the quasi target objects corresponding to the target index search terms by determining the target index search term matched with each unmatched search term, so that the search range can be further reduced, and the search efficiency can be improved.
In an optional embodiment of the present invention, the searching the target object matching the search request from the quasi target objects includes:
counting the matching times of each quasi target object in the process of matching the unmatched search term;
and determining the quasi target objects with the same number of matching times as the unmatched search terms as target objects matched with the search request.
In this embodiment, in the process of obtaining the quasi target object corresponding to each target index item, the matching number of each quasi target object may be recorded, and the quasi target object with the matching number equal to the number of unmatched search items is determined as the target object matched by the search request.
Illustratively, when the number of unmatched search terms of the search request is 2, at this time, there are two unmatched search terms, and when a quasi object corresponding to a target index term matched by a first unmatched search term is acquired, the matching number of each quasi object is recorded as 1; when a quasi target object corresponding to a target index item matched with a second unmatched search item is obtained, adding 1 to the matching times of each quasi target object on the basis of the original matching times; at this time, the number of matches of each quasi-target object may be determined, and a quasi-target object whose number of matches is 2 may be determined as the target object to which the search request matches.
When the search request is an advertisement recall request, in the advertisement recall scene, one advertisement recall request is generally used for recalling one advertisement, so that the above-mentioned quasi target objects with the same number of times as the number of unmatched search terms are determined as target objects matched with the search request, and the method comprises the following steps:
and determining the quasi target advertisement order item with the least historical recall times in the quasi target advertisement order items with the same number of matching times as the unmatched search items as the target advertisement order item matched with the advertisement recall request.
In this embodiment, when the number of quasi-targeted advertisement order items that match the advertisement recall request is more than one, recall the quasi-targeted advertisement order item that has the least number of historical recall times among the quasi-targeted advertisement order items that match the advertisement recall request; the fairness of advertisement recall can be improved.
Of course, in other embodiments, any one of the quasi-targeted ad orders matching the ad recall request may also be recalled randomly; the invention is not limited in this regard.
In order for those skilled in the art to better understand the present solution, embodiments of the present invention will be illustrated below with reference to the following examples, but it should be understood that the embodiments of the present invention are not limited thereto.
In this example, as shown in fig. 4, the obtained target index multi-tree includes the search terms of { < KeyO, value1>, < KeyP, value1>, < KeyQ, value2>, < Key1, value1>, < Key2, value2> }; at this time, traversing is started from the first layer of the target index multi-tree, the index dimension corresponding to the first layer is KeyO, a search item corresponding to the index dimension is KeyO, namely < KeyO, value1>, is found from the search items, and the current node is determined to be the first layer Value1 node according to the search dimension Value1 corresponding to the search dimension KeyO. Since the current node is not a leaf node, the next layer (second layer) corresponding to the current node is taken as the current layer, the index dimension corresponding to the current layer (second layer) is KeyP, a search item corresponding to the index dimension being KeyP, namely < KeyP, value1>, is found from the search items, and the current node is determined to be the second layer Value1 node according to the search dimension Value1 corresponding to the search dimension KeyP. Since the current node is not a leaf node, the next layer (the third layer of the first branch) corresponding to the current node is taken as the current layer, the index dimension corresponding to the current layer (the third layer of the first branch) is KeyQ, a search item corresponding to the index dimension being KeyQ, namely < KeyQ, value2>, is found from the search items, and the current node is the third layer Value2 node of the first branch according to the search dimension Value2 corresponding to the search dimension KeyQ. At this time, the third layer Value2 node of the first branch is a leaf node, and thus, the third layer Value2 node of the first branch is determined as a target leaf node. The index structure of leaf nodes and associated objects are represented as follows:
Figure BDA0003139620630000201
At this time, the unmatched search items included in the current search request include { < Key1, value1>, < Key2, value2> }, and the number is 2; and determining the target primary index as the second primary index. The objects corresponding to the second level index comprise Ad2, ad3, ad4, ad5 and Ad8; at this time, the target objects, i.e., ad2 and Ad8, which match the search request can be determined by one-to-one comparison.
Generally, in the mass data, the number of objects corresponding to the target primary index is still larger; therefore, in one preferred manner, the target index search term in the second level index is determined to be < Key1, value1> and < Key2, value2> according to the unmatched search terms < Key1, value1> and < Key2, value2> of the search request. The objects corresponding to the target index retrieval items < Key1, value1> are Ad2, ad3, ad4 and Ad8, and the objects corresponding to the target index retrieval items < Key2, value2> are Ad2 and Ad8; obtaining a union of an object corresponding to the target index retrieval item < Key1, value1> and an object corresponding to the target index retrieval item < Key2, value2>, namely Ad2, ad3, ad4 and Ad8; compared with the object corresponding to the second level index, the object Ad5 is omitted, namely Ad5 is eliminated, and the quasi-target objects are Ad2, ad3, ad4 and Ad8. In the mass data, the number of the objects eliminated in the step is large, and the retrieval efficiency can be improved by narrowing the range of the objects compared with the retrieval request.
Further, in the process of determining the matched target objects from the quasi-target objects Ad2, ad3, ad4 and Ad8, the target objects can be determined according to the matching times by counting the matching times of each quasi-target object. Specifically, ad2 successfully matches 1 time when matching unmatched search items < Key1, value1>, and successfully matches 1 time when matching unmatched search items < Key2, value2 >; counting the times of Ad2 matching to obtain 2 times; similarly, the number of times of Ad3 matching can be obtained is 1 time; the number of times of Ad4 matching is 1 time; the number of Ad8 matches was 2. Since the number of search terms that the search request does not match is 2, the quasi-target object whose number of matches is 2 is the target object, that is, ad2 and Ad8 are the target objects. In the case where multiple objects can be returned, ad2 and Ad8 can be returned simultaneously; when only one object can be returned, one return can be randomly selected from Ad2 and Ad8, or determined from Ad2 and Ad8 according to other constraints.
According to the retrieval method provided by the embodiment of the invention, the retrieval request and the target index multi-way tree corresponding to the plurality of objects are obtained; the depth value of the target index multi-way tree is smaller than the number of search terms contained in the search request; traversing the first layer of the target index multi-way tree in the direction of the leaf nodes by taking the first layer as the current layer, and determining the target leaf nodes corresponding to the retrieval request; determining target primary indexes corresponding to the number of unmatched search items in target leaf nodes according to the unmatched search items in the search request; searching a target object matched with the search request from candidate objects corresponding to the target primary index; the target object meeting the condition can be quickly searched; when the object is an advertisement, the retrieval request is an advertisement recall request, and the target advertisement corresponding to the advertisement recall request can be quickly recalled by the retrieval method provided by the embodiment of the invention.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to FIG. 6, a schematic block diagram of a retrieved building block according to one embodiment of the invention is shown, which may include the following modules in particular:
a first obtaining module 601, configured to obtain a first object and a preset index multi-way tree; the first object is an object to be indexed, and comprises a plurality of index fields, wherein each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to the second object; the second object is a plurality of objects associated with the leaf node, and the index structure comprises a plurality of primary indexes for representing the number of residual index fields of the second object; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer of the index multi-way tree to the leaf nodes in index fields contained in the second object;
The target node determining module 602 is configured to determine a target index dimension of the current layer by using a first layer of the index multi-way tree as a current layer, search a target index field corresponding to the target index dimension from index fields included in the first object, and use a node in the current layer, in which an index dimension value matches with a target dimension value of the target index field, as a current node, and when the current node is not a leaf node, use a next layer corresponding to the current node as the current layer, and continue to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
a first association module 603, configured to associate the first object with the target leaf node, and determine a remaining index field of the first object;
and a second association module 604, configured to determine a target primary index matching the number of the remaining index fields of the first object from the primary indexes of the index structure, and associate the first object with the target primary index, to obtain a target index multi-way tree.
Optionally, the primary index includes a plurality of index retrieval items, and the apparatus further includes:
A target index search term determining module, configured to determine a target index search term that matches a remaining index field of the first object;
and a third association module, configured to associate the first object with the target index retrieval item.
Optionally, the target retrieval index item determining module includes:
the target index retrieval item searching module is used for searching whether a target index retrieval item matched with the residual index field exists in the target primary index;
and the target index search term generation module is used for generating a target index search term matched with the residual index field in the target primary index if the target index search term does not exist.
Referring to fig. 7, a schematic block diagram of a retrieval device according to an embodiment of the present invention is shown, and may specifically include the following modules:
a second obtaining module 701, configured to obtain a search request and a target index multi-way tree; the search request comprises a plurality of search terms, and each search term comprises a search dimension and a corresponding search dimension value; the target index multi-way tree is constructed according to the index construction method; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree;
The search node determining module 702 is configured to determine a target index dimension of the current layer by using a first layer of the target index multi-way tree as a current layer, search a target search term corresponding to the target index dimension from the search request, and use a node in the current layer, in which an index dimension value matches with a target search dimension value of the target search term, as a current node, and when the current node is not a leaf node, use a next layer corresponding to the current node as the current layer, and continue to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
a first determining module 703, configured to determine an unmatched search term in the search request that does not correspond to an index dimension in a traversal path from the first layer to the target leaf node;
a second determining module 704, configured to determine, from an index structure stored in the target leaf node, a target primary index corresponding to the number of unmatched search terms;
and a third determining module 705, configured to determine an object associated with the target primary index as a candidate object, and search for a target object matching the search request from the candidate objects corresponding to the target primary index.
Optionally, the primary index includes a plurality of index search terms, where the plurality of index search terms are a set of remaining index fields of a plurality of candidate objects corresponding to the primary index, and the third determining module 705 includes:
the index retrieval item determining module is used for determining target index retrieval items matched with each unmatched retrieval item;
the quasi target object determining module is used for acquiring quasi target objects corresponding to each target index retrieval item;
and the target object searching module is used for searching the target object matched with the retrieval request from the quasi target objects.
Optionally, the target object searching module includes:
the matching frequency counting module is used for counting the matching frequency of each quasi-target object in the process of matching the unmatched search term;
and the target object determining module is used for determining the quasi target objects, the matching times of which are the same as the number of the unmatched search terms, as target objects matched with the search request.
Optionally, the object is an advertisement order item, the search request is an advertisement recall request, and the target object determining module is further configured to:
and determining the quasi target advertisement order item with the least historical recall times in the quasi target advertisement order items with the same matching times as the unmatched search items as the target advertisement order item matched with the advertisement recall request.
For the device embodiments, the description is relatively simple, as it is substantially similar to the corresponding method embodiments, with reference to the partial description of the method embodiments being relevant.
The embodiment of the invention also provides a terminal, which can comprise a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the processes of the index constructing method or the index searching method embodiment when being executed by the processor, can achieve the same technical effects, and is not repeated here for avoiding repetition.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the index constructing method or the index searching method embodiment described above, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
As will be readily appreciated by those skilled in the art: any combination of the above embodiments is possible, and thus is an embodiment of the present invention, but the present specification is not limited by the text.
The methods of constructing or retrieving the index provided herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a system constructed with aspects of the present invention will be apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components of the index building method or the retrieval method according to an embodiment of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus or device program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.

Claims (11)

1. A method of constructing an index, the method comprising:
acquiring a first object and a preset index multi-way tree; the first object is an object to be indexed, and comprises a plurality of index fields, wherein each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to the second object; the second object is a plurality of objects associated with the leaf node, and the index structure comprises a plurality of primary indexes for representing the number of residual index fields of the second object; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer of the index multi-way tree to the leaf nodes in index fields contained in the second object;
Determining a target index dimension of the current layer by taking a first layer of the index multi-way tree as the current layer, searching a target index field corresponding to the target index dimension from index fields contained in the first object, taking a node in the current layer, of which the index dimension value is matched with the target dimension value of the target index field, as a current node, and when the current node is not a leaf node, taking the next layer corresponding to the current node as the current layer, and continuing to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
associating the first object with the target leaf node and determining a remaining index field of the first object;
and determining a target primary index matched with the number of the residual index fields of the first object from the primary indexes of the index structure, and associating the first object with the target primary index to obtain a target index multi-way tree.
2. The method of claim 1, wherein the primary index comprises a plurality of index retrieval items, the method further comprising:
determining a target index retrieval item matched with the residual index field of the first object;
And associating the first object with the target index retrieval item.
3. The method of claim 2, wherein the determining a target index retrieval item that matches the remaining index field of the first object comprises:
searching whether a target index retrieval item matched with the residual index field exists in the target primary index;
and if the index field does not exist, generating a target index retrieval item matched with the residual index field in the target primary index.
4. A method of searching, the method comprising:
acquiring a retrieval request and a target index multi-way tree; the search request comprises a plurality of search terms, and each search term comprises a search dimension and a corresponding search dimension value; the target index multi-way tree is constructed according to the index construction method of any one of claims 1-3; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree;
determining a target index dimension of the current layer by taking a first layer of the target index multi-way tree as the current layer, searching a target retrieval item corresponding to the target index dimension from the retrieval request, taking a node matched with a target retrieval dimension value of the target retrieval item in the current layer as the current node, and continuously executing the step until the current node is a leaf node by taking a next layer corresponding to the current node as the current layer when the current node is not a leaf node; determining the current node as a target leaf node;
Determining unmatched search terms in the search request that do not correspond to index dimensions in a traversal path from the first tier to the target leaf node;
determining a target primary index corresponding to the number of unmatched search terms from an index structure stored in the target leaf node;
and determining the object associated with the target primary index as a candidate object, and searching a target object matched with the retrieval request from the candidate objects corresponding to the target primary index.
5. The method of claim 4, wherein the primary index comprises a plurality of index entries, the plurality of index entries being a set of remaining index fields of a plurality of candidate objects corresponding to the primary index, the searching for a target object matching the search request from among the candidate objects corresponding to the target primary index comprising:
determining a target index search term matched with each unmatched search term;
obtaining a quasi target object corresponding to each target index retrieval item;
and searching a target object matched with the retrieval request from the quasi target objects.
6. The method of claim 5, wherein said finding a target object from said quasi target objects that matches said search request comprises:
Counting the matching times of each quasi target object in the process of matching the unmatched search term;
and determining the quasi target objects with the same matching times as the unmatched search terms as target objects matched with the search request.
7. The method of claim 6, wherein the object is an advertisement order item, the search request is an advertisement recall request, the determining the quasi-target object having the same number of matches as the unmatched search items as the target object matching the search request further comprises:
and determining the quasi target advertisement menu item with the least historical recall times in the quasi target advertisement menu items with the same matching times as the unmatched search items as the target advertisement menu item matched with the advertisement recall request.
8. An apparatus for constructing an index, the apparatus comprising:
the first acquisition module is used for acquiring a first object and a preset index multi-way tree, wherein the first object is an object to be indexed, the first object comprises a plurality of index fields, and each index field comprises a dimension and a dimension value; nodes of the same layer in the index multi-way tree correspond to the same index dimension, and different nodes of the same layer correspond to different index dimension values; the depth value of the index multi-way tree is smaller than the number of index fields contained in the first object; the leaf nodes of the index multi-way tree store an index structure corresponding to the second object; the second object is a plurality of objects associated with the leaf node, and the index structure comprises a plurality of primary indexes for representing the number of residual index fields of the second object; the remaining index fields refer to index fields which are not matched with index dimensions in a traversing path from the first layer of the index multi-way tree to the leaf nodes in index fields contained in the second object;
The target node determining module is configured to determine a target index dimension of the current layer by using a first layer of the index multi-way tree as a current layer, search a target index field corresponding to the target index dimension from index fields included in the first object, and use a node in the current layer, in which an index dimension value matches with a target dimension value of the target index field, as a current node, and when the current node is not a leaf node, use a next layer corresponding to the current node as the current layer, and continue to execute the step until the current node is a leaf node; determining the current node as a target leaf node;
the first association module is used for associating the first object with the target leaf node and determining the residual index field of the first object;
and the second association module is used for determining a target primary index matched with the number of the residual index fields of the first object from the primary indexes of the index structure, and associating the first object with the target primary index to obtain a target index multi-way tree.
9. A retrieval device, the device comprising:
the second acquisition module is used for acquiring the search request and the target index multi-way tree; the search request comprises a plurality of search terms, and each search term comprises a search dimension and a corresponding search dimension value; the target index multi-way tree is constructed according to the index construction method of any one of claims 1-3; the number of the search terms contained in the search request is larger than the depth value of the target index multi-way tree;
The search node determining module is used for determining a target index dimension of the current layer by taking a first layer of the target index multi-way tree as the current layer, searching a target search item corresponding to the target index dimension from the search request, taking a node matched with a target search dimension value of the target search item in the current layer as the current node, and continuously executing the step until the current node is a leaf node by taking a next layer corresponding to the current node as the current layer when the current node is not a leaf node; determining the current node as a target leaf node;
a first determining module, configured to determine an unmatched search term in the search request that does not correspond to an index dimension in a traversal path from the first layer to the target leaf node;
the second determining module is used for determining target primary indexes corresponding to the number of unmatched retrieval items from the index structure stored in the target leaf node;
and the third determining module is used for determining the object associated with the target primary index as a candidate object and searching a target object matched with the retrieval request from the candidate objects corresponding to the target primary index.
10. A terminal, comprising: a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of constructing an index as claimed in any one of claims 1 to 3 or implements the method of retrieving as claimed in any one of claims 4 to 7.
11. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method of constructing an index according to any one of claims 1 to 3, or implements the method of retrieving according to any one of claims 4 to 7.
CN202110732632.7A 2021-06-29 2021-06-29 Index construction method, index retrieval method, and corresponding device, terminal and medium Active CN113343043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110732632.7A CN113343043B (en) 2021-06-29 2021-06-29 Index construction method, index retrieval method, and corresponding device, terminal and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110732632.7A CN113343043B (en) 2021-06-29 2021-06-29 Index construction method, index retrieval method, and corresponding device, terminal and medium

Publications (2)

Publication Number Publication Date
CN113343043A CN113343043A (en) 2021-09-03
CN113343043B true CN113343043B (en) 2023-06-23

Family

ID=77481861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110732632.7A Active CN113343043B (en) 2021-06-29 2021-06-29 Index construction method, index retrieval method, and corresponding device, terminal and medium

Country Status (1)

Country Link
CN (1) CN113343043B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526746A (en) * 2016-06-22 2017-12-29 伊姆西公司 The method and apparatus of management document index
CN107644070A (en) * 2017-09-13 2018-01-30 北京柠檬微趣科技股份有限公司 Data index method, data query method and electronic equipment
CN109344226A (en) * 2018-10-11 2019-02-15 北京奇艺世纪科技有限公司 A kind of index data update method and device
KR102006283B1 (en) * 2019-02-26 2019-10-01 가천대학교 산학협력단 Dataset loading method in m-tree using fastmap
CN112434027A (en) * 2020-10-30 2021-03-02 金蝶软件(中国)有限公司 Indexing method and device for multi-dimensional data, computer equipment and storage medium
CN112883125A (en) * 2021-04-28 2021-06-01 北京奇岱松科技有限公司 Entity data processing method, device, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654760B2 (en) * 2001-06-04 2003-11-25 Hewlett-Packard Development Company, L.P. System and method of providing a cache-efficient, hybrid, compressed digital tree with wide dynamic ranges and simple interface requiring no configuration or tuning
US8171030B2 (en) * 2007-06-18 2012-05-01 Zeitera, Llc Method and apparatus for multi-dimensional content search and video identification
US8577837B2 (en) * 2007-10-30 2013-11-05 Sap Ag Method and system for generic extraction of business object data
US20110258034A1 (en) * 2010-04-15 2011-10-20 Yahoo! Inc. Hierarchically-structured indexing and retrieval

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526746A (en) * 2016-06-22 2017-12-29 伊姆西公司 The method and apparatus of management document index
CN107644070A (en) * 2017-09-13 2018-01-30 北京柠檬微趣科技股份有限公司 Data index method, data query method and electronic equipment
CN109344226A (en) * 2018-10-11 2019-02-15 北京奇艺世纪科技有限公司 A kind of index data update method and device
KR102006283B1 (en) * 2019-02-26 2019-10-01 가천대학교 산학협력단 Dataset loading method in m-tree using fastmap
CN112434027A (en) * 2020-10-30 2021-03-02 金蝶软件(中国)有限公司 Indexing method and device for multi-dimensional data, computer equipment and storage medium
CN112883125A (en) * 2021-04-28 2021-06-01 北京奇岱松科技有限公司 Entity data processing method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于B+树的电力大数据分布式索引;曲朝阳等;《东北电力大学学报》;第80-85页 *

Also Published As

Publication number Publication date
CN113343043A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
US11651036B2 (en) Integrated index blocks and searching in blockchain systems
CN108804532B (en) Query intention mining method and device and query intention identification method and device
US9842149B2 (en) Population and/or animation of spatial visualization(s)
US8239364B2 (en) Search and retrieval of objects in a social networking system
US9817895B2 (en) Associating video content with geographic maps
CN108197226A (en) MPTC account status tree and MPTC block chain method for quickly retrieving
CN108717407B (en) Entity vector determination method and device, and information retrieval method and device
RU2666460C2 (en) Support of tagged search results
CN107464133B (en) Method and device for delivering advertisement
US10068178B2 (en) Methods and system for associating locations with annotations
US20130138429A1 (en) Method and Apparatus for Information Searching
CN110569213A (en) File access method, device and equipment
WO2013097231A1 (en) File access method and system
JP6137960B2 (en) Content search apparatus, method, and program
JP2019512143A (en) Data processing method and apparatus
WO2014172204A1 (en) Method and apparatus of recommending an internet transaction
US20120271844A1 (en) Providng relevant information for a term in a user message
CN110134681A (en) Data storage and querying method, device, computer equipment and storage medium
CN112927057A (en) Object information display method and device, computer equipment and readable storage medium
CN112084291A (en) Information recommendation method and device
CN115168362A (en) Data processing method and device, readable medium and electronic equipment
CN113343043B (en) Index construction method, index retrieval method, and corresponding device, terminal and medium
CN109376174B (en) Method and device for selecting database
US20230085500A1 (en) Method, apparatus, and computer program product for point-of-interest recommendations
CN109726254B (en) Method and device for constructing triple knowledge base

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant