CN112925629A - Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium - Google Patents

Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium Download PDF

Info

Publication number
CN112925629A
CN112925629A CN202110348319.3A CN202110348319A CN112925629A CN 112925629 A CN112925629 A CN 112925629A CN 202110348319 A CN202110348319 A CN 202110348319A CN 112925629 A CN112925629 A CN 112925629A
Authority
CN
China
Prior art keywords
data table
preset
bloom filter
capacity expansion
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110348319.3A
Other languages
Chinese (zh)
Other versions
CN112925629B (en
Inventor
王学佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enyike Beijing Data Technology Co ltd
Original Assignee
Enyike Beijing Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enyike Beijing Data Technology Co ltd filed Critical Enyike Beijing Data Technology Co ltd
Priority to CN202110348319.3A priority Critical patent/CN112925629B/en
Publication of CN112925629A publication Critical patent/CN112925629A/en
Application granted granted Critical
Publication of CN112925629B publication Critical patent/CN112925629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application relates to a dynamic adjustment method, a dynamic adjustment system, electronic equipment and a storage medium for a bloom filter, wherein the method comprises the following steps: the existing digit detection step, namely acquiring the existing digits of a data table of the bloom filter, and acquiring the ratio of the existing digits to the total digits of the data table; monitoring a data table, namely monitoring whether the data table needs capacity expansion or not in real time by a monitoring center according to a preset capacity expansion threshold and the ratio and sending a capacity expansion request to a dispatching center; a step of capacity expansion detection, in which a dispatching center receives a capacity expansion request and detects whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request; and adjusting a bloom filter, namely, if the preset multiple of the data table is expanded and the preset table threshold value is not reached, expanding the data table, and otherwise, adjusting a multilayer hash algorithm or sending a memory alarm notice. Reduce the memory consumption through this application, effectively solve the problem of data collision and optimize the use of bloom filter.

Description

Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a bloom filter dynamic adjustment method, system, electronic device, and computer-readable storage medium.
Background
When the amount of data is increased rapidly, along with the growth of big data, Redis has become one of indispensable components in big data flow processing, Redis is usually used as a middle database for caching, but as the number of users and data of enterprise services is increased, Redis is a memory-level database, so that the memory is far from enough in some real-time data processing scenarios, and many enterprises use Bloom filters (Bloom filters) for alleviating the problem. A bloom filter is a long binary vector and a series of random mapping functions. The bloom filter can be used for searching whether an element is in a set or not, and only occupies a single byte because data of different main keys are temporary, so that the memory overhead is greatly saved.
However, the memory space is pre-allocated by using the bloom filter, the reserved bit number is usually more than ten times of the used bit number, and the probability of hash Collision can be guaranteed to be relatively low as far as possible under the condition that the hash algorithm is accurate, and if the values of the hash functions of two input strings are the same, the two strings are called as a Collision (Collision). Since a character string of an arbitrary length is converted into a character string of a fixed length, it is necessary to have an output string corresponding to an infinite number of input strings, and collisions are inevitable. Because the real-time data magnitude cannot be estimated, hash collision may be caused under the condition of data surge, and a large amount of memory waste may be generated due to complete overestimation of the data magnitude and calculation by 10 times. The existing developer calculates the bloom filter size by multiplying the estimated quantity by 10 times. However, the estimated magnitude is about 3 times of the usual magnitude because the problem of data surge is considered, and the estimated magnitude is about 10 times of the usual magnitude, namely about 30 times of the magnitude, and although only one bit is stored, the consumed memory is still more.
Of course, there is also a scheme to alleviate this problem by using a multi-layer hash algorithm, but the multi-layer hash algorithm also needs to be confronted with memory space occupied by a table for storing data required by each layer of hash algorithm. Since the hash algorithm is bound to face the problem of hash collision even if multiple layers of hash are used, but the situation still occurs when the table at each layer is too small, it is an important prerequisite to ensure that the size of the table is a safe multiple of the data level no matter what hash algorithm is used.
In the prior art, the hash algorithm can be automatically adjusted according to the usage amount of the memory resources. It also has disadvantages. Firstly, the most important problem related to the hash optimization is that original data are provided for external use, the original data cannot be stored in full quantity firstly because the original purpose of reducing the memory use of the bloom filter is basically deviated once being stored, and secondly, the use of the original data is influenced if the original data are reversely deduced through the hash and the original data are too frequent; in addition, optimization between hash algorithms has an optimization rate, but if the data magnitude is too large, the impact cannot be resisted at all.
Disclosure of Invention
The embodiment of the application provides a dynamic adjustment method and system for a bloom filter, electronic equipment and a computer readable storage medium.
In a first aspect, an embodiment of the present application provides a dynamic adjustment method for a bloom filter, including:
the existing digit detection step, namely acquiring the existing digits of a data table of the bloom filter, and acquiring the ratio of the existing digits to the total digits of the data table;
monitoring a data table, namely monitoring whether the data table needs capacity expansion or not in real time by a monitoring center according to a preset capacity expansion threshold and the ratio and sending a capacity expansion request to a dispatching center;
a step of capacity expansion detection, in which a dispatching center receives a capacity expansion request and detects whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request;
and adjusting a bloom filter, namely, if the preset multiple of the data table is expanded and the preset table threshold value is not reached, expanding the data table, and otherwise, adjusting a multilayer hash algorithm or sending a memory alarm notice.
Based on the steps, the capacity expansion of the bloom filter is dynamically realized by monitoring the position number of the data table, the original fixed and dead-written mechanism of the bloom filter is replaced, too large memory does not need to be preset, and the problem that the memory consumption of the existing bloom filter is too much is effectively solved.
In some of these embodiments, the bloom filter adjusting step further comprises:
a data table capacity expansion step, which is used for receiving a data table storage capacity request of the dispatching center through a control center and establishing a new table according to the data table capacity expansion request and carrying out table data migration if the dispatching center does not reach the preset table threshold value after detecting the preset capacity expansion multiple of the data table;
and a hash algorithm adjusting step, configured to, if the scheduling center detects that the preset expansion multiple of the data table reaches the preset table threshold, which indicates that the data table cannot be expanded, provide external access or send a memory alarm notification through multiple layers of bloom filters according to whether the bloom filters are preset with multiple layers of hash algorithms. Specifically, the bloom filter is preset with a multi-layer hash algorithm, and whether data exists is judged according to the preset multi-layer hash algorithm and external access is provided.
Based on the steps, the embodiment of the application can automatically realize the multilayer bloom filter according to the preset multilayer hash algorithm, and carries out condition limitation on the application of the multilayer hash algorithm, thereby effectively solving the problem of data collision, preventing the problem that the use of the bloom filter is frequently influenced by the hash back-pushing, and realizing the effect of reducing the use of the memory.
In some embodiments, the data table expansion step further includes:
a step of sending a capacity expansion request, in which the dispatching center sends a data table capacity expansion request to the control center;
a capacity expansion request processing step, wherein the control center receives and establishes a new data table increased by a preset multiple according to the data table capacity expansion request and performs data migration; specifically, after the number of bits of the new data table is increased by a preset multiple compared with the number of bits of the data table, the data migration is to copy the data of the data table in the metadata table of the bloom filter and write the data into the new data table;
and a step of capacity expansion completion notification, in which the control center returns a capacity expansion completion notification to the scheduling center to notify the scheduling center that the new data table is established and the data migration is completed.
In some embodiments, in the bloom filter adjusting step, if the preset expansion multiple of the data table is detected by the scheduling center and the preset table threshold is not reached and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists or not and use the data for external access.
In a second aspect, an embodiment of the present application provides a bloom filter dynamic adjustment system, including:
the existing digit detection module is used for acquiring the existing digits of the data table of the bloom filter and acquiring the ratio of the existing digits to the total digits of the data table;
the data table monitoring module is used for monitoring whether the data table needs to be expanded or not in real time by the monitoring center according to a preset expansion threshold value and the ratio and sending an expansion request to the dispatching center;
the capacity expansion detection module is used for receiving a capacity expansion request and detecting whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request by the dispatching center;
and the bloom filter adjusting module is used for expanding the data table if the preset multiple of the data table is not reached to the preset table threshold value, or else, adjusting the multilayer hash algorithm or sending a memory alarm notice.
Based on the modules, the capacity expansion of the bloom filter is dynamically realized by monitoring the number of the data epitopes, the original fixed dead-written mechanism of the bloom filter is replaced, too large memory does not need to be preset, and the problem of excessive memory consumption of the existing bloom filter is effectively solved.
In some of these embodiments, the bloom filter tuning module further comprises:
the data table capacity expansion module is used for receiving a data table storage capacity request of the dispatching center through a control center, establishing a new table according to the data table capacity expansion request and carrying out table data migration if the dispatching center does not reach the preset table threshold value after detecting the preset capacity expansion multiple of the data table;
and the hash algorithm adjusting module is used for providing external access or sending a memory alarm notice through the multilayer bloom filter according to whether the bloom filter is preset with the multilayer hash algorithm or not if the dispatching center detects that the capacity expansion preset multiple of the data table reaches the preset table threshold value, which indicates that the capacity expansion of the data table cannot be performed. Specifically, the bloom filter is preset with a multi-layer hash algorithm, and whether data exists is judged according to the preset multi-layer hash algorithm and external access is provided.
Based on the modules, the embodiment of the application can automatically realize the multilayer bloom filter according to the preset multilayer hash algorithm, and carries out condition limitation on the application of the multilayer hash algorithm, thereby effectively solving the problem of data collision, preventing the problem that the use of the bloom filter is frequently influenced by the hash reverse thrust, and realizing the effect of reducing the use of the memory.
In some embodiments, the data table expansion module further includes:
the dispatching center sends a data table capacity expansion request to the control center;
the control center receives and establishes a new data table increased by a preset multiple according to the data table capacity expansion request and performs data migration; specifically, after the number of bits of the new data table is increased by a preset multiple compared with the number of bits of the data table, the data migration is to copy the data of the data table in the metadata table of the bloom filter and write the data into the new data table;
and the control center returns a capacity expansion completion notification to the dispatching center so as to notify the dispatching center that the new data table is established and the data migration is completed.
In some embodiments, in the bloom filter adjusting module, if the preset expansion multiple of the data table is detected by the scheduling center and the preset table threshold is not reached, and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists or not and use the data for external access.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor, when executing the computer program, implements the bloom filter dynamic adjustment method according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the bloom filter dynamic adjustment method according to the first aspect.
Compared with the related art, the dynamic adjustment method, the dynamic adjustment system, the electronic device and the computer-readable storage medium for the bloom filter provided by the embodiment of the application dynamically realize capacity expansion of the bloom filter by monitoring the number of the data bits, replace an original fixed and written mechanism of the bloom filter, do not need to preset too large memory, and effectively solve the problem of excessive memory consumption of the existing bloom filter. The problem of data collision can be effectively solved, the problem that the use of a bloom filter is frequently influenced due to the fact that the hash reverse-pushing is excessive can be prevented, and the effect of reducing the use of the memory is achieved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic flow chart diagram of a dynamic bloom filter tuning method according to an embodiment of the present application;
FIG. 2 is a flow chart illustrating the sub-steps of a dynamic bloom filter tuning method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a bloom filter dynamic adjustment method according to an embodiment of the present application;
FIG. 4 is a block diagram of a bloom filter dynamic adjustment system according to an embodiment of the present application;
fig. 5 is a block diagram of sub-modules of a bloom filter dynamic adjustment system according to an embodiment of the present application.
Description of the drawings:
1. a present digit detection module; 2. a data table monitoring module; 3. a capacity expansion detection module;
4. a bloom filter adjustment module; 41. a data table capacity expansion module; 42. a hash algorithm adjusting module;
411. a capacity expansion request sending module; 412. a capacity expansion request processing module;
413. and the expansion completion notification module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The embodiment provides a dynamic adjustment method of a bloom filter. Fig. 1-2 are flowcharts of a dynamic bloom filter adjusting method according to an embodiment of the present application, and fig. 3 is a schematic diagram of a principle of the dynamic bloom filter adjusting method according to an embodiment of the present application, and as shown in fig. 1-3, the flowchart includes the following steps:
an existing digit detection step S1, acquiring an existing digit of the data table of the bloom filter, and acquiring a ratio of the existing digit to a total digit of the data table, where the existing digit is used to indicate an address digit of existing data in the data table of the bloom filter, and for example and without limitation, if the total digit of the data table is 100 digits, and the existing digit is 20 digits, the ratio is one fifth;
a data table monitoring step S2, in which the monitoring center monitors whether the data table needs to be expanded in real time according to a preset expansion threshold and a ratio, and if the ratio reaches the preset expansion threshold, the monitoring center determines that the data table needs to be expanded, and sends an expansion request to the scheduling center;
a capacity expansion detection step S3, in which the dispatching center receives the capacity expansion request and detects whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request; wherein, the preset multiple can be 10 times or any natural number multiple;
in the bloom filter adjusting step S4, if the preset multiple of the capacity of the data table does not reach the preset table threshold, the capacity of the data table is expanded, otherwise, the multi-layer hash algorithm is adjusted or a memory alarm is sent.
Through the steps, the embodiment of the application is based on the two sub-modules of the monitoring center and the scheduling center which are constructed in advance, the monitoring center module is used for monitoring the data table of the bloom filter in real time and sending the capacity expansion request, the scheduling center module is used for processing the capacity expansion request, so that the capacity expansion of the bloom filter is dynamically realized by monitoring the number of the data table, the original fixed and dead-written mechanism of the bloom filter is replaced, too large memory does not need to be preset, and the problem that the memory consumption of the existing bloom filter is too much is effectively solved.
In some of these embodiments, the bloom filter adjusting step S4 further includes:
a data table expansion step S41, configured to receive, by a control center, a data table storage capacity request of a scheduling center, establish a new table according to the data table expansion request, and perform table data migration if the scheduling center detects that a preset expansion multiple of the data table does not reach a preset table threshold;
and a hash algorithm adjusting step S42, configured to, if the preset expansion multiple of the data table is detected by the scheduling center and reaches a preset table threshold, indicating that the data table cannot be expanded, provide external access or send a memory alarm notification through multiple layers of bloom filters according to whether the bloom filters are preset with multiple layers of hash algorithms. Specifically, the bloom filter is preset with a multi-layer hash algorithm, and whether data exists is judged according to the preset multi-layer hash algorithm and external access is provided for use. For example, but not by way of limitation, if the user presets a two-layer hash algorithm, a new two-layer hash data table with the same size is established by sending a request, the generated two-layer hash data table replaces the data table, and so on to the N-layer hash algorithm. Based on the steps, the embodiment of the application can automatically realize the multilayer bloom filter according to the preset multilayer hash algorithm, and carries out condition limitation on the application of the multilayer hash algorithm, thereby effectively solving the problem of data collision, preventing the problem that the use of the bloom filter is frequently influenced by the hash back-pushing, and realizing the effect of reducing the use of the memory.
In some embodiments, the data table expansion step S41 further includes:
a step S411 of sending a capacity expansion request, in which the dispatching center sends a data table capacity expansion request to the control center;
an expansion request processing step S412, in which the control center receives and establishes a new data table increased by a preset multiple according to a data table expansion request and performs data migration; specifically, after the number of bits of the new data table is increased by a preset multiple compared with the data table, the data is migrated into data for copying the data table in the Metadata table of the bloom filter and writing the data into the new data table, wherein Metadata (Metadata), also called intermediary data and relay data, is data about description data (data about data), mainly information about description data attribute (property), and is used for supporting functions such as indicating a storage location, history data, resource lookup, file recording, and the like. The metadata is an electronic catalog, and in order to achieve the purpose of compiling the catalog, the contents or characteristics of data must be described and collected, so as to achieve the purpose of assisting data retrieval;
in the expansion completion notifying step S413, the control center returns an expansion completion notification to the scheduling center to notify the scheduling center that the new data table is completely established and the data migration is completed.
It should be noted that, in the bloom filter adjusting step S4, if the preset expansion multiple of the data table is detected by the scheduling center and does not reach the preset table threshold and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists and use the data for external access.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment further provides a dynamic adjustment system for a bloom filter, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the system that has been already made is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
Fig. 4 is a block diagram of a dynamic bloom filter adjusting system according to an embodiment of the present application, and fig. 5 is a block diagram of sub-modules of the dynamic bloom filter adjusting system according to an embodiment of the present application, as shown in fig. 4-5, the system includes:
the existing digit detection module 1 is used for acquiring the existing digits of the data table of the bloom filter and acquiring the ratio of the existing digits to the total digits of the data table; for example, but not by way of limitation, the total number of bits in the data table is 100 bits, and the number of existing bits is 20 bits;
the data table monitoring module 2 is used for monitoring whether the data table needs to be expanded and sending an expansion request to the dispatching center in real time by the monitoring center according to a preset expansion threshold value and a ratio, wherein the monitoring center is a logic pre-constructed sub-module which is used for monitoring the data table of the bloom filter in real time and sending the expansion request, and the dispatching center is a logic pre-constructed sub-module which is used for processing the expansion request by the system;
the capacity expansion detection module 3 is used for receiving the capacity expansion request and detecting whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request;
and the bloom filter adjusting module 4 is used for expanding the capacity of the data table if the preset expansion multiple of the data table is not reached to a preset table threshold value, or else, adjusting the multilayer hash algorithm or sending a memory alarm notice. Specifically, the bloom filter adjusting module 4 further includes: the data table capacity expansion module 41 is configured to receive, by a control center, a data table storage capacity request of the scheduling center, establish a new table according to the data table capacity expansion request, and perform table data migration if the scheduling center detects that the preset capacity expansion multiple of the data table does not reach a preset table threshold; and the hash algorithm adjusting module 42 is configured to, if the preset expansion multiple of the data table is detected by the scheduling center and then reaches a preset table threshold, indicating that the data table cannot be expanded, provide external access or send a memory alarm notification through the multilayer bloom filter according to whether the bloom filter is preset with the multilayer hash algorithm. Specifically, the bloom filter is preset with a multi-layer hash algorithm, and whether data exists is judged according to the preset multi-layer hash algorithm and external access is provided for use. For example, but not by way of limitation, if the user presets a two-layer hash algorithm, a new two-layer hash data table with the same size is established by sending a request, the generated two-layer hash data table replaces the data table, and so on to the N-layer hash algorithm. Based on the modules, the embodiment of the application can automatically realize the multilayer bloom filter according to the preset multilayer hash algorithm, and carries out condition limitation on the application of the multilayer hash algorithm, thereby effectively solving the problem of data collision, preventing the problem that the use of the bloom filter is frequently influenced by the hash reverse thrust, and realizing the effect of reducing the use of the memory.
Wherein, the data table capacity expansion module 41 further includes: the capacity expansion request sending module 411 is configured to send a data table capacity expansion request to the control center by the scheduling center; the capacity expansion request processing module 412 is used for receiving and establishing a new data table with preset times increased according to the capacity expansion request of the data table by the control center and performing data migration; specifically, after the number of bits of the new data table is increased by a preset multiple compared with the number of bits of the data table, data migration is performed in such a way that data of the data table is copied in a metadata table of the bloom filter and written in the new data table; and a capacity expansion completion notification module 413, where the control center returns a capacity expansion completion notification to the scheduling center to notify the scheduling center that the new data table is established and the data migration is completed.
It should be noted that, in the bloom filter adjusting module 4, if the dispatch center detects that the preset expansion multiple of the data table does not reach the preset table threshold and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists and use the data for external access.
Based on the modules, the capacity expansion of the bloom filter is dynamically realized by monitoring the number of the data epitopes, the original fixed dead-written mechanism of the bloom filter is replaced, too large memory does not need to be preset, and the problem of excessive memory consumption of the existing bloom filter is effectively solved.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
In addition, the bloom filter dynamic adjustment method described in the embodiment of the present application in conjunction with fig. 1 to 3 may be implemented by an electronic device. The electronic device may include a processor and a memory storing computer program instructions.
In particular, the processor may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
The memory may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is a Non-Volatile (Non-Volatile) memory. In particular embodiments, the Memory includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (earrom), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory may be used to store or cache various data files for processing and/or communication use, as well as possibly computer program instructions for execution by the processor.
The processor reads and executes the computer program instructions stored in the memory to implement any one of the bloom filter dynamic adjustment methods in the above embodiments.
In addition, in combination with the bloom filter dynamic adjustment method in the foregoing embodiment, the embodiment of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the bloom filter dynamic adjustment methods in the above embodiments.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A dynamic bloom filter adjusting method is characterized by comprising the following steps:
the existing digit detection step, namely acquiring the existing digits of a data table of the bloom filter, and acquiring the ratio of the existing digits to the total digits of the data table;
monitoring a data table, namely monitoring whether the data table needs capacity expansion or not in real time by a monitoring center according to a preset capacity expansion threshold and the ratio and sending a capacity expansion request to a dispatching center;
a step of capacity expansion detection, in which a dispatching center receives a capacity expansion request and detects whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request;
and adjusting a bloom filter, namely, if the preset multiple of the data table is expanded and the preset table threshold value is not reached, expanding the data table, and otherwise, adjusting a multilayer hash algorithm or sending a memory alarm notice.
2. The dynamic bloom filter adjusting method as recited in claim 1, wherein the bloom filter adjusting step further comprises:
a data table capacity expansion step, which is used for receiving a data table storage capacity request of the dispatching center through a control center and establishing a new table according to the data table capacity expansion request and carrying out table data migration if the dispatching center does not reach the preset table threshold value after detecting the preset capacity expansion multiple of the data table;
and a hash algorithm adjusting step, namely, if the dispatching center detects that the capacity expansion preset multiple of the data table reaches the preset table threshold value, providing external access or sending a memory alarm notification through a plurality of layers of bloom filters according to whether the bloom filters are preset with a plurality of layers of hash algorithms.
3. The dynamic bloom filter tuning method of claim 2, wherein the step of expanding the data table further comprises:
a step of sending a capacity expansion request, in which the dispatching center sends a data table capacity expansion request to the control center;
a capacity expansion request processing step, wherein the control center receives and establishes a new data table increased by a preset multiple according to the data table capacity expansion request and performs data migration;
and a step of informing the completion of capacity expansion, wherein the control center returns a notice of the completion of capacity expansion to the dispatching center.
4. The dynamic bloom filter adjusting method according to claim 2 or 3, wherein in the bloom filter adjusting step, if the preset table threshold is not reached after the dispatching center detects that the data table has been expanded by a preset multiple, and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists or not and use the data for external access.
5. A bloom filter dynamic adjustment system, comprising:
the existing digit detection module is used for acquiring the existing digits of the data table of the bloom filter and acquiring the ratio of the existing digits to the total digits of the data table;
the data table monitoring module is used for monitoring whether the data table needs to be expanded or not in real time by the monitoring center according to a preset expansion threshold value and the ratio and sending an expansion request to the dispatching center;
the capacity expansion detection module is used for receiving a capacity expansion request and detecting whether the capacity expansion of the data table reaches a preset table threshold value after a preset multiple is expanded according to the capacity expansion request by the dispatching center;
and the bloom filter adjusting module is used for expanding the data table if the preset multiple of the data table is not reached to the preset table threshold value, or else, adjusting the multilayer hash algorithm or sending a memory alarm notice.
6. The bloom filter dynamic adjustment system of claim 5, wherein the bloom filter adjustment module further comprises:
the data table capacity expansion module is used for receiving a data table storage capacity request of the dispatching center through a control center, establishing a new table according to the data table capacity expansion request and carrying out table data migration if the dispatching center does not reach the preset table threshold value after detecting the preset capacity expansion multiple of the data table;
and the hash algorithm adjusting module is used for providing external access or sending a memory alarm notice through a plurality of layers of bloom filters according to whether the bloom filters are preset with a plurality of layers of hash algorithms or not if the dispatching center detects that the data table reaches the preset table threshold value after the preset expansion multiple is obtained.
7. The dynamic bloom filter tuning system of claim 6, wherein the data table expansion module further comprises:
the dispatching center sends a data table capacity expansion request to the control center;
the control center receives and establishes a new data table increased by a preset multiple according to the data table capacity expansion request and performs data migration;
and the control center returns a capacity expansion completion notification to the dispatching center.
8. The dynamic bloom filter adjusting system according to claim 6 or 7, wherein in the bloom filter adjusting module, if the preset table threshold is not reached after the dispatching center detects that the capacity of the data table is expanded by the preset multiple, and the bloom filter is preset with a multi-layer hash algorithm, the bloom filter performs multi-layer hash algorithm operation, but does not start the multi-layer hash data table to determine whether the data exists or not and use the data for external access.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the bloom filter dynamic adjustment method of any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the bloom filter dynamic adjustment method according to any one of claims 1 to 4.
CN202110348319.3A 2021-03-31 2021-03-31 Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium Active CN112925629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110348319.3A CN112925629B (en) 2021-03-31 2021-03-31 Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110348319.3A CN112925629B (en) 2021-03-31 2021-03-31 Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112925629A true CN112925629A (en) 2021-06-08
CN112925629B CN112925629B (en) 2023-10-20

Family

ID=76176820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110348319.3A Active CN112925629B (en) 2021-03-31 2021-03-31 Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112925629B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320654A (en) * 2014-05-28 2016-02-10 中国科学院深圳先进技术研究院 Dynamic bloom filter and element operating method based on same
CN106372190A (en) * 2016-08-31 2017-02-01 华北电力大学(保定) Method and device for querying OLAP (on-line analytical processing) in real time
CN106445944A (en) * 2015-08-06 2017-02-22 阿里巴巴集团控股有限公司 Data query request processing method and apparatus, and electronic device
US20170154099A1 (en) * 2015-11-26 2017-06-01 International Business Machines Corporation Efficient lookup in multiple bloom filters
CN107729535A (en) * 2017-11-17 2018-02-23 中国科学技术大学 The collocation method of Bloom filter in a kind of key value database
CN109828721A (en) * 2019-01-23 2019-05-31 平安科技(深圳)有限公司 Data-erasure method, device, computer equipment and storage medium
US10503737B1 (en) * 2015-03-31 2019-12-10 Maginatics Llc Bloom filter partitioning
CN112068958A (en) * 2020-08-31 2020-12-11 常州微亿智造科技有限公司 Bloom filter and data processing method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320654A (en) * 2014-05-28 2016-02-10 中国科学院深圳先进技术研究院 Dynamic bloom filter and element operating method based on same
US10503737B1 (en) * 2015-03-31 2019-12-10 Maginatics Llc Bloom filter partitioning
CN106445944A (en) * 2015-08-06 2017-02-22 阿里巴巴集团控股有限公司 Data query request processing method and apparatus, and electronic device
US20170154099A1 (en) * 2015-11-26 2017-06-01 International Business Machines Corporation Efficient lookup in multiple bloom filters
CN106372190A (en) * 2016-08-31 2017-02-01 华北电力大学(保定) Method and device for querying OLAP (on-line analytical processing) in real time
CN107729535A (en) * 2017-11-17 2018-02-23 中国科学技术大学 The collocation method of Bloom filter in a kind of key value database
CN109828721A (en) * 2019-01-23 2019-05-31 平安科技(深圳)有限公司 Data-erasure method, device, computer equipment and storage medium
WO2020151332A1 (en) * 2019-01-23 2020-07-30 平安科技(深圳)有限公司 Data deletion method and apparatus, computer device, and storage medium
CN112068958A (en) * 2020-08-31 2020-12-11 常州微亿智造科技有限公司 Bloom filter and data processing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YUHAN WU等: "Elastic Bloom Filter:Deletable and Expandable Filter Using Elastic Fingerprints", 《IEEE TRANSACTION ON COMPUTERS》, pages 984 - 991 *
樊星等: "区块链应用下的新型区块链布隆过滤器", 《计算机科学与探索》, pages 1921 - 1929 *
王浩严: "一种分层次数据去冗技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 137 - 63 *

Also Published As

Publication number Publication date
CN112925629B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
US11853549B2 (en) Index storage in shingled magnetic recording (SMR) storage system with non-shingled region
CN102782683B (en) Buffer pool extension for database server
US20200257450A1 (en) Data hierarchical storage and hierarchical query method and apparatus
WO2020093501A1 (en) File storage method and deletion method, server, and storage medium
CN111309258B (en) B + tree access method and device and computer readable storage medium
CN112799595B (en) Data processing method, device and storage medium
CN114265670B (en) Memory block sorting method, medium and computing device
CN109213450B (en) Associated metadata deleting method, device and equipment based on flash memory array
CN108762670B (en) Management method, system and device for data blocks in SSD (solid State disk) firmware
CN111414228A (en) Kubernetes-based method for managing storage space and related device
EP3385846B1 (en) Method and device for processing access request, and computer system
CN111880731A (en) Data processing method and device and related components
CN104965835A (en) Method and apparatus for reading and writing files of a distributed file system
CN113485874B (en) Data processing method and distributed storage system
US9395930B2 (en) Information processing system, control method of information processing system, and recording medium
US10083117B2 (en) Filtering write request sequences
CN111488128B (en) Method, device, equipment and medium for updating metadata
CN106354793B (en) Method and device for monitoring hot spot object
WO2019037587A1 (en) Data restoration method and device
CN112925629A (en) Bloom filter dynamic adjustment method, bloom filter dynamic adjustment system, electronic equipment and storage medium
CN111984650A (en) Storage method, system and related device of tree structure data
CN113626089B (en) Data operation method, system, medium and device based on BIOS (basic input output system)
CN114610243A (en) Thin volume conversion method, system, storage medium and equipment
CN115114239A (en) Distributed system data processing method, device, equipment and medium
CN107918654B (en) File decompression method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant