CN112269670A - Data storage method, device and system and storage medium - Google Patents

Data storage method, device and system and storage medium Download PDF

Info

Publication number
CN112269670A
CN112269670A CN202011195013.0A CN202011195013A CN112269670A CN 112269670 A CN112269670 A CN 112269670A CN 202011195013 A CN202011195013 A CN 202011195013A CN 112269670 A CN112269670 A CN 112269670A
Authority
CN
China
Prior art keywords
data
partition
target
pieces
partitions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011195013.0A
Other languages
Chinese (zh)
Other versions
CN112269670B (en
Inventor
贺宁
魏程琛
傅浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Unisinsight Technology Co Ltd
Original Assignee
Chongqing Unisinsight Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Unisinsight Technology Co Ltd filed Critical Chongqing Unisinsight Technology Co Ltd
Priority to CN202011195013.0A priority Critical patent/CN112269670B/en
Publication of CN112269670A publication Critical patent/CN112269670A/en
Application granted granted Critical
Publication of CN112269670B publication Critical patent/CN112269670B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data warehousing method, a device, a system and a storage medium, wherein the method comprises the following steps: acquiring a plurality of pieces of data written into a target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of a plurality of microservices; storing the plurality of data into a database through target micro services corresponding to the target data partitions according to the timestamp information corresponding to the plurality of data respectively; and the plurality of pieces of data after being put in storage have a time sequence order. The method can be applied to the scene that the storage of the analyzed data cannot be covered, and the time sequence of the data can be ensured in the process of storing the disordered analyzed data.

Description

Data storage method, device and system and storage medium
Technical Field
The invention relates to the field of data processing, in particular to a data warehousing method, device and system and a storage medium.
Background
In recent years, the internet of things and the intelligent terminal generate massive data, the existing traditional database cannot meet the performance requirement of a specific scene, and under the massive data, the data of a certain device in a certain time period can be inquired. And the demand of users for a specified device in a specified time period is stronger, so a time sequence database is introduced to solve the problems.
At present, the current time-sequence database design aims at a device direct connection database, namely, data of the device needs to be analyzed and processed and is directly stored in a warehouse, but for some data needing to be analyzed and processed to mine data value, the time sequence of the data cannot be met under multiple services and multiple processes, so that the data storage speed is low, and the requirement cannot be met.
Disclosure of Invention
In view of this, the present invention provides a data warehousing method, apparatus, system and storage medium, which are intended to improve the warehousing efficiency and satisfy the time sequence of data when some data needing to be analyzed and processed to mine the data value are warehoused.
The technical scheme of the invention is as follows:
in a first aspect, the present invention provides a data warehousing method applied to a microservice system, where the microservice system includes a microservice, a distributed message component and a database, the distributed message component has a plurality of data partitions, and the method includes: acquiring a plurality of pieces of data written into a target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices; storing the plurality of pieces of data into the database through target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data respectively; and the plurality of pieces of data after being put in storage have the time sequence order.
In a second aspect, the present invention provides a data warehousing apparatus, disposed in a microservice system, where the microservice system includes a microservice, a distributed message component and a database, the distributed message component has a plurality of data partitions, and includes: the acquisition module is used for acquiring a plurality of pieces of data which are written into the target data partition and correspond to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices; the storage module is used for storing the plurality of pieces of data into the database through the target micro service corresponding to the target data partition according to the timestamp information corresponding to the plurality of pieces of data; and the plurality of pieces of data after being put in storage have the time sequence order.
In a third aspect, the present invention provides a system, including a microserver, a distributed message component, and a database, where the distributed message component has a plurality of data partitions, and the system is configured to obtain a plurality of pieces of data written into a target data partition corresponding to a data acquisition device; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the system accessed by the data acquisition equipment by the target micro service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices; storing the plurality of pieces of data into the database through target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data respectively; and the plurality of pieces of data after being put in storage have the time sequence order.
In a fourth aspect, the present invention provides a storage medium, wherein the computer program, when executed by a processor, implements the data warehousing method of the first aspect.
The embodiment of the invention provides a data storage method, a device, a system and a storage medium, wherein the method comprises the following steps: acquiring a plurality of pieces of data written into a target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of a plurality of microservices; storing the plurality of data into a database through target micro services corresponding to the target data partitions according to the timestamp information corresponding to the plurality of data respectively; and the plurality of pieces of data after being put in storage have a time sequence order. The method can be applied to the scene that the storage of the analyzed data cannot be covered, and the time sequence of the data can be ensured in the process of storing the disordered analyzed data. The difference from the prior art is that: in the prior art, the time sequence of data cannot be guaranteed in the process of processing analyzed data, and the data are fused together.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic flow chart of a data warehousing method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a possible implementation manner of step S107 provided in the embodiment of the present invention;
FIG. 3 is a schematic flow chart of another method for data warehousing according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of a possible implementation manner of step S102 provided in the embodiment of the present invention;
FIG. 5 is a schematic flow chart diagram of another method for data warehousing according to an embodiment of the present invention;
fig. 6 is a schematic flowchart of a possible implementation manner of step S105 provided in the embodiment of the present invention;
FIG. 7 is a schematic flow chart diagram of another method for data warehousing according to an embodiment of the present invention;
fig. 8 is a schematic view of a scenario provided by an embodiment of the present invention;
fig. 9 is a functional block diagram of a data warehousing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present invention, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. indicate an orientation or a positional relationship based on that shown in the drawings or that the product of the present invention is used as it is, this is only for convenience of description and simplification of the description, and it does not indicate or imply that the device or the element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.
Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
At present, the internet of things and intelligent terminals generate massive data, and the existing traditional database cannot meet the performance requirements of specific scenes, such as: under mass data, data of a certain device in a certain time period is queried, and as a data block where the data is located cannot be confirmed, only all data can be traversed, so that a great deal of time and labor are wasted for querying, and the requirements of users are far from being met. The video monitoring also means that the front end generates data continuously, and the demand of a user for a specified device in a specified time period is stronger, so a time sequence database is introduced to solve the problems. However, the current time-sequence database design is directed to the device direct connection database, and at present, the current time-sequence database design is directed to the device direct connection database, that is, the data of the device needs to be analyzed and processed, and is directly put into a storage.
However, in the research process, for some media data, such as video data, image data, game data, etc., if such data are directly put into a warehouse, the practical significance is not great, but some characteristic data obtained after the data are analyzed and analyzed have a greater effect, so that in the prior art, when the data after such analysis are put into a warehouse, the time sequence of the data cannot be guaranteed, the warehousing speed is slow, and the requirement cannot be met. For example, the data acquisition devices acquire data 1, data 2, and data 3 at time t1, t2, and t3, and access the system at the same time, the time sequence between t1, t2, and t3 may be, for example: at this time, the data 1, 2 and 3 can be analyzed by 3 micro services respectively, and due to different processing capacities and speeds of the micro services, the time for the data 1, 2 and 3 to arrive at the database to wait for being put in storage after being analyzed may be T1> T2> T3, and at this time, if the data 1, 2 and 3 are put in storage according to the sequence of T1> T2> T3, it is obvious that the time sequence order among the data 1, 2 and 3 after being put in storage is out of order.
Therefore, in order to solve the above technical problems, the inventors have proposed a data warehousing method in the research process, in which the processing target is the data subjected to the analysis processing, and the method can ensure that the data out of order after the analysis is stored in the database in time series, and the warehousing efficiency is high.
Referring to fig. 1, fig. 1 is a schematic flowchart of a data warehousing method according to an embodiment of the present invention, where the method may be applied to a microservice system, where the microservice system includes a microservice, a distributed message component and a database, and the distributed message component has a plurality of data partitions, and the method may include:
s106, acquiring a plurality of pieces of data written into the target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation;
in the embodiment of the invention, the data is the data obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of a plurality of microservices;
in a possible embodiment, the data partition may correspond to at least one data acquisition device; the data accessed to the system by each data acquisition device may be parsed by a plurality of microservices.
S107, storing the plurality of pieces of data into a database through target microservices corresponding to target data partitions according to the timestamp information corresponding to the plurality of pieces of data; and the plurality of pieces of data after being put in storage have a time sequence order.
The data storage method provided by the embodiment of the invention comprises the following steps: acquiring a plurality of pieces of data written into a target data partition corresponding to at least one data acquisition device; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of a plurality of microservices; storing the plurality of data into a database through target micro services corresponding to the target data partitions according to the timestamp information corresponding to the plurality of data respectively; the difference between the multiple pieces of data after being put in storage and the prior art in the time sequence order is as follows: in the prior art, the implementation mode of data storage cannot cover a scene of storing analyzed data, the time sequence of the data cannot be guaranteed in the process of processing the analyzed data, and the data are fused together.
Optionally, in order to improve the efficiency of data storage and ensure the timing of data storage, a possible implementation is given below, referring to fig. 2, where fig. 2 is a schematic flowchart of an implementation of step S107 provided in an embodiment of the present invention, that is, step S107 may include the following sub-steps:
s107-1, obtaining a time sequence data set according to the time stamps of the plurality of pieces of data;
it is to be understood that the time-series data set is a data set obtained after sorting the plurality of data by time. In some possible implementations, the pieces of data may be sorted chronologically, or sorted chronologically.
S107-2, when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence.
S107-3, when the database has the warehoused data of the data acquisition equipment, warehousing at least one piece of data corresponding to the timestamp with the timestamp smaller than that of the warehoused data in the data set according to a preset import strategy, and inserting at least one piece of data corresponding to the timestamp with the timestamp larger than that of the warehoused data into the database according to a time sequence.
It can be understood that whether the data to be warehoused is out of order or in order is judged by judging the warehousing time of the warehoused data of the data acquisition equipment, when the data of the data acquisition equipment does not exist in the database, the data in the data set is inserted into the database according to the time sequence order, and if the data does exist, the warehousing of the data is accelerated according to a leading-in strategy by using the data of which the timestamp is smaller than the timestamp of the warehoused data.
Optionally, in order to obtain data written into the data partition by the data acquisition device quickly and accurately, a possible implementation manner is given on the basis of fig. 1, referring to fig. 3, fig. 3 is a schematic flowchart of another data warehousing method provided by the embodiment of the present invention, that is, before step S101, the method further includes:
s101, acquiring equipment identification sets corresponding to all data acquisition equipment and partition identification sets corresponding to all data partitions.
In some possible implementations, the device identifier set may be a set of device identifiers corresponding to all data acquisition devices connected to the system, where the device identifier may be a device ID, and the partition identifier may be a partition number.
In some possible embodiments, after the user adds the device identification information in the system, the number of data partitions in the distributed message component (e.g., the kaffka middleware) is obtained through the producer service, and the partition identification corresponding to each data partition is obtained to form a partition identification set.
S102, establishing a first corresponding relation between each partition identification and at least one equipment identification.
It can be understood that each partition identifier may correspond to at least one device identifier, which may save the memory used by the data partition, and the system may store the data of the device in the corresponding data partition all the time by maintaining the corresponding relationship between the device and the data partition, thereby facilitating the subsequent data storage and data query of the same data acquisition device.
S103, writing the plurality of pieces of data into the target data partitions corresponding to the data acquisition equipment according to the first corresponding relation.
It can be understood that the data of the same data acquisition device are all written into the same data partition, so that the data can be processed by the same micro service subsequently, and only time comparison and selection can be performed in the micro service, so that the data storage time is accelerated.
It can be understood that all data of the data acquisition equipment can be stored in the same corresponding data partition by maintaining the corresponding relation between the data acquisition equipment and the data partition, so that all data of the equipment can be quickly obtained from the same data partition according to the equipment identification in the data storage process, and the time for obtaining the equipment data is saved.
Optionally, in the process of establishing the corresponding relationship between the data partitions and the data acquisition devices, in order to balance the number of devices corresponding to each data partition, a possible implementation manner is given below, referring to fig. 4, where fig. 4 is a schematic flow chart of a possible implementation manner of step S102 provided in an embodiment of the present invention, that is, step S102 may include the following sub-steps:
s102-1, when the total number of the equipment identifications is smaller than or equal to the total number of the partition identifications, distributing one partition identification for each equipment identification, and establishing a first corresponding relation between each partition identification and one equipment identification;
s102-2, when the total number of the equipment identifications is larger than the total number of the partition identifications and the number of the equipment identifications corresponding to at least one partition identification is smaller than the number of the equipment identifications corresponding to any other partition identification in the partition identification set, distributing the equipment identifications for at least one partition identification until the number of the equipment identifications corresponding to all the partition identifications is consistent, and establishing a first corresponding relation between each partition identification and at least one equipment identification.
It can be understood that, in the embodiment of the present invention, a polling allocation policy may be adopted to allocate a corresponding device identifier to each partition identifier, so as to ensure the balance of the data partitions. When the number of the device identifiers is greater than the total number of the partitions, in order to balance the number of the devices corresponding to each data partition, firstly, whether the total number of the devices of the data partition is smaller than the average value is judged, if so, the original device is considered to be deleted, the newly added device is directly added into the corresponding data partition, and the number of the devices corresponding to each data partition is balanced.
In a possible implementation manner, the average value may be rounded by a quotient obtained by dividing the total number of the devices by the number of the partitions, if the average value is smaller than the rounded integer, and if the number of the devices corresponding to the partitions is rounded, data representing the partitions is deleted, for convenience of understanding, for example, it is assumed that there are 6 data partitions, the partition identifiers may be represented as partitions 0 to 5, the number of the data acquisition devices is 2, and the device identifiers may be device identifiers 1 to device identifiers 2. At this time, the partition identifier 0 may be assigned to the device identifier 1, and the partition identifier 2 may be assigned to the device identifier 2; assuming that 5 data acquisition devices are provided, 4 data partitions are provided, the quotient of the total number of the devices divided by the number of the partitions is 1, after the first polling allocation is completed, 2 device identifiers are already allocated to the data partition 0, 1 device identifier is already allocated to the data partition 1, 1 device identifier is allocated to the data partition 2, 1 device identifier is allocated to the data partition 3, if any data corresponding to any one partition is smaller than 1, it is indicated that the first round of data is deleted, and if the partition from which the data is deleted is the data partition 3, it is assumed that the data is deleted is the data partition 3, at this time, one device identifier may be allocated to the data partition 3 first, and then the device identifiers are sequentially polled and allocated to the data partitions 0, 1, and 23.
Optionally, before warehousing the multiple pieces of data, in order to ensure that the data is consumed and warehoused by the same microservice, a possible implementation manner is further provided below, referring to fig. 5 on the basis of fig. 1, where fig. 5 is a schematic flow chart of another data warehousing method provided by an embodiment of the present invention, that is, before step S106, the method further includes:
s104, establishing a second corresponding relation between each micro service and at least one data partition;
and S105, consuming the data in the target data partition corresponding to the target micro service through the target micro service according to the second corresponding relation.
In some possible embodiments, a plurality of data partitions in the distributed message component may be allocated in a round robin manner with the consuming micro service, for example, if the distributed message component has 3 micro services, 60 data partitions, then each partition is identified in micro service 1, micro service 2, and micro service 3, and the round robin is performed and counted until the count is equal to the number of data partitions, which is implemented in such a way that the partition identification of the data partition corresponding to micro service 1 is: 0.3.6.9 …, similarly, the partition identifier of the data partition corresponding to the partition information in microservice 2 is: 1.4.7.10 … are provided.
In some possible embodiments, the second corresponding data may be written to the micro-cloud shared storage. After the micro services and the data partitions are correspondingly completed, the data are stored in the home path, and each micro service is ensured to acquire the same information for use in disaster recovery.
It can be understood that by maintaining the corresponding relationship between the micro service and the data partition and corresponding the micro service and the data partition, it can be ensured that data of the same data acquisition device is consumed by the same micro service.
Optionally, in the process of performing data consumption by the microserver, in order to ensure the consumption capability of the consumed data, an implementation manner is given below, referring to fig. 6, and fig. 6 is a schematic flow chart of a possible implementation manner of step S105 provided by the embodiment of the present invention.
S105-1, when the total pre-consumption data amount of the micro-service is smaller than the data amount in the target data partition, consuming the data in the target data partition, which is consistent with the pre-consumption amount, through the micro-service.
S105-2, when the pre-consumption quantity is larger than or equal to the total quantity of the data in all the target data partitions corresponding to the micro-service, consuming the data in all the target data partitions through the micro-service.
It can be understood that the microservice may consume the data in the corresponding data partition according to different consumption information, and if the total number of all the data partitions is less than 1000, the microservice may consume the data in all the corresponding data partitions, assuming that the microservice consumes 1000 pieces of data each time. If the data volume of a single data partition is larger than 1000, after 1000 pieces of data in the data partition are consumed, the partition is not consumed any more in the next consumption, and other partitions are consumed in a round-robin manner until all partition data are consumed.
Optionally, after the corresponding relationship between the device and the data partition is established, data of a certain device may be quickly and accurately queried according to the corresponding relationship, a possible implementation manner is provided below on the basis of fig. 3, referring to fig. 7, where fig. 7 is a schematic flow chart of another data warehousing method provided by the embodiment of the present invention, and the method further includes:
s108, receiving a data query request, wherein the data query request comprises at least one device identifier and a time identifier.
S109, determining a partition identifier corresponding to the equipment identifier according to the equipment identifier, and obtaining target data in a data partition corresponding to the partition identifier according to the time identifier.
Optionally, after establishing the first corresponding relationship between the device and the data partition and the second corresponding relationship between the consumption service and the data partition, disaster recovery may be performed according to the first corresponding relationship and the second corresponding relationship, and the production service may obtain the first corresponding relationship between the device identifier and the partition identifier in the database, and preferentially recover data production; and the consumption service acquires the second corresponding relation between the consumption service in the shared storage component and the data partition, recovers the data consumption and reduces the pressure of the middleware.
To facilitate understanding of the above-mentioned implementation flow of the whole data warehousing, a scene schematic diagram is given below, referring to fig. 8, and fig. 8 is a scene schematic diagram provided in an embodiment of the present invention.
As shown in FIG. 8, after the system obtains the device information for the data collection device, the public service first creates a database table (which information is in the database table) with the device ID. The producer service acquires the number of data partitions of the current distributed message component and calculates the amount of equipment data required to be stored in each data partition. And the service distributes the equipment according to the stored quantity and stores the equipment data into the corresponding data partition. After data production, other services consume, before consumption, according to the service quantity (according to …) and the data partition quantity, the services perform round patrol on the data partitions until all the data partitions are completely corresponding to the consumed services, and the corresponding relation between the corresponding micro services and the data partitions is stored in the database, so that reliability is improved. After consumption, the consumption service firstly sorts a batch of data according to different equipment data when consuming the data according to the table establishment of the equipment, caches the latest time for the equipment to enter the warehouse in a cache, processes the data by a disorder process if the time for the data to enter the warehouse is judged to be not more than the time for the equipment to enter the warehouse, and directly stores the data if the new data time is more than the time for the equipment to enter the warehouse.
In order to implement the steps in the foregoing embodiments to achieve the corresponding technical effects, an implementation manner of a data warehousing device is given below, referring to fig. 9, fig. 9 is a functional block diagram of a data warehousing device according to an embodiment of the present invention, where the device 20 may be applied to a micro-service system, the micro-service system includes a micro-service, a distributed message component and a database, the distributed message component has a plurality of data partitions, and the device 20 includes an obtaining module 201 and a warehousing module 202;
an obtaining module 201, configured to obtain multiple pieces of data written into a target data partition corresponding to a data acquisition device; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of a plurality of microservices;
the database entry module 202 is configured to store the plurality of pieces of data into a database through the target microservice corresponding to the target data partition according to the timestamp information corresponding to each of the plurality of pieces of data; and the plurality of pieces of data after being put in storage have a time sequence order.
It is understood that the inclusion of the acquisition module 201 and the warehousing module 202 may be used to perform steps S106, S107 to achieve corresponding technical effects.
Optionally, the apparatus 20 further comprises: the establishing module and the writing module, and the obtaining module 201 are further configured to obtain device identifier sets corresponding to all data acquisition devices and partition identifier sets corresponding to all data partitions; the establishing module is used for establishing a first corresponding relation between each partition identifier and at least one equipment identifier; and the writing module is used for writing the plurality of pieces of data into the target data partitions corresponding to the data acquisition equipment according to the first corresponding relation.
It is understood that the establishing module and the writing module, the obtaining module 201 may be used to execute the steps S101-S103 to achieve the corresponding technical effect. The setup module may also implement S102-1 through S102-2 to achieve corresponding technical effects.
Optionally, the data warehousing device 20 further includes a consuming module, and the establishing module is further configured to establish a second corresponding relationship between each micro service and at least one data partition; and the consumption module is used for consuming the data in the target data partition corresponding to the target micro service through the target micro service according to the second corresponding relation.
It is understood that the establishing module and the writing module, the obtaining module 201 may be used to execute the steps S101-S103 to achieve the corresponding technical effect.
Optionally, the establishing module is specifically configured to, when the total number of the device identifiers is less than or equal to the total number of the partition identifiers, allocate one partition identifier to each device identifier, and establish a first corresponding relationship between each partition identifier and one device identifier; when the total number of the device identifications is larger than the total number of the partition identifications, if the number of the device identifications corresponding to at least one partition identification is smaller than the number of the device identifications corresponding to any other partition identification in the partition identification set, the device identifications are distributed for at least one partition identification until the number of the device identifications corresponding to all the partition identifications is consistent, and a first corresponding relation between each partition identification and at least one device identification is established.
Optionally, the consumption module is specifically configured to consume, by the micro service, data in the target data partition, which is consistent with the pre-consumption amount, when the total pre-consumption data amount of the micro service is smaller than the data amount in the target data partition; and when the pre-consumption quantity is larger than or equal to the total quantity of the data in all the target data partitions corresponding to the micro-service, consuming the data in all the target data partitions through the micro-service.
Optionally, the storage module is specifically configured to obtain a time-series data set according to the timestamps of the multiple pieces of data; the time sequence data set is a set obtained by sequencing the data according to the time stamps; when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to a time sequence; when the database has the warehoused data of the data acquisition equipment, warehousing at least one piece of data corresponding to the timestamp with the timestamp smaller than that of the warehoused data in the data set according to a preset import strategy, and inserting at least one piece of data corresponding to the timestamp with the timestamp larger than that of the warehoused data into the database according to a time sequence.
Optionally, the data warehousing device 20 further includes a receiving module and an inquiring module, where the receiving module is configured to receive a data inquiring request; the data query request comprises at least one equipment identifier and a time identifier; and the query module is used for determining a partition identifier corresponding to the equipment identifier according to the equipment identifier and acquiring target data in a data partition corresponding to the partition identifier according to the time identifier.
In order to facilitate understanding of the data storage method provided by the embodiment of the present invention, the embodiment of the present invention further provides a first system, where the system includes a distributed message component, a microservice, and a database; the distributed message component also includes a plurality of data partitions. The system can be used for acquiring a plurality of pieces of data written into the target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of a plurality of data partitions; the target microservice is one of the plurality of microservices; storing the plurality of pieces of data into the database through target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data respectively; and the plurality of pieces of data after being put in storage have the time sequence order.
It will be appreciated that the system may operate in an electronic device that includes a communication interface, a processor, and a memory. The processor, memory and communication interface are electrically connected to each other, directly or indirectly, to enable transfer or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory may be configured to store software programs and modules, such as program instructions/modules corresponding to the data entry method provided in the embodiment of the present invention, and the processor executes various functional applications and data processing by executing the software programs and modules stored in the memory. The communication interface may be used for communicating signaling or data with other node devices. The electronic device may have a plurality of communication interfaces in the present invention.
The memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an erasable read only memory (EPROM), an electrically erasable read only memory (EEPROM), and the like.
The processor may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.
An embodiment of the present invention provides a storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the data warehousing method according to any one of the foregoing embodiments. The computer readable storage medium may be, but is not limited to, various media that can store program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a PROM, an EPROM, an EEPROM, a magnetic or optical disk, etc.

Claims (10)

1. A method for warehousing data, the method being applied to a microservice system, the microservice system comprising a plurality of microservices, a distributed message component, and a database, the distributed message component having a plurality of data partitions, the method comprising:
acquiring a plurality of pieces of data written into a target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices;
storing the plurality of pieces of data into the database through target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data respectively; and the plurality of pieces of data after being put in storage have the time sequence order.
2. The data warehousing method according to claim 1, before acquiring a plurality of pieces of data written in the target data partition corresponding to at least one data acquisition device, further comprising:
acquiring equipment identifier sets corresponding to all data acquisition equipment and partition identifier sets corresponding to all the data partitions;
establishing a first corresponding relation between each partition identification and at least one equipment identification;
and writing the plurality of pieces of data into the target data partitions corresponding to the data acquisition equipment according to the first corresponding relation.
3. The method of claim 2, wherein before storing the plurality of pieces of data into the database through the target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data, the method further comprises:
establishing a second corresponding relation between each micro service and at least one data partition;
and consuming the data in the target data partition corresponding to the target micro service through the target micro service according to the second corresponding relation.
4. The method according to claim 2, wherein the establishing a first correspondence between each partition id and at least one device id comprises:
when the total number of the equipment identifications is smaller than or equal to the total number of the partition identifications, allocating one partition identification to each equipment identification, and establishing a first corresponding relation between each partition identification and one equipment identification;
when the total number of the device identifications is greater than the total number of the partition identifications, if the number of the device identifications corresponding to at least one partition identification is less than the number of the device identifications corresponding to any other partition identification in the partition identification set, the device identifications are distributed for the at least one partition identification until the number of the device identifications corresponding to all the partition identifications is consistent, and a first corresponding relation between each partition identification and the at least one device identification is established.
5. The method according to claim 3, wherein consuming data in the target data partition corresponding to the target microservice by the target microservice according to the second correspondence comprises:
when the total pre-consumption data amount of the micro-service is smaller than the data amount in the target data partition, consuming data consistent with the total pre-consumption data amount in the target data partition through the micro-service;
and when the total pre-consumption data amount is larger than or equal to the total data amount in all the target data partitions corresponding to the micro service, consuming the data in all the target data partitions through the micro service.
6. The method of claim 1, wherein storing the plurality of pieces of data into the database through the target microservices corresponding to the target data partitions according to timestamp information corresponding to the plurality of pieces of data, comprises:
obtaining a time sequence data set according to the time stamps of the plurality of pieces of data; the time sequence data set is a set obtained by sequencing the plurality of pieces of data according to the time stamps;
when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence;
when the database has the warehoused data of the data acquisition equipment, warehousing at least one piece of data corresponding to the timestamp of the data set smaller than the warehoused data according to a preset import strategy, and inserting at least one piece of data corresponding to the timestamp of the data set larger than the warehoused data into the database according to the time sequence.
7. The data warehousing method of claim 2, further comprising:
receiving a data query request; the data query request comprises at least one equipment identifier and a time identifier;
and determining a partition identifier corresponding to the equipment identifier according to the equipment identifier, and acquiring target data in a data partition corresponding to the partition identifier according to the time identifier.
8. A data warehousing apparatus provided in a microservice system including a plurality of microservices, a distributed message component having a plurality of data partitions, and a database, comprising:
the acquisition module is used for acquiring a plurality of pieces of data which are written into the target data partition and correspond to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the data acquisition equipment accessing the micro-service system by the target micro-service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices;
the storage module is used for storing the plurality of pieces of data into the database through the target micro service corresponding to the target data partition according to the timestamp information corresponding to the plurality of pieces of data; and the plurality of pieces of data after being put in storage have the time sequence order.
9. A system comprising a microservice, a distributed message component, and a database; the distributed message component has a plurality of data partitions;
the system is used for acquiring a plurality of pieces of data written into the target data partition corresponding to the data acquisition equipment; the target data partition and the data acquisition equipment have a corresponding relation; the data is obtained by analyzing the data of the system accessed by the data acquisition equipment by the target micro service corresponding to the target data partition; the plurality of pieces of data do not have a time sequence order; the target data partition is one of the plurality of data partitions; the target microservice is one of the plurality of microservices; storing the plurality of pieces of data into the database through target microservices corresponding to the target data partitions according to the timestamp information corresponding to the plurality of pieces of data respectively; and the plurality of pieces of data after being put in storage have the time sequence order.
10. A storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the data warehousing method of any of claims 1-7.
CN202011195013.0A 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium Active CN112269670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011195013.0A CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011195013.0A CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Publications (2)

Publication Number Publication Date
CN112269670A true CN112269670A (en) 2021-01-26
CN112269670B CN112269670B (en) 2023-08-25

Family

ID=74345274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011195013.0A Active CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Country Status (1)

Country Link
CN (1) CN112269670B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377989A (en) * 2021-06-04 2021-09-10 上海云从汇临人工智能科技有限公司 Data retrieval method, system, medium and device based on GPU
CN116069519A (en) * 2022-04-21 2023-05-05 中国石油天然气集团有限公司 Logging method, device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140181041A1 (en) * 2012-12-21 2014-06-26 Zetta, Inc. Distributed data store
CN104407879A (en) * 2014-10-22 2015-03-11 江苏瑞中数据股份有限公司 A power grid timing sequence large data parallel loading method
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109033289A (en) * 2018-07-13 2018-12-18 天津瑞能电气有限公司 A kind of banking procedure of the high frequency real time data for micro-capacitance sensor
CN109829125A (en) * 2019-03-01 2019-05-31 国网吉林省电力有限公司白城供电公司 Show the platform of user management of dispatching of power netwoks operation data
CN110046183A (en) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 A kind of time series data polymerization search method, equipment and medium
CN110795428A (en) * 2019-10-10 2020-02-14 中盈优创资讯科技有限公司 Time sequence data storage method and time sequence database applied to industrial Internet of things
CN111104535A (en) * 2019-11-20 2020-05-05 中国第一汽车股份有限公司 Data management system and data management method
CN111125089A (en) * 2019-11-05 2020-05-08 远景智能国际私人投资有限公司 Time sequence data storage method, device, server and storage medium
CN111552441A (en) * 2020-04-29 2020-08-18 重庆紫光华山智安科技有限公司 Data storage method and device, main node and distributed system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140181041A1 (en) * 2012-12-21 2014-06-26 Zetta, Inc. Distributed data store
CN104407879A (en) * 2014-10-22 2015-03-11 江苏瑞中数据股份有限公司 A power grid timing sequence large data parallel loading method
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109033289A (en) * 2018-07-13 2018-12-18 天津瑞能电气有限公司 A kind of banking procedure of the high frequency real time data for micro-capacitance sensor
CN109829125A (en) * 2019-03-01 2019-05-31 国网吉林省电力有限公司白城供电公司 Show the platform of user management of dispatching of power netwoks operation data
CN110046183A (en) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 A kind of time series data polymerization search method, equipment and medium
CN110795428A (en) * 2019-10-10 2020-02-14 中盈优创资讯科技有限公司 Time sequence data storage method and time sequence database applied to industrial Internet of things
CN111125089A (en) * 2019-11-05 2020-05-08 远景智能国际私人投资有限公司 Time sequence data storage method, device, server and storage medium
CN111104535A (en) * 2019-11-20 2020-05-05 中国第一汽车股份有限公司 Data management system and data management method
CN111552441A (en) * 2020-04-29 2020-08-18 重庆紫光华山智安科技有限公司 Data storage method and device, main node and distributed system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MICHAEL PHINNEY等: "Mining repetitive sequences using a big data ecosystem", 《2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE》, pages 60 - 62 *
刘博伟等: "基于HBase的金融时序数据存储***", 《中国科技论文》, no. 20, pages 2387 - 2392 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377989A (en) * 2021-06-04 2021-09-10 上海云从汇临人工智能科技有限公司 Data retrieval method, system, medium and device based on GPU
CN116069519A (en) * 2022-04-21 2023-05-05 中国石油天然气集团有限公司 Logging method, device, equipment and storage medium
CN116069519B (en) * 2022-04-21 2024-01-30 中国石油天然气集团有限公司 Logging method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112269670B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN111414416B (en) Data processing method, device, equipment and storage medium
CN112269670A (en) Data storage method, device and system and storage medium
CN106452830A (en) Test task execution machine distribution method and device
CN109800204B (en) Data distribution method and related product
CN109842621B (en) Method and terminal for reducing token storage quantity
CN111859127A (en) Subscription method and device of consumption data and storage medium
WO2023020187A1 (en) Data obtaining methods and apparatuses, electronic device and storage medium
CN113055877A (en) Cloud card distribution method and device, electronic equipment and storage medium
CN108197050B (en) Equipment identification method, device and system
CN109561048B (en) Communication management method and device
CN114510299A (en) Method, device and storage medium for processing artificial intelligence service
CN101466163A (en) Method and system for processing information and relevant equipment
CN116028696A (en) Resource information acquisition method and device, electronic equipment and storage medium
CN101179408B (en) Method and system of obtaining network TV program in instant communication cluster
CN113905252B (en) Data storage method and device for live broadcasting room, electronic equipment and storage medium
CN116821215A (en) OPC UA server searching method based on port inquiry
CN112788592B (en) Data transmission processing method for adding wake-up time
CN109151808A (en) A kind of data analysing method and system
CN115866582A (en) Equipment identification method, device, equipment and storage medium
CN114860536A (en) Monitoring method, monitoring system and related device of GPU card
CN111479142B (en) Program content updating method and system based on information release
CN105227395B (en) A kind of method, apparatus and system of distribution JVM performance evaluations
CN109981694A (en) A kind of synchronous method, server and terminal
CN109587223B (en) Data aggregation method, device and system
WO2020020358A1 (en) Method and apparatus for determining residence time duration, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant