CN112269670B - Data warehouse-in method, device, system and storage medium - Google Patents

Data warehouse-in method, device, system and storage medium Download PDF

Info

Publication number
CN112269670B
CN112269670B CN202011195013.0A CN202011195013A CN112269670B CN 112269670 B CN112269670 B CN 112269670B CN 202011195013 A CN202011195013 A CN 202011195013A CN 112269670 B CN112269670 B CN 112269670B
Authority
CN
China
Prior art keywords
data
partition
target
micro
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011195013.0A
Other languages
Chinese (zh)
Other versions
CN112269670A (en
Inventor
贺宁
魏程琛
傅浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Unisinsight Technology Co Ltd
Original Assignee
Chongqing Unisinsight Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Unisinsight Technology Co Ltd filed Critical Chongqing Unisinsight Technology Co Ltd
Priority to CN202011195013.0A priority Critical patent/CN112269670B/en
Publication of CN112269670A publication Critical patent/CN112269670A/en
Application granted granted Critical
Publication of CN112269670B publication Critical patent/CN112269670B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data warehousing method, a device, a system and a storage medium, wherein the method comprises the following steps: acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into a target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of a plurality of micro-services; storing the pieces of data into a database through a target micro-service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the data after warehouse entry has a time sequence. The method can be applied to the scene that the analyzed data cannot be covered and the time sequence of the data can be ensured in the process of warehousing the disordered analyzed data.

Description

Data warehouse-in method, device, system and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a data warehousing method, device, system, and storage medium.
Background
In recent years, the Internet of things and intelligent terminals generate massive data, the existing traditional database cannot meet the performance requirements of specific scenes, under the massive data, the traditional database is used for inquiring the data of a certain device in a certain time period, and as the data blocks of the data cannot be confirmed, only all the data can be traversed, so that a great amount of time and calculation force are wasted for inquiring, and the requirements of users cannot be met far. The more intense the user demand for a given device over a given period of time, the introduction of a time-sequential database addresses such problems.
At present, the current time sequence database design aims at a direct connection database of equipment, namely the data of the equipment needs to be analyzed and processed and is directly put into storage, but the time sequence of the data cannot be met under the conditions of multiple services and multiple processes for some data needing to be mined by analyzing and processing, so that the database entering speed is low and the requirements cannot be met.
Disclosure of Invention
In view of the above, the present invention provides a data warehousing method, device, system and storage medium, which are intended to improve warehousing efficiency and satisfy time sequence of data when warehousing some data which need to be mined with data value through analysis processing.
The technical scheme of the invention is as follows:
in a first aspect, the present invention provides a method for database entry, applied to a micro-service system, the micro-service system including a micro-service, a distributed message component, and a database, the distributed message component having a plurality of data partitions, the method comprising: acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into a target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro service system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services; storing the pieces of data into the database through a target micro-service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; the plurality of pieces of data after warehousing have the time sequence.
In a second aspect, the present invention provides a data warehousing apparatus disposed in a micro service system, the micro service system including a micro service, a distributed message component, and a database, the distributed message component having a plurality of data partitions, including: the acquisition module is used for acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into the target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro service system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services; the warehousing module is used for storing the plurality of pieces of data into the database through the target micro-service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; the plurality of pieces of data after warehousing have the time sequence.
In a third aspect, the present invention provides a system, including a micro service, a distributed message component, and a database, where the distributed message component has a plurality of data partitions, and the system is configured to obtain a plurality of pieces of data written into a target data partition corresponding to a data acquisition device; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services; storing the pieces of data into the database through a target micro-service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the plurality of pieces of data after warehousing have the time sequence.
In a fourth aspect, the present invention provides a storage medium, where the computer program, when executed by a processor, implements the data warehousing method according to the first aspect.
The embodiment of the invention provides a data warehousing method, a device, a system and a storage medium, wherein the method comprises the following steps: acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into a target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of a plurality of micro-services; storing the pieces of data into a database through a target micro-service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the data after warehouse entry has a time sequence. The method can be applied to the scene that the analyzed data cannot be covered and the time sequence of the data can be ensured in the process of warehousing the disordered analyzed data. The difference with the prior art is that: in the prior art, the time sequence of the data cannot be guaranteed in the process of processing the analyzed data, and the data are fused together.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for database entry according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a possible implementation of step S107 provided by an embodiment of the present invention;
FIG. 3 is a schematic flow chart of another method for database entry according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of a possible implementation of step S102 provided by an embodiment of the present invention;
FIG. 5 is a schematic flow chart of another method for database entry according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart of a possible implementation of step S105 provided by an embodiment of the present invention;
FIG. 7 is a schematic flow chart of another method for database entry according to an embodiment of the present invention;
FIG. 8 is a schematic view of a scenario provided by an embodiment of the present invention;
fig. 9 is a functional block diagram of a data warehouse entry device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the present invention, it should be noted that, if the terms "upper", "lower", "inner", "outer", and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or the azimuth or the positional relationship in which the inventive product is conventionally put in use, it is merely for convenience of describing the present invention and simplifying the description, and it is not indicated or implied that the apparatus or element referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus it should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like, if any, are used merely for distinguishing between descriptions and not for indicating or implying a relative importance.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
At present, the internet of things and intelligent terminals generate massive data, and the existing traditional database cannot meet the performance requirements of specific scenes, such as: under the condition of massive data, the data of a certain device in a certain time period can be queried, and all the data can only be traversed due to the fact that the data block of the data can not be confirmed, so that a great amount of time and calculation force are wasted for querying, and the requirement of a user can not be met far. Video surveillance is also the continuous generation of data by the front-end camera, and the more intense the user's demand for a given device over a given period of time, the introduction of a time-sequential database addresses such problems. However, the current time-series database design is aimed at a device direct-connection database, and at present, the current time-series database design is aimed at a device direct-connection database, namely, the data of the device need to be analyzed and processed and is directly put into storage.
However, in the research process, if some media data, such as video data, image data, game data, etc., are put into storage directly, the practical significance is not great, but some characteristic data obtained after the data are subjected to data analysis and processing has a larger effect, so that the time sequence of the data cannot be ensured when the analyzed data are put into storage in the prior art, and the storage speed is slow, and the requirement cannot be met. For example, the data 1, the data 2 and the data 3 corresponding to the data acquisition device are acquired at the time t1, the time t2 and the time t3, and the access system is accessed, and a time sequence may be arranged among the time t1, the time t2 and the time t3, for example: at the moment, the data 1, 2 and 3 can be respectively analyzed by 3 micro services, and the time for reaching the database to wait for warehousing after the data 1, 2 and 3 are analyzed is possibly T1> T2> T3 because the processing capacity and the speed of the micro services are different, at the moment, if the data 1, 2 and 3 are warehoused according to the sequence of T1> T2> T3, the sequence of the data 1, 2 and 3 after warehousing is obviously disordered.
Therefore, in order to solve the above technical problems, the inventors propose a data warehouse entry method in the research process, the processing object of the method is the data processed by the analysis, and the method can ensure that the data which is disordered after the analysis is stored in the database according to the time sequence, so that the warehouse entry efficiency is high.
Referring to fig. 1, fig. 1 is a schematic flowchart of a method for database entry, where the method may be applied to a micro service system, the micro service system includes a micro service, a distributed message component, and a database, and the distributed message component has a plurality of data partitions, and the method may include:
s106, acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into the target data partition; the target data partition has a corresponding relation with the data acquisition equipment;
in the embodiment of the invention, the data is the data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of a plurality of micro-services;
in one possible implementation manner, the data partition may correspond to at least one data acquisition device; the data accessed by each data acquisition device to the system may be parsed by a plurality of microservices.
S107, storing the pieces of data into a database through a target micro service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the data after warehouse entry has a time sequence.
The data warehousing method provided by the embodiment of the invention comprises the following steps: acquiring a plurality of pieces of data which correspond to at least one data acquisition device and are written into a target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of a plurality of micro-services; storing the pieces of data into a database through a target micro service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the difference between the time sequence of the data after warehouse entry and the prior art is that: in the prior art, the implementation mode of data warehousing cannot cover the scene of warehousing the analyzed data, the time sequence of the data cannot be guaranteed in the process of processing the analyzed data, and the data are fused together.
Optionally, in order to improve the efficiency of data warehousing and ensure the time sequence of data warehousing, one possible implementation manner is given below, referring to fig. 2, fig. 2 is a schematic flowchart of one implementation manner of step S107 provided in an embodiment of the present invention, that is, step S107 may include the following sub-steps:
s107-1, obtaining a time sequence data set according to the time stamps of the plurality of data;
it is understood that the time series data set is a data set obtained after sorting the plurality of data by time. In some possible implementations, the pieces of data may be ordered in a time sequence, or in a time sequence.
S107-2, inserting the data in the data set into the database according to the time sequence, wherein the data of the data acquisition equipment does not exist in the database.
S107-3, warehousing data of the data acquisition equipment exist in the database, at least one piece of data corresponding to the timestamp smaller than the timestamp of the warehousing data in the data set is warehoused according to a preset import strategy, and at least one piece of data corresponding to the timestamp larger than the timestamp of the warehousing data is inserted into the database according to a time sequence.
It can be understood that whether the data to be put in storage is disordered or sequential is judged by judging the put-in time of the data to be put in storage of the data acquisition equipment, the data in the data set is inserted into the data base according to the time sequence, if so, the data with the time stamp smaller than the time stamp of the data to be put in storage is imported according to the import strategy, and the put-in of the data is accelerated.
Optionally, in order to quickly and accurately obtain the data written in the data partition by the data acquisition device, a possible implementation manner is given on the basis of fig. 1, and referring to fig. 3, fig. 3 is a schematic flowchart of another database entry method provided in an embodiment of the present invention, that is, before step S101, the method further includes:
s101, acquiring a device identification set corresponding to all data acquisition devices and a partition identification set corresponding to all data partitions.
In some possible implementations, the device identifier set may be a set of device identifiers corresponding to all data collecting devices connected to the system, where the device identifier may be a device ID, and the partition identifier may be a partition number.
In some possible embodiments, after the user adds the device identification information to the system, the number of data partitions in the distributed message component (e.g., the kafu card middleware) is obtained through the producer service, and the partition identifications corresponding to each data partition are obtained, so as to form a partition identification set.
S102, establishing a first corresponding relation between each partition identifier and at least one device identifier.
It can be understood that each partition identifier can correspond to at least one device identifier, so that the memory used by the data partition can be saved, and the system can store the data of the device into the corresponding data partition all the time by maintaining the corresponding relation between the device and the data partition, thereby facilitating the subsequent data storage and data query of the same data acquisition device.
And S103, writing the data into the target data partition corresponding to the data acquisition equipment according to the first corresponding relation.
It can be understood that the data of the same data acquisition device are written into the same data partition, so that the data can be processed by the same micro-service subsequently, and then the time comparison and the brushing can be performed in the micro-service, thereby accelerating the data warehouse-in time.
It can be understood that by maintaining the correspondence between the data acquisition device and the data partition, all data of the data acquisition device can be stored in the corresponding data partition, so that all data of the device can be obtained from the same data partition quickly according to the device identifier in the process of data storage, and the time for obtaining the device data is saved.
Optionally, in the process of establishing the correspondence between the data partitions and the data acquisition devices, in order to equalize the number of devices corresponding to each data partition, one possible implementation is given below, referring to fig. 4, and fig. 4 is a schematic flowchart of a possible implementation of step S102 provided by an embodiment of the present invention, that is, step S102 may include the following substeps:
s102-1, when the total number of the device identifiers is smaller than or equal to the total number of the partition identifiers, distributing a partition identifier for each device identifier, and establishing a first corresponding relation between each partition identifier and one device identifier;
s102-2, when the total number of the device identifications is greater than the total number of the partition identifications and the number of the device identifications corresponding to at least one partition identification is less than the number of the device identifications corresponding to any one of the other partition identifications in the partition identification set, distributing the device identifications for at least one partition identification until the number of the device identifications corresponding to all the partition identifications is consistent, and establishing a first corresponding relation between each partition identification and at least one device identification.
It can be appreciated that, in the embodiment of the present invention, a polling allocation policy may be used to allocate a corresponding device identifier for each partition identifier, so as to ensure the equality of the data partition. When the number of the device identifiers is greater than the total number of the partitions, in order to balance the number of the devices corresponding to each data partition, firstly judging whether the total number of the devices of the data partition is smaller than the average value, if so, considering that the original devices are deleted, directly adding the newly added devices into the corresponding data partition, and balancing the number of the devices corresponding to each data partition.
In one possible implementation manner, the average value may be a quotient of dividing the total number of devices by the number of partitions, if the average value is smaller than the integer, if the number of devices corresponding to the partition is the integer after the integer is rounded, the data characterizing the partition is deleted, for convenience of understanding, for example, assume that the number of the data partitions is 6, the partition identifier may be represented as partition 0-partition 5, the number of the data acquisition devices is 2, and the device identifier may be device identifier 1-device identifier 2. Partition identity 0 may be assigned to device identity 1 and partition identity 2 may be assigned to device identity 2 at this time; assuming that there are 5 data acquisition devices and 4 data partitions, the quotient of the total number of the devices divided by the number of the partitions is 1, after the first polling allocation is completed, 2 device identifiers are already allocated to the data partition 0, 1 device identifier is allocated to the data partition 1, 1 device identifier is allocated to the data partition 2, 1 device identifier is allocated to the data partition 3, if the data corresponding to any one partition is smaller than 1, the data indicating the first round is deleted, and if the partition of the deleted data is the data partition 3, then one device identifier can be allocated to the data partition 3 at this time, and then the device identifiers are sequentially polled and allocated to the data partitions 0, 1 and 23.
Optionally, before the data are put into storage, in order to ensure that the data are put into storage by the same micro-service consumption, a possible implementation manner is further provided below, on the basis of fig. 1, referring to fig. 5, fig. 5 is a schematic flowchart of another data input method provided in an embodiment of the present invention, that is, before step S106, the method further includes:
s104, establishing a second corresponding relation between each micro service and at least one data partition;
s105, consuming the data in the target data partition corresponding to the target micro-service through the target micro-service according to the second corresponding relation.
In some possible embodiments, multiple data partitions in the distributed message component may be allocated for polling with the consuming micro service, for example, the distributed message component has 3 micro services and 60 data partitions, then each partition is identified in micro service 1, micro service 2, and micro service 3 are polled and counted until the count is equal to the number of data partitions, and the partition identification of the data partition corresponding to the micro service 1 is achieved as follows: 0.3.6.9 …, similarly, the partition identifier of the data partition corresponding to the partition information in the microservice 2 is: 1.4.7.10 ….
In some possible embodiments, the second corresponding data may be written to the micro cloud shared storage. After the micro-services and the data partitions are completed correspondingly, the data is stored under the/home path, so that each micro-service can acquire the same information for disaster recovery.
It can be appreciated that by maintaining the correspondence between the micro-services and the data partitions, the micro-services are associated with the data partitions, which can ensure that data of the same data acquisition device is consumed by the same micro-service.
Optionally, in order to ensure the consuming capability of the consuming data during the data consumption of the micro service, an implementation manner is given below, referring to fig. 6, and fig. 6 is a schematic flowchart of a possible implementation manner of step S105 provided by an embodiment of the present invention.
S105-1, when the total amount of the pre-consumption data of the micro-service is smaller than the data amount in the target data partition, consuming the data consistent with the pre-consumption amount in the target data partition through the micro-service.
S105-2, when the pre-consumption number is greater than or equal to the total data amount in all the target data partitions corresponding to the micro-service, consuming the data in all the target data partitions through the micro-service.
It can be understood that the micro service may consume the data in the corresponding data partition according to different consumption information, and if the total number of all the data partitions is less than 1000, the micro service consumes the data in all the corresponding data partitions assuming that 1000 micro services consume each time. If the data volume of a single data partition is larger than 1000, after consuming 1000 pieces of data in the data partition, the next consumption does not consume the partition any more, and other partitions are consumed in rounds until all partition data are consumed.
Optionally, after establishing the correspondence between the device and the data partition, the data of a certain device may be queried rapidly and accurately according to the correspondence, and a possible implementation manner is given below on the basis of fig. 3, with reference to fig. 7, fig. 7 is a schematic flowchart of another database entry method provided by an embodiment of the present invention, where the method further includes:
s108, receiving a data query request, wherein the data query request comprises at least one equipment identifier and a time identifier.
S109, determining a partition identifier corresponding to the equipment identifier according to the equipment identifier, and obtaining target data in a data partition corresponding to the partition identifier according to the time identifier.
Optionally, after the first correspondence between the device and the data partition and the second correspondence between the consumption service and the data partition are established, disaster recovery can be performed according to the first correspondence and the second correspondence, and the production service can obtain the first correspondence between the device identifier and the partition identifier in the database, so as to recover data production preferentially; the consumption service acquires a second corresponding relation between the consumption service and the data partition in the shared storage component, so that the data consumption is recovered, and the middleware pressure is reduced.
In order to facilitate understanding of the implementation flow of the whole data warehouse entry, a schematic view of a scene is provided below, referring to fig. 8, and fig. 8 is a schematic view of a scene provided by an embodiment of the present invention.
As shown in fig. 8, after the system obtains the device information of the data collection device, first the public service creates a database table (which information is in the database table) with the device ID. The producer service will obtain the number of data partitions of the current distributed message component and calculate how much device data needs to be stored in each data partition. And the service allocates the equipment according to the stored quantity and stores the equipment data into the corresponding data partition. After data production, other services consume, the service rounds the data partitions according to the service quantity (according to …) and the data partition quantity before consumption until all the data partitions are completely corresponding to the consumed service, and the corresponding relationship between the corresponding micro-service and the data partition is stored in a database, so that the reliability is improved. After consumption, when the consumption service is used for building a table according to the equipment and consuming data, firstly, sorting a batch of data according to different equipment data, caching the latest time of equipment warehousing in a cache, and if the time of the data to be warehoused is not more than the time of the last equipment warehousing, processing by an out-of-order process, if the time of the new data is more than the time of the last equipment, directly warehousing.
In order to achieve the corresponding technical effects by implementing the steps in the foregoing embodiments, an implementation manner of a data warehousing device is given below, referring to fig. 9, fig. 9 is a functional block diagram of a data warehousing device provided in an embodiment of the present invention, where the device 20 may be configured to be applied to a micro service system, where the micro service system includes a micro service, a distributed message component and a database, where the distributed message component has a plurality of data partitions, and the device 20 includes an acquisition module 201 and a warehousing module 202;
an obtaining module 201, configured to obtain a plurality of pieces of data written into a target data partition corresponding to a data acquisition device; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of a plurality of micro-services;
the warehousing module 202 is configured to store the plurality of pieces of data into the database through the target micro service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; the data after warehouse entry has a time sequence.
It will be appreciated that the inclusion of the acquisition module 201 and the binning module 202 may be used to perform steps S106, S107 to achieve the corresponding technical effects.
Optionally, the apparatus 20 further comprises: the establishing module and the writing module, the obtaining module 201 are further configured to obtain a device identifier set corresponding to all data acquisition devices and a partition identifier set corresponding to all data partitions; the establishing module is used for establishing a first corresponding relation between each partition identifier and at least one device identifier; and the writing module is used for writing the plurality of pieces of data into the target data partition corresponding to the data acquisition equipment according to the first corresponding relation.
It will be appreciated that the creation module and the writing module, the acquisition module 201 may be configured to perform steps S101-S103 to achieve the corresponding technical effects. The setup module may also implement S102-1 to S102-2 to achieve the corresponding technical effects.
Optionally, the data warehousing device 20 further includes a consumption module, and the establishment module is further configured to establish a second correspondence between each micro service and at least one data partition; and the consumption module is used for consuming the data in the target data partition corresponding to the target micro-service through the target micro-service according to the second corresponding relation.
It will be appreciated that the creation module and the writing module, the acquisition module 201 may be configured to perform steps S101-S103 to achieve the corresponding technical effects.
Optionally, the establishing module is specifically configured to allocate a partition identifier to each device identifier when the total number of device identifiers is less than or equal to the total number of partition identifiers, and establish a first correspondence between each partition identifier and one device identifier; when the total number of the device identifications is greater than the total number of the partition identifications, if the number of the device identifications corresponding to at least one partition identification is less than the number of the device identifications corresponding to any one of the other partition identifications in the partition identification set, the device identifications are allocated to the at least one partition identification until the number of the device identifications corresponding to all the partition identifications is consistent, and a first corresponding relation between each partition identification and the at least one device identification is established.
Optionally, the consumption module is specifically configured to consume, by the micro service, data in the target data partition that is consistent with the pre-consumption amount when the total amount of pre-consumption data of the micro service is smaller than the data amount in the target data partition; and when the pre-consumption number is greater than or equal to the total data amount in all the target data partitions corresponding to the micro-service, consuming the data in all the target data partitions through the micro-service.
Optionally, the warehousing module is specifically configured to obtain a time sequence data set according to time stamps of the plurality of pieces of data; the time sequence data set is a set obtained after the pieces of data are ordered according to time stamps; when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence; when the data in the database is stored in the data acquisition equipment, storing at least one piece of data corresponding to the time stamp of which the time stamp is smaller than that of the data in the database according to a preset import strategy, and inserting at least one piece of data corresponding to the time stamp of which the time stamp is larger than that of the data in the database according to a time sequence.
Optionally, the data warehousing device 20 further includes a receiving module and a query module, where the receiving module is configured to receive a data query request; the data query request comprises at least one equipment identifier and a time identifier; the query module is used for determining a partition identifier corresponding to the equipment identifier according to the equipment identifier, and obtaining target data in a data partition corresponding to the partition identifier according to the time identifier.
In order to facilitate understanding of the data warehousing method provided by the embodiment of the invention, the embodiment of the invention also provides a first system which comprises a distributed message component, a micro-service and a database; the distributed message component also includes a plurality of data partitions. The system can be used for acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into the target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro-service system; the plurality of data have no time sequence; the target data partition is one of a plurality of data partitions; the target micro-service is one of the plurality of micro-services; storing the pieces of data into the database through a target micro-service corresponding to the target data partition according to the time stamp information corresponding to each piece of data; the plurality of pieces of data after warehousing have the time sequence.
It will be appreciated that the system may operate in an electronic device that includes a communication interface, a processor, and a memory. The processor, the memory and the communication interface are electrically connected with each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the data warehousing method provided by the embodiments of the present invention, and the processor executes the software programs and modules stored in the memory, thereby executing various functional applications and data processing. The communication interface may be used for communication of signaling or data with other node devices. The electronic device may have a plurality of communication interfaces in the present invention.
Wherein the memory may be, but is not limited to, random access memory (RandomAccessMemory, RAM), read-only memory (ReadOnlyMemory, ROM), programmable read-only memory (programmable read-OnlyMemory, PROM), erasable read-only memory (erasabableread-OnlyMemory, EPROM), electrically erasable read-only memory (electrically erasable programmable read-OnlyMemory, EEPROM), etc.
The processor may be an integrated circuit chip having signal processing capabilities. The processor may be a general purpose processor including a central processing unit (CentralProcessingUnit, CPU), a network processor (NetworkProcessor, NP), etc.; but may also be a digital signal processor (DigitalSignalProcessing, DSP), an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), a Field programmable gate array (Field-ProgrammableGateArray, FPGA) or other programmable logic device, a discrete gate or transistor logic device, discrete hardware components, or the like.
An embodiment of the present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, implements a data warehousing method according to any one of the foregoing embodiments. The computer readable storage medium may be, but is not limited to, a usb disk, a removable hard disk, ROM, RAM, PROM, EPROM, EEPROM, a magnetic disk, or an optical disk, etc. various media capable of storing program codes.

Claims (9)

1. A data warehousing method for use in a micro-service system, the micro-service system comprising a plurality of micro-services, a distributed message component and a database, the distributed message component having a plurality of data partitions, the method comprising:
acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into a target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro service system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services;
storing the pieces of data into the database through a target micro-service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; wherein the plurality of pieces of data after warehousing have the time sequence;
storing the plurality of data into the database through a target micro-service corresponding to the target data partition according to the timestamp information corresponding to each of the plurality of data, including:
obtaining a time sequence data set according to the time stamps of the plurality of data; the time sequence data set is a set obtained after the plurality of data are ordered according to time stamps;
when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence;
when the data in the database exist in the data acquisition equipment, at least one piece of data corresponding to the time stamp smaller than the time stamp of the data in the data set is put in the database according to a preset import strategy, and at least one piece of data corresponding to the time stamp larger than the time stamp of the data in the database is inserted into the database according to the time sequence.
2. The data warehousing method of claim 1, further comprising, prior to acquiring the plurality of pieces of data written into the target data partition corresponding to the at least one data acquisition device:
acquiring a device identification set corresponding to all data acquisition devices and a partition identification set corresponding to all the data partitions;
establishing a first corresponding relation between each partition identifier and at least one device identifier;
and writing the plurality of data into the target data partition corresponding to the data acquisition equipment according to the first corresponding relation.
3. The data warehousing method according to claim 2, wherein before storing the pieces of data into the database by the target micro service corresponding to the target data partition according to the time stamp information corresponding to each of the pieces of data, further comprising:
establishing a second corresponding relation between each micro service and at least one data partition;
consuming data in the target data partition corresponding to the target micro-service through the target micro-service according to the second corresponding relation.
4. The method of claim 2, wherein said establishing a first correspondence between each of said partition identifiers and at least one of said device identifiers comprises:
when the total number of the device identifiers is smaller than or equal to the total number of the partition identifiers, distributing one partition identifier for each device identifier, and establishing a first corresponding relation between each partition identifier and one device identifier;
when the total number of the device identifiers is greater than the total number of the partition identifiers, if the number of the device identifiers corresponding to at least one partition identifier is less than the number of the device identifiers corresponding to any one of the other partition identifiers in the partition identifier set, the device identifiers are distributed to the at least one partition identifier until the number of the device identifiers corresponding to all the partition identifiers is consistent, and a first corresponding relation between each partition identifier and at least one device identifier is established.
5. The data warehousing method according to claim 3, wherein said consuming data within said target data partition corresponding to said target micro-service by a target micro-service according to said second correspondence comprises:
when the total amount of pre-consumption data of the micro service is smaller than the data amount in the target data partition, consuming data consistent with the total amount of pre-consumption data in the target data partition through the micro service;
and when the total amount of the pre-consumption data is greater than or equal to the total amount of data in all the target data partitions corresponding to the micro-service, consuming the data in all the target data partitions through the micro-service.
6. The data warehousing method of claim 2, further comprising:
receiving a data query request; the data query request comprises at least one equipment identifier and a time identifier;
and determining a partition identifier corresponding to the equipment identifier according to the equipment identifier, and obtaining target data in a data partition corresponding to the partition identifier according to the time identifier.
7. A data warehousing apparatus, disposed in a micro-service system, the micro-service system comprising a plurality of micro-services, a distributed message component, and a database, the distributed message component having a plurality of data partitions, comprising:
the acquisition module is used for acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into the target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the micro service system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services;
the warehousing module is used for storing the plurality of pieces of data into the database through the target micro-service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; wherein the plurality of pieces of data after warehousing have the time sequence;
the warehousing module is specifically used for obtaining a time sequence data set according to the time stamps of the plurality of pieces of data; the time sequence data set is a set obtained after the plurality of data are ordered according to time stamps; when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence; when the data in the database exist in the data acquisition equipment, at least one piece of data corresponding to the time stamp smaller than the time stamp of the data in the data set is put in the database according to a preset import strategy, and at least one piece of data corresponding to the time stamp larger than the time stamp of the data in the database is inserted into the database according to the time sequence.
8. A micro service system is characterized by comprising a micro service, a distributed message component and a database; the distributed message component has a plurality of data partitions;
the micro-service system is used for acquiring a plurality of pieces of data which correspond to the data acquisition equipment and are written into the target data partition; the target data partition has a corresponding relation with the data acquisition equipment; the data is data after the target micro-service corresponding to the target data partition analyzes the data of the data acquisition equipment accessed to the system; the plurality of data have no time sequence; the target data partition is one of the plurality of data partitions; the target micro-service is one of the plurality of micro-services; storing the pieces of data into the database through a target micro-service corresponding to the target data partition according to the timestamp information corresponding to each piece of data; wherein the plurality of pieces of data after warehousing have the time sequence;
the micro service system is specifically configured to obtain a time sequence data set according to the time stamps of the plurality of pieces of data; the time sequence data set is a set obtained after the plurality of data are ordered according to time stamps; when the data of the data acquisition equipment does not exist in the database, inserting the data in the data set into the database according to the time sequence; when the data in the database exist in the data acquisition equipment, at least one piece of data corresponding to the time stamp smaller than the time stamp of the data in the data set is put in the database according to a preset import strategy, and at least one piece of data corresponding to the time stamp larger than the time stamp of the data in the database is inserted into the database according to the time sequence.
9. A storage medium having stored thereon a computer program, which when executed by a processor implements the data warehousing method of any one of claims 1-6.
CN202011195013.0A 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium Active CN112269670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011195013.0A CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011195013.0A CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Publications (2)

Publication Number Publication Date
CN112269670A CN112269670A (en) 2021-01-26
CN112269670B true CN112269670B (en) 2023-08-25

Family

ID=74345274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011195013.0A Active CN112269670B (en) 2020-10-30 2020-10-30 Data warehouse-in method, device, system and storage medium

Country Status (1)

Country Link
CN (1) CN112269670B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377989A (en) * 2021-06-04 2021-09-10 上海云从汇临人工智能科技有限公司 Data retrieval method, system, medium and device based on GPU
CN116069519B (en) * 2022-04-21 2024-01-30 中国石油天然气集团有限公司 Logging method, device, equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407879A (en) * 2014-10-22 2015-03-11 江苏瑞中数据股份有限公司 A power grid timing sequence large data parallel loading method
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109033289A (en) * 2018-07-13 2018-12-18 天津瑞能电气有限公司 A kind of banking procedure of the high frequency real time data for micro-capacitance sensor
CN109829125A (en) * 2019-03-01 2019-05-31 国网吉林省电力有限公司白城供电公司 Show the platform of user management of dispatching of power netwoks operation data
CN110046183A (en) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 A kind of time series data polymerization search method, equipment and medium
CN110795428A (en) * 2019-10-10 2020-02-14 中盈优创资讯科技有限公司 Time sequence data storage method and time sequence database applied to industrial Internet of things
CN111104535A (en) * 2019-11-20 2020-05-05 中国第一汽车股份有限公司 Data management system and data management method
CN111125089A (en) * 2019-11-05 2020-05-08 远景智能国际私人投资有限公司 Time sequence data storage method, device, server and storage medium
CN111552441A (en) * 2020-04-29 2020-08-18 重庆紫光华山智安科技有限公司 Data storage method and device, main node and distributed system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9152643B2 (en) * 2012-12-21 2015-10-06 Zetta Inc. Distributed data store

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407879A (en) * 2014-10-22 2015-03-11 江苏瑞中数据股份有限公司 A power grid timing sequence large data parallel loading method
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109033289A (en) * 2018-07-13 2018-12-18 天津瑞能电气有限公司 A kind of banking procedure of the high frequency real time data for micro-capacitance sensor
CN109829125A (en) * 2019-03-01 2019-05-31 国网吉林省电力有限公司白城供电公司 Show the platform of user management of dispatching of power netwoks operation data
CN110046183A (en) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 A kind of time series data polymerization search method, equipment and medium
CN110795428A (en) * 2019-10-10 2020-02-14 中盈优创资讯科技有限公司 Time sequence data storage method and time sequence database applied to industrial Internet of things
CN111125089A (en) * 2019-11-05 2020-05-08 远景智能国际私人投资有限公司 Time sequence data storage method, device, server and storage medium
CN111104535A (en) * 2019-11-20 2020-05-05 中国第一汽车股份有限公司 Data management system and data management method
CN111552441A (en) * 2020-04-29 2020-08-18 重庆紫光华山智安科技有限公司 Data storage method and device, main node and distributed system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于HBase的金融时序数据存储***;刘博伟等;《中国科技论文》(第20期);第2387-2392页 *

Also Published As

Publication number Publication date
CN112269670A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
CN111414416B (en) Data processing method, device, equipment and storage medium
CN112269670B (en) Data warehouse-in method, device, system and storage medium
CN109327856B (en) Passenger flow volume statistical method, network equipment and storage medium
CN111355816B (en) Server selection method, device, equipment and distributed service system
CN109120885B (en) Video data acquisition method and device
CN110413845B (en) Resource storage method and device based on Internet of things operating system
CN107092686B (en) File management method and device based on cloud storage platform
CN104424331A (en) Data sampling method and device
CN111859127A (en) Subscription method and device of consumption data and storage medium
CN112115133A (en) Distributed global unique ID generation method and system, storage medium and device
CN113691610B (en) Data acquisition method and device, electronic equipment and storage medium
CN110286981A (en) The display methods and display system of the use state of virtual cloud desktop server
CN102917026A (en) Method, equipment and system for subscribing information of internet of things
CN111061785B (en) Method and system for classified storage of orders in management platform
CN113076159A (en) Image display method and apparatus, storage medium, and electronic device
CN115994036B (en) Cloud platform tenant isolation method, device, equipment and storage medium
CN113905252B (en) Data storage method and device for live broadcasting room, electronic equipment and storage medium
CN116028696A (en) Resource information acquisition method and device, electronic equipment and storage medium
CN108268545B (en) Method and device for establishing hierarchical user label library
CN115422184A (en) Data acquisition method, device, equipment and storage medium
CN114860536A (en) Monitoring method, monitoring system and related device of GPU card
CN109981694A (en) A kind of synchronous method, server and terminal
CN113590907A (en) Camera management method and device, electronic equipment and storage medium
WO2020020358A1 (en) Method and apparatus for determining residence time duration, device, and storage medium
CN109547544A (en) Content delivery method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant