CN113515504B

CN113515504B - Data management method, device, electronic equipment and storage medium

Info

Publication number: CN113515504B
Application number: CN202110867547.1A
Authority: CN
Inventors: 杨波
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-07-29
Filing date: 2021-07-29
Publication date: 2024-01-09
Anticipated expiration: 2041-07-29
Also published as: CN113515504A

Abstract

The disclosure provides a data management method, a data management device, electronic equipment and a storage medium, relates to the technical field of data processing, and particularly relates to the field of intelligent searching. The specific implementation scheme is as follows: determining a first operation record of the database according to the first version information of the database and the generator information of the database corresponding to the first version information; determining a second operation record of the data unit in the database according to the second version information of the data unit in the database and the storage identification of the data unit corresponding to the second version information; and determining a historical operation record of the database according to the first operation record and the second operation record.

Description

Data management method, device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to the field of intelligent searching.

Background

Data management is an important application field of computer technology, which includes processes of collecting, sorting, organizing, storing, processing, transmitting, retrieving, etc. different types of data. One of the purposes of data management is to extract and derive information valuable to people from a large amount of raw data, and then use the information as a basis for actions and decisions. Another object is to scientifically save and manage complex, large amounts of data by means of a computer so that people can make convenient and full use of these information resources. Data management is the core of data processing and may include operations on data, such as organization, classification, encoding, storage, retrieval, maintenance, and the like.

Disclosure of Invention

The disclosure provides a data management method, a data management device, electronic equipment and a storage medium.

According to an aspect of the present disclosure, there is provided a data management method including: determining a first operation record of a database according to first version information of the database and generator information of the database corresponding to the first version information; determining a second operation record of the data unit in the database according to the second version information of the data unit in the database and the storage identification of the data unit corresponding to the second version information; and determining a historical operation record of the database according to the first operation record and the second operation record.

According to another aspect of the present disclosure, there is provided a data management apparatus including: the first determining module is used for determining a first operation record of the database according to the first version information of the database and the generator information of the database corresponding to the first version information; the second determining module is used for determining a second operation record of the data unit in the database according to the second version information of the data unit in the database and the storage identification of the data unit corresponding to the second version information; and a third determining module, configured to determine a historical operation record of the database according to the first operation record and the second operation record.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data management method as described above.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the data management method as described above.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a data management method as described above.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 schematically illustrates an exemplary system architecture to which data management methods and apparatus may be applied, according to embodiments of the present disclosure;

FIG. 2 schematically illustrates a flow chart of a data management method according to an embodiment of the disclosure;

FIG. 3 schematically illustrates an overall flow diagram of a data management method according to an embodiment of the disclosure;

FIG. 4 schematically illustrates a schematic diagram of determining a change pattern in a neighboring version in accordance with an embodiment of the disclosure;

FIG. 5 schematically illustrates a block diagram of a data management apparatus according to an embodiment of the disclosure; and

fig. 6 illustrates a schematic block diagram of an example electronic device 600 that may be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related personal information of the user all conform to the regulations of related laws and regulations, necessary security measures are taken, and the public order harmony is not violated.

The data accuracy and the update timeliness are core indexes for reflecting the competitiveness of the product, the process flow of data production and release has important influence on the two indexes, and the organization and management capability of data in the production process greatly influences the efficiency of the production process and the data release. Therefore, there is a need for a data organization and management mechanism that can implement at least one of the following: (1) And the data in the production process, including the master library and the data in the operation process, are efficiently and conveniently managed. (2) All job histories can be recorded, and rapid retrieval of history information is supported. (3) Data distribution can be well supported, including support for incremental distribution.

Histories, i.e., records of historical data, are important assets in the production process that can record the evolutionary progression of the data.

At present, the concept of a master library and a job library is commonly used in data production, namely, all master library data are stored separately. If the mother library needs to be updated, a part of data is pulled from the mother library to a local or independent operation area, and after modification is completed, the update is submitted. Depending on the storage mode, the data organization modes are different and can include the following two types: (1) The entire parent library organizes data in mif (a common data exchange format), tab (a file format) format according to the unit directory, and stores the data based on ftp (a file transfer protocol) or the like. When an operator performs the operation, the operator acquires the operation data from the ftp and performs the modification operation. After the completion, the data is uploaded to the ftp service, and the record of the history can be realized by storing the work result of each time as different catalogs. (2) Based on the database storage data, the entire parent database data is stored in one database. When in operation, a sub-library is output from the database, and the operation is performed on the sub-library data. After the operation is completed, the operation result of the sub-library is stored in the database of the master library. If record of the history is to be realized, an additional storage mechanism is needed, otherwise only the last result is reserved in the library.

The inventor finds that in the process of implementing the disclosed concept, the file-based organization mode lacks systematic management of meta information, so that the meta information (who works and what works are referred to) of the work result and the work process is relatively loose, and the histories are difficult to quickly search. In addition, when the job is submitted, a complicated maintenance directory structure is needed, the maintenance cost is high, and errors are easy to occur. Based on the manner of database storage, a sub-library or a return sub-library is subjected to higher calculation cost in the operation process compared with a file. Meanwhile, as the final result is reserved in the library aiming at the same library, if the record and the inquiry of the history are realized, an additional data storage mechanism is needed, so that the result and the history are coupled, and the design of the system is not facilitated. Based on the database storage, if the reliability of the data storage is to be improved, additional main and standby construction is required.

Fig. 1 schematically illustrates an exemplary system architecture to which data management methods and apparatuses may be applied according to embodiments of the present disclosure.

It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios. For example, in another embodiment, an exemplary system architecture to which the content processing method and apparatus may be applied may include a terminal device, but the terminal device may implement the content processing method and apparatus provided by the embodiments of the present disclosure without interaction with a server.

As shown in fig. 1, a system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired and/or wireless communication links, and the like.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 101, 102, 103, such as a knowledge reading class application, a web browser application, a search class application, an instant messaging tool, a mailbox client and/or social platform software, etc. (as examples only).

The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for content browsed by the user using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be noted that, the data management method provided by the embodiments of the present disclosure may be generally performed by the terminal device 101, 102, or 103. Accordingly, the data management apparatus provided by the embodiments of the present disclosure may also be provided in the terminal device 101, 102, or 103.

Alternatively, the data management methods provided by embodiments of the present disclosure may also be generally performed by the server 105. Accordingly, the data management apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The data management method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the data management apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.

For example, when a recording history operation is required, the terminal device 101, 102, 103 may determine the first operation record of the database from the first version information of the database and the generator information of the database corresponding to the first version information. And then, determining a second operation record of the data unit in the database according to the second version information of the data unit in the database and the storage identification of the data unit corresponding to the second version information. And then, determining the historical operation record of the database according to the first operation record and the second operation record. Or the database and the data units are analyzed by a server or a cluster of servers capable of communicating with the terminal devices 101, 102, 103 and/or the server 105 and determining the historical operating record of the database is effected.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Fig. 2 schematically illustrates a flow chart of a data management method according to an embodiment of the present disclosure.

As shown in fig. 2, the method includes operations S210 to S230.

In operation S210, a first operation record of a database is determined according to first version information of the database and generator information of the database corresponding to the first version information.

In operation S220, a second operation record of the data unit in the database is determined according to the second version information of the data unit in the database and the storage identifier of the data unit corresponding to the second version information.

In operation S230, a historical operating record of the database is determined according to the first operating record and the second operating record.

According to an embodiment of the present disclosure, the first version information is for example used to characterize the version of the database. One database may correspond to multiple versions, with different versions of the database differing in content. The producer information is used, for example, to characterize the source of the database. For example, the database that was originally generated and submitted, the generator information of which may be "none"; databases generated and submitted by other databases, the generator information of which may be a library identification of the other databases. The first version information, the producer information, and the library identification of the database may constitute one operation record of the database, i.e., the first operation record.

According to an embodiment of the present disclosure, the first operation record may further include at least one of a submitter and a submission time of the database, for example, without limitation.

Table 1 schematically shows a representation of a first operation record.

Library identification	Commit version	Submitter	Which job library submitted	Commit time
					Mother warehouse	100	Initializing versions	Without any means for	2020-6-2
Mother warehouse	101	User a	Working base 1	2020-6-2

TABLE 1

It can be recorded by table 1 that a master library is first generated and submitted at month 2 of 2020, with an initial version number of 100. On the same day, the user a modifies the mother library in the job library 1 and resubmits the second mother library, and the version of the second mother library submitted is 101. Note that, the library identifier may be identifier information of the parent library, which is not limited herein.

According to an embodiment of the present disclosure, the second version information is for example used to characterize the version of the data unit in the database, which may be the basic unit constituting the database. For example, in the case where the database is a set of documents, the data unit may be a file; in the case that the database is an electronic map related file, the data unit may be a file formed by one map. The drawing is a drawing format, and refers to the size of a drawing for drawing a pattern, and can represent a basic unit when acquiring data in an electronic map, and the unit can be customized in size. One data unit may correspond to multiple versions, with different versions of the data unit differing in content. The storage identity is for example used to characterize the actual storage location of the data unit, such as an actual storage directory, url link, etc. The corresponding data unit may be obtained by accessing the directory or link. The commit details of the database may be further determined based on the second version information, the storage identifier, and the second operational record determined by the data unit.

In accordance with this embodiment of the present disclosure, to further describe the details of the submission of the database, the second operation record may further include, for example, at least one of a submitter and a time of submission of each data unit of the database, without limitation.

Table 2 schematically shows a representation of a second operation record.

Library identification	File name	File version	Storing identification	Submitter	Commit time
						Mother warehouse	Picture A	100	http：//file/a_100	Initialization of	2020-6-2
Mother warehouse	Picture B	100	http：//file/b_100	Initialization of	2020-6-2
						Mother warehouse	Picture A	101	http：//file/a_2	User a	2020-6-2

TABLE 2

It can be recorded by table 2 that the first generated and submitted parent library in month 6 and 2 of 2020 includes two files of drawing a and drawing B. The version numbers of panels a and B may depend on the version numbers of the parent libraries, i.e., the version numbers of panels a and B are determined to be 100. On the same day, the modification of the master library by the user a in the job library 1 is that the diagram A of the master library is modified, and the version number of the modified diagram A is consistent with the version number of the master library submitted for the second time. The address links listed according to the storage identifier can access the files of the drawing A, the drawing B and the modified drawing A.

According to the embodiments of the present disclosure, by managing meta information such as a database, a library identification of a data unit, a file name, a storage identification, and the like in combination with tables 1 and 2, it is possible to perform history management on the database. In the history management of the databases, each database can have independent identifiers and versions, each file meta-information can also have a corresponding version, and after each file update is submitted, the version number of the file submitted at this time can be consistent with the version number of the database. In this way, in combination with the structures of tables 1 and 2, each commit operation for each database can be recorded in detail.

By the embodiment of the disclosure, the construction of the historical operation records is introduced, the problem of high cost when personnel organization maintains data management is well solved, all the historical operation records are reserved, and the problems can be quickly tracked and positioned through the retrieval function of the system.

The method shown in fig. 2 is further described below in connection with the specific examples.

According to an embodiment of the present disclosure, the data management method further includes: in response to an initial commit operation on the database, first initial version information of the database is determined. In response to an update operation to the database, first updated version information of the updated database is determined. And determining the first version information according to the first initial version information and the first updated version information.

According to an embodiment of the present disclosure, the first initial version information characterizes, for example, version information of the database submitted for the first time, such as 100. The first updated version information characterizes, for example, version information of the database submitted again after an update operation such as a modification is performed on the database submitted for the first time, as at 101.

Through the embodiment of the disclosure, each submission of the database can be recorded, and convenience conditions are provided for subsequent retrieval and positioning.

According to an embodiment of the present disclosure, the data management method further includes: in response to an update operation to the database, a first job library for completing the update operation is generated. After the updating of the database is completed by the first operation library, the first updated version information of the database is determined in response to the submitting operation of the updated database in the first operation library.

According to an embodiment of the present disclosure, the first operation library is, for example, a memory space that is additionally existing independent of the original database. The database requiring modification can be transferred to the memory space to complete modification and resubmit, so that version information, i.e., the first updated version information, can be determined for the resubmit database after resubmit is successful.

Table 3 schematically shows a recording stream pattern based on the first job library.

Library identification	Commit version	Submitter	Which job library submitted	Commit time
					Working base 1	1	Initializing versions	Without any means for	2020-6-2
Working base 1	2	User a	Without any means for	2020-6-2

TABLE 3 Table 3

The following information can be recorded by table 3: in the case of preparing to modify the database, one job library 1 may be created and the initial version information of the job library 1 is defined as 1. After the modifications to the database are completed in the job base 1 and submitted, the job base version may be increased to 2.

Through the embodiment of the disclosure, the database is introduced to provide additional space for modification of the database, so that each updated or non-updated database is saved, and convenience is brought to subsequent retrieval.

According to an embodiment of the present disclosure, the data management method further includes: after updating the database with the first job library is completed, determining first library identification information of the first job library in response to a commit operation to the updated database in the first job library. And determining the generator information of the database corresponding to the first updated version information according to the first library identification information.

According to an embodiment of the present disclosure, for example, it may be determined from table 3 that the first library identification information of the first library for completing the update operation to the database is the library 1. From this, the updated generator information of the database can be determined as the job base 1.

Through the above embodiment of the present disclosure, the update record of the database with the update can be searched in combination with the generator information, and the update details in the database update process can be further determined.

According to an embodiment of the present disclosure, the data management method further includes: in response to a modification operation to the data unit in the database, a second job library is generated for implementing the modification operation. After the modification of the data unit is completed with the second job base, determining first updated version information of the database in response to a commit operation for the modified data unit in the second job base.

According to an embodiment of the present disclosure, the second operation library is, for example, a memory space that is additionally existing independent of the original database. The data units that need to be modified may be transferred to the memory space to complete the modification and resubmit in order to determine version information, i.e. the first updated version information, for the resubmit data units after the resubmit is successful.

Table 4 schematically shows a recording stream pattern based on the second job library.

Library identification	File name	File version	Storing identification	Submitter	Commit time
						Working house 2	Picture A	1	http：//file/a_100	User a	2020-6-2
Working house 2	Picture A	2	http：//file/a_2	User a	2020-6-2

TABLE 4 Table 4

By table 4 it can be recorded that in case of preparing a modified data unit (e.g. picture a) a job base 2 can be created and the initial version information of the job base 2 is defined as 1. In job base 2, after completing the drawing a and submitting, the version of job base 2 may be increased to 2. The version number of the current modified picture a may be consistent with the version number of the job library 2.

Note that, the job library 1 and the job library 2 may be the same job library, and the same operation may be realized. For example, in the case where the modification to the database is modification of the drawing sheet a, the job library 1 and the job library 2 may realize the same operation.

Through the embodiment of the disclosure, the job library is introduced, and extra space is provided for modification of the data units, so that each updated or non-updated data unit is saved, and convenience is provided for subsequent retrieval.

According to an embodiment of the present disclosure, the data management method further includes: after the modification of the data unit is completed with the second job repository, second repository identification information of the second job repository is determined in response to a commit operation for the modified data unit in the second job repository. And determining the generator information of the database corresponding to the first updated version information according to the second library identification information.

According to an embodiment of the present disclosure, for example, it may be determined from table 4 that the second library identification information of the second library for completing the modification operation to the web page a is the library 2. From this, it can be determined that the producer information of the database to which the modified data unit is submitted is the job base 2, thereby further determining that the producer information of the modified data unit is the job base 2.

By the above embodiment of the present disclosure, the modification record of the data unit with modification can be searched in combination with the generator information, and the update details of each data unit in the database update process can be further determined.

According to an embodiment of the present disclosure, the data management method further includes: and determining second initial version information of the data units in the database corresponding to the first initial version information according to the first initial version information. And determining second updated version information of the data units with the update in the database corresponding to the first updated version information according to the first updated version information. And determining the second version information according to the second initial version information and the second updated version information.

According to an embodiment of the present disclosure, the second initial version information characterizes, for example, version information of the data units in the database submitted for the first time, such as 100. The second updated version information characterizes, for example, version information of the database submitted again after an update operation such as a modification is performed on the database submitted for the first time, as at 101.

Through the embodiment of the disclosure, each submission of each data unit in the database can be recorded, so that convenience conditions are provided for subsequent retrieval and positioning.

In accordance with embodiments of the present disclosure, after job library 1 or job library 2 is returned to the parent library, for example, a modification to the corresponding parent library may be characterized. In this case, the version of the parent library may be increased to 101. Meanwhile, based on the information of the generator, namely a column of which operation library submits, each update of the parent library can be recorded and completed by which operation library is modified, and further, more detailed update details can be queried from the operation library.

Through the process, each time the picture file is modified, the corresponding record can be made. If it is desired to know how the data of a file changes, it can be first identified by the picture number + parent, which version of the parent the picture was modified in. And then, quickly positioning the corresponding operation library according to which operation library the version of the mother library records to carry out the submission. And then similar operations are performed in the operation library to find the change information of each time.

For example, to find how the gallery A was modified, one may first look up, indicated by the gallery A+ parent, that the gallery A has both versions modified, 100 and 101. Since the modification of the 101 version is caused by the job base 1, all the modification information of the image a in the job base 1 can be queried continuously through the image a+job base 1.

Fig. 3 schematically illustrates an overall flow diagram of a data management method according to an embodiment of the disclosure.

As shown in fig. 3, there is a parent library 310 with an initial version of 100. When modification of data is required in the parent library 310, an initial version 1 of the job library 320 may be created first, where the job library 320 is used to complete modification of relevant data in the parent library 310. After the modification is completed, a version 2 job library 330 containing the modified related data may be obtained. The modified relevant data in the job library 330 may be further submitted to the parent library 310, at which point a parent library 340 having a version 101 containing the modified relevant data may be obtained. Accordingly, when the parent library 340 needs to modify data, a job library 350 of initial version 1 may be first created, and the job library 350 is used to complete modification of the relevant data in the parent library 340. After the modification is completed, a version 2 of the job library 360 containing the modified related data may be obtained. The modified relevant data in the job library 360 may be further submitted to the parent library 340, at which point a parent library 370 of version 102 containing the modified relevant data may be obtained. The following is described with further modifications to the foregoing process.

According to the embodiment of the disclosure, based on the data management method, for the database or the data unit with the modification or update operation, which version of the database or the data unit the corresponding operation library is from and the version corresponding to the database when submitted to the database have corresponding records, so that the capability of quickly searching the history information can be effectively improved. Meanwhile, the problem that personnel are required to organize and maintain data to manage high cost is well solved through the construction of the resume system, all operation resume is reserved, and the problem can be quickly tracked and positioned through the retrieval function of the system.

According to an embodiment of the present disclosure, the data management method further includes: and determining the first target version information of the data units in the database released in the last data release process. And determining second target version information of the data units in the database at the current moment. And determining a target data unit with the version information changed according to the first target version information and the second target version information. And performing data release on the target data unit.

According to the embodiment of the disclosure, data of data production needs to be delivered to each downstream application end through a data release link. The data volume is huge in the whole country, and each time the data only changes locally, so that the data which changes can be sent to the downstream only in an incremental mode by means of complete history management of the master library.

According to the embodiment of the disclosure, for example, a change map between two adjacent versions of a last data release process release and a current data release process may be obtained first. In combination with the version management mechanism, each time the data is updated, the version of the updated file and the version of the library are kept consistent, so that the file with each version changed relative to the previous version can be conveniently queried.

Fig. 4 schematically illustrates a schematic diagram of determining a change pattern in a neighboring version according to an embodiment of the disclosure.

As shown in fig. 4, the parent library 410 with version 100 includes a frame a, a frame B, and a frame C with version 100. The parent library 420 of version 101 includes a drawing a of version 101, a drawing B of version 100, and a drawing C. The parent library 430 of version 102 includes a version 101 of the drawing a, a version 100 of the drawing B, and a version 102 of the drawing C. Accordingly, by comparing the parent libraries 410 and 420 or directly comparing the parent libraries 410 and 430, it is convenient to find the changed files between the two versions, and then only process and issue the changed files.

According to the embodiment of the invention, based on the difference between versions, only the changed files are processed and released, so that the incremental data release is realized, and the efficiency of data release can be greatly improved.

According to embodiments of the present disclosure, the data units described above may be stored in a distributed storage system.

By the embodiment of the disclosure, the storage of the files uses a distributed storage system, so that the reliability of data storage can be effectively improved.

Fig. 5 schematically shows a block diagram of a data management apparatus according to an embodiment of the present disclosure.

As shown in fig. 5, the data management apparatus 500 includes a first determination module 510, a second determination module 520, and a third determination module 530.

The first determining module 510 is configured to determine a first operation record of the database according to the first version information of the database and the generator information of the database corresponding to the first version information.

The second determining module 520 is configured to determine a second operation record of the data unit in the database according to the second version information of the data unit in the database and the storage identifier of the data unit corresponding to the second version information.

A third determining module 530 is configured to determine a historical operation record of the database according to the first operation record and the second operation record.

According to an embodiment of the present disclosure, the data management apparatus further includes a fourth determination module, a fifth determination module, and a sixth determination module.

And a fourth determining module for determining the first initial version information of the database in response to an initial commit operation of the database.

And a fifth determining module, configured to determine, in response to an update operation on the database, first updated version information of the updated database.

And the sixth determining module is used for determining the first version information according to the first initial version information and the first updated version information.

According to an embodiment of the present disclosure, the data management apparatus further includes a first generation module and a seventh determination module.

The first generation module is used for responding to the updating operation of the database and generating a first operation library for completing the updating operation.

And a seventh determining module, configured to determine, after the update of the database is completed by using the first job repository, first updated version information of the database in response to a commit operation for the updated database in the first job repository.

According to an embodiment of the present disclosure, the data management apparatus further includes an eighth determination module and a ninth determination module.

And the eighth determining module is used for determining the first library identification information of the first operation library in response to the submitting operation of the updated database in the first operation library after the update of the database is completed by using the first operation library.

And a ninth determining module, configured to determine, according to the first library identification information, producer information of the database corresponding to the first updated version information.

According to an embodiment of the present disclosure, the data management apparatus further includes a second generation module and a tenth determination module.

And the second generation module is used for responding to the modification operation of the data unit in the database and generating a second job base for realizing the modification operation.

And a tenth determining module, configured to determine, after the modification of the data unit is completed by using the second job repository, the first updated version information of the database in response to the commit operation for the modified data unit in the second job repository.

According to an embodiment of the present disclosure, the data management apparatus further includes an eleventh determination module and a twelfth determination module.

And the eleventh determining module is used for determining second library identification information of the second operation library in response to the submitting operation of the modified data unit in the second operation library after the modification of the data unit is completed by the second operation library.

And a twelfth determining module for determining the generator information of the database corresponding to the first updated version information according to the second library identification information.

According to an embodiment of the present disclosure, the data management apparatus further includes a thirteenth determination module, a fourteenth determination module, and a fifteenth determination module.

And a thirteenth determining module, configured to determine, according to the first initial version information, second initial version information of the data unit in the database corresponding to the first initial version information.

And the fourteenth determining module is used for determining second updated version information of the data units with the update in the database corresponding to the first updated version information according to the first updated version information.

And a fifteenth determining module, configured to determine the second version information according to the second initial version information and the second updated version information.

According to an embodiment of the present disclosure, the data management apparatus further includes a sixteenth determination module, a seventeenth determination module, an eighteenth determination module, and a publication module.

And the sixteenth determining module is used for determining the first target version information of the data units in the database released in the last data release process.

Seventeenth determining module, configured to determine second target version information of the data unit in the database at the current moment.

And the eighteenth determining module is used for determining a target data unit with the version information changed according to the first target version information and the second target version information.

And the release module is used for releasing the data of the target data unit.

According to an embodiment of the present disclosure, data units are stored in a distributed storage system.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

According to an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described above.

According to an embodiment of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described above.

According to an embodiment of the present disclosure, a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.

Fig. 6 illustrates a schematic block diagram of an example electronic device 600 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM 602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, mouse, etc.; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the respective methods and processes described above, such as a data management method. For example, in some embodiments, the data management method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When a computer program is loaded into RAM 603 and executed by computing unit 601, one or more steps of the data management method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the data management method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A data management method, comprising:

determining a first operation record of a database according to first version information of the database and generator information of the database corresponding to the first version information, wherein the first version information represents the version of content in the database;

determining a second operation record of the data unit in the database according to second version information of the data unit in the database and a storage identifier of the data unit corresponding to the second version information, wherein the second version information represents the version of the content of the data unit;

Determining a historical operation record of the database according to the first operation record and the second operation record;

determining first target version information of a data unit in a database released in the last data release process;

determining second target version information of the data units in the database at the current moment;

determining a target data unit with version information changed according to the first target version information and the second target version information; and

and carrying out data release on the target data unit.

2. The method of claim 1, further comprising:

determining first initial version information of the database in response to an initial commit operation to the database;

determining first updated version information of the updated database in response to an update operation on the database; and

and determining the first version information according to the first initial version information and the first updated version information.

3. The method of claim 2, further comprising:

generating a first job base for completing the updating operation in response to the updating operation of the database; and

after the database is updated by the first operation library, the first updated version information of the database is determined in response to the submitting operation of the updated database in the first operation library.

4. A method according to claim 3, further comprising:

after the database is updated by the first operation library, determining first library identification information of the first operation library in response to the submitting operation of the updated database in the first operation library; and

and determining generator information of a database corresponding to the first updated version information according to the first library identification information.

5. The method of claim 2, further comprising:

generating a second job base for realizing the modification operation in response to the modification operation of the data unit in the database; and

after the modification of the data unit is completed by the second operation library, the first updated version information of the database is determined in response to a commit operation for the modified data unit in the second operation library.

6. The method of claim 5, further comprising:

after the modification of the data unit is completed by using the second operation library, determining second library identification information of the second operation library in response to the submitting operation of the modified data unit in the second operation library; and

And determining generator information of a database corresponding to the first updated version information according to the second library identification information.

7. The method of claim 2, further comprising:

determining second initial version information of data units in a database corresponding to the first initial version information according to the first initial version information;

determining second updated version information of the data units with the update in the database corresponding to the first updated version information according to the first updated version information; and

and determining the second version information according to the second initial version information and the second updated version information.

8. The method of any of claims 1 to 7, wherein the data units are stored in a distributed storage system.

9. A data management apparatus comprising:

the first determining module is used for determining a first operation record of the database according to first version information of the database and generator information of the database corresponding to the first version information, wherein the first version information represents the version of content in the database;

a second determining module, configured to determine a second operation record of a data unit in the database according to second version information of the data unit in the database and a storage identifier of the data unit corresponding to the second version information, where the second version information represents a version of content of the data unit;

The third determining module is used for determining a historical operation record of the database according to the first operation record and the second operation record;

a sixteenth determining module, configured to determine first target version information of a data unit in a database published in a last data publishing process;

a seventeenth determining module, configured to determine second target version information of the data unit in the database at the current time;

an eighteenth determining module, configured to determine, according to the first target version information and the second target version information, a target data unit in which version information changes; and

and the release module is used for releasing the data of the target data unit.

10. The apparatus of claim 9, further comprising:

a fourth determining module, configured to determine first initial version information of the database in response to an initial commit operation to the database;

a fifth determining module, configured to determine first updated version information of the updated database in response to an update operation on the database; and

and a sixth determining module, configured to determine the first version information according to the first initial version information and the first updated version information.

11. The apparatus of claim 10, further comprising:

the first generation module is used for responding to the updating operation of the database and generating a first job library for completing the updating operation; and

and a seventh determining module, configured to determine, after updating the database with the first job library, first updated version information of the database in response to a commit operation for the updated database in the first job library.

12. The apparatus of claim 11, further comprising:

an eighth determining module, configured to determine, after completing updating of the database with the first job library, first library identification information of the first job library in response to a commit operation to an updated database in the first job library; and

and a ninth determining module, configured to determine, according to the first library identification information, producer information of a database corresponding to the first updated version information.

13. The apparatus of claim 10, further comprising:

the second generation module is used for responding to the modification operation of the data units in the database and generating a second job library for realizing the modification operation; and

And a tenth determining module, configured to determine, after the modification of the data unit using the second job base, first updated version information of the database in response to a commit operation for the modified data unit in the second job base.

14. The apparatus of claim 13, further comprising:

an eleventh determining module, configured to determine, after completing modification of the data unit with the second job library, second library identification information of the second job library in response to a commit operation for the modified data unit in the second job library; and

and a twelfth determining module, configured to determine, according to the second library identification information, producer information of a database corresponding to the first updated version information.

15. The apparatus of claim 10, further comprising:

a thirteenth determining module, configured to determine, according to the first initial version information, second initial version information of a data unit in a database corresponding to the first initial version information;

a fourteenth determining module, configured to determine, according to the first updated version information, second updated version information of a data unit that has an update in a database corresponding to the first updated version information; and

16. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

17. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-8.