CN102693312A - Flexible transaction management method in key-value store data storage - Google Patents

Flexible transaction management method in key-value store data storage Download PDF

Info

Publication number
CN102693312A
CN102693312A CN2012101693018A CN201210169301A CN102693312A CN 102693312 A CN102693312 A CN 102693312A CN 2012101693018 A CN2012101693018 A CN 2012101693018A CN 201210169301 A CN201210169301 A CN 201210169301A CN 102693312 A CN102693312 A CN 102693312A
Authority
CN
China
Prior art keywords
data
key assignments
daily record
coordination module
storehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101693018A
Other languages
Chinese (zh)
Other versions
CN102693312B (en
Inventor
***
丁贵广
朱妤晴
衣国垒
杨义繁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201210169301.8A priority Critical patent/CN102693312B/en
Publication of CN102693312A publication Critical patent/CN102693312A/en
Application granted granted Critical
Publication of CN102693312B publication Critical patent/CN102693312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention belongs to the technical field of computer database management, and particularly relates to a flexible transaction management method in key-value store data storage, which includes the following steps: when data is written, a coordinative module packages a request into a log and obtains the existing log position, the Parkes algorithm is used to write the log into the new log position, the position is recorded, the successful operation information is returned, and the data and the log position are written into the data storage; when the data is read, the coordinative module obtains the newest log position and checks whether the data is updated, if the data is updated, the data is read and returned to the user, and if the data is not updated, the log is read, and the data is then corrected and read and returned to the user. The flexible transaction management method improves the concurrency, the fault tolerance and the expansibility of the key-value store data storage, can narrow the limitation range of the transaction as much as possible under the circumstance that the system consistency is ensured, and improves the concurrency of the system; and the design of the flexible transaction has high positive function in improving the flexibility and the adaptability of the database transaction.

Description

Flexible transaction management method in a kind of key assignments database data storage
Technical field
The present invention relates to flexible transaction management method in a kind of key assignments database data storage, belong to the computer data base management technical field.
Background technology
In recent years, the development of online interaction formula service platform is rapid, like fields such as social networks, Emails.The formation that users crowd's existence and mass user generate content (User-Generated Content) has expedited the emergence of system platform enhanced scalability and high concurrent requirement.Moreover, internet, applications makes system platform have to provide the service with high availability and fault-tolerance to the continuous service requirement of online service." notion of cloud computing is just answered this development trend and is given birth to.
Aspect data management, these application demands have proposed the requirement of enhanced scalability and high availability to data management system.Although widely the network service is used for full-fledged traditional relational database, be difficult to assurance aspect enhanced scalability and the high availability.Along with " one type of representative cloud storage system has appearred in the proposition of cloud computing notion aspect storage, i.e. key assignments storehouse (Key-value store also is called NoSQL DB).The relational model of traditional database has been abandoned in the key assignments storehouse, and adopts the simple data model based on key-value pair, has sacrificed the database features of part as transactions access, to improve the extensibility and the fault-tolerance of data-storage system.This type of key assignments storehouse system is used widely in actual internet; As be used for Google service big table (Bigtable), be used for the simple data storehouse (SimpleDB) of Amazon service, be used for Yahoo's service Pi Naci (PNUTS), be used for the types of facial makeup in Beijing operas (Facebook) and push away the Cassandra (Cassandra) that spy (Twitter) serves.
Cassandra is as the typical case's representative in the storage of key assignments database data, and with compared with relational database, advantage is that data model is simple, has high scalability, availability and fault-tolerance, and the application development that is simple and easy to usefulness interface is provided.Cassandra is based on reciprocity internet architecture (peer-to-peer; P2P) storage system; Characteristics are each computing machines (i.e. " node " in the storage system; All have reciprocity status down together), be responsible for the storage and the backup of a part of data separately, do not have the resources allocation of single node control total system.Adopt reciprocity internet architecture to help to improve concurrency, fault-tolerance and the extendability of system.Because each nodal function is identical, can both responding system outside request of data, all nodes are the response external request of data concurrently, can improve the data throughout of system, and good concurrency promptly is provided.Improving fault-tolerance is meant; When part of nodes is made mistakes cisco unity malfunction (i.e. " inefficacy "); Because data have backup on other nodes, and the identical easy phase trans-substitution of nodal function, so can substitute the request of failure node response external by the node of other operate as normal; System can keep external normal response, and the fault freedom of system is improved.Again because each nodal function is identical, so log off or new node when adding system, can not change to the flat structure of system at original node, only need the data in the system be redistributed and get final product, so system is with good expansibility.
When the P2P framework brings high concurrency, fault-tolerance and extendability for the Cassandra system, also brought some shortcomings, wherein the most important is the final consistency that it can only realize data, not the supporting database transaction functionality.Data consistency is a requirement different user when visiting the same data of Database Systems at the same time, should obtain identical (promptly consistent) data content.Under distributed environment, data need a plurality of backups usually, cause the loss of data of system to prevent that individual node lost efficacy, but have also brought the problem that how to keep data consistent between a plurality of backups simultaneously.Desirable data are in full accord to require all backups of data all to have identical data content at synchronization.And the final consistency that Cassandra realizes is meant, system can not guarantee that each different backed up data of the moment all is consistent, and the system that can only guarantee is consistent in the data through steady state (SS) after considerable time.This is a kind of consistance, in system operation, might different user visits same data simultaneously and can obtain different return results.Db transaction is meant in the database sequence of operations of carrying out as single logical unit of work, these operations comprise insert data, Update Information, deleted data etc.Db transaction mechanism can guarantee that all operations in the transaction units all completes successfully, and perhaps all operations does not carry out.The traditional data storehouse requires affairs to have atomicity (all operations or all successes; All fail), consistance (affairs carry out result make database change to another state) from a state; Isolation (operation of different affairs is independent of each other mutually); Persistence (affairs of successful execution can not lost the change of data), this four big characteristic is referred to as the ACID characteristic.Cassandra is merely able to realization finally to carry out, and therefore can not satisfy the requirement of database to affairs, and this make troubles for the upper layer application developer, because they must consider to solve the inconsistent problem of system that a plurality of user concurrent access bring in application layer.
Summary of the invention
The objective of the invention is to propose flexible transaction management method in a kind of key assignments database data storage, the data area of transaction management can be changed according to user's appointment, and make things convenient for the user when using database, to keep the consistance of data.
Flexible transaction management method in the key assignments database data storage that the present invention proposes may further comprise the steps:
The ablation process of data:
(1) user's data that will write the key assignments storehouse are submitted to Coordination module, write in the data in key assignments storehouse to have key assignments storehouse line unit, and Coordination module is encapsulated into daily record with user's data that write the key assignments storehouse and write operation;
(2) Coordination module is obtained current up-to-date daily record position from the Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes the daily record position, and the daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the N of log store in a key assignments storehouse log memory of step (1), and wherein N is more than or equal to 3;
(4) Coordination module adopts the Orion Pax consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates;
(5) Coordination module writes the daily record position of above-mentioned steps (2) in the Version Control module, and Coordination module is returned the write operation successful information to the user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the M of data storage in a key assignments storehouse data storer of step (1), and wherein M is more than or equal to 3;
(7) Coordination module writes M data storer with the data of user's submission in the above-mentioned steps (1) and the daily record position that writes of above-mentioned steps (2);
The process that reads of data:
(8) user comprises in this request needing the line unit of reading of data in the key assignments storehouse to the request of Coordination module submission reading of data;
(9) Coordination module is obtained the corresponding up-to-date daily record position P of line unit with step (8) from the Version Control module 1
(10) Coordination module calculate U the data storer at the line unit place of step (8), and a data storer S from U data storer obtains and the corresponding daily record of the line unit position P of step (8) according to key assignments library backup data rule 2
(11) Coordination module is to above-mentioned two daily record position P 1And P 2Compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10) then, and the data of obtaining are returned to the request user;
If P 1>P 2, then Coordination module calculates V the log memory at the line unit place of step (8) according to key assignments library backup data rule; And a log memory T from V log memory obtains and the corresponding daily record of line unit of step (8); Simultaneously, Coordination module is revised the data among the data-carrier store S according to the current log content among the log memory T; And from data-carrier store S, obtain and the strong corresponding data of the row of step (8), these data are returned to the request user.
Flexible transaction management method in the key assignments database data storage that the present invention proposes, its advantage is:
1, flexible transaction management method in the key assignments database data storage of the present invention has adopted the reciprocity internet architecture (P2P) in the prior art, has improved concurrency, fault-tolerance and the extendability of key assignments database data storage.
2, flexible transaction management method in the key assignments database data of the present invention storage adopts the management method of flexible affairs, supports the data cell of transactional attribute dynamically to adjust.Affairs both can be the affairs within the delegation; Assurance has atomicity to a plurality of row of the read-write of delegation; Under concurrent situation, can keep simultaneously the consistance of data, also can be the affairs based on group of entities of striding multirow, guarantees the ACID characteristic of Data Update in this group of entities.Affairs limit the flexibility of data area; Bring great convenience to database user; The user can limit the size of group of entities neatly according to the needs of using, and can under the guaranteed situation of system conformance, dwindle the affairs restricted portion as best one can, to improve the concurrency of system.The design of flexible affairs has very positive effect to dirigibility, the adaptability that improves db transaction, also is great advantage of the present invention.
Description of drawings
Fig. 1 is the module invokes synoptic diagram in the inventive method.
Embodiment
Flexible transaction management method in the key assignments database data storage that the present invention proposes, its each module invokes synoptic diagram is as shown in Figure 1, may further comprise the steps:
The ablation process of data:
(1) user's data that will write the key assignments storehouse are submitted to Coordination module, write in the data in key assignments storehouse to have key assignments storehouse line unit, and Coordination module is encapsulated into daily record with user's data that write the key assignments storehouse and write operation;
(2) Coordination module is obtained current up-to-date daily record position from the Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes the daily record position, and the daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the N of log store in a key assignments storehouse log memory of step (1); Wherein N is more than or equal to 3; Key assignments library backup data rule mainly contains two kinds, can be specified by the user, and a kind of is Random assignment; Another kind is that the data of a certain size scope of storage are responsible in each backup in key assignments storehouse according to the corresponding size ordering of data line unit.Key assignments library backup data rule specifically can be referring to the Cassandra configuration instruction;
(4) Coordination module adopts Orion Pax (Paxos) consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates; Orion Pax algorithm wherein is a kind of algorithm that is issued to a plurality of processors unanimities in insecure network environment; Specifying of algorithm can be referring to paper Lamport L, Malkhi D, Zhou L; Vertical Paxos and primary backup replication; MSR-TR-2009-63 [R], Microsoft Research, 2009;
(5) Coordination module writes the daily record position of above-mentioned steps (2) in the Version Control module, and Coordination module is returned the write operation successful information to the user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the M of data storage in a key assignments storehouse data storer of step (1), and wherein M is more than or equal to 3;
(7) Coordination module writes M data storer with the data of user's submission in the above-mentioned steps (1) and the daily record position that writes of above-mentioned steps (2);
The process that reads of data:
(8) user comprises in this request needing the line unit of reading of data in the key assignments storehouse to the request of Coordination module submission reading of data;
(9) Coordination module is obtained the corresponding up-to-date daily record position P of line unit with step (8) from the Version Control module 1
(10) Coordination module calculate U the data storer at the line unit place of step (8), and a data storer S from U data storer obtains and the corresponding daily record of the line unit position P of step (8) according to key assignments library backup data rule 2
(11) Coordination module is to above-mentioned two daily record position P 1And P 2Compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10) then, and the data of obtaining are returned to the request user;
If P 1>P 2Then Coordination module is according to key assignments library backup data rule; Calculate V the log memory at the line unit place of step (8); And a log memory T from V log memory obtains and the corresponding daily record of line unit of step (8), and these daily records comprise the daily record position greater than P2, and are less than or equal to all daily records of P1.Simultaneously, Coordination module is revised the data among the data-carrier store S according to the current log content among the log memory T, and from data-carrier store S, obtains and the strong corresponding data of the row of step (8), and these data are returned to the request user.
In order to realize method of the present invention, must system be divided into following main modular:
Coordination module (Coordinator): telegon is the inlet of transactions requests visit, also is the nucleus module of system logic.This module receives transactions requests from user side, according to request encapsulation affairs execution journal, through coordinate distributed storage system with transaction log backup to many machines, and the notification data storer is carried out the operation of affairs.Also need the up-to-date log information of acquiring and maintaining at the same time.This module is a nucleus module of the present invention by the independent development realization of encoding.
Version Control module (Version Controller): all there is independently transaction journal of portion the unit of each affairs, and the daily record of same transaction units is sorted according to the priority of carrying out.The Version Control device is managed the module of the current up-to-date daily record of each transaction units position (sequence number of daily record) exactly, and correct up-to-date daily record position is the important assurance that guarantees that affairs are carried out according to the daily record number order, so this module must guarantee the consistance of the overall situation.When realizing, this module realizes based on the Counter of Cassandra, because this function can guarantee the atomicity and the consistance of data change.
Log memory (Log Node): the function of log memory is a storage practice Operation Log.In order to guarantee the persistence of practice operation, can not cause and submit losing of data to because of the collapse of node, when carrying out the Data Update operation, system can write transaction journal earlier, submits to successfully just after daily record is write as merit, return.According to log content real data is made amendment again afterwards, in read-write process afterwards, also can repair, to guarantee the persistence and the consistance of transaction operation according to daily record if revise failure.For the fault-tolerance of enhanced system under distributed environment, each daily record all has a plurality of backups, and the backup number is not less than 3 usually.When realizing, this module combines the data-carrier store of Cassandra to realize jointly by independent development.
Data-carrier store (Data Node): the function of data-carrier store is the actual data of storage, and these data all are that the content through execution journal produces.Data-carrier store also is based on the data-carrier store of Cassandra and develops realization.In the system deployment of reality, log memory and data-carrier store are usually located on same the physical machine, so that aim at local the execution day.

Claims (1)

1. flexible transaction management method during a key assignments database data is stored is characterized in that this method may further comprise the steps:
The ablation process of data:
(1) user's data that will write the key assignments storehouse are submitted to Coordination module, write in the data in key assignments storehouse to have key assignments storehouse line unit, and Coordination module is encapsulated into daily record with user's data that write the key assignments storehouse and write operation;
(2) Coordination module is obtained current up-to-date daily record position from the Version Control module, on up-to-date daily record position, adds 1, and what obtain step (1) daily record writes the daily record position, and the daily record position is described by the data line in the storage of key assignments database data;
(3) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the N of log store in a key assignments storehouse log memory of step (1), and wherein N is more than or equal to 3;
(4) Coordination module adopts the Orion Pax consistency algorithm, and the daily record of step (1) is left in N the log memory that step (3) calculates;
(5) Coordination module writes the daily record position of above-mentioned steps (2) in the Version Control module, and Coordination module is returned the write operation successful information to the user;
(6) Coordination module is according to the key assignments storehouse line unit of step (1), and according to key assignments library backup data rule, calculates the M of data storage in a key assignments storehouse data storer of step (1), and wherein M is more than or equal to 3;
(7) Coordination module writes M data storer with the data of user's submission in the above-mentioned steps (1) and the daily record position that writes of above-mentioned steps (2);
The process that reads of data:
(8) user comprises in this request needing the line unit of reading of data in the key assignments storehouse to the request of Coordination module submission reading of data;
(9) Coordination module is obtained the corresponding up-to-date daily record position P of line unit with step (8) from the Version Control module 1
(10) Coordination module calculate U the data storer at the line unit place of step (8), and a data storer S from U data storer obtains and the corresponding daily record of the line unit position P of step (8) according to key assignments library backup data rule 2
(11) Coordination module is to above-mentioned two daily record position P 1And P 2Compare:
If P 1=P 2, the data of Coordination module line unit of obtaining step (8) from the data-carrier store S of step (10) then, and the data of obtaining are returned to the request user;
If P 1>P 2, then Coordination module calculates V the log memory at the line unit place of step (8) according to key assignments library backup data rule; And a log memory T from V log memory obtains and the corresponding daily record of line unit of step (8); Simultaneously, Coordination module is revised the data among the data-carrier store S according to the current log content among the log memory T; And from data-carrier store S, obtain and the strong corresponding data of the row of step (8), these data are returned to the request user.
CN201210169301.8A 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage Active CN102693312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210169301.8A CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210169301.8A CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Publications (2)

Publication Number Publication Date
CN102693312A true CN102693312A (en) 2012-09-26
CN102693312B CN102693312B (en) 2014-05-28

Family

ID=46858745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210169301.8A Active CN102693312B (en) 2012-05-28 2012-05-28 Flexible transaction management method in key-value store data storage

Country Status (1)

Country Link
CN (1) CN102693312B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104238963A (en) * 2014-09-30 2014-12-24 华为技术有限公司 Data storage method, device and system
CN105704004A (en) * 2014-11-28 2016-06-22 华为技术有限公司 Service data processing method and device
CN106708840A (en) * 2015-11-12 2017-05-24 中国科学院深圳先进技术研究院 Customer information management method and system
CN107844388A (en) * 2012-11-26 2018-03-27 亚马逊科技公司 Recover database from standby system streaming
CN109522273A (en) * 2018-11-15 2019-03-26 郑州云海信息技术有限公司 A kind of method and device for realizing data write-in
CN109739684A (en) * 2018-11-20 2019-05-10 清华大学 The copy restorative procedure and device of distributed key value database based on vector clock
CN113778632A (en) * 2021-09-14 2021-12-10 杭州沃趣科技股份有限公司 Distributed transaction management method based on cassandra database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706811A (en) * 2009-11-24 2010-05-12 中国科学院软件研究所 Transaction commit method of distributed database system
WO2011067214A2 (en) * 2009-12-04 2011-06-09 International Business Machines Corporation High throughput, reliable replication of transformed data in information systems
CN102298641A (en) * 2011-09-14 2011-12-28 清华大学 Method for uniformly storing files and structured data based on key value bank

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706811A (en) * 2009-11-24 2010-05-12 中国科学院软件研究所 Transaction commit method of distributed database system
WO2011067214A2 (en) * 2009-12-04 2011-06-09 International Business Machines Corporation High throughput, reliable replication of transformed data in information systems
CN102298641A (en) * 2011-09-14 2011-12-28 清华大学 Method for uniformly storing files and structured data based on key value bank

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨义繁 等: "强快照与强提交读隔离的多键云事务实现方法", 《计算机科学与探索》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844388A (en) * 2012-11-26 2018-03-27 亚马逊科技公司 Recover database from standby system streaming
CN107844388B (en) * 2012-11-26 2021-12-07 亚马逊科技公司 Streaming restore of database from backup system
US11475038B2 (en) 2012-11-26 2022-10-18 Amazon Technologies, Inc. Automatic repair of corrupted blocks in a database
CN104238963A (en) * 2014-09-30 2014-12-24 华为技术有限公司 Data storage method, device and system
CN104238963B (en) * 2014-09-30 2017-08-11 华为技术有限公司 A kind of date storage method, storage device and storage system
CN105704004A (en) * 2014-11-28 2016-06-22 华为技术有限公司 Service data processing method and device
CN105704004B (en) * 2014-11-28 2019-10-22 华为技术有限公司 Business data processing method and device
CN106708840A (en) * 2015-11-12 2017-05-24 中国科学院深圳先进技术研究院 Customer information management method and system
CN109522273A (en) * 2018-11-15 2019-03-26 郑州云海信息技术有限公司 A kind of method and device for realizing data write-in
CN109739684A (en) * 2018-11-20 2019-05-10 清华大学 The copy restorative procedure and device of distributed key value database based on vector clock
CN113778632A (en) * 2021-09-14 2021-12-10 杭州沃趣科技股份有限公司 Distributed transaction management method based on cassandra database

Also Published As

Publication number Publication date
CN102693312B (en) 2014-05-28

Similar Documents

Publication Publication Date Title
AU2017218964B2 (en) Cloud-based distributed persistence and cache data model
CN102663117B (en) OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform
CN102693312B (en) Flexible transaction management method in key-value store data storage
Tsai et al. Scalable architectures for SaaS
Makris et al. A classification of NoSQL data stores based on key design characteristics
CN105359099B (en) Index update pipeline
US11314717B1 (en) Scalable architecture for propagating updates to replicated data
CN100452046C (en) Storage method and system for mass file
CN102622427A (en) Method and system for read-write splitting database
JP2016524750A5 (en)
CN103150304A (en) Cloud database system
US20160306550A1 (en) Constructing a scalable storage device, and scaled storage device
US11250022B1 (en) Offline index builds for database tables
CN105069151A (en) HBase secondary index construction apparatus and method
US11003550B2 (en) Methods and systems of operating a database management system DBMS in a strong consistency mode
CN103593420A (en) Method for constructing heterogeneous database clusters on same platform by sharing online logs
Srinivasan et al. Citrusleaf: A real-time nosql db which preserves acid
CN105468296A (en) No-sharing storage management method based on virtualization platform
Jiang et al. A novel clustered MongoDB-based storage system for unstructured data with high availability
CN114925075B (en) Real-time dynamic fusion method for multi-source time-space monitoring information
US20200110632A1 (en) Method and system for routing and executing transactions
US11556589B1 (en) Adaptive tiering for database data of a replica group
US11449398B2 (en) Embedded container-based control plane for clustered environment
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
Jiang et al. MyStore: A high available distributed storage system for unstructured data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant