CN107943981A - HBase rows paging method, server and computer-readable recording medium - Google Patents

HBase rows paging method, server and computer-readable recording medium Download PDF

Info

Publication number
CN107943981A
CN107943981A CN201711234263.9A CN201711234263A CN107943981A CN 107943981 A CN107943981 A CN 107943981A CN 201711234263 A CN201711234263 A CN 201711234263A CN 107943981 A CN107943981 A CN 107943981A
Authority
CN
China
Prior art keywords
paging
row
data
hbase
offset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711234263.9A
Other languages
Chinese (zh)
Inventor
陈金添
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nubia Technology Co Ltd
Original Assignee
Nubia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nubia Technology Co Ltd filed Critical Nubia Technology Co Ltd
Priority to CN201711234263.9A priority Critical patent/CN107943981A/en
Publication of CN107943981A publication Critical patent/CN107943981A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a kind of HBase rows paging method, this method includes:The row of HBase data row paging informations is set to be good for RowKey;Obtain the total number Total of the data record of the data row paging information;The data row paging information and corresponding initial data are inserted into same table at the same time;Receive the data offset OffSet and paging size PageSize in paging request;The first row StartRowKey and last column StopRowKey of each paging are calculated according to the data offset OffSet and paging size PageSize;According to the first row StartRowKey of each paging and last column StopRowKey, and the total number Total, the paging to the initial data is realized.The embodiment of the invention also discloses a kind of server and computer-readable recording medium.Thereby, it is possible to fast and effeciently realize the row paging scanning of HBase data.

Description

HBase rows paging method, server and computer-readable recording medium
Technical field
The present invention relates to database technical field, more particularly to a kind of HBase rows paging method, server and computer can Read storage medium.
Background technology
HBase is the database of Apache Hadoop, can provide random, real-time read and write access function to big data, Have the characteristics that to increase income, is distributed, expansible and stored towards row, its target is to store and handle large-scale data, more Common hardware configuration need to be only used to handle the data of thousands of row and column for body ground.
The paging filter (PageFilter) of the primary offers of HBase can only be supported to specify some initial row and paging size PageSize parameters realize row paging.In actual paging displaying, client can record last column of present scan, and The initial row that described last column is set to scan next time, while retain identical filter attribute, then it is iterated successively.
However, above-mentioned paging method there are the problem of have:(1) it must be known by scanning when obtaining paged data every time Initial row, therefore cannot support the inquiry of any page;(2) total number for the data volume to be obtained in paging query, it is necessary to Cout functions are used, however it is very low for the HBase of often storage big data, the treatment effeciency of cout functions.
The content of the invention
It is a primary object of the present invention to propose a kind of HBase rows paging method and corresponding server, it is intended to solve such as The problem of what fast and effeciently realizes the row paging of HBase data.
To achieve the above object, a kind of HBase rows paging method provided by the invention, this method include:
Step a:The row of HBase data row paging informations is set to be good for RowKey;
Step b:Obtain the total number Total of the data record of the data row paging information;
Step c:The data row paging information and corresponding initial data are inserted into same table at the same time;
Step d:Receive the data offset OffSet and paging size PageSize in paging request;
Step e:The first row of each paging is calculated according to the data offset OffSet and paging size PageSize StartRowKey and last column StopRowKey;And
Step f:According to the first row StartRowKey of each paging and last column StopRowKey, with And the total number Total, realize the paging to the initial data.
Alternatively, step d described in this method and step e could alternatively be:
Step g:Receive the first row StartRowKey for each paging that client provides and described last column StopRowKey。
Alternatively, the structure that the row of the data row paging information is good for RowKey is:~_ serialNum_ PrimaryRowKey, wherein "~" is prefix sign, for being distinguished with the line unit of the initial data;“serialNum” Represent the number of data record when the insertion row is good for RowKey;" primaryRowKey " represents the row in the initial data Key value;" _ " is connector.
Alternatively, the serialNum is the fixed-length string of 10 systems or 16 systems.
Alternatively, the step b is specifically included:
The order of the data row paging information is inverted;
Obtain first row after reversion and be good for RowKey;
The value that the row is good for RowKey by the connector " _ " is split, and obtains the value of " serialNum ";
The value of " serialNum " removes the zero of prefix by described in, that is, obtains the total number Total.
Alternatively, the step e is specifically included:
According to setting the row of the data row paging information to be good for the rule of RowKey, the data offset OffSet is set " serialNum " value of the StartRowKey is set to, so as to obtain the StartRowKey;
According to the value of OffSet+PageSize+1, " serialNum " value of the StopRowKey is set, so as to obtain The StopRowKey.
In addition, to achieve the above object, the present invention also proposes a kind of server, and the server includes:Memory, processing Device and the HBase row paging programs that is stored on the memory and can run on the processor, the HBase rows paging Program realizes following steps when being performed by the processor:
The row of HBase data row paging informations is set to be good for RowKey;
Obtain the total number Total of the data record of the data row paging information;
The data row paging information and corresponding initial data are inserted into same table at the same time;
Receive the data offset OffSet and paging size PageSize in paging request;
The first row of each paging is calculated according to the data offset OffSet and paging size PageSize StartRowKey and last column StopRowKey;And
According to the first row StartRowKey of each paging and last column StopRowKey, and it is described Total number Total, realizes the paging to the initial data.
Alternatively, step is also realized when the HBase rows paging program is performed by the processor:
Receive the first row StartRowKey for each paging that client provides and described last column StopRowKey。
Alternatively, the structure that the row of the data row paging information is good for RowKey is:~_ serialNum_ PrimaryRowKey, wherein "~" is prefix sign, for being distinguished with the line unit of the initial data;“serialNum” Represent the number of data record when the insertion row is good for RowKey;" primaryRowKey " represents the row in the initial data Key value;" _ " is connector.
Further, to achieve the above object, the present invention also provides a kind of computer-readable recording medium, the computer HBase row paging programs are stored with readable storage medium storing program for executing, are realized as above when the HBase rows paging program is executed by processor The step of HBase row paging methods stated.
HBase rows paging method, server and computer-readable recording medium proposed by the present invention, employ one kind and are based on The mode of HBase data row paging information redundancies, is stored by additional space and (initial data sum number is stored in same table According to row paging information) cost come realize the row paging of HBase data scan.This method can utilize the strong RowKey of row in HBase The characteristics of dictionary sorts, ensures that initial data and data paging information will not be mingled in one by adding special prefix character Rise, and both included paging information in data row paging information or included the information of initial data.Which is obtaining initial data And same table need to be only manipulated during paging information, without secondary index, initial data can be quickly obtained, allows user As traditional Relational DataBase, by offset OffSet, paging size PageSize and total number Total come easily Realize data page, that is to say, that can realize the data query to any page.In addition, for the total of data volume in this method It is also lightweight that number Total, which carries out obtaining hour operation quantity, can reduce system load.
Brief description of the drawings
Fig. 1 is a kind of application environment Organization Chart for realizing each embodiment of the present invention;
Fig. 2 is a kind of flow chart for HBase rows paging method that first embodiment of the invention proposes;
Fig. 3 is a kind of flow chart for HBase rows paging method that second embodiment of the invention proposes;
Fig. 4 is a kind of module diagram for server that third embodiment of the invention proposes;
Fig. 5 is a kind of module signal for HBase rows paging system that fourth embodiment of the invention and the 5th embodiment propose Figure.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
In follow-up description, the suffix using such as " module ", " component " or " unit " for representing element is only Be conducive to the explanation of the present invention, itself there is no a specific meaning.Therefore, " module ", " component " or " unit " can mix Ground uses.
Referring to Fig. 1, Fig. 1 is a kind of application environment Organization Chart for realizing each embodiment of the present invention.The present invention can apply In includeing but not limited to, in server 2, client 4, the application environment of network 6.
Wherein, the server 2 can be rack-mount server, blade server, tower server or cabinet-type clothes Computing device, the servers 2 such as business device can be the server sets that independent server or multiple servers are formed Group.
The client 4 can be mobile phone, smart phone, laptop, digit broadcasting receiver, PDA (individuals Digital assistants), PAD (tablet computer), PMP (portable media player), guider, car-mounted device etc. it is removable Equipment, and such as fixed terminal of numeral TV, desktop computer, server etc..
The network 6 can be intranet (Intranet), internet (Internet), global system for mobile communications (Global System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), 4G networks, 5G networks, bluetooth (Bluetooth), Wi-Fi etc. is wireless or has Gauze network.The server 2 is communicated to connect with one or more clients 4 respectively by the network 6, to carry out data Transmission and interaction.
A kind of data sheet deriving method proposed by the present invention, applied in server 2, in same table at the same time Initial data and data row paging information are inserted into, each paging is obtained according to the line unit set in the data row paging information The first row StartRowKey and last column StopRowKey, and the total number of data record, so as to fulfill to described original The paging scanning of data.
Embodiment one
As shown in Fig. 2, first embodiment of the invention proposes a kind of HBase rows paging method, this method comprises the following steps:
S200, sets the row of HBase data row paging informations to be good for RowKey.
In the present embodiment, the initial data in HBase and data row paging information are placed in same table, to realize HBase data row pagings.Firstly, it is necessary to the row of data row paging information is set to be good for RowKey.
Specifically, since the row that the row of the data row paging information is good in RowKey and initial data is good for PrimaryRowKey is in same table, in order to avoid the row of data row paging information is good in RowKey and initial data The strong primaryRowKey of row is mixed in together, causes initial data correctly to carry out range query, according in HBase Row is strong be according to dictionary sort (by the ascending sequence of the size of ASCII character) the characteristics of, can be by reasonably setting data The row of row paging information is good for the prefix of RowKey, such as using symbol "~" as its prefix, so as to allow the row to be good for RowKey PrimaryRowKey is good for the row in initial data to distinguish.
For example, using symbol "~" as prefix, the structure that the row of the data row paging information is good for RowKey can be: ~_ serialNum_primaryRowKey.
Wherein, " serialNum " represents to be inserted into the number of data record when the row is good for RowKey.In the present embodiment SerialNum uses 10 systems, can also use 16 systems in other embodiments.In order to realize that the row is good for RowKey presses Arranged according to the order of insertion, it is necessary to which the serialNum is arranged to fixed-length string.The length of the serialNum can Can record 10,000,000,000 datas according to the preposition supplement zero of actual conditions, such as the serialNum of 11 fixed length. " primaryRowKey " represents the row key value in initial data." _ " is a connector, for connecting the prefix sign Row key value " primaryRowKey " in the number " serialNum " of "~", the data record and the initial data.
The row for the data row paging information is good for an example of RowKey below:
~_ 00000000001_primaryRowKey1
~_ 00000000002_primaryRowKey2
~_ 00000000003_primaryRowKey3
...
~_ 00000000010_primaryRowKey10
...
And so on.
S202, obtains the total number Total of the data record of the data row paging information.
Specifically, understand that the prefix sign "~" is maximum in addition to delete key by searching for ASCII character table ASCII value, therefore, only need to obtain the value that last row in the data row paging information is good for, you can learn the data The total number Total of record.Only needed in HBase by a reverse turn operation, by the order of the data row paging information Reversion, then obtains the number " serialNum " for the data record that first row is good in RowKey.Pass through the connector The value that the row is good for RowKey by " _ " is split, then second value (value of " serialNum ") is removed after the zero of prefix Numerical value is current total number Total.For example, first row is good for RowKey as~_ 00000003510_ after reversion PrimaryRowKey3510, the value that the row is good for RowKey by the connector " _ " are split, second value " 00000003510 " obtains numerical value 3510, i.e., the data record of presently described data row paging information after removing the zero of prefix Total number Total be 3510.
It is worth noting that, the total number Total of the data record of the data row paging information is also equal to the original The total number of the data record of beginning data.
S204, the data row paging information and corresponding initial data are inserted into same table at the same time.
Specifically, in order to avoid carrying out quadratic search during paging query, in same table be inserted into data when, it is necessary to It is inserted into the initial data and the data row paging information at the same time.It is good for according to the row of the data row paging information RowKey, can obtain the correspondence of initial data described in each and the data row paging information.The initial data With the data row paging information in addition to line unit is different, the data row paging information also stores a original number According to information.
S206, receives data offset OffSet and paging size PageSize in paging request.
Specifically, when client 4 sends paging request to server 2, offset OffSet and paging can be provided Size PageSize.For example, offset OffSet is 38, paging size PageSize is 10.In this paging request, for Every page, can providing a corresponding offset OffSet, (offset of lower one page adds paging for the offset of page up Size), and paging size PageSize is fixed.
S208, the first row of each paging is calculated according to the data offset OffSet and paging size PageSize StartRowKey and last column StopRowKey.
Specifically, when server 2 carries out paging according to the paging request, it is necessary first to according to the setting data row The row of paging information is good for the rule (setting fixed-length string) of RowKey, and the data offset OffSet is arranged to " serialNum " value of StartRowKey, so as to obtain the StartRowKey.For example, it is assumed that " serialNum " value For 11 fixed-length string, the data offset OffSet is 38, then fixed-length string (" serialNum " value) is 00000000038, therefore, obtaining the StartRowKey is:~_ 00000000038_primaryRowKey38 (sweep by paging This line of StartRowKey is not included when retouching).In addition, according to the value of OffSet+PageSize+1 (operation for adding 1 be because This line of StopRowKey is not included when being scanned for paging), it can set the StopRowKey's " serialNum " value, so as to obtain the StopRowKey.For example, the data offset OffSet is 38, paging size PageSize is 10, OffSet+PageSize+1=38+10+1=49, then fixed-length string (" serialNum " value) is 00000000049, obtaining StopRowKey is:~_ 00000000049_primaryRowKey49.
S210, according to the first row StartRowKey and last column StopRowKey of each paging, and it is described Total number Total, realizes the paging to the initial data.
Specifically, when obtaining the first row StartRowKey and last column StopRowKey of each paging, and The total number Total (total number for being equal to the data record of the initial data) of the data record of the data row paging information Afterwards, the paging to the initial data can be realized according to traditional paging method.In HBase, according to the first row StartRowKey and last column StopRowKey, the paging that every page is carried out to the initial data scan.According to described total Number Total, can learn whether there are lower one page, i.e. whether paging scanning is fully completed.
The HBase row paging methods that the present embodiment proposes, can be inserted into initial data and data at the same time in same table Row paging information, is provided with new line unit RowKey in the data row paging information, according to data offset OffSet and point Page size PageSize, and the setting rule of the line unit RowKey, can be calculated the first row of each paging StartRowKey and last column StopRowKey, according further to the serialNum values in the line unit RowKey, can obtain To the total number Total of data record, so as to fulfill the paging scanning to the initial data.
It should be noted that this method needs to ensure that the initial data will not be modified and deleted.Under normal conditions, When HBase is used in practical application as data-storage system, data are all that the behaviour such as can only be added by persistence Make, without going to delete or edit data.Therefore, the scope of application of this method in HBase is wider.
Embodiment two
As shown in figure 3, second embodiment of the invention proposes a kind of HBase rows paging method.In a second embodiment, it is described The step S300-S304 and step S308 of HBase row paging methods and the step S200-S204 and step S210 of first embodiment Similar, difference lies in this method to further include step S306.
This method comprises the following steps:
S300, sets the row of HBase data row paging informations to be good for RowKey.
In the present embodiment, the initial data in HBase and data row paging information are placed in same table, to realize HBase data row pagings.Firstly, it is necessary to the row of data row paging information is set to be good for RowKey.
Specifically, since the row that the row of the data row paging information is good in RowKey and initial data is good for PrimaryRowKey is in same table, in order to avoid the row of data row paging information is good in RowKey and initial data The strong primaryRowKey of row is mixed in together, causes initial data correctly to carry out range query.Therefore, according to HBase In row be good for be according to dictionary sort (by the ascending sequence of the size of ASCII character) the characteristics of, can be by reasonably setting The row of data row paging information is good for the prefix of RowKey, such as using symbol "~" as its prefix, so as to allow the row to be good for RowKey is good for primaryRowKey with the row in initial data and distinguishes.
For example, using symbol "~" as prefix, the structure that the row of the data row paging information is good for RowKey can be: ~_ serialNum_primaryRowKey.
Wherein, " serialNum " represents to be inserted into the number of data record when the row is good for RowKey.In the present embodiment SerialNum uses 10 systems, can also use 16 systems in other embodiments.In order to realize that the row is good for RowKey presses Arranged according to the order of insertion, it is necessary to which the serialNum is arranged to fixed-length string.The length of the serialNum can Can record 10,000,000,000 datas according to the preposition supplement zero of actual conditions, such as the serialNum of 11 fixed length. " primaryRowKey " represents the row key value in initial data." _ " is a connector, for connecting the prefix sign Row key value " primaryRowKey " in the number " serialNum " of "~", the data record and the initial data.
The row for the data row paging information is good for an example of RowKey below:
~_ 00000000001_primaryRowKey1
~_ 00000000002_primaryRowKey2
~_ 00000000003_primaryRowKey3
...
~_ 00000000010_primaryRowKey10
...
And so on.
S302, obtains the total number Total of the data record of the data row paging information.
Specifically, understand that the prefix sign "~" is maximum in addition to delete key by searching for ASCII character table ASCII value, therefore, only need to obtain the value that last row in the data row paging information is good for, you can learn the data The total number Total of record.Only needed in HBase by a reverse turn operation, by the order of the data row paging information Reversion, then obtains the number " serialNum " for the data record that first row is good in RowKey.Pass through the connector The value that the row is good for RowKey by " _ " is split, then second value (value of " serialNum ") is removed after the zero of prefix Numerical value is current total number Total.For example, first row is good for RowKey as~_ 00000003510_ after reversion PrimaryRowKey3510, the value that the row is good for RowKey by the connector " _ " are split, second value " 00000003510 " obtains numerical value 3510, i.e., the data record of presently described data row paging information after removing the zero of prefix Total number Total be 3510.
It is worth noting that, the total number Total of the data record of the data row paging information is also equal to the original The total number of the data record of beginning data.
S304, the data row paging information and corresponding initial data are inserted into same table at the same time.
Specifically, in order to avoid carrying out quadratic search during paging query, in same table be inserted into data when, it is necessary to It is inserted into the initial data and the data row paging information at the same time.It is good for according to the row of the data row paging information RowKey, can obtain the correspondence of initial data described in each and the data row paging information.The initial data With the data row paging information in addition to line unit is different, the data row paging information also stores a original number According to information.
S306, receives the first row StartRowKey and last column of each paging that client 4 provides StopRowKey。
Specifically, can be according to the data offset of each paging when client 4 sends paging request to server 2 OffSet and paging size PageSize, calculates the first row StartRowKey and last column StopRowKey, there is provided to institute State server 2 and carry out paging scanning.In this paging request, for every page, correspond to respectively an offset OffSet (under The offset of one page adds paging size for the offset of page up), and paging size PageSize is fixed.
Client 4 (sets fixed length firstly the need of the rule that RowKey is good for according to the row of the setting data row paging information Character string), the data offset OffSet is arranged to " serialNum " value of StartRowKey, so as to obtain described StartRowKey.For example, it is assumed that " serialNum " value is 11 fixed-length string, the data offset OffSet is 38, then fixed-length string (" serialNum " value) is 00000000038, and therefore, obtaining the StartRowKey is:~_ 00000000038_primaryRowKey38 (this line of StartRowKey is not included when paging scans).In addition, according to (operation for adding 1 is because this line of StopRowKey is to be not included when paging scans to the value of OffSet+PageSize+1 ), " serialNum " value of the StopRowKey can be set, so as to obtain the StopRowKey.For example, the number It is 38 according to offset OffSet, paging size PageSize is 10, OffSet+PageSize+1=38+10+1=49, then fixed length Character string (" serialNum " value) is 00000000049, and therefore, obtaining the StopRowKey is:~_ 00000000049_ primaryRowKey49。
In this paging request, client 4 provides the StartRowKey calculated and the StopRowKey To the server 2.The StartRowKey and the StopRowKey that the server 2 is directly provided according to client 4 Carry out paging scanning.
S308, according to the first row StartRowKey and last column StopRowKey of each paging, and it is described Total number Total, realizes the paging to the initial data.
Specifically, when obtaining the first row StartRowKey and last column StopRowKey of each paging, and The total number Total (total number for being equal to the data record of the initial data) of the data record of the data row paging information Afterwards, the paging to the initial data can be realized according to traditional paging method.In HBase, according to the first row StartRowKey and last column StopRowKey, the paging that every page is carried out to the initial data scan.According to described total Number Total, can learn whether there are lower one page, i.e. whether paging scanning is fully completed.
The HBase row paging methods that the present embodiment proposes, are inserted into initial data and data row point at the same time in same table Page information, is provided with new line unit RowKey in the data row paging information.It can make client according to data offset OffSet and paging size PageSize, and the setting rule of the line unit RowKey, are calculated the first of each paging Row StartRowKey and last column StopRowKey, is provided to server.What server was directly provided according to client The StartRowKey and StopRowKey, and the data note that serialNum in the line unit RowKey is worth to The total number Total of record, realizes the paging scanning to the initial data.
The present invention further provides a kind of server, the server includes memory, processor and HBase row pagings system System.The HBase rows paging system is used to be inserted into initial data and data row paging information at the same time in same table, according to institute State the line unit set in data row paging information and obtain the first row StartRowKey and last column of each paging StopRowKey, and the total number of data record, so as to fulfill the paging scanning to the initial data.
Embodiment three
As shown in figure 4, third embodiment of the invention proposes a kind of server 2.The server 2 includes memory 20, place Manage device 22 and HBase rows paging system 28.
Wherein, the memory 20 includes at least a type of readable storage medium storing program for executing, and the clothes are installed on for storing The operating system and types of applications software of business device 2, such as program code of HBase rows paging system 28 etc..In addition, the storage Device 20 can be also used for temporarily storing the Various types of data that has exported or will export.
The processor 22 can be in certain embodiments central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 22 is commonly used in the control clothes The overall operation of business device 2.In the present embodiment, the processor 22 be used to running the program code that is stored in the memory 20 or Person handles data, such as runs described HBase rows paging system 28 etc..
Example IV
As shown in figure 5, fourth embodiment of the invention proposes a kind of HBase rows paging system 28.In the present embodiment, it is described HBase rows paging system 28 includes:
Setup module 800, the row for setting HBase data row paging informations are good for RowKey.
In the present embodiment, the initial data in HBase and data row paging information are placed in same table, to realize HBase data row pagings.First, setup module 800 needs to set the row of data row paging information to be good for RowKey.
Specifically, since the row that the row of the data row paging information is good in RowKey and initial data is good for PrimaryRowKey is in same table, in order to avoid the row of data row paging information is good in RowKey and initial data The strong primaryRowKey of row is mixed in together, causes initial data correctly to carry out range query, according in HBase Row is strong be according to dictionary sort (by the ascending sequence of the size of ASCII character) the characteristics of, can be by reasonably setting data The row of row paging information is good for the prefix of RowKey, such as using symbol "~" as its prefix, so as to allow the row to be good for RowKey PrimaryRowKey is good for the row in initial data to distinguish.
For example, using symbol "~" as prefix, the structure that the row of the data row paging information is good for RowKey can be: ~_ serialNum_primaryRowKey.
Wherein, " serialNum " represents to be inserted into the number of data record when the row is good for RowKey.In the present embodiment SerialNum uses 10 systems, can also use 16 systems in other embodiments.In order to realize that the row is good for RowKey presses Arranged according to the order of insertion, it is necessary to which the serialNum is arranged to fixed-length string.The length of the serialNum can Can record 10,000,000,000 datas according to the preposition supplement zero of actual conditions, such as the serialNum of 11 fixed length. " primaryRowKey " represents the row key value in initial data." _ " is a connector, for connecting the prefix sign Row key value " primaryRowKey " in the number " serialNum " of "~", the data record and the initial data.
The row for the data row paging information is good for an example of RowKey below:
~_ 00000000001_primaryRowKey1
~_ 00000000002_primaryRowKey2
~_ 00000000003_primaryRowKey3
...
~_ 00000000010_primaryRowKey10
...
And so on.
Acquisition module 802, the total number Total of the data record for obtaining the data row paging information.
Specifically, understand that the prefix sign "~" is maximum in addition to delete key by searching for ASCII character table ASCII value, therefore, only need to obtain the value that last row in the data row paging information is good for, you can learn the data The total number Total of record.Only needed in HBase by a reverse turn operation, by the order of the data row paging information Reversion, then obtains the number " serialNum " for the data record that first row is good in RowKey.Pass through the connector The value that the row is good for RowKey by " _ " is split, then second value (value of " serialNum ") is removed after the zero of prefix Numerical value is current total number Total.For example, first row is good for RowKey as~_ 00000003510_ after reversion PrimaryRowKey3510, the value that the row is good for RowKey by the connector " _ " are split, second value " 00000003510 " obtains numerical value 3510, i.e., the data record of presently described data row paging information after removing the zero of prefix Total number Total be 3510.
It is worth noting that, the total number Total of the data record of the data row paging information is also equal to the original The total number of the data record of beginning data.
Module 804 is inserted into, for being inserted into the data row paging information and corresponding original number at the same time in same table According to.
Specifically, in order to avoid carrying out quadratic search during paging query, in same table be inserted into data when, it is necessary to It is inserted into the initial data and the data row paging information at the same time.It is good for according to the row of the data row paging information RowKey, can obtain the correspondence of initial data described in each and the data row paging information.The initial data With the data row paging information in addition to line unit is different, the data row paging information also stores a original number According to information.
Receiving module 806, for receiving data offset OffSet and paging size PageSize in paging request.
Specifically, when client 4 sends paging request to server 2, offset OffSet and paging can be provided Size PageSize.For example, offset OffSet is 38, paging size PageSize is 10.In this paging request, for Every page, can providing a corresponding offset OffSet, (offset of lower one page adds paging for the offset of page up Size), and paging size PageSize is fixed.Described in receiving module 806 is received from the paging request of the client 4 Data offset OffSet and paging size PageSize.
Computing module 808, divides every time for being calculated according to the data offset OffSet and paging size PageSize The first row StartRowKey and last column StopRowKey of page.
Specifically, when server 2 carries out paging according to the paging request, computing module 808 is set firstly the need of basis The row for putting the data row paging information is good for the rule of RowKey (setting fixed-length string), by the data offset OffSet " serialNum " value of StartRowKey is arranged to, so as to obtain the StartRowKey.It is for example, it is assumed that described " serialNum " value is 11 fixed-length string, and the data offset OffSet is 38, then fixed-length string (" serialNum " value) is 00000000038, and therefore, obtaining the StartRowKey is:~_ 00000000038_ PrimaryRowKey38 (this line of StartRowKey is not included when paging scans).In addition, according to OffSet+ The value (this line of StopRowKey is to be not included when adding 1 operation to be scanned because of paging) of PageSize+1, can be with " serialNum " value of the StopRowKey is set, so as to obtain the StopRowKey.For example, the data offset OffSet is 38, and paging size PageSize is 10, OffSet+PageSize+1=38+10+1=49, then fixed-length string (" serialNum " value) is 00000000049, and obtaining StopRowKey is:~_ 00000000049_primaryRowKey49.
Scan module 810, for the first row StartRowKey and last column according to each paging StopRowKey, and the total number Total, realize the paging to the initial data.
Specifically, when obtaining the first row StartRowKey and last column StopRowKey of each paging, and The total number Total (total number for being equal to the data record of the initial data) of the data record of the data row paging information Afterwards, scan module 810 can realize the paging to the initial data according to traditional paging method.In HBase, according to The first row StartRowKey and last column StopRowKey, the paging that every page is carried out to the initial data scan. According to the total number Total, lower one page can be learned whether there are, i.e. whether paging scanning is fully completed.
Embodiment five
As shown in figure 5, fifth embodiment of the invention proposes a kind of HBase rows paging system 28.In the present embodiment, it is described HBase rows paging system 28 is similar with the fourth embodiment, difference lies in:
The receiving module 806, be additionally operable to receive client 4 provide each paging the first row StartRowKey and Last column StopRowKey.
Specifically, can be according to the data offset of each paging when client 4 sends paging request to server 2 OffSet and paging size PageSize, calculates the first row StartRowKey and last column StopRowKey, there is provided to institute State server 2 and carry out paging scanning.In this paging request, for every page, correspond to respectively an offset OffSet (under The offset of one page adds paging size for the offset of page up), and paging size PageSize is fixed.
Client 4 (sets fixed length firstly the need of the rule that RowKey is good for according to the row of the setting data row paging information Character string), the data offset OffSet is arranged to " serialNum " value of StartRowKey, so as to obtain described StartRowKey.For example, it is assumed that " serialNum " value is 11 fixed-length string, the data offset OffSet is 38, then fixed-length string (" serialNum " value) is 00000000038, and therefore, obtaining the StartRowKey is:~_ 00000000038_primaryRowKey38 (this line of StartRowKey is not included when paging scans).In addition, according to (operation for adding 1 is because this line of StopRowKey is to be not included when paging scans to the value of OffSet+PageSize+1 ), " serialNum " value of the StopRowKey can be set, so as to obtain the StopRowKey.For example, the number It is 38 according to offset OffSet, paging size PageSize is 10, OffSet+PageSize+1=38+10+1=49, then fixed length Character string (" serialNum " value) is 00000000049, and therefore, obtaining the StopRowKey is:~_ 00000000049_ primaryRowKey49。
In this paging request, client 4 provides the StartRowKey calculated and the StopRowKey To the server 2.The receiving module 806 receives the first row StartRowKey for each paging that client 4 provides With last column StopRowKey.Then, the StartRowKey that the scan module 810 is directly provided according to client 4 Paging scanning is carried out with the StopRowKey.
Embodiment six
Present invention also offers another embodiment, that is, provides a kind of computer-readable recording medium, the computer Readable storage medium storing program for executing is stored with HBase row paging programs, and the HBase rows paging program can be performed by least one processor, So that the step of at least one processor performs HBase rows paging method described above.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those key elements, and And other elements that are not explicitly listed are further included, or further include as this process, method, article or device institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Also there are other identical element in the process of key element, method, article or device.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme substantially in other words does the prior art Going out the part of contribution can be embodied in the form of software product, which is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal (can be mobile phone, computer, services Device, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
The embodiment of the present invention is described above in conjunction with attached drawing, but the invention is not limited in above-mentioned specific Embodiment, above-mentioned embodiment is only schematical, rather than restricted, those of ordinary skill in the art Under the enlightenment of the present invention, in the case of present inventive concept and scope of the claimed protection is not departed from, it can also make very much Form, these are belonged within the protection of the present invention.

Claims (10)

  1. A kind of 1. HBase rows paging method, applied to server, it is characterised in that this method includes:
    Step a:The row of HBase data row paging informations is set to be good for RowKey;
    Step b:Obtain the total number Total of the data record of the data row paging information;
    Step c:The data row paging information and corresponding initial data are inserted into same table at the same time;
    Step d:Receive the data offset OffSet and paging size PageSize in paging request;
    Step e:The first row of each paging is calculated according to the data offset OffSet and paging size PageSize StartRowKey and last column StopRowKey;And
    Step f:According to the first row StartRowKey of each paging and last column StopRowKey, Yi Jisuo Total number Total is stated, realizes the paging to the initial data.
  2. 2. HBase rows paging method according to claim 1, it is characterised in that step d described in this method and step e Replace with:
    Step g:Receive the first row StartRowKey for each paging that client provides and described last column StopRowKey。
  3. 3. HBase rows paging method according to claim 1 or 2, it is characterised in that the row of the data row paging information The structure of strong RowKey is:~_ serialNum_primaryRowKey, wherein "~" is prefix sign, for it is described original The line unit of data distinguishes;" serialNum " represents to be inserted into the number of data record when the row is good for RowKey; " primaryRowKey " represents the row key value in the initial data;" _ " is connector.
  4. 4. HBase rows paging method according to claim 3, it is characterised in that the serialNum is 10 systems or 16 The fixed-length string of system.
  5. 5. HBase rows paging method according to claim 3, it is characterised in that the step b is specifically included:
    The order of the data row paging information is inverted;
    Obtain first row after reversion and be good for RowKey;
    The value that the row is good for RowKey by the connector " _ " is split, and obtains the value of " serialNum ";
    The value of " serialNum " removes the zero of prefix by described in, that is, obtains the total number Total.
  6. 6. HBase rows paging method according to claim 3, it is characterised in that the step e is specifically included:
    According to setting the row of the data row paging information to be good for the rule of RowKey, the data offset OffSet is arranged to " serialNum " value of the StartRowKey, so as to obtain the StartRowKey;
    According to the value of OffSet+PageSize+1, " serialNum " value of the StopRowKey is set, so as to obtain described StopRowKey。
  7. 7. a kind of server, it is characterised in that the server includes:Memory, processor and it is stored on the memory And the HBase row paging programs that can be run on the processor, when the HBase rows paging program is performed by the processor Realize following steps:
    The row of HBase data row paging informations is set to be good for RowKey;
    Obtain the total number Total of the data record of the data row paging information;
    The data row paging information and corresponding initial data are inserted into same table at the same time;
    Receive the data offset OffSet and paging size PageSize in paging request;
    The first row StartRowKey of each paging is calculated according to the data offset OffSet and paging size PageSize With last column StopRowKey;And
    According to the first row StartRowKey of each paging and last column StopRowKey, and described total Number Total, realizes the paging to the initial data.
  8. 8. server according to claim 7, it is characterised in that the HBase rows paging program is held by the processor Step is also realized during row:
    Receive the first row StartRowKey and last column StopRowKey for each paging that client provides.
  9. 9. the server according to claim 7 or 8, it is characterised in that the row of the data row paging information is good for RowKey Structure be:~_ serialNum_primaryRowKey, wherein "~" is prefix sign, for the row with the initial data Key distinguishes;" serialNum " represents to be inserted into the number of data record when the row is good for RowKey;“primaryRowKey” Represent the row key value in the initial data;" _ " is connector.
  10. 10. a kind of computer-readable recording medium, it is characterised in that be stored with HBase on the computer-readable recording medium Row paging program, is realized as any one of claim 1 to 6 when the HBase rows paging program is executed by processor The step of HBase row paging methods.
CN201711234263.9A 2017-11-30 2017-11-30 HBase rows paging method, server and computer-readable recording medium Pending CN107943981A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711234263.9A CN107943981A (en) 2017-11-30 2017-11-30 HBase rows paging method, server and computer-readable recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711234263.9A CN107943981A (en) 2017-11-30 2017-11-30 HBase rows paging method, server and computer-readable recording medium

Publications (1)

Publication Number Publication Date
CN107943981A true CN107943981A (en) 2018-04-20

Family

ID=61947878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711234263.9A Pending CN107943981A (en) 2017-11-30 2017-11-30 HBase rows paging method, server and computer-readable recording medium

Country Status (1)

Country Link
CN (1) CN107943981A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182040A (en) * 2020-09-30 2021-01-05 深圳前海微众银行股份有限公司 Data query method, device, equipment and storage medium
CN112417276A (en) * 2020-11-18 2021-02-26 北京字节跳动网络技术有限公司 Paging data acquisition method and device, electronic equipment and computer readable storage medium
CN115098215A (en) * 2022-07-19 2022-09-23 重庆紫光华山智安科技有限公司 Data paging method, system, electronic device and storage medium based on multiple services

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103658A1 (en) * 2011-10-19 2013-04-25 Vmware, Inc. Time series data mapping into a key-value database
CN103617232A (en) * 2013-11-26 2014-03-05 北京京东尚科信息技术有限公司 Paging inquiring method for HBase table
CN106874400A (en) * 2017-01-16 2017-06-20 努比亚技术有限公司 A kind of data processing method and server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103658A1 (en) * 2011-10-19 2013-04-25 Vmware, Inc. Time series data mapping into a key-value database
CN103617232A (en) * 2013-11-26 2014-03-05 北京京东尚科信息技术有限公司 Paging inquiring method for HBase table
CN106874400A (en) * 2017-01-16 2017-06-20 努比亚技术有限公司 A kind of data processing method and server

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182040A (en) * 2020-09-30 2021-01-05 深圳前海微众银行股份有限公司 Data query method, device, equipment and storage medium
CN112417276A (en) * 2020-11-18 2021-02-26 北京字节跳动网络技术有限公司 Paging data acquisition method and device, electronic equipment and computer readable storage medium
WO2022105682A1 (en) * 2020-11-18 2022-05-27 北京字节跳动网络技术有限公司 Paging data acquisition method and apparatus, electronic device, and computer readable storage medium
CN115098215A (en) * 2022-07-19 2022-09-23 重庆紫光华山智安科技有限公司 Data paging method, system, electronic device and storage medium based on multiple services
CN115098215B (en) * 2022-07-19 2024-06-04 重庆紫光华山智安科技有限公司 Multi-service-based data paging method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103514201B (en) Method and device for querying data in non-relational database
CN111611225A (en) Data storage management method, query method, device, electronic equipment and medium
CN108733681A (en) Information processing method and device
WO2019024496A1 (en) Enterprise recommendation method and application server
CN111008521B (en) Method, device and computer storage medium for generating wide table
CN107943981A (en) HBase rows paging method, server and computer-readable recording medium
CN113704243A (en) Data analysis method, data analysis device, computer device, and storage medium
RU2605041C2 (en) Methods and systems for displaying microblog topics
CN105224534A (en) A kind of method and device of asking response
CN105515997A (en) BF_TCAM (Bloom Filter-Ternary Content Addressable Memory)-based high-efficiency range matching method for realizing zero range expansion
CN104346347A (en) Data storage method, device, server and system
CN112182021B (en) User data query method, device and system
CN113849499A (en) Data query method and device, storage medium and electronic device
CN110266834B (en) Area searching method and device based on internet protocol address
KR101743731B1 (en) Method and apparatus for processing quary based on ontology generated by collaborating distributed data
CN109542912B (en) Interval data storage method, device, server and storage medium
CN109697234B (en) Multi-attribute information query method, device, server and medium for entity
CN103955519A (en) Account inquiring and recording system and inquiring and recording method thereof
CN116301656A (en) Data storage method, system and equipment based on log structure merging tree
CN113407702B (en) Employee cooperation relationship intensity quantization method, system, computer and storage medium
CN107977381B (en) Data configuration method, index management method, related device and computing equipment
CN109614587B (en) Intelligent human relationship analysis modeling method, terminal device and storage medium
CN101963892B (en) Method and device for printing table data
CN115858699B (en) Data warehouse construction method and device, electronic equipment and readable storage medium
CN114817315B (en) Data processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180420

RJ01 Rejection of invention patent application after publication