Embodiment
Technical scheme in the application is understood better in order to make those skilled in the art person, below in conjunction with accompanying drawing, the technical scheme of the application is clearly and completely described, obviously, described embodiment is only some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making the every other embodiment obtained under creative work prerequisite, all should belong to the scope of the application's protection.
In order to make those skilled in the art person understand the application's scheme better, below in conjunction with accompanying drawing, the application is described in further detail:
Embodiment one:
The process flow diagram of the data processing method that Fig. 1 provides for the embodiment of the present application one.
With reference to shown in Fig. 1, the data processing method that the embodiment of the present application provides, comprising:
Step S11: obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
In the embodiment of the present application, after getting pending data, can obtain the data message of pending data further, comprise data name, storing path, data type, data layout and data volume size etc., the function that all can be carried by operating system is directly obtained.
Due to the data that pending data may be magnanimity, be difficult to be read by software after saving as a data file, therefore after the data message getting pending data, can judge whether the data volume of pending data is greater than predetermined threshold value according to the data message of pending data, predetermined threshold value here can be the maximal value of the data volume that popular software can read.
Step S12: if the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
In the embodiment of the present application, when the data volume of pending data is greater than predetermined threshold value, namely when the data volume of pending data is greater than the maximal value of the data volume that software can read, can be multiple data segment by pending Data Placement, the data volume in each data segment is made all to be not more than described predetermined threshold value, so that each data segment all can be read by software.
Step S13: select at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
In the embodiment of the present application, at least one sample data is selected from multiple data segments that step S12 divides, a sample data can be selected from each data segment, also multiple sample data can be selected from any one data segment, a sample data can also be selected from multiple data segment, also namely the number of sample data is not necessarily consistent with the number of data segment, the number of preferred sample data is The more the better, the number of the data segment at place is The more the better, then at least one sample data selected is utilized to build the data subset browsed for the overall situation, the data volume in described data subset is made to be not more than described predetermined threshold value equally, namely the data subset browsed for the overall situation of structure is opened by software, the overall situation of the pending data of general view.
The data processing method provided by above the embodiment of the present application, obtains pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.Like this, can be multiple little data segments by the pending Data Placement of magnanimity, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus be browsed by the overall situation that the data subset browsing structure can realize mass data.
Embodiment two:
The process flow diagram of the data processing method that Fig. 2 provides for the embodiment of the present application two.
With reference to shown in Fig. 2, the data processing method that the embodiment of the present application provides, comprising:
Step S21: obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
Step S22: if the data volume of described pending data is greater than predetermined threshold value, obtain the first index information of described pending data.
In the embodiment of the present application, if the data volume of described pending data is greater than predetermined threshold value, then index process is carried out to data, obtain the first index information of described pending data.Such as, if comprise temporal information in pending data, can therefrom extracting time information, as the first index information, here the first index information is the foundation of follow-up data process, if now only have data message to have no time information in pending data, then can automatically give extra temporal information, jointly be kept in an index file together with data message.
Step S23: be multiple data segment described pending Data Placement according to described first index information, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
In the embodiment of the present application, first index information is for locating the data in each data segment, refer to the first index information can uniquely locate in pending data certain or certain group data information, here the first index information may be certain data name, data value or the combination of the two, such as: " data when test period is 0.1 second ", here, " test period " is the first index information, because " 0.1 second " this time value is unique, there will not be multiple 0.1 second.Contrary, " data when speed is 0.1m/s ", then uniquely can not determine data or one group of data, because speed can change, repeatedly can reach 0.1m/s, " speed " therefore in this example is not the first index information.
According to the first index information selected, set the corresponding relation of the data segment of the first index information and pending data, can be just multiple data segment according to the first index information described pending Data Placement according to this corresponding relation, and this corresponding relation can be kept in an index file jointly together with data message obtained before.
For the pending data test.txt that a size is 3.29GB, its data layout is with reference to table 1 below.When often row in pending data all preserve a variable data, the line number of pending data, more than 2,000 ten thousand row, utilizes conventional data scan tool all cannot browse.
And Time variable is unique in test.txt, therefore can using this variable as the first index information, conveniently search, the value of the first index information and Time is needed to be mapped with the line number of pending data, therefore, according to the line number of pending data corresponding to the value of Time, pending data can be divided into different data segments according to line number.Such as, the information of preserving is needed to comprise in the index file of this pending data:
The data name of pending data: test.txt
The data volume size of pending data: 3.29GB
The position of pending data: C: (example)
The data layout of pending data: TXT
First index information of pending data:
Time (time) |
Line (line number) |
0 |
2~20001 |
100 |
20002~40001 |
200 |
40002~60001 |
…… |
…… |
Table 1
Wherein, the data segment that first index information Time is corresponding when being 0 is 2 ~ 20001 row, the data segment that first index information Time is corresponding when being 100 is 20002 ~ 40001 row, and data segment corresponding when the first index information Time is 200 is 40002 ~ 60001 row, and the rest may be inferred.
Step S24: select at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
The data processing method provided by above the embodiment of the present application, obtains pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment according to the first index information of described pending data obtained by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.Like this, can be multiple little data segments by the pending Data Placement of magnanimity according to the first index information, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus be browsed by the overall situation that the data subset browsing structure can realize mass data.
Embodiment three:
The process flow diagram of the data processing method that Fig. 3 provides for the embodiment of the present application three.
With reference to shown in Fig. 3, the data processing method that the embodiment of the present application provides, comprising:
Step S31: obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
Step S32: if the data volume of described pending data is greater than predetermined threshold value, obtain the first index information of described pending data.
In the embodiment of the present application, if the data volume of described pending data is greater than predetermined threshold value, then index process is carried out to data, obtain the first index information of described pending data.Such as, if comprise temporal information in pending data, can therefrom extracting time information, as the first index information, here the first index information is the foundation of follow-up data process, if now only have data message to have no time information in pending data, then can automatically give extra temporal information, jointly be kept in an index file together with data message.
Step S33: be multiple data segment described pending Data Placement according to described first index information, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
In the embodiment of the present application, first index information is for locating the data in each data segment, refer to the first index information can uniquely locate in pending data certain or certain group data information, here the first index information may be certain data name, data value or the combination of the two, such as: " data when test period is 0.1 second ", here, " test period " is the first index information, because " 0.1 second " this time value is unique, there will not be multiple 0.1 second.Contrary, " data when speed is 0.1m/s ", then uniquely can not determine data or one group of data, because speed can change, repeatedly can reach 0.1m/s, " speed " therefore in this example is not the first index information.
According to the first index information selected, set the corresponding relation of the data segment of the first index information and pending data, can be just multiple data segment according to the first index information described pending Data Placement according to this corresponding relation, and this corresponding relation can be kept in an index file jointly together with data message obtained before.
Step S34: determine at least one sample data in described pending data according to the predetermined sampling interval, and determine the data segment at each sample data place according to described first index information.
In the embodiment of the present application, the predetermined sampling interval can be determined according to the first index information in index file, also can according to demand sets itself, here sampling interval can be corresponding with the first index information the number of data segment consistent, also can be inconsistent, situation preferably consistent in the embodiment of the present application.
In the embodiment of the present application, after determining the sampling interval of described pending data, just can determine sample data to be extracted according to this sampling interval from pending data, and, because the first index information is corresponding with data segment, after determining each sample data, just can determine the data segment at each sample data place according to described first index information.
Step S35: select described sample data from the data segment at each sample data place, and record the corresponding relation of the data segment at each sample data and place, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
In the embodiment of the present application, after determining the first index information, can sample as required to pending data, obtain a data subset, the data volume of this data subset is applicable to existing data and checks that software directly processes.This data subset can reach the effect of preview overall situation trend, reaches preview effect to global data by reading and showing this data set.
Such as, for the pending data test.txt of above example, can according to the corresponding relation of the first index information Time and data segment and line number Line, one group of sample data is extracted at interval of 20000 row, like this, the data volume of data subset then only has about 1000 row, and existing data scan tool can easily be browsed completely.
The data processing method provided by above the embodiment of the present application, obtains pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment according to the first index information of described pending data obtained by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.Like this, can be multiple little data segments by the pending Data Placement of magnanimity according to the first index information, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus be browsed by the overall situation that the data subset browsing structure can realize mass data.
Embodiment four:
The process flow diagram of the data processing method that Fig. 4 provides for the embodiment of the present application four.
With reference to shown in Fig. 4, the data processing method that the embodiment of the present application provides, comprising:
Step S41: obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
Step S42: if the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Step S43: select at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Step S44: obtain and select operation to first of the sample data in described data subset, determines the sample data selected by described first selection operation.
In the embodiment of the present application, user, after browsing the trend of whole pending data according to the data subset overall situation, can also as required, select the interested sample data in described data subset.
Step S45: according to the corresponding relation of the data segment at each sample data and place, extracts and shows the data in the described first data segment selecting the sample data selected by operation corresponding.
In the embodiment of the present application, after determining that user's first selects the sample data selected by operation, the data segment at the sample data place that user selects can be determined, and go out the total data of this data segment from pending extracting data, check in detail.
Wherein, when comprising the first index information in pending data, the data segment at the sample data place that user selects can be determined according to the first index information, when not comprising the first index information in pending data, also directly can determine the data segment at the sample data place that user selects according to the data segment divided, the application is not limited in any way this.
Continue for above-mentioned example, if user selects the sample data of checking the 20000th row in detail, then can directly determine that the sample data of the 20000th row is in the data segment of 2nd ~ 20001 by the data segment divided, also 0 can determine that data segment corresponding to the sample data of the 20000th row is 2 ~ 20001 row by the first index information Time corresponding according to the sample data of the 20000th row, and then all data can extracted in the data segment of 2nd ~ 20001 showing, check in detail for user.
The data processing method provided by above the embodiment of the present application, obtains pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value; Obtain and operation is selected to first of the sample data in described data subset, determine the sample data selected by described first selection operation; According to the corresponding relation of the data segment at each sample data and place, extract and show the data in the described first data segment selecting the sample data selected by operation corresponding.Like this, can be multiple little data segments by the pending Data Placement of magnanimity, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus browsed by the overall situation that the data subset browsing structure can realize mass data, then interested sample data can also be selected from data subset, check the total data of the data segment at this sample data place in detail, can after the overall situation be browsed, scope is browsed for user's rapid drop data, the experience browsing part data of interest is in detail provided.
Embodiment five:
The process flow diagram of the data processing method that Fig. 5 provides for the embodiment of the present application five.
With reference to shown in Fig. 5, the data processing method that the embodiment of the present application provides, comprising:
Step S51: obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
Step S52: if the data volume of described pending data is greater than predetermined threshold value, obtain the first index information of described pending data.
Step S53: be multiple data segment described pending Data Placement according to described first index information, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
In the embodiment of the present application, first index information is for locating the data in each data segment, refer to the first index information can uniquely locate in pending data certain or certain group data information, here the first index information may be certain data name, data value or the combination of the two, such as: " data when test period is 0.1 second ", here, " test period " is the first index information, because " 0.1 second " this time value is unique, there will not be multiple 0.1 second.Contrary, " data when speed is 0.1m/s ", then uniquely can not determine data or one group of data, because speed can change, repeatedly can reach 0.1m/s, " speed " therefore in this example is not the first index information.
According to the first index information selected, set the corresponding relation of the data segment of the first index information and pending data, can be just multiple data segment according to the first index information described pending Data Placement according to this corresponding relation, and this corresponding relation can be kept in an index file jointly together with data message obtained before.
Step S54: select at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Step S55: the second index information obtaining each data segment.
In the embodiment of the present application, in order to prevent the information of preserving in single index information too many, affecting processing speed, can also two-stage index be set up, namely outside the first index information of pending data, for each data segment sets up the second index information.
Step S56: each data segment is divided into multiple subdata section according to described second index information, described second index information is for locating the data in each subdata section.
In the embodiment of the present application, because the second index information is the second index information of each data segment, so the second index information can be multiple subdata sections each data segment Further Division.
For the pending data test.txt that a size is 3.29GB, its data layout is with reference to table 2 below.When often row in pending data all preserve a variable data, the line number of pending data, more than 2,000 ten thousand row, utilizes conventional data scan tool all cannot browse.
And Time variable is unique in test.txt, therefore can using this variable as the first index information, conveniently search, the value of the first index information and Time is needed to be mapped with the line number of pending data, therefore, according to the line number of pending data corresponding to the value of Time, pending data can be divided into different data segments according to line number.Such as, the information of preserving is needed to comprise in the index file of this pending data:
The data name of pending data: test.txt
The data volume size of pending data: 3.29GB
The position of pending data: C: (example)
The data layout of pending data: TXT
First index information of pending data:
Time (time) |
Line (line number) |
Secondary index |
0 |
2~20001 |
? |
100 |
20002~40001 |
1.index |
200 |
40002~60001 |
2.index |
…… |
…… |
…… |
Table 2
Wherein, the data segment that first index information Time is corresponding when being 0 is 2 ~ 20001 row, the data segment that first index information Time is corresponding when being 100 is 20002 ~ 40001 row, and data segment corresponding when the first index information Time is 200 is 40002 ~ 60001 row, and the rest may be inferred.
And due to the index span in upper table 2 larger, therefore add " secondary index " i.e. the second index information, convenient to after data Primary Location, again reduce load position.Such as, save the second index information that the first index information Time is 2 ~ 20001 row data segments from the Line of 0 to 100 correspondences in 1.index, its content is as following table 3:
Table 3
Wherein, the subdata section that second index information Time is corresponding when being 0 is 2 ~ 2001 row, the subdata section that second index information Time is corresponding when being 10 is 2002 ~ 4001 row, and subdata section corresponding when the second index information Time is 20 is 4002 ~ 6001 row, and the rest may be inferred.
Contrasted from table 2 and table 3, Time from the data between 5 to 15, in the first index information Time from 0 to 100 between corresponding data segment, i.e. between the 2 to 20002 row of pending data.Owing to there is the second index information, therefore according to the second index information 1.index, can locator data position further, Time from data second index information between 5 to 15 Time from the subdata section between 0 to 20, by such location, data area can be reduced 1000 times.Between 2 to 4002 row of i.e. pending data, conventional data scan tool can show the data of this data volume.
Step S57: obtain and select operation to second of the data in described subdata section, according to described second index information, extracts and shows the data in the described second subdata section selecting the data selected by operation corresponding.
In the embodiment of the present application, user, after viewing the total data in interested data segment, can also as required, select data interested in described data segment.
In the embodiment of the present application, after determining that user's second selects the data selected by operation, the subdata section in the data place data segment that user selects can be determined, and extract the total data of this subdata section, check in detail.
Continue for above-mentioned example, if user selects the sample data of checking the 20000th row in detail, the first index information Time corresponding according to the sample data of the 20000th row 0 determines that data segment corresponding to the sample data of the 20000th row is 2 ~ 20001 row, extract all data in the data segment of 2nd ~ 20001 and show, after checking in detail for user, if user selects the data of checking the 2000th row further, then 0 can determine that subdata section corresponding to the data of the 2000th row is 2 ~ 2001 row by the second index information Time corresponding according to the data of the 2000th row, and then extract 2nd ~ 2001 subdata section in all data and show, check in detail for user.
And, after to utilize software to read Time be all data in the subdata section of 0 2nd ~ 2001 user, because data area is very little, data more among a small circle can also be searched voluntarily at software inhouse, the display carrying out next step with check.
In the embodiment of the present application, do not limit the grade quantity of index information, therefore except the first index information in above-described embodiment and the second index information, the 3rd index information and the 4th index information etc. can also be set, the rest may be inferred, reduces data area so that less.
The data processing method provided by above the embodiment of the present application, obtains pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value; Obtain and operation is selected to first of the sample data in described data subset, determine the sample data selected by described first selection operation; According to the corresponding relation of the data segment at each sample data and place, extract and show the data in the described first data segment selecting the sample data selected by operation corresponding.Like this, can be multiple little data segments by the pending Data Placement of magnanimity, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus browsed by the overall situation that the data subset browsing structure can realize mass data, then interested sample data can also be selected from data subset, the total data of the data segment at this sample data place is checked in detail according to the first index information, the total data in the subdata section at the data place in data segment is checked in detail according to the second index information, can after the overall situation be browsed, scope is browsed for user's rapid drop data, the experience browsing part data of interest is in detail provided.
Be understandable that, for aforesaid each embodiment, if judge that the data volume of described pending data is not more than predetermined threshold value, can directly opened by data scan tool and check pending data, without the need to dividing data section again and carry out subsequent treatment.
For aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.
The above disclosed data processing method of the present invention, accordingly, the invention also discloses the device applying above-mentioned data processing method, and this device is browsed the overall situation of mass data for realizing.
The structural representation of a kind of data processing equipment that Fig. 6 provides for the application.
With reference to shown in Fig. 6, the data processing equipment that the embodiment of the present application provides, comprising:
First acquisition module 1, for obtaining pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
First divides module 2, if be greater than predetermined threshold value for the data volume of described pending data, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Build module 3, for selecting at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
The data processing equipment that the embodiment of the present application provides, can adopt the data processing method in said method embodiment, repeat no more herein.
The structural representation of the another kind of data processing equipment that Fig. 7 provides for the application.
With reference to shown in Fig. 7, the data processing equipment that the embodiment of the present application provides, comprising:
First acquisition module 1, for obtaining pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
First divides module 2, if be greater than predetermined threshold value for the data volume of described pending data, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Described first divides module 2 specifically comprises: acquiring unit 21, if be greater than predetermined threshold value for the data volume of described pending data, obtains the first index information of described pending data.Division unit 22, for being multiple data segment according to described first index information described pending Data Placement, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
Build module 3, for selecting at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
The data processing equipment that the embodiment of the present application provides, can adopt the data processing method in said method embodiment, repeat no more herein.
The structural representation of another data processing equipment that Fig. 8 provides for the application.
With reference to shown in Fig. 8, the data processing equipment that the embodiment of the present application provides, comprising:
First acquisition module 1, for obtaining pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
First divides module 2, if be greater than predetermined threshold value for the data volume of described pending data, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Described first divides module 2 specifically comprises: acquiring unit 21, if be greater than predetermined threshold value for the data volume of described pending data, obtains the first index information of described pending data.Division unit 22, for being multiple data segment according to described first index information described pending Data Placement, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
Build module 3, for selecting at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Described structure module 3 specifically comprises: sample unit 31, for determining at least one sample data in described pending data according to the predetermined sampling interval, and determines the data segment at each sample data place according to described first index information.Selection unit 32, for selecting described sample data in the data segment from each sample data place, and record the corresponding relation of the data segment at each sample data and place, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
The data processing equipment that the embodiment of the present application provides, can adopt the data processing method in said method embodiment, repeat no more herein.
The structural representation of another data processing equipment that Fig. 9 provides for the application.
With reference to shown in Fig. 9, the data processing equipment that the embodiment of the present application provides, comprising:
First acquisition module 1, for obtaining pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
First divides module 2, if be greater than predetermined threshold value for the data volume of described pending data, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Build module 3, for selecting at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Determination module 4, operates the selection of the sample data in described data subset for obtaining, and determines the described sample data selected selected by operation.
First extraction module 5, for the corresponding relation of the data segment according to each sample data and place, extracts and shows the data in the described data segment selecting the sample data selected by operation corresponding.
The data processing equipment that the embodiment of the present application provides, can adopt the data processing method in said method embodiment, repeat no more herein.
The structural representation of a kind of data processing equipment that Figure 10 provides for the application.
With reference to shown in Figure 10, the data processing equipment that the embodiment of the present application provides, comprising:
First acquisition module 1, for obtaining pending data, judges whether the data volume of described pending data is greater than predetermined threshold value.
First divides module 2, if be greater than predetermined threshold value for the data volume of described pending data, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value.
Described first divides module 2 specifically comprises: acquiring unit 21, if be greater than predetermined threshold value for the data volume of described pending data, obtains the first index information of described pending data.Division unit 22, for being multiple data segment according to described first index information described pending Data Placement, described first index information is for locating the data in each data segment, and the data volume in each data segment is all not more than described predetermined threshold value.
Build module 3, for selecting at least one sample data from described multiple data segment, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Described structure module 3 specifically comprises: sample unit 31, for determining at least one sample data in described pending data according to the predetermined sampling interval, and determines the data segment at each sample data place according to described first index information.Selection unit 32, for selecting described sample data in the data segment from each sample data place, and record the corresponding relation of the data segment at each sample data and place, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.
Determination module 4, operates the selection of the sample data in described data subset for obtaining, and determines the described sample data selected selected by operation.
First extraction module 5, for the corresponding relation of the data segment according to each sample data and place, extracts and shows the data in the described data segment selecting the sample data selected by operation corresponding.
Second acquisition module 6, for obtaining the second index information of each data segment.
Second divides module 7, and for each data segment being divided into multiple subdata section according to described second index information, described second index information is for locating the data in each subdata section.
Second extraction module 8, selects operation for obtaining to second of the data in described subdata section, according to described second index information, extracts and shows the data in the described second subdata section selecting the data selected by operation corresponding.
The data processing equipment that the embodiment of the present application provides, can adopt the data processing method in said method embodiment, repeat no more herein.
The data processing method provided by above the application and device, obtain pending data, judges whether the data volume of described pending data is greater than predetermined threshold value; If the data volume of described pending data is greater than predetermined threshold value, be multiple data segment by described pending Data Placement, and the data volume in each data segment is all not more than described predetermined threshold value; From described multiple data segment, select at least one sample data, utilize at least one sample data selected to build the data subset browsed for the overall situation, the data volume in described data subset is not more than described predetermined threshold value.Like this, can be multiple little data segments by the pending Data Placement of magnanimity, then from multiple data segment, select multiple sample data to form data subset, data volume in data segment and data subset is all not more than predetermined threshold value, can check that software is browsed and checks for existing data, thus be browsed by the overall situation that the data subset browsing structure can realize mass data.
For convenience of description, various unit is divided into describe respectively with function when describing above device.Certainly, the function of each unit can be realized in same or multiple software and/or hardware when implementing the application.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for device or system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.Apparatus and system embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Professional can also recognize further, in conjunction with unit and the algorithm steps of each example of embodiment disclosed herein description, can realize with electronic hardware, computer software or the combination of the two, in order to the interchangeability of hardware and software is clearly described, generally describe composition and the step of each example in the above description according to function.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
The software module that the method described in conjunction with embodiment disclosed herein or the step of algorithm can directly use hardware, processor to perform, or the combination of the two is implemented.Software module can be placed in the storage medium of other form any known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.