CN103106253B - A kind of data balancing method based on genetic algorithm in MapReduce computation model - Google Patents

A kind of data balancing method based on genetic algorithm in MapReduce computation model Download PDF

Info

Publication number
CN103106253B
CN103106253B CN201310015988.4A CN201310015988A CN103106253B CN 103106253 B CN103106253 B CN 103106253B CN 201310015988 A CN201310015988 A CN 201310015988A CN 103106253 B CN103106253 B CN 103106253B
Authority
CN
China
Prior art keywords
gene
map
task
data
reduce
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310015988.4A
Other languages
Chinese (zh)
Other versions
CN103106253A (en
Inventor
伍卫国
樊源泉
魏伟
朱霍
高颜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201310015988.4A priority Critical patent/CN103106253B/en
Publication of CN103106253A publication Critical patent/CN103106253A/en
Application granted granted Critical
Publication of CN103106253B publication Critical patent/CN103106253B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of data balancing method based on genetic algorithm in MapReduce computation model, first obtain overall Map output information, utilize genetic algorithm to carry out Combinatorial Optimization: first by metadata collecting, encode, then population is carried out to repeatedly random division, each division forms a genome, calculate the fitness function value of all subsets in each gene, and calculate the probability of each object function, on the basis that the fitness of each gene is evaluated, selection operator is applied to genome, utilize roulette algorithm random select several Fineness genes in genome, the gene of electing is carried out to crossing operation, carry out again mutation operation, through too much selecting according to elite's retention strategy the gene retaining after wheel evolution, and gene is decoded, just can obtain the combination to metadata a optimization, ensure that the handled data volume of each reducer is approximately equalised, the invention solves the reduce stage inputs the unbalanced problem of data, save computational resource, minimizing assesses the cost.

Description

A kind of data balancing method based on genetic algorithm in MapReduce computation model
Technical field
The invention belongs to computer MapReduce computation model technical field, be specifically related to a kind of MapReduce and calculate mouldData balancing method based on genetic algorithm in type.
Background technology
Hadoop be by Apache increase income storage that of organization development has high reliability and an enhanced scalability with pointCloth formula parallel computing platform, develops Zhi Houcong the earliest as the basic platform of the search engine project Nutch that increases incomeIndependent in Nutch project, become one of the cloud computing platform of typically increasing income. Hadoop core has realized dividing by piece storageCloth formula file system (HadoopDistributedFileSystem, HDFS) and for Distributed CalculationMapReduce computation model.
The processing stage that MapReduce computation model being divided into the large task of two of Map and Reduce. Process at MapReduceIn process, the Map stage is by change into<Key of input data, Value>data mode of key-value pair, offering the Reduce stage entersRow is further processed. Before Reduce accepts the key-value pair data of Map output and it is processed, also need through oneThe Shuffle stage. The Shuffle stage mainly shuffles the output data of each Map task, and collects these Map tasksThe data that need to be processed by same reduce task in output data. Because the data scale of collecting may be larger,The Shuffle stage can merge data to store in the local file system of reduce task place node, thereby reduces internal memorySpace hold rate.
Each Map task is divided into output data according to the quantity of reduce task the subregion number of equal parts, singleIndividual reduce task is collected the partition data of answering in contrast from all Map tasks, and all Map that possess identical key value are defeatedGo out key-value pair and will be assigned to same reduce task and process, thereby ensure that the final process result of each reduce is to buildStand in global scope.
The feature in Shuffle stage has determined data volume that each reduce task of Reduce stage accepts, and likely the utmost point is notBalance, thus cause the Reduce stage to calculate the problem tilting.
1) Reduce being caused by User Defined partitioning strategies calculates
When MapReduce operation is submitted to, according to the partitioning strategies of specifying, the Map stage need to be divided the number of output subregion,Set up the corresponding relation between Map output and reduce input. User-defined partitioning strategies, will according to practical application requestThe data that are mutually related are divided in same subregion, complete processing by same reduce task, are just ensureing final resultReally property, but also may cause each reduce task deal with data amount imbalance simultaneously.
In the time that the concrete subregion of data is indifferent in MapReduce operation, for completing fast point zoning of Map output dataPoint, what conventionally adopt is hash subregion method, hash value by Key definite whole<Key, Value>the affiliated subregion of key-value pairNumber, i.e. partition number partitionNum=hashCode (Key) %REDUCER_NUM. This method is limited by hash and calculates conflictAnd the factor such as reduce Limited Number, probably occur that a large amount of key converges on same subregion, cause each reduce taskOn data volume imbalance.
2) Reduce being caused by input data unique characteristics calculates
Because division operation is at each Map<Key, Value>the rear execution of key-value pair data output, foundation oftenSome feature of Key is determined its district location, lacks the global statistics information of the corresponding Value data scale of Key. Therefore,Make partitioning strategies can ensure the roughly balance of quantity of key in each subregion, but because the Map stage is inputted the own characteristic of data,The corresponding Value data volume of some specific key is measured much larger than Value corresponding to other key, thereby causes part reduceTask data volume to be processed is excessive. This phenomenon comes across the situation that has some hot spot datas in input data conventionally. OneAs in situation, the input data skew in Reduce stage will make some reduce task with respect to other reduce tasks carryingsTime increases, and has extended the running time in whole Reduce stage, finally affects the deadline of whole MapReduce operation.
Summary of the invention
In order to overcome the shortcoming of above-mentioned prior art, the object of the present invention is to provide a kind of MapReduce computation modelIn data balancing method based on genetic algorithm, reduced the processing time of task reducer, and then reduced wholeIn the processing time of MapReduce, can well save computational resource and minimizing assesses the cost.
In order to achieve the above object, the technical scheme that the present invention takes is:
A data balancing method based on genetic algorithm in MapReduce computation model, comprises the following steps:
1), obtain overall Map output information, obtain the metadata information of the subregion of reduce task processing, Reduce unitThe acquisition process of data is:
1.1, each Map task, completing processing procedure and Output rusults being write after local disk, can be passed throughTaskTracker utilizes heartbeat message transmission task to complete message to JobTracker;
1.2, JobTracker is that each MapReduce operation safeguards that a Map task completes message queue, when certain fortuneWhen the TaskTracker acquisition request Map task of row reduce task, according to the operation under this reduce task, from correspondingIn queue, take out message and pass to TaskTracker;
1.3, the reduce task in same operation is obtained Map task from the TaskTracker at place and is completed message, therefromExtract Map task operation time information, comprise Map mission number, XM, utilizes these information, reduce task creationBe connected with the HTTP between XM, and ask the metadata information of Map task output;
1.4, TaskTracker, according to the Map mission number of request, reads corresponding Map task from local file systemThe index file of output, and send to the reduce task of request;
1.5, the identical numbering virtual partition in reduce task merging different index file, gathers in each virtual partitionAll same kind<Key, Value>data volume of key-value pair, because each reduce task will be obtained all map tasks outputMetadata information;
2), the output data of Map are processed, reduce task is obtained the subregion original number of each map task outputAccording to; Metadata after gathering is submitted to repartition device, adopt genetic algorithm to carry out equilibrium to metadata, genetic algorithm isBit string is operated, and its concrete steps are as follows:
2.1, the metadata collecting of Map being exported to data gets up to be placed in a set, as a population, in populationEach element encode, so-called coding use exactly " 0,1 " composition each element of coded representation, the coding staff of employingFormula is to represent the subscript in the set of element place by 1 number, and this population is carried out to random division, is divided into N subset, itsMiddle N is corresponding with the number of reduce, and division each time forms a gene, after repeatedly dividing, forms a baseBecause of group;
2.2, in genetic algorithm, fitness function is for weighing the individual adaptedness for living environment of heredity, suitableThe individuality that response is higher obtains more duplicator meeting, and vice versa, therefore, defines a fitness function
min { &Sigma; j = 1 n | S j - S | } / n Formula (1),
Wherein,For whole mean value of the element sum of subsets, in formula (1), object function is retouchedWhat state is the average distance that each subset is incorporated into mean value, utilizes this formula (1), and each gene is calculated to its fitness letterNumber, forms a new set, then obtains the probability of each Gene sufficiency function, i.e. the fitness function of a geneValue divided by whole genomic fitness function value sum;
2.3, selection operator is applied to genome, the selection operator of employing is roulette wheel selection, utilizes random functionProduce a random number between [0,1], judge the position in its fitness probability sequence in genome, if itMultipotency is greater than m value in sequence, represents that m gene is selected, freely specifies the number that needs the gene of selecting;
2.4, carry out crossing operation to electing gene, the part-structure of Fineness gene is replaced and reconfigured shapeThe gene of Cheng Xin, adopts single-point crossover operator, and concrete operations are: set at random a crosspoint, corresponding roulette selection algorithmThe gene choosing, intersects, and the part-structure of two genes before and after this crosspoint exchanges, and generates twoNew individual, and guarantee that the genome after exchange there will not be the situation that has null set, set a nullGen mark, timeGo through the genome after intersection, exist if find that there is null set, be set to false by nullGen mark, and identify with thisThe gene of this deletion;
2.5,, to the gene computing that makes a variation after intersecting, variation computing is by some base in genome according to variation probabilityThereby form a new individuality because replacing with other gene, adopt fixed bit mutation operator, and the probability that will make a variation is establishedBe 0.1, to obtaining optimal solution, fixed bit mutation operator refers to a certain position or a few the bases of the appointment fixing to individual geneBecause making mutation operation: original gene is 0, become 1, original gene is 1, becomes 0, through after mutation operation, rightGene after variation carries out non-NULLCHECK, ensures that the gene after compiling still has N subset;
2.6, described above one and taken turns evolutionary process, warp retains according to the selection of elite's retention strategy after too much taking turns and evolvingGene, the gene retention strategy of employing is: after above step, calculate the target function value of each gene, and by its withIn genome, the target function value of all genes is compared, and the gene that the former is less than to the latter remains;
2.7, the gene remaining is decoded, just can obtain the combination to metadata a optimization, be about to unitData are divided into N the subset that size is substantially suitable, then, and by data allocations to corresponding each subset reducerUpper, so just ensure that the handled data volume of each reducer is approximately equalised.
The invention has the beneficial effects as follows:
Calculate tilt problem for the Reduce stage existing in MapReduce platform, proposed solution, the methodUtilize genetic algorithm to carry out repartition by being exported to data the Map stage, guarantee that the data volume of each subregion is unanimous on the whole, makeReduce task is used the resource of system more efficiently, has avoided due to uneven the locating of causing of reducer input data volumeReason time inconsistent, thus processing time of task reducer reduced, and then reduced the processing of whole MapReduceTime. From business aspect, new method can well save computational resource and minimizing assesses the cost.
Brief description of the drawings
Fig. 1 Reduce metadata is obtained flow chart.
Fig. 2 Map output metadata acquisition module class figure.
The flow chart of the data balancing method of Fig. 3 based on genetic algorithm.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is described in detail.
A data balancing method based on genetic algorithm in MapReduce computation model, comprises the following steps:
1), obtain overall Map output information, obtain the metadata information of the subregion of reduce task processing, Reduce unitThe acquisition process of data is as shown in Figure 1:
1.1, each Map task, completing processing procedure and Output rusults being write after local disk, can be passed throughTaskTracker utilizes heartbeat message transmission task to complete message to JobTracker;
1.2, JobTracker is that each MapReduce operation safeguards that a Map task completes message queue, when certain fortuneWhen the TaskTracker acquisition request Map task of row reduce task, according to the operation under this reduce task, from correspondingIn queue, take out message and pass to TaskTracker;
1.3, the reduce task in same operation is obtained Map task from the TaskTracker at place and is completed message, therefromExtract Map task operation time information, comprise Map mission number, XM, utilizes these information, reduce task creationBe connected with the HTTP between XM, and ask the metadata information of Map task output;
1.4, TaskTracker, according to the Map mission number of request, reads corresponding Map task from local file systemThe index file of output, and send to the reduce task of request;
1.5, the identical numbering virtual partition in reduce task merging different index file, gathers in each virtual partitionAll same kind<Key, Value>data volume of key-value pair, because each reduce task will be obtained all map tasks outputMetadata information, consider in practical situation, map task number is conventionally more, and is distributed on multiple computing nodes, for carryingHigh efficiency is accelerated metadata acquisition process, in can realizing, adopts multithreading to complete this process, and Map output metadata is obtained mouldThe main class formation of piece as shown in Figure 2;
2), the output data of Map are processed, reduce task is obtained the subregion original number of each map task outputAccording to; Metadata after gathering is submitted to repartition device, in order to make the large of input data volume that each reducer obtainsLittle basically identical, the present invention adopts genetic algorithm, and metadata is carried out to equilibrium, and genetic algorithm is that bit string is carried outOperation, instead of to data itself, its concrete steps are as follows:
2.1, the metadata collecting of Map being exported to data gets up to be placed in a set, as a population, in populationEach element encode, so-called coding use exactly " 0,1 " composition each element of coded representation, the present invention adoptCoded system is to represent the subscript in the set of element place by 1 number, and this population is carried out to random division, is divided into NSubset, wherein N is corresponding with the number of reduce, and division each time forms a gene, after repeatedly dividing, formsA genome;
2.2, in genetic algorithm, fitness function is for weighing the individual adaptedness for living environment of heredity, suitableThe individuality that response is higher obtains more duplicator meeting, and vice versa, and therefore, the present invention defines a fitness function
min { &Sigma; j = 1 n | S j - S | } / n Formula (1),
Wherein,For whole mean value of the element sum of subsets, in formula (1), object function is retouchedWhat state is the average distance that each subset is incorporated into mean value, utilizes this formula (1), and each gene is calculated to its fitness letterNumber, forms a new set, then obtains the probability of each Gene sufficiency function, i.e. the fitness function of a geneValue divided by whole genomic fitness function value sum;
2.3, selection operator is applied to genome, the selection operator that the present invention adopts is roulette wheel selection, rouletteBack-and-forth method is a kind of conventional random system of selection, is similar to the roulette in gambling game, and its main thought is individual suitableResponse is converted to the probability of selection in proportion, and the ratio shared by individuality carries out ratio cut partition on disk, each rotary disk,Treat that it is the individuality of choosing that disk stops individuality corresponding to backpointer stop sector, adopt the benefit of this selection algorithm to be, individual generalRate is larger, and the area occupied ratio of this individuality in disk is also larger, and selected probability is also just larger, utilizes this thought, thisThe specific implementation of invention is: utilize random function to produce a random number between [0,1], judge its fitting in genomePosition in response probability sequence, if its multipotency is greater than m value in sequence, represents that m gene is selected, generalIn situation, can freely specify the number that needs the gene of selecting;
2.4, several genes of electing are carried out to crossing operation, the part-structure of Fineness gene is replaced heavilyNewly be combined to form new gene, crossing operation is the key character that genetic algorithm is different from other evolution algorithms, and the present invention adoptsSingle-point crossover operator, concrete operations are: set at random a crosspoint, the gene that corresponding roulette selection algorithm chooses,Intersect, the part-structure of two genes before and after this crosspoint exchanges, and generates two new individualities, and guaranteesGenome after exchange there will not be the situation that has null set, sets a nullGen mark, the gene after traversal is intersectedGroup, exists if find that there is null set, is set to false, and identifies the gene of this deletion with this by nullGen mark;
2.5,, to the gene computing that makes a variation after intersecting, variation computing is by some base in genome according to variation probabilityThereby form a new individuality because replacing with other gene, the object that genetic algorithm is introduced variation has two: the one, makeGenetic algorithm has local random searching ability, in the time that genetic algorithm has approached optimal solution neighborhood by crossover operator, utilizesThis local random searching ability of mutation operator can be accelerated to optimal solution convergence, obviously, and variation probability in such casesShould get smaller value, otherwise the building block that approaches optimal solution can be destroyed because of variation; The 2nd, make genetic algorithm can maintain colonyDiversity, to prevent prematurity Convergent Phenomenon, now convergent probability should be got higher value, and based on above consideration, the present invention adoptsUse fixed bit mutation operator, and variation probability is made as to 0.1, to obtaining optimal solution, fixed bit mutation operator refers to listA certain position or a few the genes of the fixing appointment of individual gene are made mutation operation: original gene is 0, becomes 1, original geneBe 1, become 0, through after mutation operation, the gene after variation is carried out to non-NULLCHECK, ensure that the gene after compiling is complied withSo have N subset;
2.6, described above one and taken turns evolutionary process, warp retains according to the selection of elite's retention strategy after too much taking turns and evolvingGene, the gene retention strategy that the present invention adopts is: after above step, calculate the target function value of each gene, andIt is compared with the target function value of all genes in genome, and the gene that the former is less than to the latter remains;
2.7, the gene remaining is decoded, just can obtain the combination to metadata a optimization, be about to unitData are divided into N the subset that size is substantially suitable, then, and by data allocations to corresponding each subset reducerUpper, so just can ensure that the handled data volume of each reducer is suitable, well solve the reduce stage to inputThe problem of data skew. In MapReduce computation model, a kind of flow chart of the data balancing method based on genetic algorithm is as Fig. 3Shown in.

Claims (1)

1. the data balancing method based on genetic algorithm in MapReduce computation model, is characterized in that, comprises following stepRapid:
1), obtain overall Map output information, obtain the metadata information of the subregion of Reduce task processing, Reduce metadataAcquisition process be:
1.1, each Map task, completing processing procedure and Output rusults being write after local disk, can be passed through TaskTrackerUtilize heartbeat message transmission task to complete message to JobTracker;
1.2, JobTracker is that each MapReduce operation safeguards that a Map task completes message queue, when certain operationWhen the TaskTracker acquisition request Map task of Reduce task, according to the operation under this Reduce task, from corresponding teamIn row, take out message and pass to TaskTracker;
1.3, the Reduce task in same operation is obtained Map task from the TaskTracker at place and is completed message, therefrom extractsThe information when operation of Map task, comprises Map mission number, and XM, utilizes these information, Reduce task creation with holdThe internodal HTTP of row connects, and asks the metadata information of Map task output;
1.4, TaskTracker, according to the Map mission number of request, reads corresponding Map task output from local file systemIndex file, and send to the Reduce task of request;
1.5, the identical numbering virtual partition in Reduce task merging different index file, gathers in each virtual partition allSame kind<Key, Value>data volume of key-value pair, each Reduce task will be obtained the metadata of all Map tasks outputsInformation;
2), the output data of Map are processed, Reduce task is obtained the subregion initial data of each Map task output; WillMetadata after gathering is submitted to repartition device, adopts genetic algorithm to carry out equilibrium to metadata, and genetic algorithm is to twoSystem bit string operates, and its concrete steps are as follows:
2.1, the metadata collecting of Map being exported to data gets up to be placed in a set, as a population, and every in populationIndividual element is encoded, and so-called coding is used each element of coded representation of " 0,1 " composition exactly, the coding that the present invention adoptsMode is to represent the subscript in the set of element place by 1 number, and this population is carried out to random division, is divided into N subset,Wherein N is corresponding with the number of Reduce, and division each time forms a gene, after repeatedly dividing, forms oneGenome;
2.2, in genetic algorithm, fitness function is for weighing the individual adaptedness for living environment of heredity, fitnessHigher individuality obtains more duplicator meeting, and vice versa, therefore, defines a fitness function
Formula (1),
Wherein,For whole mean value of the element sum of subsets, min represents to get minimum of a value, formula (1)What middle object function was described is the average distance that each subset is incorporated into mean value, utilizes this formula (1), and each gene is calculatedIts fitness function, forms a new set, then obtain the probability of each Gene sufficiency function, i.e. a geneThe value of fitness function is divided by whole genomic fitness function value sum;
2.3, selection operator is applied to genome, the selection operator of employing is roulette wheel selection, utilizes random function to produceA random number between [0,1], judges the position in its fitness probability sequence in genome, if its multipotencyBe greater than m value in sequence, represent that m gene is selected, freely specify the number that needs the gene of selecting;
2.4, the gene of electing is carried out to crossing operation, the part-structure of Fineness gene is replaced and reconfigured formationNew gene, adopts single-point crossover operator, and concrete operations are: set at random a crosspoint, corresponding roulette selection algorithm choosingThe gene of selecting out, intersects, and the part-structure of two genes before and after this crosspoint exchanges, and generates two newlyIndividuality, and guarantee that the genome after exchange there will not be the situation that has null set, set a nullGen mark, traversalGenome after intersection, exists if find that there is null set, is set to false, and identifies this with this by nullGen markThe gene of deleting;
2.5,, to the gene computing that makes a variation after intersecting, variation computing is according to variation probability, some gene in genome to be usedThereby other gene is replaced and is formed a new individuality, adopts fixed bit mutation operator, and variation probability is made as0.1, to obtaining optimal solution, fixed bit mutation operator refers to a certain position or a few the genes of the appointment fixing to individual geneMake mutation operation: original gene is 0, become 1, original gene is 1, becomes 0, through after mutation operation, to becomingGene after different carries out non-NULLCHECK, ensures that the gene after compiling still has N subset;
2.6, described above one and taken turns evolutionary process, through too much selecting according to elite's retention strategy the base retaining after wheel evolutionCause, the gene retention strategy of employing is: after above step, calculate the target function value of each gene, and by itself and baseCompare because of the target function value of all genes in group, the gene that the former is less than to the latter remains;
2.7, the gene remaining is decoded, just can obtain the combination to metadata a optimization, by metadataBe divided into N the subset that size is substantially suitable, then, data allocations to corresponding each subset reducer is upper, thisSample just ensures that the handled data volume of each reducer is suitable.
CN201310015988.4A 2013-01-16 2013-01-16 A kind of data balancing method based on genetic algorithm in MapReduce computation model Expired - Fee Related CN103106253B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310015988.4A CN103106253B (en) 2013-01-16 2013-01-16 A kind of data balancing method based on genetic algorithm in MapReduce computation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310015988.4A CN103106253B (en) 2013-01-16 2013-01-16 A kind of data balancing method based on genetic algorithm in MapReduce computation model

Publications (2)

Publication Number Publication Date
CN103106253A CN103106253A (en) 2013-05-15
CN103106253B true CN103106253B (en) 2016-05-04

Family

ID=48314108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310015988.4A Expired - Fee Related CN103106253B (en) 2013-01-16 2013-01-16 A kind of data balancing method based on genetic algorithm in MapReduce computation model

Country Status (1)

Country Link
CN (1) CN103106253B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103401626B (en) * 2013-08-23 2016-03-16 西安电子科技大学 Based on the collaborative spectrum sensing optimization method of genetic algorithm
CN104102707B (en) * 2014-07-10 2016-03-30 西安交通大学 A kind of geographical attaching information querying method towards MapReduce framework
CN104239529A (en) * 2014-09-19 2014-12-24 浪潮(北京)电子信息产业有限公司 Method and device for preventing Hive data from being inclined
CN105260324B (en) * 2015-10-14 2018-12-07 北京百度网讯科技有限公司 Key-value pair data operating method and device for distributed cache system
CN113098773B (en) * 2018-03-05 2022-12-30 华为技术有限公司 Data processing method, device and system
CN110032559A (en) * 2019-04-19 2019-07-19 成都四方伟业软件股份有限公司 A kind of data pick-up method and device
CN110109753A (en) * 2019-04-25 2019-08-09 成都信息工程大学 Resource regulating method and system based on various dimensions constraint genetic algorithm
CN111104225A (en) * 2019-12-23 2020-05-05 杭州安恒信息技术股份有限公司 Data processing method, device, equipment and medium based on MapReduce
CN112307008B (en) * 2020-12-14 2023-12-08 湖南蚁坊软件股份有限公司 Druid compacting method
CN112769522B (en) * 2021-01-20 2022-04-19 广西师范大学 Partition structure-based encoding distributed computing method
CN113434299B (en) * 2021-07-05 2024-02-06 广西师范大学 Coding distributed computing method based on MapReduce framework

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183368A (en) * 2007-12-06 2008-05-21 华南理工大学 Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing
CN101764835A (en) * 2008-12-25 2010-06-30 华为技术有限公司 Task allocation method and device based on MapReduce programming framework

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102141995B (en) * 2010-01-29 2013-06-12 国际商业机器公司 System and method for simplifying transmission in parallel computing system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183368A (en) * 2007-12-06 2008-05-21 华南理工大学 Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing
CN101764835A (en) * 2008-12-25 2010-06-30 华为技术有限公司 Task allocation method and device based on MapReduce programming framework

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《一种适用于大规模变量的并行遗传算法研究》;李东等;《计算机科学》;20120731;第39卷(第7期);第182页至184页,204页 *
《云计算环境下的改进型Map-Reduce模型》;李震等;《计算机工程》;20120630;第38卷(第11期);第27页至29页 *

Also Published As

Publication number Publication date
CN103106253A (en) 2013-05-15

Similar Documents

Publication Publication Date Title
CN103106253B (en) A kind of data balancing method based on genetic algorithm in MapReduce computation model
CN103631657B (en) A kind of method for scheduling task based on MapReduce
US7537523B2 (en) Dynamic player groups for interest management in multi-character virtual environments
CN102063339B (en) Resource load balancing method and equipment based on cloud computing system
CN104869140B (en) The method of the data storage of multi-cluster system and control multi-cluster system
CN110222029A (en) A kind of big data multidimensional analysis computational efficiency method for improving and system
CN105138281B (en) A kind of sharing method and device of physical disk
JP7349178B2 (en) Optimization system and method for parameter settings of wave energy devices
CN109543726A (en) A kind of method and device of training pattern
CN106227599A (en) The method and system of scheduling of resource in a kind of cloud computing system
Long et al. A toolkit for modeling and simulating cloud data storage: An extension to cloudsim
CN106503196A (en) The structure and querying method of extensible storage index structure in cloud environment
CN108418858A (en) A kind of data copy laying method towards Geo-distributed cloud storages
CN106570113A (en) Cloud storage method and system for mass vector slice data
CN115981562A (en) Data processing method and device
CN104580518A (en) Load balance control method used for storage system
CN104125293A (en) Cloud server and application method thereof
CN105426255A (en) Network I/O (input/output) cost evaluation based ReduceTask data locality scheduling method for Hadoop big data platform
Awad et al. A swarm intelligence-based approach for dynamic data replication in a cloud environment
CN111078380A (en) Multi-target task scheduling method and system
Li et al. QoS-aware and multi-objective virtual machine dynamic scheduling for big data centers in clouds
CN107807793B (en) The storage of data copy isomery and access method in distributed computer storage system
CN105635285A (en) State-sensing-based VM migration scheduling method
CN107257356B (en) Social user data optimal placement method based on hypergraph segmentation
WO2015196176A1 (en) Dynamic n-dimensional cubes for hosted analytics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160504

Termination date: 20220116

CF01 Termination of patent right due to non-payment of annual fee