CN103488775A

CN103488775A - Computing system and computing method for big data processing

Info

Publication number: CN103488775A
Application number: CN201310455174.2A
Authority: CN
Inventors: 王鹏; 韩冀中; 王伟平; 孟丹; 张云
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2013-09-29
Filing date: 2013-09-29
Publication date: 2014-01-01
Anticipated expiration: 2033-09-29
Also published as: CN103488775B

Abstract

The invention relates to a computing system and computing method for big data processing. The computing system comprises a bottommost-layer module, a middle-layer module and a topmost-layer module sequentially from bottom to top. The middle-layer module comprises a message transmission module and computation model modules. A Hadoop distributed file system is adopted in the bottommost-layer module, and the bottommost-layer module is used for storing data. The message transmission module is used for transmitting messages between the computation model modules operating at different computing nodes. The computation model modules operating at the different computing nodes work cooperatively according to the messages transmitted by the message transmission module and establish specific types of computation models respectively to process the data. The topmost-layer module is used for providing programmatic interfaces for the computation models, combining computation expressed by the different computation models in a serial mode and enabling the different computation models to share the data based on an internal storage flow line mode through setting at the same time. According to the computing system and computing method for big data processing, application programs can be written in one system through the multiple computation models, and more complex problems can be solved.

Description

A kind of computing system and computing method of processing for large data

Technical field

The present invention relates to count greatly process field, particularly relate to a kind of computing system and computing method based on hybrid programming.

Background technology

In recent years, along with the fast development of social informatization, no matter at scientific research, commercial production, business and internet arena, data all present explosive growth.At present, the data in a lot of application are from the TB(terabyte) level develops the PB(petabyte rapidly) order of magnitude that level is even higher.The Computational frame of processing towards large data has become much-talked-about topic.The Hadoop system of increasing income at present, is applied widely in industrial community.Although the MapReduce model that Hadoop provides is simple and easy to use, its computation model has limitation, and ability to express is limited, when solving challenges such as iterative computation, map analysis, be difficult to algorithm is mapped in the MapReduce model, the workload of exploitation is large, and the inefficiency of operation.At present, diversified Computational frame has appearred, for example, Dryad, Piccolo, Pregel and Spark etc., these Computational frames have greatly enriched the means that large data are processed.But these Computational frames are towards the particular problem field, in its suitable application area, can efficiently deal with problems fast.For example, Pregel calculates for large-scale figure, in solving application such as web link analysis, illness spread path and optimization traffic route, has obvious advantage.Current large data processing task diversification day by day, do not exist the framework of a kind of " omnipotent " to be applicable to all application scenarioss, and the data processing platform (DPP) that merges multiple Computational frame becomes trend of the times.Under the data processing platform (DPP) of this " unification ", the means that large data are processed can be more and more abundanter, and dealing with problems can be more and more easier, and treatment effeciency also can be more and more higher.Current solution is to hold multiple Computational frame by resource management system on cluster, and typical system comprises Mesos and YARN etc.This type systematic can allow multiple Computational frame to share same cluster resource, but also exist significantly not enough, be mainly reflected in three aspects:: (1) programming threshold is high, the programmer needs to be grasped multiple programming language (as the C of MPI, the Java language of Hadoop), therefore, there is the poor shortcoming of ease for use; (2) can't reuse the code of different frames, and code reuse is extremely important in modern software engineering, therefore, has the shortcoming that development efficiency is low; (3) between different work with distributed file system (as HDFS, Hadoop Distributed File System, i.e. Hadoop distributed file system) mode share data, bring a large amount of magnetic disc i/o expenses, therefore, the shortcoming that there is the wasting of resources.

In sum, the method and system of the multiple Computational frame of existing fusion also exist that ease for use is poor, development efficiency is low and the shortcoming of the wasting of resources.

Summary of the invention

Technical matters to be solved by this invention is to provide a kind of computing system and computing method of processing for large data, the limitation problem existed for solving the existing method that merges multiple Computational frame.

The technical scheme that the present invention solves the problems of the technologies described above is as follows: a kind of computing system of processing for large data, it runs on a plurality of computing nodes, and comprise successively from bottom to up three layers of module, be respectively bottom module, middle layer module and top module, middle layer module comprises again transmission of messages module and computation model module;

Described bottom module, it adopts HDFS, for storing data;

Described transmission of messages module, it is for realizing pass-along message between the computation model module of different computing node operations;

The described computation model module in the operation of different computing nodes, its message according to the transmission of described transmission of messages module realizes collaborative work, and the computation model that builds separately particular type is processed the data that read from HDFS;

Described top module, it is used to the computation model of each particular type that corresponding DLL (dynamic link library) is provided, and combines in the mode of serial the calculating that different computation models are expressed, and arranges between different computation models simultaneously and shares data based on the internal memory pipeline system.

On the basis of technique scheme, the present invention can also do following improvement.

Further, the data of described bottom module stores comprise input data set, intermediate result data set and Output rusults data set.

Further, described transmission of messages module comprises transmitter and receiver, described transmitter is used for from the computation model module receipt message in same computing node, and the message of reception is sent to the receiver of the computing node of appointment, the message that described receiver sends for the transmitter that receives different computing nodes from network, and the message of reception is transmitted to the computation model module in same computing node.

Further, the message of transmitting between described transmission of messages module, described computation model module and described transmitter and described receiver comprises request message and response message.

Further, described computation model module comprises the controller of the computation model of some employing particular types, also comprise the processor that adopts the identical calculations model with each controller, described controller is for coordinating the execution flow process of its computation model adopted, the request message that described processor sends for receiving controller, the data processing operation of implementation controller appointment, and report response message to controller; Described processor is also for reading the input data of HDFS, and to HDFS output result of calculation.

Further, described top module also provides corresponding configuration for the computation model to each particular type, comprise input directory, output directory, computing node number and output journal catalogue are set, described input directory is used to specify pending data set path, and described output directory is used to specify the storing path of final calculation result.

Further, when the mode with serial combines the calculating of different computation models expression, described top module is also for providing the DLL (dynamic link library) of the multiple computation model of serial combination.

Further, described dissimilar computation model comprises BSP(Bulk Synchronous Parallel, piece is run simultaneously) computation model, DAG(Directed Acyclic Graph, directed acyclic graph) computation model, GraphLab computation model and Spark computation model.The DAG computation model is a kind of comparatively general computation model, and the Dryad Computational frame of Microsoft has adopted this model.The Pregel figure Computational frame that Google proposes is derived from the BSP computation model.GraphLab is a kind of computation model based on asynchronous mode, and Spark has proposed the concept of elastic data collection, and the mode that adopts internal memory to calculate improves the efficiency that data are processed.

Technical scheme of the present invention also comprises a kind of computing method of processing for large data, and it has adopted above-mentioned computing system, and concrete steps comprise:

Step 1, be uploaded to HDFS by pending data set;

Step 2, according to the application demand of pending data set, be split as the whole calculation process of pending data set the segmentation calculation process of a plurality of serials;

Step 3, for each segmentation calculation process is chosen the computation model of particular type, and utilize the DLL (dynamic link library) provided to write the correlative code of the computation model of each particular type;

Step 4, adopt the mode of serial to combine the calculating that different computation models are expressed, and arranges between different computation models simultaneously and share data based on the internal memory pipeline system;

Step 5, the relevant configuration of application program is provided according to the DLL (dynamic link library) provided;

Step 6, carry out the correlative code of writing.

Further, described computation model comprises BSP computation model, DAG computation model, GraphLab computation model and Spark computation model.

The invention has the beneficial effects as follows: unique distinction of the present invention is to support multiple computation model in a computing system, overcome current Computational frame and can only support a kind of limitation of computation model, allow the developer in a system, to adopt multiple computation model to write application program, and can be by multiple calculation combination together, thereby solve more complicated problem.The computing system that the present invention proposes is supported the computation model of two kinds of main flows, i.e. DAG and BSP yet can compatible other models, as GraphLab, Spark etc.

The accompanying drawing explanation

The structural representation that Fig. 1 is the computing system of processing for large data of the present invention;

The schematic flow sheet that Fig. 2 is the computing method of processing for large data of the present invention;

Fig. 3 adopts the comparison diagram of different pieces of information sharing mode in application examples of the present invention.

In accompanying drawing, the list of parts of each label representative is as follows:

1, bottom module, 2, middle layer module, 3, top module, 21, the transmission of messages module, 22, the computation model module, 211, transmitter, 212, receiver, 221, controller, 222, processor.

Embodiment

Below in conjunction with accompanying drawing, principle of the present invention and feature are described, example, only for explaining the present invention, is not intended to limit scope of the present invention.

As shown in Figure 1, the present embodiment has provided a kind of computing system of processing for large data, it runs on a plurality of computing nodes, comprise successively from bottom to up three layers of module, be respectively bottom module 1, middle layer module 2 and top module 3, and middle layer module 2 comprises again transmission of messages module 21 and computation model module 22.

Described bottom module 1, it adopts HDFS, and for storing data, the data of its storage comprise input data set, intermediate result data set and Output rusults data set.

Described transmission of messages module 21, it is for realizing pass-along message between the computation model module of different computing node operations, comprise transmitter 211 and receiver 212, described transmitter 211 is for computation model module 22 receipt messages from same computing node, and the message of reception is sent to the receiver 212 of the computing node of appointment, the message that described receiver 212 sends for the transmitter that receives different computing nodes from network, and the message of reception is transmitted to the computation model module 22 in same computing node.Described message comprises request message (as carried out " request " class message of certain operation) and response message (as certain operates " response " class message whether executed).

The described computation model module 22 in the operation of different computing nodes, its message of transmitting according to described transmission of messages module 21 realizes collaborative work, and the computation model that builds separately particular type carrys out deal with data.Described computation model module 22 comprises the controller 221 of the dissimilar computation model of some employings, also comprise the processor 222 that adopts the identical calculations model with each controller, described controller 221 is for coordinating the execution flow process of its computation model adopted, the request message that described processor 222 sends for receiving controller 221, the data processing operation of implementation controller 221 appointments, and report response message to controller 221; Described processor 222 is also for reading the input data of HDFS, and to HDFS output result of calculation.In addition, described computation model comprises BSP computation model, DAG computation model, GraphLab computation model and Spark computation model.

Described top module 3, it is used to the computation model of each particular type that corresponding DLL (dynamic link library) is provided, and combines in the mode of serial the calculating that different computation models are expressed, and arranges between different computation models simultaneously and shares data based on the internal memory pipeline system.Described top module also provides corresponding configuration for the computation model to each particular type, comprise input directory, output directory, computing node number and output journal catalogue etc. are set, described input directory is used to specify pending data set path, and described output directory is used to specify the storing path of final calculation result.When processing the some segmentation calculation process that split, obtain data by processor from HDFS, share data owing to arranging between different computation models based on the internal memory pipeline system simultaneously, also by pipeline system, from internal memory, read the output data of previous segmentation calculation process, after data are processed, export to again next segmentation calculation process, and can read input data and intermediate data and write intermediate data and output data to HDFS from HDFS in whole process.

In addition, when the mode with serial combines the calculating of different computation models expression, each computation model need carry out serial combination, and described top module is also for providing the DLL (dynamic link library) of the multiple computation model of serial combination.If any computation model A, B and C, by it with A-B-C-the order of A carries out serial combination, top module corresponding providing is carried out A-> B-C-the DLL (dynamic link library) of this serial combination of A.

Computing system based on above-mentioned, as shown in Figure 2, technical scheme of the present invention also comprises a kind of computing method of processing for large data, concrete steps comprise:

Step 1, be uploaded to HDFS by pending data set;

Step 6, carry out the correlative code of writing.

Based on such scheme, hereinafter provided the basic procedure that adopts above-mentioned computing system and method.

(1) all nodes that will participate in calculating couple together by switch, the cluster that foundation can intercom mutually.

(2) the Hadoop system of increasing income in each node deploy of cluster, and the HDFS of startup Hadoop system;

(3) pending data set is uploaded to HDFS, the computing system that adopts the present embodiment to propose, write the application program that data are processed, and specific implementation method is referring to relevant portion above.

(4) use compiler that the compiling of application of writing is become to an executable file.

(5) choose a collection of node from cluster, be used for running job.Node is divided into 2 classes, and one is main controlled node, and remaining is computing node.

(6) start executable program on main controlled node, it is master that its parameter is set, and is designated hereinafter simply as master.

(7) start executable program on computing node, it is worker that its parameter is set, and is designated hereinafter simply as worker.

(8) after master starts, carry out following step:

1) at first created the queue of a controller object (Controller).

2) master ejects a controller object from queue, and carries out this controller.The step that controller is carried out is as follows:

[1] controller object of master is responsible for worker Distribution Calculation task, sends control command to worker, and waits for the feedback of worker.

[2] according to the feedback of worker, controller dispatches next step computation process.

Above-mentioned steps [1] [2] is circulation constantly, until a segmentation calculation process executes, controller exits.

3) execute when a segmentation calculation process, master can take out next controller from queue to be continued to carry out.Whole process is until the controller queue is empty.

(9) after worker starts, carry out following step:

1) at first to master, send log-on message.

2) create a processor (handler) queue, then wait for the control command of master.

3) receive the control command that master sends, call handler and carry out the compute function.

4) report executing state to master.

5) if receive and exit command, just quit a program.

Above-mentioned steps 3) and step 4) constantly carry out.

(10) when the controller queue of master is sky, mean that all segmentation calculation process are finished, whole operation also executes.Master notifies all worker to exit calculating.Master also exits thereupon, and the calculating of whole operation completes.

(11) after having calculated, result of calculation is saved in HDFS, checks result of calculation in output directory.

Above-mentioned steps 4 is the method based on computing system mentioned above that the present embodiment proposes, and is summarized as follows:

At first, according to the demand of application, calculating is split as to the segmentation calculation process of a plurality of serials;

Secondly, for each segmentation calculation process is chosen suitable computation model, and write the processing function that model is relevant.Calculate the execute phase for figure, adopt the BSP computation model.For the calculating of other types, adopt the DAG computation model.The computing system of the present embodiment provides DAG and BSP BCL, and the developer need to write the computing function of BSP BCL and the computing function of DAG BCL.

Finally, the driving interface that adopts framework to provide, specify the serial execution sequence of the code of a plurality of segmentation calculation process, and configure corresponding parameter, comprises the sharing mode of input directory, output directory and intermediate data set.Usually, in order to improve treatment effeciency, adopt online intermediate data set sharing mode, rather than the intermediate data set sharing mode of off-line.The former adopts the mode based on the internal memory streamline to share data, and the latter adopts the mode of HDFS to share data.

With a concrete application examples, the performing step of the basic procedure of shared cluster resource is done to concrete description.Suppose at first to carry out PageRank calculating to a web data collection, obtain the PageRank value of each webpage; Then the PageRank value of webpage is carried out to descending sort.This calculating can be decomposed into 2 segmentation calculation process: at first adopt the BSP model to carry out PageRank calculating, then adopt the calculating of sorting of DAG model.

Should adopt the Go language by use-case, and utilize the distributed file system HDFS increased income to build experimental situation, concrete implementation step is as follows.

(2) the Hadoop system of increasing income in each node deploy of cluster, and the HDFS of startup Hadoop system.

(3) pending data set is uploaded to HDFS.

(4) write the application program that data are processed:

At first, according to the demand of application, calculating is split as to the segmentation calculation process of 2 serials: calculate the stage of PageRank and the stage of sequence;

Secondly, calculate the PageRank stage, adopt the BSP computation model, inherit the BSP base class, write corresponding computing function; Phase sorting, adopt the DAG computation model, inherits the DAG base class, writes corresponding computing function;

Finally, the driving API that adopts top module to provide, add BSP subclass and the DAG subclass of writing in the execute phase to, configures input directory and output directory, and the sharing mode of intermediate data set is set.

(5) use go build order, the compiling of application of writing is become to an executable file.

(6) choose a collection of node from cluster, be used for running job.Node is divided into 2 classes, and one is main controlled node, and remaining is computing node.

(7) start executable program on main controlled node, it is master that its parameter is set.

(8) start executable program on computing node, it is worker that its parameter is set.

(8) after master starts, carry out following step:

1) at first created the queue of a controller object (Controller), totally two objects: a BSP controller, another one is DAG controller.

2) master ejects BSP controller object from queue, and carries out this controller.

3), after BSP controller object executes, master takes out next controller from the queue of controller object, i.e. DAG controller, and call its and carry out.

4), after DAG controller executes, queue is empty.Master notice worker exits calculating.

5) master self exits calculating.

(9) after worker starts, carry out following step:

1) at first to master, send log-on message

2) create a processor (handler) queue, totally two objects: a BSP handler and a DAG handler.Wait for the control command of master.

3) control command received according to master, call BSP handler and carry out corresponding compute function.

4) report executing state to master.

5) control command received according to master, call DAG handler and carry out corresponding compute function.

6) report executing state to master.

7) receive and exit command, worker quits a program.

(10) after having calculated, result of calculation is saved in HDFS, checks result of calculation in output directory.

In should use-case, test data set be to be generated by the benchmark program, has produced the diagram data collection on 400 ten thousand, 800 ten thousand and 1.6 thousand ten thousand summits.At first data set is carried out to PageRank calculating, then the PageRank value of calculating is carried out to descending sort.PageRank calculates and completes with the BSP model, and sequence is calculated and completed with the DAG model.Wherein, having calculated intermediate data set after PageRank offers follow-up sequence and calculates.Simultaneously, should two kinds of middle working times that collect under the data sharing mode in step (4) have been compared by use-case: based on the HDFS mode, share data and share data based on internal memory streamline (In-mermory Pipeline) mode.Fig. 3 has provided the comparing result of the sharing mode of two kinds of intermediate data sets.From the comparing result of Fig. 3, owing to having avoided a large amount of HDFS magnetic disc i/o expenses based on the internal memory pipeline system, have obvious shortening than HDFS working time.

The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. a computing system of processing for large data, it is characterized in that, it runs on a plurality of computing nodes, and comprises successively from bottom to up three layers of module, be respectively bottom module, middle layer module and top module, middle layer module comprises again transmission of messages module and computation model module:

Described bottom module, it adopts the Hadoop distributed file system, for storing data;

The described computation model module in the operation of different computing nodes, its message according to the transmission of described transmission of messages module realizes collaborative work, and the computation model that builds separately particular type is processed the data that read from the Hadoop distributed file system;

2. computing system according to claim 1, is characterized in that, the data of described bottom module stores comprise input data set, intermediate result data set and Output rusults data set.

3. computing system according to claim 1, it is characterized in that, described transmission of messages module comprises transmitter and receiver, described transmitter is used for from the computation model module receipt message in same computing node, and the message of reception is sent to the receiver of the computing node of appointment, the message that described receiver sends for the transmitter that receives different computing nodes from network, and the message of reception is transmitted to the computation model module in same computing node.

4. according to the described computing system of claim 1 or 3, it is characterized in that, described message comprises request message and response message.

5. computing system according to claim 1, it is characterized in that, described computation model module comprises the controller of the computation model of some employing particular types, also comprise the processor that adopts the identical calculations model with each controller, described controller is for coordinating the execution flow process of its computation model adopted, the request message that described processor sends for receiving controller, the data processing operation of implementation controller appointment, and report response message to controller; Described processor is also for reading the input data of Hadoop distributed file system, and to Hadoop distributed file system output result of calculation.

6. computing system according to claim 1, it is characterized in that, described top module also provides corresponding configuration for the computation model to each particular type, comprise input directory, output directory, computing node number and output journal catalogue are set, described input directory is used to specify pending data set path, and described output directory is used to specify the storing path of final calculation result.

7. computing system according to claim 1, is characterized in that, when the mode with serial combines the calculating of different computation models expression, described top module is also for providing the DLL (dynamic link library) of the multiple computation model of serial combination.

8. according to claim 1,5,6 or 7 described computing systems, it is characterized in that, described computation model comprises BSP computation model, DAG computation model, GraphLab computation model and Spark computation model.

9. computing method of processing for large data, it is applicable to a plurality of computing nodes, it is characterized in that, and concrete steps comprise:

Step 1, be uploaded to the Hadoop distributed file system by pending data set;

Step 6, carry out the correlative code of writing.

10. computing method according to claim 9, is characterized in that, described computation model comprises BSP computation model, DAG computation model, GraphLab computation model and Spark computation model.