CN107172149A

CN107172149A - Big data instant scheduling method

Info

Publication number: CN107172149A
Application number: CN201710344085.9A
Authority: CN
Inventors: 赖真霖; 文君
Original assignee: Chengdu Sixiang Lianchuang Technology Co Ltd
Current assignee: Chengdu Sixiang Lianchuang Technology Co Ltd
Priority date: 2017-05-16
Filing date: 2017-05-16
Publication date: 2017-09-15

Abstract

The invention provides a kind of big data instant scheduling method, this method includes：Operation is divided into independent task by host node, and load node is to host node application and performs task；Host node carries out one-level burst to operation, and the execution information for getting all two grades of bursts in this burst is transmitted to load；Internal memory mapping and secondary burst are done in load to this piece, are then placed in thread pool；Each thread of load node, which is performed, to be completed to carry out a reduce after a second task burst, sends result to host node, and host node will load the result progress reduce operations sent；Distribution meets the load process host node of Host List documentation requirements, task is distributed into load node by map processes, and obtain corresponding task result from load node by reduce processes；Merger intermediate result obtains final result, and final job result is given corresponding task submitter by host node by socket communication distinct feed-back.The present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, the need for meeting high-performance cloud calculating.

Description

Big data instant scheduling method

Technical field

The present invention relates to big data, more particularly to a kind of big data instant scheduling method.

Background technology

Cloud computing is by the way that internet is by huge data storage and calculates the computer that processing routine is distributed to group system In, and corresponding application program service is provided.User to resource when submitting access request, and system can be automatically by request It is switched to the computer and storage system of actual storage resource.The cloud computing platform of virtualization technology is in mass data processing side Face achieves gratifying achievement.But mass data is distributed on large-scale cluster and carries out parallel processing by cloud computing, due to Current main flow cloud computing platform bottom uses virtualization technology, and all softwares and application thereon is operated on virtual hardware, This strategy necessarily brings performance to a certain extent to reduce.And realization mechanism is to use first to store number inside MapReduce According to the strategy for reading forward process again, when middle data volume becomes big, number increases, this pattern necessarily leads to substantial amounts of useless Magnetic disc i/o operation；If data can so increase network load in distal end；, can be by I/O bottlenecks if data are local Limitation, so as to reduce the efficiency of tasks carrying.

The content of the invention

To solve the problems of above-mentioned prior art, the present invention proposes a kind of big data instant scheduling method, bag Include：

Operation is divided into independent task, and maintenance task state array by host node；After load node starts, to host node Application task, is obtained after task by predetermined scheme execution task；After the completion of tasks carrying, the result of task is fed back into main section Put and apply for next task；If the mission bit stream that load node is obtained is invalid, this parallel computation Mission Success is completed, Load node winding-up；The operation of limit priority is taken out from task queue；Host node carries out burst to operation, obtains one-level Burst, is put into load cell；Host node inquiry state table obtains the information of one-level burst, then gets in this burst all two grades The execution information of burst is transmitted to load；Load-receipt does internal memory mapping and secondary burst to this piece to task, and detection is total Core number, set up thread pool, burst be then put into thread pool, thread obtains second task point by loading internal state table Piece, completion is performed until content is all in table；It is laggard that each thread of load node performs one second task burst of completion Reduce of row, host node is sent to by reduce results and internal state table, and host node determines whether to send to this load Another burst, if desired sends new task, then continues to take task to be sent to load；Until content is all in state table After the completion of execution, host node sends exit instruction to each load, and load triggers event signal notifies to be retired, the institutes such as monitoring There are load and host node to wait to exit together；Host node will load the result progress reduce operations sent；Host node is analyzed Incoming command line parameter is to obtain the relevant information of this parallel interface operation；According to Host List file, dynamically distributes are more Individual load node simultaneously constitutes a communication domain, then waits load node application task or submits task result；It is then based on simultaneously The dynamic process that row calculates interface is created and administrative model, and the load process host node that distribution meets Host List documentation requirements will Task is distributed to load node by map processes, and obtains corresponding task result from load node by reduce processes；Return And these intermediate results, the final result of this parallel computation operation is obtained, finally this result distinct feed-back is carried to task Friendship person；Once host node, which detects some load node, occurs exception, then new Host List file is produced, and distribute newly Load node performs task to take over abnormal load node；If host node finds all in the corresponding state array of task Item is all to have completed, then host node submits final job result by socket communication distinct feed-back to corresponding task Person.

The present invention compared with prior art, with advantages below：

The present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, to meet high-performance cloud The need for calculating.

Brief description of the drawings

Fig. 1 is the flow chart of big data instant scheduling method according to embodiments of the present invention.

Embodiment

Retouching in detail to one or more embodiment of the invention is hereafter provided together with illustrating the accompanying drawing of the principle of the invention State.The present invention is described with reference to such embodiment, but the invention is not restricted to any embodiment.The scope of the present invention is only by right Claim is limited, and the present invention covers many replacements, modification and equivalent.Illustrate in the following description many details with Thorough understanding of the present invention is just provided.These details are provided for exemplary purposes, and without in these details Some or all details can also realize the present invention according to claims.

An aspect of of the present present invention provides a kind of big data instant scheduling method.Fig. 1 is according to embodiments of the present invention big Data instant scheduling method flow diagram.

The high-performance cloud calculating platform that the present invention is designed directly builds cloud using Heterogeneous Computing node and put down without virtualization Platform bottom；MapReduce is rewritten using the parallel computation interfacing and multithreading of increase multi-level fault tolerance function, calculated In avoid substantial amounts of useless I/O operation, so as to improve the efficiency of cloud computing, to meet the need for high-performance cloud calculates. Under node isomerous environment, based on HADOOP, cloud infrastructure service layer is set up using isomerization hardware, operation two times scheduling is realized, Operation/task rollback and dynamic migration, MapReduce frameworks are set up based on multi-level fault tolerance parallel computation interface.To centre knot Fruit is directly handled, and reduces unnecessary I/O operation.

The cloud computing platform framework based on HADOOP of the present invention includes Heterogeneous Computing node, based on parallel computation interface Fault-tolerant unit, monitoring module, job management module, task management module, distributed data base and MapReduce Computational frames.

Job management module is used to preserve job queue, manages the scheduling of operation, there is provided corresponding for the execution of monitoring operation It is fault-tolerant, support that remote job is submitted and result is returned.Monitoring module is used to manage available host list, notes abnormalities node simultaneously According to loading condition to node sequencing, to select the minimum node of load to perform task first.Task management module is used for task Divide and allocation schedule, tasks carrying, collect and returning result.

Comprising operation communication submodule in job management module, according to user mutual, realize that operation submission is fed back with result； Job management submodule according to priority carries out job scheduling according to the management and scheduling of operation, upkeep operation queue；Monitoring module evidence The running situation and loading level of all nodes are monitored, and accordingly for the Host List file needed for task execution module is provided.

Task management module is the specific performing module of operation, and burst is carried out to operation, and carries out rational management to burst, Finally collect and feed back result of calculation.Task management module is obtained after the operation that job management is distributed with scheduler module, by setting Strategy divide the job into multiple tasks, interacted with monitoring module and perform parallel computation required by task to generate scheduling Host List file；The dynamic process for being then based on parallel computation interface is created and administrative model, and distribution meets Host List text Task is distributed to load node by the load process host node of part requirement by map processes, and is passed through from load node there Reduce processes obtain corresponding task result；These intermediate results of merger can be obtained by the final of this parallel computation operation As a result, finally task submitter is given this result distinct feed-back.It is different that once host node detects the generation of some load node Often, then interacted with monitoring module to produce new Host List file, and distribute new load node to take over abnormal load Node performs task.

After operation communicator module initialization, according to user setting, it is necessary to binding local socket server address with And the socket server remote address of job management submodule, and create two worker threads：Wait thread：Circular wait is received The job result fed back from task management module；Thread is sent, once user have submitted new job, based on job management submodule The server address of block, transmits the operation that user inputs by socket and gives job management submodule.

Job management submodule waits operation communication submodule to submit to the operation of oneself.In worker thread, create Two worker threads.Thread is parsed, the local socket server for safeguarding obtains the work of operation communication submodule remote visiting system Industry, parses the information and confirms whether the information meets rule, is placed behind the address that qualified operation is bound to operation submitter Into multipriority job queue, with execution to be scheduled.Thread is scanned, scan round multipriority job queue is to determine team Whether there is the operation of higher priority in row, if with the presence of the operation of higher priority, taking out operation, life is constructed according to operation Order row simultaneously starts the process of multiple tasks performing module to complete this parallel computation operation；This parallel computation of circular wait is made Industry, which is performed, to be terminated；Complete then to judge that generation is abnormal if operation is not normal execution, then reschedule and perform this subjob.

Host node analyzes incoming command line parameter to obtain the relevant information of this parallel interface operation.According to monitoring mould The Host List file of block generation, the multiple load nodes of dynamically distributes simultaneously constitute a communication domain, then wait load node Shen Please task or submission task result.

Operation is divided into independent task by host node, and safeguards that an initial value is 0 task status array, to remember Record tasks carrying situation.Once there is load node application task, then first search was obtained in state array as appointing corresponding to 0 item Business, assigns the task to applicant and respective items in state array is put into 1；If certain task is completed, correspondence in state array Item should be set to 2；If performing the process exception of some task, respective items should be reset as 0 in state array, with etc. Wait to be reassigned to the execution of other nodes；If state array is got the bid, a certain corresponding task for being designated as 1 is in stipulated time model The interior process exception without feedback result, then the Predicated execution task is enclosed, and the corresponding item of the task in state array is reset 0, performed to be reassigned to other nodes.

If host node finds that all items in the corresponding state array of task are all 2, show that this parallel computation is made Already through completing, final job result is given corresponding task submitter by host node by socket communication distinct feed-back.

After load node starts, analyze its parent process and whether there is, if being not present, refusal is performed；If in the presence of judging Itself it is to be started by host node.Load node obtains just performing task by predetermined scheme after task to host node application task； After the completion of tasks carrying, the result of task is fed back into host node and applies for next task.If what load node was obtained appoints Business information is invalid, then this parallel computation Mission Success is completed, load node winding-up.

Include three-level fault-tolerant networks in high-performance cloud calculating platform proposed by the present invention：The fault-tolerant i.e. operation of one-level is secondary to adjust Degree.When systems scan into cluster a certain node in execution task it is abnormal, then system can reschedule immediately execution this Business.Two grades of fault-tolerant i.e. operation/task rollbacks.Performed at system rollback operation/task to nearest checkpoint.If host node is different Often, then this parallel computation operation fails, and secondary system dispatches the execution state of this parallel computation operation and rollback operation extremely At nearest checkpoint and continue executing with operation；If load node is abnormal, system attempt to restart abnormal nodes and rollback its At execution status of task to nearest checkpoint and continue executing with task.Fault-tolerant three-level is dynamic migration.When abnormal load section Point can not obtain rollback in a short time, that is to say, that in the case of two grades of fault-tolerant failures, and system can be actively by abnormal work The task immigration of node to other working nodes are performed.In order to ensure the stability of parallel interface PC cluster ability, and rack Calculating platform substitutes abnormal load process by the new load process of dynamically distributes.

Monitoring module is used to generate available Host List in present system：According to cpu core numbers and being carrying out The number of processes of task compares, if entering number of passes less than host name is added into Host List if core number.If entering number of passes not less than cpu Core number, then remove host name from Host List.The node where No. 0 process in monitoring programme is monitoring server, Node tasks node where non-zero process.The monitoring of process model is concretely comprised the following steps：

1) it is event signal a kernel objects to be created in each non-zero process, and process opens this event when performing task Signal, calculates and completes trigger event signal.This signal be used for obtain process whether etc. it is to be retired.

2) host name is obtained, oneself is calculated in the number of processes M and process for the task of being carrying out and completes wait through performing and move back The quantity K gone out, sets up available host listing file.

, will through performing the quantity K to be retired such as completion equal to oneself in process if 3) be carrying out the number of processes M of task This host name writes Host List file.

The renewal of Host List has two kinds of strategies：Timing updates to be updated with before task scheduling.Updated for timing, monitor journey Sequence is kept running always, and scheduler program and monitoring are not interacted；For being updated before task, the operation of monitoring programme occurs dividing Before task, monitoring programme automatic start before execution task updates Host List, is then log out, host node process is according to available Host List enters one group of process of Mobile state and performs task, if wherein there is the failure of process task midway, automatic start is monitored again Program updates list, and the process that host node process starts corresponding quantity further according to available host list completes appointing for failure process Business.

Burst of the task scheduling of the present invention based on operation.After operation is taken out in job queue, operation is first carried out One-level burst, node is assigned to by one-level burst, and secondary burst is then carried out in node, and two grades of bursts are assigned into thread, The distribution of task adds the method for thread pool using load cell.For map operations, the distribution of one-level burst is saved according to free time load Point application principle, the strong more bursts of node application of computing capability, the weak node of computing capability completes less burst.Load Node has been carried out after map tasks, and map result as reduce input directly are carried out into first time reduce, then will As a result it is sent to host node and is second of reduce, and obtains final result.

Load to load cell application task, host node will inquire about one-level burst table, if finding state value to be not carried out Burst, this burst information is just sent to load process.When host node is performing the fault-tolerant strategy inch of task dynamic migration, hair Send the position of the starting point and ending point of burst task, and by the execution of all two grades of bursts in one-level burst to be migrated Information table is transmitted to load together.

If this operation is scheduled for the first time, all one-level bursts are all put into load cell and are scheduled by host node. If operation is not scheduled for the first time, host node is selected state value and is scheduled for unfinished piece.Performing the mistake of scheduling Host node continuous updating performs the content of state table in journey.

The content for the load internal state table that host node is sent according to load, if detecting this fraction in now load Piece goes to certain progress and not completed, then sends another burst task to it.If host node detects execution state table In content all to have completed, then whole task is completed, and the signal to be retired such as sends to load node, and this operation is completed.For Load node ,-level burst is sent to after load, and load first does internal memory mapping to this one-level burst, then institute in detection node There is the core number that processor is total, open the thread of respective numbers, start thread pool and perform task.

In the load, one-level burst is represented with load internal state table.Thread pool takes out also according to load internal state table The burst being not carried out performs map tasks, and the input of the result of map tasks as reduce tasks then is performed into reduce appoints Business, during execution, often completes two grades of bursts and just updates the content of load internal state table, and execution is tied Host node really and to the modification information for loading internal state table is issued in the lump, and host node updates institute further according to load internal state table There is the execution state table of burst.Until performing, state is all to perform completion, and all bursts perform completion.

Operation is as follows from the flow that obtains result is submitted to：Operation communication submodule submits the operation of priority；Working pipe Manage the operation that submodule takes out limit priority from task queue；Host node carries out burst to operation, obtains one-level burst, is put into Load cell.Host node obtains the information of one-level burst by inquiring about state table, then gets all two grades of bursts in this burst Execution information transmit to load；Load-receipt does internal memory mapping and secondary burst to this piece, detects total core to task Calculation, sets up thread pool, and burst then is put into thread pool, and thread obtains second task burst by loading internal state table, Completion is performed until content is all in table, thread is exited, thread pool is destroyed.Each thread of load node, which is performed, completes one A reduce is carried out after second task burst, reduce results and internal state table are sent to host node, host node is determined Whether another burst is sent to this load, if desired send new task, then continue to take task to be sent to load.Until shape In state table after the completion of all execution of content, host node sends exit instruction to each load, and load triggers event signal is notified Monitoring etc. is to be retired, and all loads and host node are waited and exited together.Host node will load the result progress reduce sent Operation.Host node sends result to submitter's operation communication submodule of operation.

In node data cache policy, the work in the load node of cloud platform caching cloud platform interior joint subnet is deployed in Industry.Local index server is deployed in each subnet, the operation shared in residing cloud platform is stored, and respectively The corresponding node listing of individual operation.Load node oneself is cached with the identity of resource provider to local index server registration Operation.Line node builds the subnet in cloud platform in local index server organization cloud platform, helps subnet user convenient Find load node in ground.Management server periodically collects job request letter from the local index server in each cloud platform Breath；After each run cache algorithm, management server be each cloud platform generate two cache object inventories, and by this two Individual the job list is sent respectively to two load nodes disposed in cloud platform；Each stores working as operation in node Preceding state also periodically reports management server.Load node updates according to the job list received from management server The process space of oneself；Once some new cache object is synchronously completed, load node will calculate its content hash values, and Local index server registration operation into residing cloud platform.

When the user of some cloud platform starts an operation, the node adds global node subnet and cloud platform simultaneously In subnet.And reported to local index server in the running status of each operation, local index server monitoring cloud platform Subnet, is sent to management server by the relevant information of each active job in cloud platform, includes unique hash marks of operation Know, job size, the number of users accessed, currently available number of copies in cloud platform, the subnet of the most recently requested operation is used Amount etc..According to the information being collected into from local index server, management server is periodically executed a cache algorithm to draw Load node in each cloud platform needs the operation for updating and deleting.And the j ob schedule drawn is immediately sent to each born Carry node.After an operation is synchronously completed, the local index server registration work of load node immediately into cloud platform Industry.Then load node uploads locally to ask the user of the operation to provide data.

Further, operation burst to be distributed is actively sent to by procedure below the node of available free bandwidth. Management server sets decision package, according to the information being collected into, determines the operation for needing to send, and carries out burst to it, sends out The big operation burst of auxiliary node is given, and operation is stored to the memory allocation of user.To the transmission target section each chosen Point adds an operation, and the operation is in running background, and the small operation that the operation is independently performed with user is carried into resource simultaneously Deposition state, shares uploading bandwidth.When user leaves system, the end of job of the running background.Management server selection sends mesh Node is marked, notifies each selected node storage to send a burst of target job；Receive send operation node from The operation burst that cloud platform and node subnet storage are specified.After the completion of the operation burst storage of transmission, it is it to send destination node He provides upload by memory node, is used as auxiliary node；When distribution procedure terminates, management server received transmission operation to all Node send message, notify its memory to remove the operation burst being previously sent from the process space of user.

In summary, the present invention proposes a kind of big data instant scheduling method, improves the efficiency of cloud computing, to meet The need for high-performance cloud is calculated.

Obviously, can be with general it should be appreciated by those skilled in the art, above-mentioned each module of the invention or each step Computing system realize that they can be concentrated in single computing system, or be distributed in multiple computing systems and constituted Network on, alternatively, the program code that they can be can perform with computing system be realized, it is thus possible to they are stored Performed within the storage system by computing system.So, the present invention is not restricted to any specific hardware and software combination.

It should be appreciated that the above-mentioned embodiment of the present invention is used only for exemplary illustration or explains the present invention's Principle, without being construed as limiting the invention.Therefore, that is done without departing from the spirit and scope of the present invention is any Modification, equivalent substitution, improvement etc., should be included in the scope of the protection.In addition, appended claims purport of the present invention Covering the whole changes fallen into scope and border or this scope and the equivalents on border and repairing Change example.

Claims

1. a kind of big data instant scheduling method, it is characterised in that including：

Operation is divided into independent task, and maintenance task state array by host node；After load node starts, to host node application Task, is obtained after task by predetermined scheme execution task；After the completion of tasks carrying, the result of task is fed back into host node simultaneously Apply for next task；If the mission bit stream that load node is obtained is invalid, this parallel computation Mission Success is completed, load Node winding-up；The operation of limit priority is taken out from task queue；Host node carries out burst to operation, obtains a fraction Piece, is put into load cell；Host node inquiry state table obtains the information of one-level burst, then gets all two fractions in this burst The execution information of piece is transmitted to load；Load-receipt does internal memory mapping and secondary burst to this piece, detected always to task Core number, sets up thread pool, and burst then is put into thread pool, and thread obtains second task point by loading internal state table Piece, completion is performed until content is all in table；It is laggard that each thread of load node performs one second task burst of completion Reduce of row, host node is sent to by reduce results and internal state table, and host node determines whether to send to this load Another burst, if desired sends new task, then continues to take task to be sent to load；Until content is all in state table After the completion of execution, host node sends exit instruction to each load, and load triggers event signal notifies to be retired, the institutes such as monitoring There are load and host node to wait to exit together；Host node will load the result progress reduce operations sent；Host node is analyzed Incoming command line parameter is to obtain the relevant information of this parallel interface operation；According to Host List file, dynamically distributes are more Individual load node simultaneously constitutes a communication domain, then waits load node application task or submits task result；It is then based on simultaneously The dynamic process that row calculates interface is created and administrative model, and the load process host node that distribution meets Host List documentation requirements will Task is distributed to load node by map processes, and obtains corresponding task result from load node by reduce processes；Return And these intermediate results, the final result of this parallel computation operation is obtained, finally this result distinct feed-back is carried to task Friendship person；Once host node, which detects some load node, occurs exception, then new Host List file is produced, and distribute newly Load node performs task to take over abnormal load node；If host node finds all in the corresponding state array of task Item is all to have completed, then host node submits final job result by socket communication distinct feed-back to corresponding task Person.