Summary of the invention
Based on this, it is necessary in view of the above-mentioned problems, providing a kind of big data increment that big data treatment effeciency can be improved
Alternative manner, device, computer equipment and storage medium.
A kind of big data increment iterative method, comprising:
The directed acyclic graph task executed by graphics processor is received, number corresponding with the directed acyclic graph task is obtained
According to collection, the data set is stored into the caching into graphics processor memory;
The directed acyclic graph task is responded, calculating is iterated to the data set, the data after being iterated to calculate
Collection, and the data set stored in the caching is updated with the data set after the iterative calculation;
When increment variation occurs for the data set, based on the data set after the iterative calculation stored in the caching
Carry out increment iterative calculating, the data set after obtaining increment iterative, with the data set after the increment iterative in the caching
Data set be updated.
In one embodiment, the caching is shared drive, the method also includes:
When detecting that the data set amount of storage on the shared drive is greater than preset threshold, pass through block-based sliding window
Mouthful mechanism is by the data set migration on the shared drive to global memory.
In one embodiment, the method also includes:
It, will be in the global memory when detecting that increment iterative corresponding with the directed acyclic graph task calculates stopping
Data set migration to central processing unit memory.
In one embodiment, described to obtain data set corresponding with the directed acyclic graph task, by the data set
Store the caching into graphics processor memory, comprising:
Obtain the data set of RDD format corresponding with the directed acyclic graph task;
Data Format Transform is carried out to the data set of the RDD format, the data set of G-RDD format is obtained, by the G-
The data set of RDD format stores the caching into graphics processor memory.
In one embodiment, the data set to the RDD format carries out Data Format Transform, obtains G-RDD lattice
The data set of the G-RDD format is stored the caching into graphics processor memory by the data set of formula, comprising:
The data set of the RDD format is stored to data buffer zone, the RDD format in the data buffer zone is transferred
Data set carries out Data Format Transform, obtains the data set of G-RDD format, the data set of the G-RDD format is stored to figure
Caching in shape processor memory.
In one embodiment, described to obtain data set corresponding with the directed acyclic graph task, by the data set
Store the caching into graphics processor memory, comprising:
Data set corresponding with the directed acyclic graph task, and the number that will be read are read from distributed file system
It stores according to collection to graphics processor memory;
Data set corresponding with the directed acyclic graph task is obtained from the graphics processor memory, by the data
Collection stores the caching into the graphics processor memory.
A kind of big data increment iterative device, described device include:
Task receiving module, for receive by graphics processor execute directed acyclic graph task, obtain with it is described oriented
The corresponding data set of acyclic figure task, stores the caching into graphics processor memory for the data set;
Data update module is iterated calculating to the data set, obtains for responding the directed acyclic graph task
Data set after iterative calculation, and the data set stored in the caching is carried out more with the data set after the iterative calculation
Newly;
Increment iterative module is used for when increment variation occurs for the data set, described in storing in the caching
Data set after iterative calculation carries out increment iterative calculating, the data set after obtaining increment iterative, after the increment iterative
Data set is updated the data set in the caching.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
When receiving the directed acyclic graph task executed by graphics processor, obtain corresponding with the directed acyclic graph task
The data set is stored the caching into graphics processor memory by data set;
The directed acyclic graph task is responded, calculating is iterated to the data set, the data after being iterated to calculate
Collection, and the data set stored in the caching is updated with the data set after the iterative calculation;
When increment variation occurs for the data set, based on the data set after the iterative calculation stored in the caching
Carry out increment iterative calculating, the data set after obtaining increment iterative, with the data set after the increment iterative in the caching
Data set be updated.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
It is performed the steps of when row
The directed acyclic graph task executed by graphics processor is received, number corresponding with the directed acyclic graph task is obtained
According to collection, the data set is stored into the caching into graphics processor memory;
The directed acyclic graph task is responded, calculating is iterated to the data set, the data after being iterated to calculate
Collection, and the data set stored in the caching is updated with the data set after the iterative calculation;
When increment variation occurs for the data set, based on the data set after the iterative calculation stored in the caching
Carry out increment iterative calculating, the data set after obtaining increment iterative, with the data set after the increment iterative in the caching
Data set be updated.
Above-mentioned big data increment iterative method, apparatus, computer equipment and storage medium, when directed acyclic graph task is served as reasons
When the directed acyclic graph task that graphics processor executes, data set corresponding with directed acyclic graph task is obtained, data set is deposited
The caching into graphics processor memory is stored up, the cache resources in graphics processor are made full use of, by data intensive data
Caching, intensive iterative calculation and increment iterative calculate, hide the input/output delay of low bandwidth, effective reduce repeats
It calculates, calculates total time to reduce, improve big data treatment effeciency.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Method provided by the present application can be applied in application environment as shown in Figure 1.Server cluster is GSpark meter
Framework server cluster is calculated, GSpark Computational frame server cluster includes GSpark Computational frame model and distributed document
System.Fig. 1 a is GSpark Computational frame model schematic, and GSpark Computational frame model includes host node Master and more
A working node Worker Node, single working node include actuator Executor and GPU manager GPU Manager,
GSpark Computational frame model is the extension Spark increment iterative Computational frame model for merging graphics processor GPU.Pass through
Driver program submits operation Job to Master, Master that a Worker is allowed to start Driver (on Master), initialization
Collect swarm parameter, wherein for SparkContext parameter for constructing cluster resource, calculating nucleus number, memory including CPU and GPU are big
Small and initiation parameter configuration etc..Fig. 1 b is data flowchart in GSpark Computational frame server cluster, and GSpark is calculated
Job is divided into DAG (Directed acyclic graph, directed acyclic graph) task by the Driver in frame model.By
TaskScheduler is scheduled DAG task based on the primary data locality of Spark, DAG task is distributed to each
It is executed on Executor on Worker Node, DAG task is divided by the CPU DAG task executed and is appointed by the DAG that GPU is executed
Business.When DAG task is the DAG task executed by GPU, Executor on Worker Node is from HDFS (Hadoop
Distributed File System, distributed file system) data set corresponding with DAG task is read, it stores to GPU
It deposits.GPU Manager obtains the data set corresponding with DAG task in GPU memory, carries out Data Format Transform, obtains G-RDD
The data set of format.Data set after GPU Manager converts format is stored to caching, is calculated with being iterated.Respond DAG
Task is iterated calculating to data set, is iterated to calculate as a result, and to iterate to calculate result to the data stored in caching
Collection is updated;When increment variation occurs for data set, based on the iterative calculation result progress increment iterative stored in caching
It calculates, the data set after obtaining increment iterative is updated data set corresponding in caching with the data set after increment iterative.
In one embodiment, it as shown in Fig. 2, providing a kind of big data increment iterative method, applies in this way
For working node in Fig. 1, comprising the following steps:
Step 202, the directed acyclic graph task executed by graphics processor is received, is obtained corresponding with directed acyclic graph task
Data set, data set is stored into the caching into graphics processor memory.
User end to server cluster submits operation, i.e. submission operation Job to Master node.RDD(Resilient
Distributed Datasets, elasticity distribution formula data set) be Spark a data abstraction, itself runs on memory
In, for example reading file is a RDD, is a RDD to file calculating, result set is also a RDD, different fragments, data
Between dependence can be seen as relying between RDD, calculating is the calculating based on RDD.In GSpark Computational frame model
Job is divided into multiple directed acyclic graph DAG according to the dependence between RDD by Driver.There is a series of dependence between RDD
Relationship, dependence are divided into narrow dependence and wide dependence.DAG is committed to DAGScheduler, and DAGScheduler is according between RDD
Width rely on DAG divided into complementary multiple Stage.Stage is one group of parallel task, and each Stage includes one
Then these Task are submitted to TaskScheduler operation by a or multiple Task tasks in the form of TaskSet, then will tool
The Task of body is distributed to the thread pool processing of Executor on Worker node.
In Driver, operation corresponding RDD carries out the division of Stage by DAGSchedule first, then bottom
Scheduler TaskScheduler and Executor is interacted.The Executor of Driver and Worker node is in respective line
Cheng Chizhong runs operation.Driver can obtain the carrying out practically resource of Executor, such Driver and Executor when operation
Between communicated, by way of network, ready-portioned Task is sent to Executor by Driver, and Task refers to that business is patrolled
Collect code.
Executor receives directed acyclic graph task, carries out unserializing, outputting and inputting for data is obtained, in cluster
On identical data fragment, the service logic of data is the same, and only data are different, is responsible for execution by the thread pool of Executor.
TaskScheduler sends directed acyclic graph task to Executor, after Executor unserializing data, obtains data
It outputs and inputs, that is, the service logic of task, Executor operation is business logic codes.
Step 204, directed acyclic graph task is responded, calculating is iterated to data set, the data after being iterated to calculate
Collection, and the data set stored in caching is updated with the data set after iterating to calculate.
Iteration refers to solves same group of data collection repeatedly, to optimize some parameter, by executing identical meter
Calculation process, its purpose is to reach a convergence state.An iteration is known as to the execution of calculating process each time, and it is each
The result that secondary iteration obtains is used as the initial value of next iteration.Iterative calculation is identical by executing one group repeatedly
Operation, Step wise approximation is as a result, iterative data collection is the set as composed by the relationship between data and data.Data set is carried out
Iterative calculation, the data set after being iterated to calculate carry out the data set stored in caching with the data set after iterating to calculate
It updates, the data set of update continues the iterative calculation of next round.
Step 206, it when increment variation occurs for data set, is carried out based on the data set after the iterative calculation stored in caching
Increment iterative calculates, the data set after obtaining increment iterative, is carried out with the data set after increment iterative to the data set in caching
It updates.
After the completion of iterative calculation based on iterative data, since the new iterative data that business increases and generates is increment
Data, increment iterative are the alternative manners that new iteration result is obtained according to incremental data and original iteration result.
Above-mentioned big data increment iterative method, when directed acyclic graph task is the directed acyclic graph executed by graphics processor
When task, data set corresponding with directed acyclic graph task is obtained, data set is stored into the caching into graphics processor memory,
The cache resources in graphics processor are made full use of, the caching to data intensive data, intensive iterative calculation and increment are passed through
The input/output delay of low bandwidth is hidden in iterative calculation, and effective reduce computes repeatedly, and is calculated total time to reduce, is improved
Big data treatment effeciency.
In one embodiment, it caches as shared drive, big data increment iterative method further include: shared interior when detecting
When the data set amount of storage deposited is greater than preset threshold, by block-based sliding window mechanism by the data set on shared drive
It migrates to global memory.Shared drive (Shared memory) be graphics processor (Graphics Processing Unit,
GPU the level cache being controlled by user in), physically, (Streaming Multiprocessor, streaming are more by each SM
Handle cluster) comprising in all thread (thread) in a Block (thread block) being currently executing shared low latency
Deposit pond.Global memory Uniform memory refers to whole physical memories of all processors.When detecting on shared drive
When data set amount of storage is greater than preset threshold, as shown in figure 3, by block-based sliding window mechanism by the number on shared drive
It migrates according to collection to global memory, takes data window Read Window to read variable amount of data block (chunk) every time, by complete
The merging memory access mechanism that intra-office is deposited, sufficiently use communication bandwidth, solve GPU memory size restricted problem, bottom memory not
Manually memory management is carried out without programming personnel when sufficient, guarantees that the quick of data accesses and prevent in calculating process to reach
The effect that memory overflows, the effective handling capacity for executing the time and improving system entirety for reducing task.Specifically, when detecting
When increment iterative corresponding with directed acyclic graph task calculates stopping, by the data set migration in global memory to central processing unit
Memory guarantees the quick access of data in calculating process to realize the further management to memory.
In one embodiment, as shown in figure 4, obtaining data set corresponding with directed acyclic graph task, data set is deposited
Store up the caching into graphics processor memory, comprising: step 402, obtain the number of RDD format corresponding with directed acyclic graph task
According to collection;Step 404, Data Format Transform is carried out to the data set of RDD format, the data set of G-RDD format is obtained, by G-RDD
The data set of format stores the caching into graphics processor memory.RDD elasticity distribution formula data set is a kind of particular set, branch
A variety of sources are held, there is fault tolerant mechanism, can be buffered, support parallel work-flow, a RDD represents the data set in a subregion.
There are two types of operation operators by RDD: Transformation (conversion) and Action (execution), wherein Transformation belongs to
Delay computing is only to remember patrolling for data set when a RDD is converted into another RDD there is no being converted immediately
Collect operation;Ation triggers the operation of operation, really triggers the calculating of operator.GPU Manager is converted to RDD data
The data type G-RDD that GPU is capable of handling.
In one embodiment, Data Format Transform is carried out to the data set of RDD format, obtains the data of G-RDD format
Collection, stores the caching into graphics processor memory for the data set of G-RDD format, comprising: stores the data set of RDD format
To data buffer zone, the data set of the RDD format in called data buffer area carries out Data Format Transform, obtains G-RDD format
Data set, the data set of G-RDD format is stored into the caching into graphics processor memory.Due to Java Virtual Machine (JVM)
Heap memory (Heap) can not be communicated directly with GPU, therefore be added to data buffer zone in entire task execution, be used as
Bridge between JVM and GPU, as shown in figure 5, the buffer area has data buffering and data lattice at out-pile memory (Off-Heap)
Formula conversion function.
In one embodiment, data set corresponding with directed acyclic graph task is obtained, data set is stored to figure
Manage the caching in device memory, comprising: data set corresponding with directed acyclic graph task is read from distributed file system, and will
The data set read is stored to graphics processor memory;It is obtained from graphics processor memory corresponding with directed acyclic graph task
Data set, data set is stored into the caching into graphics processor memory.When DAG task is the DAG task executed by GPU
When, Executor on Worker Node is from HDFS (Hadoop Distributed File System, distributed field system
System) data set corresponding with DAG task is read, it stores to GPU memory.GPU Manager obtains appointing in GPU memory with DAG
It is engaged in corresponding data set, data set is stored to caching, calculated with being iterated.
In one embodiment, a kind of big data increment iterative method, firstly, this method is based on fusion graphics processor
The GSpark Computational frame model of GPU and Spark, using the high concurrent of GPU it is similar to Spark Distributed Parallel Computing it
Place, the two, which merges, can make full use of the intensive computing capabilitys of GPU general-purpose computations, entirely iterate to calculate to significantly improve
Speed-up ratio.Secondly, the increment iterative process in GPU, makes full use of the cache resources in GPU, by delaying to calculating data
It deposits and the input/output of the hiding low bandwidth of intensive iterative calculation postpones.Finally, being replaced by block-based sliding window mechanism
Memory solves the problems, such as that GPU memory size limits, so that programming personnel is not necessarily to carry out memory pipe manually because of bottom low memory
Reason achievees the effect that guarantee that the quick of data accesses and prevent memory from overflowing in calculating process.The multiple programming frame of GPU is association
The mainstream solution calculated is handled, the high concurrent and height of GPU, which is handled up, has remarkable result to promotion computational efficiency.By to current
The distributed computing framework of mainstream is extended, and is made full use of the booster action of GPU, is achieved the effect that raising efficiency.Increment changes
Generation is a kind of fast iterative algorithm, it is selectively counted in each step again using the sparse calculation dependence in data
The various pieces of model are calculated, rather than calculate completely new version.
The physical environment for building task run constructs GSpark Computational frame model and distribution on server cluster
File system, and debug starting GSpark Computational frame server cluster.After task start, operation is committed to by client
Master node, Master make a Worker starting Driver, initialization collection swarm parameter, and SparkContext parameter is used for structure
Cluster resource is built, calculating nucleus number, memory size and initiation parameter configuration including CPU and GPU etc..Data will be calculated to upload
Into HDFS distributed file system, operation is submitted to the host node Master of GSpark by client, Driver is according to width
Dependence divides the job into DAG task, and task is committed to the execution that TaskScheduler carries out task schedule, this is former
Business is divided into the TasksetScheduler_gpu task by CPU the TasksetScheduler_cpu executed and being executed by GPU.When
When task is TasksetScheduler_gpu task, the section is managed by the GPU Manager of each calculate node and distributed
The GPU computing resource of point, and task and calculating data are placed in GPU memory.GPU Manager converts data to GPU
The G-RDD data type being capable of handling, and the data for being put into GPU memory for the first time are cached (including Input Cache and
Result Cache), and reused in iteration later, to reduce the communication overhead between GPU and CPU.Root
The number of iterations is determined according to required precision, constantly data result is updated in caching in an iterative process, when incremental data is arrived
When, without recalculating the task of last time, directly it is iterated using data calculated result last round of in caching.In iteration mistake
Journey makes full use of the characteristic of GPU, is calculated by shared drive Shared Memory Accelerated iteration, several in shared drive when occurring
When not hitting according to caching greater than the limitation of GPU actual memory or deposit reading data, block-based varying glide window machine is used
System replaces shared drive data to global memory.Access window Read Window reads variable amount of data block chunk every time,
The merging memory access mechanism of global memory is utilized, sufficiently uses communication bandwidth.Due to Java Virtual Machine JVM heap memory (Heap) nothing
Method is directly communicated with GPU, adds data buffer zone in entire task execution, and the bridge being used as between JVM and GPU should
Buffer area is used as data buffering and data format conversion function at out-pile memory (Off-Heap).After the completion of iterative calculation executes,
Data in GPU global memory are transferred to CPU memory.
Although it should be understood that Fig. 2,4 flow chart in each step successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, Fig. 2, at least one in 4
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively
It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately
It executes.
In one embodiment, a kind of big data increment iterative device is provided, as shown in fig. 6, big data increment iterative
Device includes task receiving module 602, data update module 604 and increment iterative module 606.Wherein, task receiving module,
For receiving the directed acyclic graph task executed by graphics processor, acquisition data set corresponding with directed acyclic graph task will
Data set stores the caching into graphics processor memory.Data update module, for responding directed acyclic graph task, to data
Collection is iterated calculatings, the data set after being iterated to calculate, and with the data set after iterative calculation to storing in the caching
Data set be updated.Increment iterative module is used for when increment variation occurs for data set, based on the iteration stored in caching
Data set after calculating carries out increment iterative calculating, the data set after obtaining increment iterative, with the data set pair after increment iterative
Data set in caching is updated.
In one embodiment, it caches as shared drive, big data increment iterative device further include: Data Migration module,
For being incited somebody to action by block-based sliding window mechanism when detecting that the data set amount of storage on shared drive is greater than preset threshold
Data set migration on shared drive is to global memory.
In one embodiment, Data Migration module, which is also used to work as, detects that increment corresponding with directed acyclic graph task changes
When generation calculates stopping, by the data set migration in global memory to central processing unit memory.
In one embodiment, task receiving module is also used to obtain RDD format corresponding with directed acyclic graph task
Data set;Data Format Transform is carried out to the data set of RDD format, the data set of G-RDD format is obtained, by G-RDD format
Data set stores the caching into graphics processor memory.
In one embodiment, task receiving module is also used to store the data set of RDD format to data buffer zone, is adjusted
It takes the data set of the RDD format in data buffer zone to carry out Data Format Transform, the data set of G-RDD format is obtained, by G-RDD
The data set of format stores the caching into graphics processor memory.
In one embodiment, task receiving module is also used to read from distributed file system and appoint with directed acyclic graph
It is engaged in corresponding data set, and the data set read is stored to graphics processor memory;It is obtained from graphics processor memory
Data set is stored the caching into graphics processor memory by data set corresponding with directed acyclic graph task.
Specific restriction about big data increment iterative device may refer to above for big data increment iterative method
Restriction, details are not described herein.Modules in above-mentioned big data increment iterative device can be fully or partially through software, hard
Part and combinations thereof is realized.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment,
It can also be stored in a software form in the memory in computer equipment, execute the above modules in order to which processor calls
Corresponding operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in Figure 7.The computer equipment includes processor, memory and the network interface connected by system bus.
Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory of the computer equipment includes non-easy
The property lost storage medium, built-in storage.The non-volatile memory medium is stored with operating system and computer program.The built-in storage
Operation for operating system and computer program in non-volatile memory medium provides environment.The network of the computer equipment connects
Mouth with external terminal by network connection for being communicated.To realize a kind of big data when the computer program is executed by processor
Increment iterative method.
It will be understood by those skilled in the art that structure shown in Fig. 7, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with
Computer program, the processor realize the step in any embodiment in big data increment iterative method when executing computer program
Suddenly.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
The step in any embodiment in big data increment iterative method is realized when machine program is executed by processor.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.