CN112912849A

CN112912849A - Graph data-based calculation operation scheduling method, system, computer-readable medium and equipment

Info

Publication number: CN112912849A
Application number: CN201880094017.4A
Authority: CN
Inventors: 樊文飞; 于文渊; 徐静波; 罗小简
Original assignee: Zhejiang Tmall Technology Co Ltd
Current assignee: Zhejiang Tmall Technology Co Ltd
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2021-06-04
Also published as: WO2020019315A1; US20210149746A1

Abstract

A method, a system, a computer readable medium and a device for computing operation scheduling based on graph data are provided, wherein the scheduling method comprises the following steps: dividing the graph data into a plurality of partitions (S1); scheduling the partition to be allocated to a plurality of processors (S2); successively delivering the partitions to the threads corresponding to the processors for calculation according to the criticality of each partition (S3); determining whether there is an idle processor, and if the processor includes one or more idle threads, then the processor is an idle processor (S4); communicating the idle processor with other processors, and searching the partition to be calculated corresponding to the other processors (S5); selecting and determining the partition to be calculated (S6); binding the determined partition to be computed to a thread process of the idle processor (S7).

Description

Graph data-based calculation operation scheduling method, system, computer-readable medium and equipment

Technical Field

The invention belongs to the technical field of data resource allocation, and particularly relates to a method, a system, a computer readable medium and a device for computing operation scheduling based on graph data.

Background

With the development of information technology and the popularization of the internet, data is explosively increased, and particularly in recent years, the rapid development of social networks makes graph data sharply increased. Graph data is data stored in a graph as a data structure, and is abstracted into nodes and connecting lines of the nodes. In the real world, graph data is widely available and huge in size, such as an interpersonal relationship graph in a microblog, a webpage orientation graph in a search engine, a geographic information graph in a traffic system and the like. Analyzing and mining the information in the graph data is of great significance to the work in the fields of business management, production control, market analysis, engineering design, scientific exploration and the like.

When executing graph data, there may be multiple user threads waiting to execute on the processor or CPU of the computer system at any given processing time. In the prior art, after a thread calculates a task, the thread can randomly execute the next task and cannot evaluate the criticality of the task; and different processors or CPUs do not communicate with each other, for example, tasks under the threads of the CPU1 are all executed, threads of the CPU2 are all running, and tasks to be processed are waiting to be executed, so that the problems of long execution time of the CPU2 and long waiting time of the CPU1 occur, and thus, computing resources are wasted.

In view of the above, there is an urgent need to design a graph data-based computation operation scheduling method to overcome the shortcomings of the existing graph data operation.

Disclosure of Invention

The embodiment of the invention provides a method, a system, a computer readable medium and equipment for computing operation scheduling based on graph data, which aim to solve the technical problem that computing resources are wasted during the operation of the existing graph data.

The embodiment of the invention provides a calculation operation scheduling method based on graph data, which comprises the following steps:

dividing graph data into a plurality of partitions;

scheduling the partitions for allocation to a plurality of processors;

sequentially submitting the partitions to the corresponding threads of the processor for calculation according to the key degree of each partition;

judging whether an idle processor exists, wherein when the processor comprises one or more idle threads, the processor is an idle processor;

communicating the idle processor with other processors, and searching the to-be-calculated partitions corresponding to the other processors;

selecting and determining the partition to be calculated;

and migrating and binding the determined partition to be calculated to the thread processing of the idle processor.

Further, the number of partitions allocated to each processor is greater than the number of threads corresponding to each processor.

Further, if the number of the partitions to be calculated is multiple, the method for selecting and determining the partitions to be calculated includes:

considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, and evaluating the loss and the benefit;

determining which specific partitions to be calculated are allocated to the idle processor according to the evaluation result;

determining a manner in which the partition data to be computed is migrated to the idle processor.

Meanwhile, an embodiment of the present invention further provides a graph data-based computation operation scheduling system, including:

a partitioning module for partitioning the graph data into a plurality of partitions;

a scheduling module to schedule the partitions for allocation to a plurality of processors;

the submitting module is used for sequentially submitting the partitions to the threads corresponding to the processors for calculation according to the criticality of each partition;

a judging module, configured to judge whether there is an idle processor, and when the processor includes one or more idle threads, the processor is an idle processor:

the communication module is used for carrying out communication between the idle processor and other processors to find out the to-be-calculated partitions corresponding to the other processors;

the selection module is used for selecting and determining the partition to be calculated;

and the migration module is used for migrating and binding the determined partition to be calculated to the thread processing of the idle processor.

Further, the number of partitions allocated to each processor by the scheduling module is greater than the number of threads corresponding to each processor.

Further, if the number of the partitions to be calculated is multiple, the selecting module specifically selects and determines the partitions to be calculated, including:

considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, evaluating the loss and the benefit, and obtaining an evaluation result;

In addition, an embodiment of the present invention further provides a computer-readable medium storing a computer program for graph data operation, where the computer program includes instructions for causing a computer system to:

dividing graph data into a plurality of partitions;

scheduling the partitions for allocation to a plurality of processors;

selecting and determining the partition to be calculated;

In addition, the embodiment of the invention also provides an electronic device, which comprises a processor and a memory; the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory, so that the electronic equipment executes any one of the above-mentioned graph data-based calculation operation scheduling methods.

Compared with the prior art, the scheme of the embodiment of the invention at least has the following beneficial effects:

firstly, the processors communicate with each other, so that the waiting time of each processor can be reduced;

secondly, the key degree of the partition is evaluated before the thread executes the calculation partition, so that the execution time of a certain processor can be reduced, and the longer execution time is avoided;

thirdly, the multiple partitions to be calculated are evaluated and selected, so that convergence can be accelerated, and the correctness and the termination of data calculation operation are guaranteed.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flow chart of a method for computing operation scheduling based on graph data according to an embodiment of the present invention;

fig. 2 is a flowchart of a method for selecting a partition to be computed in a graph data-based computation operation scheduling method according to an embodiment of the present invention.

FIG. 3 is a schematic structural diagram of a system for computing, operating and scheduling based on graph data according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the invention;

icon: 21-a partitioning module; 22-a blending module; 23-a commit module; 24-a judgment module; 25-a communication module; 26-a selection module; 27-migration module.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Example 1

The embodiment of the invention provides a calculation operation scheduling method based on graph data, which comprises the following steps as shown in figure 1:

s1, dividing the graph data into a plurality of partitions.

The method for partitioning the graph data is not limited, as long as the number of partitions is far larger than the number of threads, and therefore the situation of thread scheduling partitions can be realized. The large number of partitions may reduce the size of each partition, thereby avoiding some partitions being more computationally intensive than others. Preferably, the partitioning may be implemented using a partitioning algorithm, such as METIS. The partitioning algorithm may follow principles of ensuring minimum cross-partition edges, minimum cross-partition points, etc.

In this embodiment, the graph data is partitioned into three levels by combining a non-uniform memory access architecture (NUMA) processing system. Specifically, the processing system includes at least one computing device, each computing device corresponds to a plurality of memories, each memory corresponds to a plurality of processors, and the three-level partitioning specific method includes:

partitioning the graph data according to the number of the computing devices and the communication overhead among the computing devices to obtain a plurality of first-level partitions;

dividing each first-level partition into a plurality of second-level partitions according to the number of NUMA nodes in each computing device and the communication overhead among the NUMA nodes;

and dividing each second-level partition into a plurality of third-level partitions according to the number of the working nodes in each NUMA node and the communication overhead among the working nodes, wherein the working nodes comprise processors or processor threads.

The data in each third-level partition is formed to be data with higher relevance degree; the data of two third-level partitions are connected to have shared edge data, and once one partition updates the edge data, the adjacent partition needs to be informed to update through a message.

The graph data is used for storing data graphically, and is one of the closest high-performance data structure modes for storing data. A graph is composed of countless nodes and relationships, the relationships among the nodes are an important part of a graph database, and many associated data can be found through the relationships, such as node sets, relationship sets and attribute sets of the node sets and the relationship sets.

S2, the partition is scheduled to be distributed to a plurality of processors.

After completion of the partitioning at step S1, each processor may be scheduled to be assigned a different number of partitions due to the different data size of each partition. Each processor may correspond to a plurality of threads, and the number of partitions allocated to each processor may be greater than the number of threads corresponding to each processor. The scheduling may be performed using existing scheduling algorithms. In this embodiment, the scheduler controls the processor to invoke different partitions.

And S3, sequentially delivering the partitions to the threads corresponding to the processors for calculation according to the criticality of each partition.

Each thread calls partition data from the processor to execute calculation, and after the execution is finished, the next partition can be continuously called from the processor to execute, at the moment, the key degree of each partition needs to be evaluated, the key partition is determined and called to execute, so that the thread calculates the key partition first, and the execution time of the partition can be reduced.

The critical partitions are evaluated based on data runtime parameters and their statistics. If the first partition and the second partition are both associated with data of a third partition, the third partition may be determined to be a critical partition, and the thread may perform the calculation of the third partition preferentially.

S4, judging whether there is idle processor, when the processor includes one or more idle threads, the processor is idle processor;

s5, the idle processor and other processors are communicated, and the to-be-calculated partitions corresponding to the other processors are searched.

The other processor refers to a processor other than the idle processor. The other processor threads may all be running and may be idle.

When it is determined that an idle processor exists, the idle processor may send a message to any other processor that there is an idle thread. When other processors receive the message, if other processors have the partition to be calculated, the other processors can allocate the partition to be calculated to an idle processor for calculation. The partition to be computed refers to a partition that a thread has not yet executed.

S6, selecting and determining the partition to be calculated;

when the number of the partitions to be calculated is multiple, several partitions need to be allocated, which partitions are migrated to the idle processor, and the number of the partitions needs to be evaluated specifically. As shown in fig. 2, the method for selecting and determining the to-be-calculated partition specifically includes:

and S61, evaluating loss and benefit by considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, and obtaining an evaluation result.

The data size of the partition to be calculated is different, and the time required for executing the calculation is also different, at this time, the partition execution time under the processor needs to be considered as a whole, and the partition with long execution time is prevented from being distributed to the same processor. In addition to the execution time of the partitions, the migration overhead between the partitions to be computed needs to be considered, which includes two parts, namely, the data of the partitions to be computed is transmitted from one machine to another machine (communication overhead) and the partitions to be computed are reconstructed on the destination machine, and the time overhead of the two parts constitutes the migration overhead.

The evaluation result comprises loss and benefit data of each partition to be calculated. An evaluation algorithm, such as a greedy algorithm, may be used for the calculation during the evaluation.

S62, determining specific partitions to be calculated to be allocated to the idle processors according to the evaluation result.

Determining which to allocate to the idle processor based on comparing the penalty and penalty data for each of the to-be-computed partitions. For example, a partition with a benefit greater than a penalty may be assigned to an idle processor.

S63, determining the mode of migrating the partition data to be calculated to the idle processor.

The idle processor can copy the data of the partition to be calculated from a disk or other processors and bind the data into an idle thread for calculation.

S7, the determined partition to be calculated is migrated and bound to the thread processing of the idle processor.

Specifically, the migration procedure is to transfer data of the partition to be computed from one machine to another (communication), and rebuild the partition to be computed on the destination machine, where the partition to be computed is processed by the thread bound to the idle processor.

In the graph data-based calculation operation scheduling method provided by the embodiment of the invention, the processors are communicated with each other, and the partitions to be calculated are redistributed, so that the waiting time of each processor can be reduced; the key degree of the partition is evaluated before the thread executes the calculation partition, so that the execution time of a certain processor can be reduced, and the longer execution time is avoided; the evaluation selection is carried out on the plurality of partitions to be calculated, so that the convergence can be accelerated, and the correctness and the termination of the data calculation operation are ensured.

Example 2

As shown in fig. 3, an embodiment of the present invention provides a graph data-based computation operation scheduling system, including:

a dividing module 21, configured to divide the graph data into a plurality of partitions.

The method for partitioning the graph data is not limited, as long as the number of partitions is far larger than the number of threads, and therefore the situation of thread scheduling partitions can be realized. The large number of partitions may reduce the size of each partition, thereby avoiding some partitions being more computationally intensive than others. Preferably, the partitioning may be implemented using a partitioning algorithm, e.g., METIS. The partitioning algorithm may follow principles of ensuring minimum cross-partition edges, minimum cross-partition points, etc.

A scheduling module 22 for scheduling the partitions to be allocated to the plurality of processors. After the partitioning module 21 completes partitioning, because the data size of each partition is different, a different number of partitions may be allocated to each server. Each processor may correspond to a plurality of threads, and the number of partitions allocated to each processor may be greater than the number of threads corresponding to each processor. The scheduling module 22 may be implemented using existing scheduling algorithms. The scheduler controls the processor to invoke different partitions.

And the submitting module 23 is configured to successively submit the partitions to the threads corresponding to the processors for calculation according to the criticality of each partition. Each thread calls partition data from the processor to execute calculation, and after the execution is finished, the next partition can be continuously called from the processor to execute, at the moment, the key degree of each partition needs to be evaluated, the key partition is determined and called to execute, so that the thread calculates the key partition first, and the execution time of the partition can be reduced.

The determining module 24 is configured to determine whether there is an idle processor, and determine that the processor is an idle processor when the processor includes one or more idle threads.

And a communication module 25, configured to communicate between the idle processor and another processor, and find a to-be-computed partition corresponding to the other processor. When an idle processor is present, the idle processor may send a message to any one of the other processors that there is an idle thread. When other processors receive the message, if there is a partition to be computed under other processors, the other processors may allocate the partition to be computed to an idle processor for computation.

And the selection module 26 is used for selecting and determining the partition to be calculated. When the number of the partitions to be calculated is multiple, it is specifically necessary to allocate several partitions and which partitions are allocated to the idle processor, and the selection module 26 needs to specifically evaluate the multiple partitions to be calculated. The selection module specifically comprises:

and evaluating the loss and the benefit by considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated to obtain an evaluation result. The data size of the partition to be calculated is different, and the time required for executing the calculation is also different, at this time, the partition execution time under the processor needs to be considered as a whole, and the partition with long execution time is prevented from being distributed to the same processor. In addition to the execution time of the partitions, the migration overhead between the partitions to be computed needs to be considered, which includes two parts, namely, the data of the partitions to be computed is transmitted from one machine to another machine (communication overhead) and the partitions to be computed are reconstructed on the destination machine, and the time overhead of the two parts constitutes the migration overhead.

And determining specific partitions to be calculated to be allocated to the idle processors according to the evaluation result. Specifically, it is determined which partitions to be computed are to be allocated to the idle processor based on comparing the loss and benefit data of each partition to be computed, for example, a partition with a benefit greater than the loss may be allocated to the idle processor.

Determining a manner in which the partition data to be computed is migrated to the idle processor. Specifically, the idle processor may copy the data of the partition to be computed from a disk or other processor.

And a migration module 27, configured to migrate and bind the determined partition to be computed to the thread processing of the idle processor. Specifically, the migration procedure is to transfer data of the partition to be computed from one machine to another (communication), and rebuild the partition to be computed on the destination machine, where the partition to be computed is processed by the thread bound to the idle processor.

In the graph data-based calculation operation scheduling system provided by the embodiment of the invention, the processors are communicated with each other, and the partitions to be calculated are redistributed, so that the waiting time of each processor can be reduced; the key degree of the partition is evaluated before the thread executes the calculation partition, so that the execution time of a certain processor can be reduced, and the longer execution time is avoided; the evaluation selection is carried out on the plurality of partitions to be calculated, so that the convergence can be accelerated, and the correctness and the termination of the data calculation operation are ensured.

Example 3

An embodiment of the present invention provides a computer-readable medium storing a computer program for graph data execution, the computer program including instructions for causing a computer system to:

dividing graph data into a plurality of partitions;

scheduling the partitions for allocation to a plurality of processors;

selecting and determining the partition to be calculated;

And the number of partitions allocated to each processor is greater than the number of threads corresponding to each processor.

If the number of the partitions to be calculated is multiple, the method for selecting and determining the partitions to be calculated comprises the following steps:

The computer readable storage medium may be ROM/RAM, magnetic disk, optical disk, etc. and includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments.

The computer readable medium storing the computer program for graph data operation, provided by the embodiment of the invention, enables the processors to communicate with each other, reallocates the partitions to be calculated, and can reduce the waiting time of each processor; the key degree of the partition is evaluated before the thread executes the calculation partition, so that the execution time of a certain processor can be reduced, and the longer execution time is avoided; the evaluation selection is carried out on the plurality of partitions to be calculated, so that the convergence can be accelerated, and the correctness and the termination of the data calculation operation are ensured.

Example 4

The embodiment of the present invention further provides an electronic device, which includes a processor 41 and a memory 42; the memory 42 is used for storing a computer program, and the processor 41 is used for executing the computer program stored in the memory, so as to enable the electronic device to execute any one of the above-mentioned graph data-based calculation operation scheduling methods.

The specific principle of the method for computing, operating and scheduling based on graph data is as described in the above embodiments, and is not described herein again.

The electronic device of embodiments of the present invention exists in a variety of forms, including but not limited to:

(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as ipads.

(3) Portable entertainment devices such devices may display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server comprises a processor, a hard disk, a memory, a system bus and the like, is similar to a general computer architecture, but has higher requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like because high-reliability service needs to be provided.

The electronic equipment provided by the embodiment of the invention carries out communication among the processors, redistributes the partitions to be calculated, and can reduce the waiting time of each processor; the key degree of the partition is evaluated before the thread executes the calculation partition, so that the execution time of a certain processor can be reduced, and the longer execution time is avoided; the evaluation selection is carried out on the plurality of partitions to be calculated, so that the convergence can be accelerated, and the correctness and the termination of the data calculation operation are ensured.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

A method for computing operation scheduling based on graph data is characterized by comprising the following steps:

dividing graph data into a plurality of partitions;

scheduling the partitions for allocation to a plurality of processors;

sequentially submitting the partitions to the corresponding threads of the processor for calculation according to the key degree of each partition;

judging whether an idle processor exists, wherein when the processor comprises one or more idle threads, the processor is an idle processor;

communicating the idle processor with other processors, and searching the to-be-calculated partitions corresponding to the other processors;

selecting and determining the partition to be calculated;

and migrating and binding the determined partition to be calculated to the thread processing of the idle processor.
The method of claim 1, wherein the number of partitions allocated to each processor is greater than the number of threads corresponding to each processor.
The graph data-based computation operation scheduling method according to claim 1, wherein if the number of the partitions to be computed is multiple, the method for selecting and determining the partitions to be computed comprises:

considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, evaluating the loss and the benefit, and obtaining an evaluation result;

determining which specific partitions to be calculated are allocated to the idle processor according to the evaluation result;

determining a manner in which the partition data to be computed is migrated to the idle processor.
A graph data based compute run scheduling system comprising:

a partitioning module for partitioning the graph data into a plurality of partitions;

a scheduling module to schedule the partitions and allocate to a plurality of processors;

the submitting module is used for sequentially submitting the partitions to the threads corresponding to the processors for calculation according to the criticality of each partition;

the judging module is used for judging whether an idle processor exists or not, and when the processor comprises one or more idle threads, the processor is an idle processor;

the communication module is used for carrying out communication between the idle processor and other processors to find out the to-be-calculated partitions corresponding to the other processors;

the selection module is used for selecting and determining the partition to be calculated;

and the migration module is used for migrating and binding the determined partition to be calculated to the thread processing of the idle processor.
The graph data based compute run scheduling system of claim 4 wherein said scheduling module allocates a greater number of partitions to each processor than a corresponding number of threads for said each processor.
The graph data-based computation operation scheduling system according to claim 4, wherein if the number of the partitions to be computed is plural, the selecting module specifically selects and determines the partitions to be computed to include:

considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, evaluating the loss and the benefit, and obtaining an evaluation result;

determining which specific partitions to be calculated are allocated to the idle processor according to the evaluation result;

determining a manner in which the partition data to be computed is migrated to the idle processor.
A computer-readable medium storing a computer program for graph data execution, the computer program comprising instructions for causing a computer system to:

dividing graph data into a plurality of partitions;

scheduling the partitions for allocation to a plurality of processors;

sequentially submitting the partitions to the corresponding threads of the processor for calculation according to the key degree of each partition;

judging whether an idle processor exists, wherein when the processor comprises one or more idle threads, the processor is an idle processor;

communicating the idle processor with other processors, and searching the to-be-calculated partitions corresponding to the other processors;

selecting and determining the partition to be calculated;

and migrating and binding the determined partition to be calculated to the thread processing of the idle processor.
The computer-readable medium of claim 7, wherein the number of partitions allocated to each processor is greater than the number of threads corresponding to each processor.
The computer-readable medium of claim 7, wherein if the number of the partitions to be calculated is multiple, the method for selecting and determining the partitions to be calculated comprises:

considering the execution time of each partition to be calculated and the migration overhead among the partitions to be calculated, evaluating the loss and the benefit, and obtaining an evaluation result;

determining which specific partitions to be calculated are allocated to the idle processor according to the evaluation result;

determining a manner in which the partition data to be computed is migrated to the idle processor.
An electronic device comprising a processor and a memory; the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory, so that the electronic equipment executes any one of the above-mentioned graph data-based calculation operation scheduling methods.