CN116594755B - Online scheduling method and system for multi-platform machine learning tasks - Google Patents

Online scheduling method and system for multi-platform machine learning tasks Download PDF

Info

Publication number
CN116594755B
CN116594755B CN202310854333.XA CN202310854333A CN116594755B CN 116594755 B CN116594755 B CN 116594755B CN 202310854333 A CN202310854333 A CN 202310854333A CN 116594755 B CN116594755 B CN 116594755B
Authority
CN
China
Prior art keywords
task
platform
online
execution
tasks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310854333.XA
Other languages
Chinese (zh)
Other versions
CN116594755A (en
Inventor
祁纲
迟雪
陈晨
王秋菊
傅豪
储熠
王聪聪
王孟宇
彭渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiji Computer Corp Ltd
Original Assignee
Taiji Computer Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiji Computer Corp Ltd filed Critical Taiji Computer Corp Ltd
Priority to CN202310854333.XA priority Critical patent/CN116594755B/en
Publication of CN116594755A publication Critical patent/CN116594755A/en
Application granted granted Critical
Publication of CN116594755B publication Critical patent/CN116594755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an online scheduling method and system for multi-platform machine learning tasks, which belong to the technical field of online scheduling, and the method comprises the following steps: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task; analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks; acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time; and analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling. The specific conditions and the flow conflict sets of the tasks to be scheduled are obtained in real time, and the conflict sets are analyzed based on a machine learning model to construct a scheduling relation network, so that the task execution conditions can be obtained in real time, the occurrence of conflicts in the task execution process is avoided, and the online scheduling of the multi-platform learning tasks is realized.

Description

Online scheduling method and system for multi-platform machine learning tasks
Technical Field
The invention relates to the technical field of machine learning, in particular to an online scheduling method and system for multi-platform machine learning tasks.
Background
Machine learning refers to a data mining technology for searching meaningful knowledge from a large amount of data, when a plurality of online platforms simultaneously execute machine learning tasks, because the number of tasks and task types required to be executed by different platforms at the same time point are different, when the tasks required to be executed reach a certain degree, a conflict event can occur, and the occurrence of the conflict event can cause that a part of tasks cannot be completed, and even a system crash situation can occur in serious cases.
Therefore, the invention provides an online scheduling method and system for multi-platform machine learning tasks.
Disclosure of Invention
The invention provides an online scheduling method and system for multi-platform machine learning tasks, which are used for realizing online scheduling by acquiring specific task conditions of an online platform in real time, and scheduling each task of the online platform in real time by acquiring specific contents of tasks which can be executed simultaneously by a flow conflict set and a machine learning model, so that occurrence of conflict events is effectively avoided, and task execution efficiency is improved.
The invention provides an online scheduling method of a multi-platform machine learning task, which comprises the following steps:
step 1: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
step 2: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
step 3: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
step 4: and analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling.
In one possible implementation manner, acquiring, in real time, the number of online platforms and the number of tasks and task types corresponding to each online platform includes:
acquiring the current state of each target platform, and counting the number of online states of the current state as the number of online platforms;
and determining effective information of the corresponding online platform according to the operation information and task rules of each online platform, and obtaining the task number and task type of the corresponding online platform based on the effective information.
In one possible implementation manner, in combination with task execution time required by the same type of task, an execution flow of the same platform task is obtained, including:
acquiring all historical execution time of the tasks of the same type on the same online platform, and performing data analysis to obtain standard task execution time required by the tasks of the same type;
and acquiring standard execution sequences of different types of tasks specified by the same online platform, and combining the standard task execution time of the different types of tasks to obtain the task execution flow of the same online platform.
In one possible implementation, analyzing and acquiring a set of flow conflicts between execution flows of a plurality of online platform tasks includes:
acquiring the execution content and the execution space of each execution flow at the same moment, and counting the quantity of the execution content and the total quantity of the execution space at the same moment;
when the number of the execution contents is larger than the number of the preset contents or the total execution space is larger than the total preset space, reserving all the execution contents at the corresponding time;
and obtaining a flow conflict set based on all the reserved results.
In one possible implementation, acquiring the machine learning model and the number of tasks and task types that the machine learning model can simultaneously perform includes:
according to all the first platforms with the largest historical online quantity and the corresponding historical task quantity and historical task types of each first platform, and combining the first historical operation logs with the largest historical online quantity, acquiring a historical execution flow of each first platform, and constructing a historical flow framework;
based on the history flow framework, determining a same history execution set of a time period corresponding to each individual task type in each first platform;
acquiring initial non-conflict constraint conditions of each individual task type according to the same history execution set;
regarding each remaining platform excluding the first platform from all the target platforms as a second platform;
respectively acquiring a second historical operation log containing the process of executing the machine learning task when each second platform is online, acquiring a second execution flow set of each second platform based on the second historical operation log, and constructing a second flow sub-frame corresponding to the second platform;
constructing a second flow Cheng Goujia containing all second platforms based on all second flow sub-frames;
acquiring a second execution set of the time period corresponding to each individual task type based on the second flow Cheng Goujia;
performing first optimization on initial non-conflict conditions of corresponding independent task types based on the second execution set to obtain optimized non-conflict conditions;
determining an execution duration variable interval of the tasks of the same type on each second platform based on the second flow framework;
constructing and obtaining a variable matrix under the same variable of the corresponding task type based on the maximum variable value and the minimum variable value of the corresponding variable interval of the tasks of the same type of the second platform, wherein a first column of the variable matrix is a maximum variable vector, and a second column of the variable matrix is a minimum variable vector;
randomly selecting j number values from the variable matrix to verify the optimized non-conflict condition;
if the verification result is valid, judging that no conflict exists, and keeping the optimized non-conflict condition unchanged;
otherwise, judging that the conflict situation exists, acquiring a minimum value in a maximum variable vector and a maximum value in a minimum variable vector in the corresponding task type, carrying out reduction limitation on the optimized non-conflict condition, and reserving the condition after the reduction limitation;
based on all the reserved conditions, the task number and task types which can be executed by the machine learning model at the same time are obtained.
In one possible implementation, resolving the set of flow conflicts based on the acquisition result includes:
analyzing the flow conflict set based on the number of tasks and task types which can be simultaneously executed by the machine learning model to obtain a task priority set, wherein the task priority set comprises: execution priorities of different task types in each platform and total task priorities of the same platform;
and arranging the execution sequence of the real-time online platform and the tasks required to be executed by the online platform based on the task priority set to obtain the task execution sequence.
In one possible implementation manner, the process conflict set is parsed based on the obtained result, and a scheduling relationship network is constructed to realize online scheduling, which includes:
determining an allowable scheduling condition of each conflict flow in the flow conflict set based on the acquired result;
establishing a scheduling relation network based on all allowable scheduling conditions;
and carrying out online scheduling on the online platform and the tasks to be executed based on the online scheduling relation network, so as to realize online scheduling of the multi-platform machine learning tasks.
The invention provides an online dispatching system of multi-platform machine learning tasks, comprising:
and the information acquisition module is used for: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
and a conflict analysis module: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
and a data arrangement module: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
and an online scheduling module: and analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flowchart of an online scheduling method for multi-platform machine learning tasks according to an embodiment of the present invention;
FIG. 2 is a block diagram of an online scheduling system for multi-platform machine learning tasks in an embodiment of the present invention;
FIG. 3 is a diagram of a historical flow architecture in an embodiment of the invention;
fig. 4 is a diagram of a second flowchart subframe structure in an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
An embodiment of the present invention provides an online scheduling method for multi-platform machine learning tasks, as shown in fig. 1, including:
step 1: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
step 2: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
step 3: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
step 4: and analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling.
In this embodiment, the number of online platforms may be obtained by obtaining and counting the current states of all target platforms, where the current states include an online state, an offline state, and a offline state, and the number of online states needs to be counted.
In this embodiment, the task types are related to functions executed by the online platform, for example, a picture pushing task, a video pushing task, a weather pushing task, etc., and different task types are different, and the task types related to each platform are set in advance, so that the task execution time required for obtaining the same type of task can be directly obtained, that is, the task currently included in the same online platform is matched with the platform, and the number of the included tasks is less than or equal to the allowable maximum number of the platform, and the task types are determined based on the type corresponding to the task currently existing in the platform.
In this embodiment, for example, the platform has tasks 1 and 2 of type 1, task 3 of type 2, and tasks 4 and 5 of type 3, and at this time, the task execution time of task 1 and task 2, the task execution time of task 3, and the task execution time of tasks 4 and 5 can be obtained by matching based on the task rule set by the platform, so that the execution time of different types of tasks in the current online task of the platform can be conveniently obtained, and the execution sequence of different types of tasks is preset by the platform, so the execution flow is as follows: task 1 and task 2- -task 4 and task 5- -task 3.
In this embodiment, the task execution time is obtained by analyzing the historical execution data, and typically, standard deviation correlation calculation is performed on the time of the same type of historical task to obtain the standard task execution time.
In this embodiment, the task execution flow includes standard execution times of different tasks of different types and standard execution orders of different types of tasks.
In this embodiment, the process conflict set is used to characterize conflict situations occurring in the history execution process, by counting the situations that the number of execution contents is greater than the number of preset contents or the total execution space is greater than the total execution space at the same time in the execution process of the history online platform, and taking the content when the conflict occurs as the process conflict set, for example, the number of preset contents is 5, the number of execution contents of the online platform 1 is 2, the number of execution contents of the online platform 2 is 1, the number of execution contents of the online platform 3 is 3, at this time, the number of execution contents is greater than the number of preset contents, and the execution task content of the online platform at this time is saved to be used to form the process conflict set.
In this embodiment, machine learning refers to a technique of data mining by finding meaningful knowledge points from a large amount of data.
In this embodiment, the number of tasks and task types that can be executed simultaneously are obtained by constructing a history flow framework for the task execution situation when the number of online histories is the largest, obtaining an initial non-conflict constraint condition for each individual task type based on the history flow framework, obtaining a second flow Cheng Goujia by constructing the task execution situation under a special condition, optimizing the initial non-conflict condition, and verifying the optimized non-conflict condition.
In this embodiment, the analysis of the flow conflict set refers to obtaining a task priority set through the flow conflict set, and then sequencing the execution sequence of the tasks of the online platform through the task priority set to obtain a final task execution sequence.
In this embodiment, the scheduling relationship network is established by determining allowable scheduling conditions for each conflicting flow in the flow conflict set based on the acquisition result.
In the embodiment, the online scheduling of the tasks required to be executed by the online platform can be further realized through online scheduling of the online scheduling network.
The beneficial effects of the technical scheme are as follows: the method has the advantages that the specific number of the online platforms and the specific conditions of the tasks required to be executed by all the online platforms are obtained in real time, the real-time performance of the executed machine learning tasks can be ensured, the occurrence of conflict events in the task execution process is effectively avoided by obtaining the flow conflict set between execution flows, the maximization of the simultaneously executed task quantity can be ensured by collecting the specific conditions of the machine learning model, the resource waste is effectively avoided, the online scheduling is realized by analyzing the flow conflict set and constructing the scheduling relation network, and the task execution efficiency is improved.
The embodiment of the invention provides an online scheduling method for multi-platform machine learning tasks, which is used for acquiring the number of online platforms and the number and the type of the tasks corresponding to each online platform in real time and comprises the following steps:
acquiring the current state of each target platform, and counting the number of online states of the current state as the number of online platforms;
and determining effective information of the corresponding online platform according to the operation information and task rules of each online platform, and obtaining the task number and task type of the corresponding online platform based on the effective information.
In this embodiment, the current state refers to the current running state of each platform, such as online, offline, etc.
In this embodiment, the running information refers to the running state of the platform and the task condition contained in the platform.
In this embodiment, the task rules are determined by an expert for each different platform in advance, and include task types, task running times of different task types, space sizes required by task running, task running rules of different tasks, execution sequences of different task types, and the like, so that effective information can be directly determined, and the purpose of the task rules is to ensure that the platform performs reasonable execution on related tasks.
In this embodiment, the effective information includes the number of tasks to be executed by the corresponding online platform, the task type, the task execution status, and the task rules of each task.
The beneficial effects of the technical scheme are as follows: the number of online platforms is counted through the acquisition of the states of all target platforms, effective information is acquired through the analysis of the running conditions of the online platforms, the task conditions to be executed are further determined, the effectiveness of the information is effectively ensured, and convenience is provided for the acquisition of the follow-up task execution flow.
The embodiment of the invention provides an online scheduling method of multi-platform machine learning tasks, which combines task execution time required by the same type of tasks to obtain execution flows of the same-platform tasks, and comprises the following steps:
acquiring all historical execution time of the tasks of the same type on the same online platform, and performing data analysis to obtain standard task execution time required by the tasks of the same type;
and acquiring standard execution sequences of different types of tasks specified by the same online platform, and combining the standard task execution time of the different types of tasks to obtain the task execution flow of the same online platform.
In this embodiment, the standard task execution time is obtained by performing data analysis through the historical execution time, and the specific calculation process is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the Wherein T is the standard task execution time, < >>For the mean of the historical execution times +.>Is the standard deviation of the historical execution time.
In this embodiment, the historical execution sequence of the tasks of the same platform and the historical task execution sequence of the tasks of the same task type and different platforms are obtained through the historical execution conditions, and the standard execution sequence is obtained in a combined mode.
In this embodiment, the task execution flow refers to the total execution sequence of different types of tasks and the execution sequence of different tasks in the same task type, which are obtained by combining the standard execution sequence and the standard task execution time, for example: task corresponding to type 1 (execution time t 1) -task corresponding to type 2 (execution time t 2).
The beneficial effects of the technical scheme are as follows: the execution condition of the tasks can be effectively estimated by acquiring the standard execution time of the tasks of the same type, the task execution flow of the same online platform is acquired by acquiring the standard execution sequence, the order of task execution is effectively ensured, and effective information is provided for the subsequent construction of the online scheduling relation network.
The embodiment of the invention provides an online scheduling method of multi-platform machine learning tasks, which analyzes and acquires a flow conflict set among execution flows of a plurality of online platform tasks, and comprises the following steps:
acquiring the execution content and the execution space of each execution flow at the same moment, and counting the quantity of the execution content and the total quantity of the execution space at the same moment;
when the number of the execution contents is larger than the number of the preset contents or the total execution space is larger than the total preset space, reserving all the execution contents at the corresponding time;
and obtaining a flow conflict set based on all the reserved results.
In this embodiment, the execution content at the same time includes execution processes required by tasks of the same task type of different online platforms and execution processes required by tasks of different task types of different online platforms.
In this embodiment, the execution space refers to the operation space size required for each execution flow to run the corresponding task at the current time.
In this embodiment, the preset content number and the preset space total amount may be obtained through historical execution conditions, for example, the number of execution contents of tasks normally executed when a conflict situation occurs and the running space total amount of corresponding execution contents are counted based on the historical execution conditions, multiple times of statistics are performed, the maximum value of the number of execution contents is taken as the preset content number, and the average value of the total amount of execution spaces is taken as the preset space total amount.
In this embodiment, the reservation result refers to all execution contents when a conflict situation occurs.
The beneficial effects of the technical scheme are as follows: the historical flow conflict situation is obtained, and is subjected to arrangement and analysis to obtain a flow conflict set, so that the comprehensiveness of the flow conflict situation is guaranteed, and effective data support is provided for the follow-up construction of the online relation scheduling network.
The embodiment of the invention provides an online scheduling method for multi-platform machine learning tasks, which is used for acquiring the number of tasks and the types of the tasks which can be executed by a machine learning model and the machine learning model simultaneously, and comprises the following steps:
according to all the first platforms with the largest historical online quantity and the corresponding historical task quantity and historical task types of each first platform, and combining the first historical operation logs with the largest historical online quantity, acquiring a historical execution flow of each first platform, and constructing a historical flow framework;
based on the history flow framework, determining a same history execution set of a time period corresponding to each individual task type in each first platform;
acquiring initial non-conflict constraint conditions of each individual task type according to the same history execution set;
regarding each remaining platform excluding the first platform from all the target platforms as a second platform;
respectively acquiring a second historical operation log containing the process of executing the machine learning task when each second platform is online, acquiring a second execution flow set of each second platform based on the second historical operation log, and constructing a second flow sub-frame corresponding to the second platform;
constructing a second flow Cheng Goujia containing all second platforms based on all second flow sub-frames;
acquiring a second execution set of the time period corresponding to each individual task type based on the second flow Cheng Goujia;
performing first optimization on initial non-conflict conditions of corresponding independent task types based on the second execution set to obtain optimized non-conflict conditions;
determining an execution duration variable interval of the tasks of the same type on each second platform based on the second flow framework;
constructing and obtaining a variable matrix under the same variable of the corresponding task type based on the maximum variable value and the minimum variable value of the corresponding variable interval of the tasks of the same type of the second platform, wherein a first column of the variable matrix is a maximum variable vector, and a second column of the variable matrix is a minimum variable vector;
randomly selecting j number values from the variable matrix to verify the optimized non-conflict condition;
if the verification result is valid, judging that no conflict exists, and keeping the optimized non-conflict condition unchanged;
otherwise, judging that the conflict situation exists, acquiring a minimum value in a maximum variable vector and a maximum value in a minimum variable vector in the corresponding task type, carrying out reduction limitation on the optimized non-conflict condition, and reserving the condition after the reduction limitation;
based on all the reserved conditions, the task number and task types which can be executed by the machine learning model at the same time are obtained.
In this embodiment, the maximum number of online histories refers to the maximum total number of tasks contained in the online platform during the historical operation.
In this embodiment, the first historical operation log refers to an operation log of executing each task when the number of historical online is maximum, and generally, the background automatically records related operation information.
In this embodiment, the historical execution flow refers to the execution flow of the task that needs to be executed by each online platform when the number of historical online is maximum, and includes the execution sequence and execution time of each task.
In this embodiment, the history flow framework refers to a task type, a task rule and a task execution flow of each platform in the history execution process, a structure diagram of the history flow framework is shown in fig. 3, where the structure diagram includes a history flow of platform 1, platform 2 and platform 3, and the history flow of each platform includes a task type 1 and a task type 2, but execution times of the task type 1 and the task type 2 of each platform may be different, where the history flow framework includes histories corresponding to all historical online platforms, and each task type of each platform may be regarded as a single task type, for example, a vertical dotted line is established based on a final execution time point of the task type 1 of the platform 1, and relevant contents of all task types of the platforms 1, 2 and 3 related to the left side of the vertical dotted line are the same historical execution set for the task type 1 of the platform 1.
In this embodiment, the initial non-conflict constraint condition refers to a constraint condition that tasks of different platforms acquired under the condition that the historical online number is the largest do not conflict in the execution process, and specifically includes: meanwhile, the number of the etched execution contents is not greater than the number of the preset contents and the total execution space is not greater than the total preset space, for example, the number of the preset contents is 5, the total preset space is 7, 4 online platforms are shared at the present moment, the number of the contents being executed by the platform 1 is 1, the number of the occupied execution spaces is 1, the number of the contents being executed by the platform 2 is 1, the number of the occupied spaces is 2, the number of the contents being executed by the platform 3 is 2, the number of the occupied spaces is 1, the number of the contents being executed by the platform 4 is 2, the number of the occupied spaces is 1, and because the total number of the contents is 6 and greater than the number of the preset contents, the conflict occurs in the execution process at this time.
In this embodiment, after the history running log of the second platform is obtained, a flow set of the platform may be obtained according to the log, where the flow set includes a plurality of flow information, and because tasks received by each platform at different times are different, according to the second history running log, an execution task set of the platform at different periods may be obtained, where the execution task set is a corresponding second execution flow set, for example, under period 1, the platform 1 relates to a task type 1 and a task type 2 that need to be executed, and under period 2, relates to a task type 3 and a task type 2 that need to be executed, where the task type 1 and the task type 2, the task type 3 and the task type 2 form the second execution flow set.
Because, the execution sequence of each platform for different types of tasks is predetermined, for example, task type 1 is executed first, then task type 2 is executed, for example, task type 2 is executed first, and then task type 3 is executed.
As shown in fig. 4, a01 is a second flow sub-frame of the second platform 1, a02 is another second flow sub-frame of the second platform 1, where a01 and a02 are placed according to a time period sequence, and the obtained placement result in fig. 4 is the second flow Cheng Goujia.
In this embodiment, the second execution set is similar to the first execution set in terms of the acquisition principle, for example, with task type 1 in a01, all the contents corresponding to the framework on the left side of the vertical line a03 are the second execution set.
In the embodiment, the first optimization refers to obtaining the execution condition of each individual task type in a corresponding time period through a second flow framework, obtaining a second non-conflict constraint condition of a second platform through the execution condition, and optimizing and supplementing the initial non-conflict constraint condition based on the second non-conflict constraint condition to obtain an optimized non-conflict condition;
such as: the initial non-conflicting constraints are:the first optimized non-conflict constraint condition is: />Wherein->Represents the amount of scheduled resources for task type 1, +.>Represents the minimum allowed amount of scheduled resources for task type 1,/for>Representing a maximum allowed amount of scheduled resources for task type 1; />Representing the minimum allowable scheduling resource amount after corresponding optimization aiming at the task type 1; />Representing the maximum allowable scheduling resource amount after corresponding optimization for the task type 1.
From the second execution set, the task type can be obtained1, such as execution time, resource scheduling type, resource scheduling amount, etc., and therefore, if the resource scheduling amount of task type 1 is the same as the initial non-conflicting constraint,if the resource metric of task type 1 is not the same as the initial non-conflicting constraint, then a larger value is selected from z01 and z21 as z11 and a smaller value is selected from z02 and z22 as z12, where z21 and z22 are the ranges of resource metrics (z 21, z 22) for the relevant task type 1 determined for the second execution set.
In this embodiment, for example, for task type 2, the task execution duration corresponding to a01 may be screened as the right boundary, the task execution duration corresponding to a02 may be screened as the left boundary, and the task execution duration matches with the frame length of the corresponding task type, so the range of the task execution duration of the task type may be [ t01, t02], because many variables exist in the execution process of the corresponding task, and the variables of the same type of task are consistent, so the execution duration of the corresponding task, the allowable adjustment amount of a certain variable (scheduling variable) in the task, and the like may be determined according to the corresponding flow construction, the scheduling range may be determined, all ranges may be arranged, and the variable matrix may be constructed:
wherein, the liquid crystal display device comprises a liquid crystal display device,represents the minimum value of the covariates at platform 1, < >>Represents the maximum value of the covariates at platform 1; />Representing the minimum value of the covariates at the plateau n; />Represents the maximum value of the covariate at plateau n.
The minimum value in the maximum variable vector is,the minimum value of (2), the maximum value of the minimum variable vector is>The maximum value of (2) and the boundary of the same variable can be effectively reduced.
For example, the optimal non-conflicting condition corresponds to a range (u 01, u 02), in which case, fromScreening as left boundary, fromThe middle filter is taken as the right boundary and the left boundary is smaller than the right boundary.
The beneficial effects of the technical scheme are as follows: the maximum execution range under the history condition is obtained by analyzing the condition that the number of the history on-line is maximum, the range is set as an initial non-conflict constraint condition, the situation that the conflict can not occur in the task execution process is ensured through the constraint condition, the constraint condition is more completed by optimizing the initial non-conflict condition, the loophole is avoided, the execution space is ensured to be maximally utilized in consideration of all possible conflict conditions, and convenience is provided for the follow-up construction of the on-line relation scheduling network.
The embodiment of the invention provides an online scheduling method of a multi-platform machine learning task, which is used for analyzing a flow conflict set based on an acquisition result and comprises the following steps:
analyzing the flow conflict set based on the number of tasks and task types which can be simultaneously executed by the machine learning model to obtain a task priority set, wherein the task priority set comprises: execution priorities of different task types in each platform and total task priorities of the same platform;
and arranging the execution sequence of the real-time online platform and the tasks required to be executed by the online platform based on the task priority set to obtain the task execution sequence.
In this embodiment, the task priority set refers to the priority execution situation when different types of tasks of different platforms encounter a conflict in the execution process, and is obtained by sequencing the task execution sequences when the conflict situation occurs in the flow conflict set, and determining the execution priority sequence of the task to be executed when the conflict occurs through repeated verification for many times.
In this embodiment, the task execution sequence refers to execution sequence of different types of tasks of different platforms existing at the same time.
The beneficial effects of the technical scheme are as follows: by analyzing the flow conflict set, the execution priority order of different tasks under the condition of conflict occurrence is determined, so that the tasks with higher priority level can be executed preferentially when the conflict occurs, and effective guarantee is provided for constructing an online relation scheduling network.
The embodiment of the invention provides an online scheduling method of a multi-platform machine learning task, which is used for analyzing a flow conflict set based on an acquisition result, constructing a scheduling relation network and realizing online scheduling, and comprises the following steps:
determining an allowable scheduling condition of each conflict flow in the flow conflict set based on the acquired result;
establishing a scheduling relation network based on all allowable scheduling conditions;
and carrying out online scheduling on the online platform and the tasks to be executed based on the online scheduling relation network, so as to realize online scheduling of the multi-platform machine learning tasks.
In this embodiment, the allowable scheduling conditions include preset conditions, scheduling variables, constraint conditions, and target task amounts, where the preset conditions include: each platform is mutually independent, each platform can only acquire one online condition, each platform corresponds to a task condition and a task period one by one, and after the task starts, the platform can only execute until the task ends and can not stop;
wherein the scheduling variables include: platform variables, task variables in each platform, execution space variables and execution sequence variables; the platform variables include the number of platforms; the task variables in the platforms comprise the number and the types of the tasks contained in each platform; the execution space variables are used for distinguishing the execution space parameters required by different types of tasks; the execution sequence variable is used for representing the real-time task execution sequence;
wherein the constraint conditions include: each task can only be executed once, and each platform can only execute one task and the like in the same time period;
the target task amount is the maximum completion amount of tasks within a time period with a specified length, and specifically comprises the following steps: taking the task completion condition of a certain online platform in the same time period as a task value, and obtaining the maximum value of the total value of the number of tasks completed by the online platform in a time period with a specified length;
in this embodiment, the process of establishing the scheduling relationship network is as follows:
establishing a scheduling relation network based on preset conditions, scheduling variables, constraint conditions and target task quantity;
the conflict property of each task is determined, and the conflict property is resolved based on the allowable scheduling condition, namely the conflict is eliminated, so that the online scheduling is met, and the corresponding tasks are required to be subjected to different parameters in the process of constructing a scheduling relation network.
After the conflict property is determined, the conflict property is combined with a corresponding allowable scheduling condition, the conflict property is input into a network construction model, a scheduling relation network is automatically generated, and the network construction model is based on the allowable scheduling conditions of different combinations, the conflict property and the matched expert setting network, so that the scheduling relation network can be directly output and obtained, online scheduling is carried out according to the network, the network comprises, for example, a platform 1 originally needs to execute a task 1, but at the moment, the task 1 and the task 2 in the platform 1 display time conflicts, at the moment, the task 2 needs to be scheduled to the platform 2, the task 2 is executed by the platform 2, the time conflict is avoided, and the task can be completed in time.
The beneficial effects of the technical scheme are as follows: the allowable dispatching conditions are obtained by analyzing the obtained results, the validity of the conditions is ensured, a dispatching relation network is established based on the allowable dispatching conditions, the establishment accuracy of the dispatching relation network is ensured, and the accuracy of the online dispatching of the multi-platform machine learning tasks is further ensured.
An embodiment of the present invention provides an online scheduling system for multi-platform machine learning tasks, as shown in fig. 2, including:
and the information acquisition module is used for: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
and a conflict analysis module: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
and a data arrangement module: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
and an online scheduling module: and analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling.
The beneficial effects of the technical scheme are as follows: the method has the advantages that the specific number of the online platforms and the specific conditions of the tasks required to be executed by all the online platforms are obtained in real time, the real-time performance of the executed machine learning tasks can be ensured, the occurrence of conflict events in the task execution process is effectively avoided by obtaining the flow conflict set between execution flows, the maximization of the simultaneously executed task quantity can be ensured by collecting the specific conditions of the machine learning model, the resource waste is effectively avoided, the online scheduling is realized by analyzing the flow conflict set and constructing the scheduling relation network, and the task execution efficiency is improved.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (7)

1. An online scheduling method for multi-platform machine learning tasks is characterized by comprising the following steps:
step 1: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
step 2: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
step 3: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
step 4: analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling;
wherein, step 3 includes:
according to all the first platforms with the largest historical online quantity and the corresponding historical task quantity and historical task types of each first platform, and combining the first historical operation logs with the largest historical online quantity, acquiring a historical execution flow of each first platform, and constructing a historical flow framework;
based on the history flow framework, determining a same history execution set of a time period corresponding to each individual task type in each first platform;
acquiring initial non-conflict constraint conditions of each individual task type according to the same history execution set;
regarding each remaining platform excluding the first platform from all the target platforms as a second platform;
respectively acquiring a second historical operation log containing the process of executing the machine learning task when each second platform is online, acquiring a second execution flow set of each second platform based on the second historical operation log, and constructing a second flow sub-frame corresponding to the second platform;
constructing a second flow Cheng Goujia containing all second platforms based on all second flow sub-frames;
acquiring a second execution set of the time period corresponding to each individual task type based on the second flow Cheng Goujia;
performing first optimization on initial non-conflict conditions of corresponding independent task types based on the second execution set to obtain optimized non-conflict conditions;
determining an execution duration variable interval of the tasks of the same type on each second platform based on the second flow framework;
constructing and obtaining a variable matrix under the same variable of the corresponding task type based on the maximum variable value and the minimum variable value of the corresponding variable interval of the tasks of the same type of the second platform, wherein a first column of the variable matrix is a maximum variable vector, and a second column of the variable matrix is a minimum variable vector;
randomly selecting j number values from the variable matrix to verify the optimized non-conflict condition;
if the verification result is valid, judging that no conflict exists, and keeping the optimized non-conflict condition unchanged;
otherwise, judging that the conflict situation exists, acquiring a minimum value in a maximum variable vector and a maximum value in a minimum variable vector in the corresponding task type, carrying out reduction limitation on the optimized non-conflict condition, and reserving the condition after the reduction limitation;
based on all the reserved conditions, the task number and task types which can be executed by the machine learning model at the same time are obtained.
2. The online scheduling method of multi-platform machine learning tasks according to claim 1, wherein in step 1, the number of online platforms, the number of tasks and the task types corresponding to each online platform are obtained in real time, and the method comprises the following steps:
acquiring the current state of each target platform, and counting the number of online states of the current state as the number of online platforms;
and determining effective information of the corresponding online platform according to the operation information and task rules of each online platform, and obtaining the task number and task type of the corresponding online platform based on the effective information.
3. The online scheduling method of multi-platform machine learning tasks according to claim 1, wherein in step 1, the execution process of the same-platform task is obtained in combination with the task execution time required by the same-type task, and the method comprises the following steps:
acquiring all historical execution time of the tasks of the same type on the same online platform, and performing data analysis to obtain standard task execution time required by the tasks of the same type;
and acquiring standard execution sequences of different types of tasks specified by the same online platform, and combining the standard task execution time of the different types of tasks to obtain the task execution flow of the same online platform.
4. The online scheduling method of multi-platform machine learning tasks according to claim 1, wherein in step 2, analyzing and acquiring a set of flow conflicts between execution flows of a plurality of online platform tasks comprises:
acquiring the execution content and the execution space of each execution flow at the same moment, and counting the quantity of the execution content and the total quantity of the execution space at the same moment;
when the number of the execution contents is larger than the number of the preset contents or the total execution space is larger than the total preset space, reserving all the execution contents at the corresponding time;
and obtaining a flow conflict set based on all the reserved results.
5. The online scheduling method of a multi-platform machine learning task according to claim 1, wherein in step 4, the analyzing the set of flow conflicts based on the obtained result includes:
analyzing the flow conflict set based on the number of tasks and task types which can be simultaneously executed by the machine learning model to obtain a task priority set, wherein the task priority set comprises: execution priorities of different task types in each platform and total task priorities of the same platform;
and arranging the execution sequence of the real-time online platform and the tasks required to be executed by the online platform based on the task priority set to obtain the task execution sequence.
6. The online scheduling method of a multi-platform machine learning task according to claim 1, wherein in step 4, the flow conflict set is parsed based on the obtained result, a scheduling relationship network is constructed, and online scheduling is implemented, which includes:
determining an allowable scheduling condition of each conflict flow in the flow conflict set based on the acquired result;
establishing a scheduling relation network based on all allowable scheduling conditions;
and carrying out online scheduling on the online platform and the tasks to be executed based on the online scheduling relation network, so as to realize online scheduling of the multi-platform machine learning tasks.
7. An online scheduling system for multi-platform machine learning tasks, comprising:
and the information acquisition module is used for: acquiring the number of online platforms and the number and the type of tasks corresponding to each online platform in real time, and acquiring the execution flow of the same online platform task by combining the task execution time required by the same type of task;
and a conflict analysis module: analyzing and acquiring a flow conflict set among execution flows of a plurality of online platform tasks;
and a data arrangement module: acquiring a machine learning model and the number and types of tasks which can be executed by the machine learning model at the same time;
and an online scheduling module: analyzing the flow conflict set based on the acquired result, constructing a scheduling relation network, and realizing online scheduling;
the data arrangement module is used for:
according to all the first platforms with the largest historical online quantity and the corresponding historical task quantity and historical task types of each first platform, and combining the first historical operation logs with the largest historical online quantity, acquiring a historical execution flow of each first platform, and constructing a historical flow framework;
based on the history flow framework, determining a same history execution set of a time period corresponding to each individual task type in each first platform;
acquiring initial non-conflict constraint conditions of each individual task type according to the same history execution set;
regarding each remaining platform excluding the first platform from all the target platforms as a second platform;
respectively acquiring a second historical operation log containing the process of executing the machine learning task when each second platform is online, acquiring a second execution flow set of each second platform based on the second historical operation log, and constructing a second flow sub-frame corresponding to the second platform;
constructing a second flow Cheng Goujia containing all second platforms based on all second flow sub-frames;
acquiring a second execution set of the time period corresponding to each individual task type based on the second flow Cheng Goujia;
performing first optimization on initial non-conflict conditions of corresponding independent task types based on the second execution set to obtain optimized non-conflict conditions;
determining an execution duration variable interval of the tasks of the same type on each second platform based on the second flow framework;
constructing and obtaining a variable matrix under the same variable of the corresponding task type based on the maximum variable value and the minimum variable value of the corresponding variable interval of the tasks of the same type of the second platform, wherein a first column of the variable matrix is a maximum variable vector, and a second column of the variable matrix is a minimum variable vector;
randomly selecting j number values from the variable matrix to verify the optimized non-conflict condition;
if the verification result is valid, judging that no conflict exists, and keeping the optimized non-conflict condition unchanged;
otherwise, judging that the conflict situation exists, acquiring a minimum value in a maximum variable vector and a maximum value in a minimum variable vector in the corresponding task type, carrying out reduction limitation on the optimized non-conflict condition, and reserving the condition after the reduction limitation;
based on all the reserved conditions, the task number and task types which can be executed by the machine learning model at the same time are obtained.
CN202310854333.XA 2023-07-13 2023-07-13 Online scheduling method and system for multi-platform machine learning tasks Active CN116594755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310854333.XA CN116594755B (en) 2023-07-13 2023-07-13 Online scheduling method and system for multi-platform machine learning tasks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310854333.XA CN116594755B (en) 2023-07-13 2023-07-13 Online scheduling method and system for multi-platform machine learning tasks

Publications (2)

Publication Number Publication Date
CN116594755A CN116594755A (en) 2023-08-15
CN116594755B true CN116594755B (en) 2023-09-22

Family

ID=87612021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310854333.XA Active CN116594755B (en) 2023-07-13 2023-07-13 Online scheduling method and system for multi-platform machine learning tasks

Country Status (1)

Country Link
CN (1) CN116594755B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825511A (en) * 2019-11-07 2020-02-21 北京集奥聚合科技有限公司 Operation flow scheduling method based on modeling platform model
CN112035238A (en) * 2020-09-11 2020-12-04 曙光信息产业(北京)有限公司 Task scheduling processing method and device, cluster system and readable storage medium
CN115827206A (en) * 2023-01-18 2023-03-21 太极计算机股份有限公司 Method and system for scheduling display card task resources based on machine learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220343155A1 (en) * 2021-04-26 2022-10-27 Adobe Inc. Intelligently modifying digital calendars utilizing a graph neural network and reinforcement learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825511A (en) * 2019-11-07 2020-02-21 北京集奥聚合科技有限公司 Operation flow scheduling method based on modeling platform model
CN112035238A (en) * 2020-09-11 2020-12-04 曙光信息产业(北京)有限公司 Task scheduling processing method and device, cluster system and readable storage medium
CN115827206A (en) * 2023-01-18 2023-03-21 太极计算机股份有限公司 Method and system for scheduling display card task resources based on machine learning

Also Published As

Publication number Publication date
CN116594755A (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN110096345B (en) Intelligent task scheduling method, device, equipment and storage medium
US8224845B2 (en) Transaction prediction modeling method
CN111858065A (en) Data processing method, device, storage medium and device
CN112016773A (en) Method and device for determining potential anchor
CN109840248B (en) Operation flow optimization method and device and storage medium
CN111062518A (en) Method, device and storage medium for processing hastening service based on artificial intelligence
CN116755891B (en) Event queue processing method and system based on multithreading
CN111680085A (en) Data processing task analysis method and device, electronic equipment and readable storage medium
CN115271102B (en) Task-oriented priority method and system for machine learning engine
CN116594755B (en) Online scheduling method and system for multi-platform machine learning tasks
CN111815941B (en) Frequent congestion bottleneck identification method and device based on historical road conditions
US7398510B2 (en) Estimating software project requirements for resolving defect backlogs
CN113391911A (en) Big data resource dynamic scheduling method, device and equipment
US9081605B2 (en) Conflicting sub-process identification method, apparatus and computer program
CN108845870B (en) Probabilistic real-time task scheduling method based on pWCET shaping
CN116501468A (en) Batch job processing method and device and electronic equipment
CN111598275B (en) Electric vehicle credit score evaluation method, device, equipment and medium
CN115271746B (en) Block chain transaction sequencing method with priority
Liang Cost-effective techniques for continuous integration testing
CN114219501B (en) Sample labeling resource allocation method, device and application
CN110633742A (en) Method for acquiring characteristic information and computer storage medium
US20240160612A1 (en) Scheduling method based on task analysis in multiple computational storage dbms environment
CN117610896B (en) Intelligent scheduling system based on industrial large model
Bilias Sequential testing of duration data: the case of the Pennsylvania ‘reemployment bonus’ experiment
CN110113434B (en) Method, device and equipment for balancing automatic scheduling of jobs and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant