CN115952054A - Simulation task resource management method, device, equipment and medium - Google Patents

Simulation task resource management method, device, equipment and medium Download PDF

Info

Publication number
CN115952054A
CN115952054A CN202211658949.1A CN202211658949A CN115952054A CN 115952054 A CN115952054 A CN 115952054A CN 202211658949 A CN202211658949 A CN 202211658949A CN 115952054 A CN115952054 A CN 115952054A
Authority
CN
China
Prior art keywords
resource
simulation
tenant
data
quota
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211658949.1A
Other languages
Chinese (zh)
Inventor
王红宾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Weride Technology Co Ltd
Original Assignee
Guangzhou Weride Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Weride Technology Co Ltd filed Critical Guangzhou Weride Technology Co Ltd
Priority to CN202211658949.1A priority Critical patent/CN115952054A/en
Publication of CN115952054A publication Critical patent/CN115952054A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method, a device, equipment and a medium for managing simulation task resources. The method comprises the following steps: summarizing the resource data of the simulation service equipment based on the target monitoring platform, and determining the global used resource data and the global unused resource data of the simulation service cluster; determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data of each tenant and a resource quota prediction model obtained by pre-training; for each tenant, if the simulation task of the current tenant is received, the to-be-used resource data corresponding to the simulation task and the resource quota predicted value of the current tenant are determined, whether the simulation task is sent to the simulation service cluster or not is determined, the simulation task is processed, the technical problem that a small number of users occupy a large number of resources to cause resource shortage of the whole platform when the simulation task is carried out is solved, the resource allocation efficiency is improved, and resource waste caused by unreasonable allocation is reduced.

Description

Simulation task resource management method, device, equipment and medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for managing simulation task resources.
Background
In the autopilot industry, large-scale simulation tasks are required to improve the performance of an autopilot algorithm model, so engineers in different teams need to perform a large number of simulation tasks to optimize the algorithm model, and the process consumes great computational power and storage resources.
At present, resource allocation can be performed according to a sequence corresponding to resource application time of a user, or priority is set for a simulation task, so that the simulation task with a high priority preferentially uses resources of a machine cluster.
However, a user after the application time of the simulation task in the method needs to wait for idle resources to perform the simulation task, so that the problem of resource shortage of the whole platform caused by a large amount of resources seized by a team or a person exists, the timely execution of the simulation tasks of other users is influenced, and the overall working efficiency is influenced.
Disclosure of Invention
The invention provides a method, a device, equipment and a medium for managing simulation task resources, which improve the efficiency of resource allocation and reduce resource waste caused by unreasonable allocation.
In a first aspect, the present invention provides a method for managing simulation task resources, including:
summarizing the resource data corresponding to the at least one simulation service device based on a target monitoring platform, and determining global resource data corresponding to the simulation service cluster; the global resource data comprise global used resource data and global unused resource data;
determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data of each tenant and a resource quota prediction model obtained by pre-training;
for each tenant, if a simulation task corresponding to the current tenant is received, determining to-be-used resource data corresponding to the simulation task and a resource quota predicted value of the current tenant, and determining whether to send the simulation task to the simulation service cluster so as to process the simulation task based on the simulation service cluster.
In a second aspect, the present invention provides an apparatus for managing emulated task resources, the apparatus comprising:
the global data determining module is used for summarizing the resource data corresponding to the at least one simulation service device based on the target monitoring platform and determining the global resource data corresponding to the simulation service cluster; the global resource data comprise global used resource data and global unused resource data;
the resource quota determining module is used for determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data and a resource quota prediction model obtained by pre-training of each tenant;
and the simulation task determining module is used for determining to-be-used resource data corresponding to the simulation task and a resource quota predicted value of the current tenant and determining whether to send the simulation task to the simulation service cluster or not so as to process the simulation task based on the simulation service cluster when receiving the simulation task corresponding to the current tenant for each tenant.
In a third aspect, the present invention provides an electronic device for data processing, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method of emulated task resource management of any of the embodiments of the invention.
In a fourth aspect, the present invention provides a computer-readable storage medium storing computer instructions for causing a processor to implement the method for managing simulation task resources according to any one of the embodiments of the present invention when the computer instructions are executed.
In a fifth aspect, the invention provides a computer program product comprising a computer program which, when executed by a processor, implements the method of emulated task resource management of any of the embodiments of the invention.
The technical scheme provided by the embodiment of the invention is applied to a simulation service cluster, wherein the simulation service cluster comprises at least one simulation service device, and the method comprises the following steps: the method comprises the steps of summarizing resource data corresponding to at least one simulation service device based on a target monitoring platform, determining global used resource data and global unused resource data corresponding to a simulation service cluster, then determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data and a resource quota prediction model obtained through pre-training of each tenant, determining resource quota predicted values of the to-be-used resource data corresponding to the simulation task and the current tenant if a simulation task corresponding to the current tenant is received for each tenant, and determining whether to send the simulation task to the simulation service cluster or not so as to process the simulation task based on the simulation service cluster.
It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a simulation task resource management method according to an embodiment of the present invention;
FIG. 2 is a schematic processing procedure diagram of a simulation task resource management method according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating a relationship between a system administrator, a tenant, and a user according to an embodiment of the present invention;
fig. 4 is a flowchart of a simulation task resource management method according to a second embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a simulation task resource management apparatus according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first preset condition", "second preset condition", and the like in the description and the claims of the present invention and the drawings are used for distinguishing similar objects and are not necessarily used for describing a specific order or sequence. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Before the technical solution is introduced, an application scenario may be exemplarily described. Automatic driving is based on an automatic driving algorithm, and before the automatic driving is formally applied to an actual driving scene, a large-scale simulation task is needed to improve the performance of an automatic driving algorithm model. When the simulation task is performed, the simulation task is performed on a designated machine cluster, which consumes great computational power and storage resources, and the total amount of resources used by the simulation task is relatively fixed, so that the problem of resource shortage of the whole platform caused by a team or an individual occupying a large amount of resources may exist, and the timely execution of the simulation tasks of other users is influenced, thereby influencing the overall working efficiency. The invention provides a resource quota management scheme combining the forecast resource quota with manual allocation, which is beneficial to the reasonable use of the system resources of the whole simulation platform.
Fig. 1 is a flowchart of a simulation task resource management method according to an embodiment of the present invention, where the embodiment is applicable to a situation where resources of a whole simulation platform system are reasonably allocated. The method can be executed by an emulation task resource management device, the device can be realized in a hardware and/or software mode, and the device can be configured on a computer device, and the computer device can be a notebook, a desktop computer, an intelligent tablet and the like. As shown in fig. 1, the method includes:
s110, summarizing the resource data corresponding to the at least one simulation service device based on the target monitoring platform, and determining the global resource data corresponding to the simulation service cluster.
The simulation service device is a hardware device for performing simulation tasks, and may be a high-performance server, for example. The number of the simulation service devices can be one or more. A plurality of emulation service devices can form an emulation service cluster. The resource data is used for representing the sizes of a CPU, an internal memory, a GPU, a disk and the like of the simulation equipment.
The target monitoring platform is a server for monitoring data resources of the simulation service equipment. The target monitoring platform comprises monitoring devices which are communicated with the simulation service devices, for example, the simulation service devices are respectively provided with an agent plugin which is in communication connection with the target monitoring platform, and the agent plugin can collect data resources of the simulation service devices.
The global data resources are data resources of the whole simulation service cluster, and the global resource data comprise global used resource data and global unused resource data. The global used resource data are data resources which are already used for carrying out simulation tasks, and the global used resource data cannot be used for carrying out resource allocation any more; the global unused resource data is a data resource that has not been used for performing a simulation task, and the global unused resource data may be used for resource allocation.
On the basis of the foregoing embodiment, determining global resource data corresponding to the simulation service cluster may include: for each simulation service device, receiving local resource data corresponding to the current simulation service device based on the monitoring device corresponding to the current simulation service device; and determining global resource data of the simulation service cluster based on the local resource data of each simulation service device.
Wherein the local resource data is relative to each emulated service device. The local resource data includes consumed resource data and available resource data. For a certain simulation service device, the consumed resource data is the data resource which is used for performing the simulation task by the current simulation service device; the resource data can be used as data resources which are not used for performing the simulation task by the current simulation service equipment, and can be used for performing resource allocation.
Specifically, a processing procedure schematic diagram of the simulation task resource management method provided by the embodiment of the present invention is shown in fig. 2. The simulation service cluster comprises a plurality of simulation service devices, each simulation service device can receive local resource data of the current simulation service device through the corresponding monitoring device, and then the monitoring device can transmit the local resource data of the current simulation service device to the target monitoring platform. The target monitoring platform can summarize the local resource data of each simulation service device, so as to determine the global resource data of the simulation service cluster.
Illustratively, the simulation service cluster includes 3 simulation service devices, which are respectively the simulation service device 1, the simulation service device 2, and the simulation service device 3, and each simulation service device is configured with a monitoring device corresponding thereto. One cycle duration may be set in advance, for example, one cycle duration is set in advance to1 second. Every 1 second, the monitoring equipment can transmit the local resource data of each simulation service equipment to the target monitoring platform. The target monitoring platform can summarize local resource data of the simulation service device 1, the simulation service device 2 and the simulation service device 3, so as to determine global resource data of the simulation service cluster.
And S120, determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data and a resource quota prediction model obtained through pre-training of each tenant.
The tenant is a virtual concept, and a tenant can be understood as a team, and a team includes at least one user. Each user in the tenant has a requirement for performing simulation tasks based on the resources of the simulation service device. The number of tenants may be one or more, and is not particularly limited herein.
The historical resource usage data is resource data of each tenant before current resource allocation. The historical resource usage data comprises at least one of historical tenant quota, historical actual resource usage of the tenant and quota adjustment record data of each user in the tenant. In this embodiment, a period duration may be preset, and the resource quota is performed every other period duration, for example, the resource quota may be performed every other week. The historical quota of the tenant is a resource quota corresponding to each period in a preset time before the current resource allocation of the tenant, for example, the preset time is one month, the duration of the preset period is one week, and the one month has 4 weeks, then for one tenant, the historical quota of the tenant can be represented as [ historical quota 1, historical quota 2, historical quota 3], a first identification bit is a resource quota obtained by the tenant in the first week, a second identification bit is a resource quota obtained by the tenant in the second week, a third identification bit is a resource quota obtained by the tenant in the third week, and a fourth identification bit is a resource quota obtained by the tenant in the fourth week. The historical actual resource usage amount of the tenant is the actual resource usage amount of the tenant in each period in the preset time before the current resource allocation. The representation mode of the actual resource usage amount of the tenant history is the same as that of the tenant history quota, and details are not repeated here. For a certain user in the tenant, if the resource quota of the user is adjusted in a period, the quota adjustment record data of the current user can be represented as "from100to150", which indicates that the resource quota with the quantity of "100" obtained by the current user is not enough to support the current user to complete the simulation task, and additionally applies for the resource quantity with the quantity of "50".
The resource quota prediction model can be a pre-trained model and is used for processing historical resource usage data and global resource data of each tenant to obtain a resource quota prediction value corresponding to each tenant. The resource quota prediction model may employ any neural network model, for example, a long short-term memory network (LSTM), which is a time-recursive neural network suitable for processing and predicting important events with relatively long intervals and delays in a time sequence, may be employed. Generally, the number of network layers of the LSTM model can be set to1 to 4, and the greater the number of layers of the LSTM model, the higher the prediction accuracy of the model, but the higher the complexity of the model, the longer the time spent in performing the prediction operation. In order to reduce the complexity of the model while ensuring the accuracy, the number of network layers of the LSTM model can be determined in a manner of weighting the accuracy, the training time and the prediction time, so as to obtain a resource quota prediction model with reasonable parameters.
It should be particularly noted that, for the resource quota prediction model trained in advance, some parameter values in the model are determined on the premise of a limited training set. In the process of practical application, in order to make the result of model prediction more fit with the actual situation, the resource quota prediction model parameters may be optimized based on actually generated data, so as to perform prediction based on the optimized model. In addition, with the continuous use of the whole set of method, each user in the tenant can continuously use the resources in each account to execute the simulation task, historical data accumulated in the period can continuously expand the training set to serve as training data of the resource quota prediction model, and the prediction performance of the model can be gradually optimized through the real-time feedback and training.
The resource quota prediction value is a resource quota corresponding to each tenant in the next period output by the resource quota prediction model. For example, there are 3 tenants, which are tenant 1, tenant 2, and tenant 3, respectively, the resource quota prediction value may be represented as [ tenant 1 predicts a value for tenant 2 predicts a value for tenant 3], the first identification bit is a resource quota prediction value corresponding to the next cycle of tenant 1, the second identification bit is a resource quota prediction value corresponding to the next cycle of tenant 2, and the third identification bit is a resource quota prediction value corresponding to the next cycle of tenant 3.
Specifically, the historical resource usage data and the global resource data of each tenant may be input into the resource quota prediction model, and the resource quota prediction model processes the historical resource usage data and the global resource data of each tenant to obtain a resource quota prediction value corresponding to each tenant.
It should be particularly noted that, in the above embodiment, in a manner of allocating resource quota for each tenant in the presence of historical resource usage data, if resource quota allocation is encountered for the first time in practical application, resource quota allocation may be performed according to a quota application amount provided by each tenant, historical resource usage data may be generated after the first allocation, and a next period resource quota predicted value may be determined based on the historical resource usage data, global resource data, and a resource quota prediction model during subsequent resource quota allocation, so that resource quota allocation is performed based on the predicted value.
S130, for each tenant, if the simulation task corresponding to the current tenant is received, determining to-be-used resource data corresponding to the simulation task and a resource quota predicted value of the current tenant, and determining whether to send the simulation task to the simulation service cluster so as to process the simulation task based on the simulation service cluster.
Wherein the simulation task is relative to each user in the tenant, and the users in the tenant can submit the simulation task to the tenant. The resource data to be used is the resource data which is determined based on the simulation task and a pre-established rule engine and is needed to be used by the simulation task.
In a specific application process, a system administrator may execute a task of global resource data allocation, the system administrator may allocate global data resources to each tenant, each tenant is configured with a tenant administrator, and then the tenant administrator reasonably allocates the resources allocated to the tenant to each user in the tenant, and a schematic diagram of relationships among the system administrator, the tenants, and the users is shown in fig. 3.
In this embodiment, the manner of allocating resource quotas inside each tenant is the same, and an example is given here as an example of one tenant. For a tenant, the resource quota of a user inside the tenant is allocated and adjusted by a tenant administrator, and the tenant administrator can give the user the resource quota predicted value of the current tenant in the tenant, but the total amount of the resource quota predicted value cannot exceed the resource quota predicted value of the current tenant of the tenant. When a user submits a simulation task, the system calls a rule engine for resource consumption of the simulation task according to various configurations of the simulation task, calculates the size of each type of resource which can be consumed by the simulation task, and refers to the size as resource data to be used. And then, the system checks whether the resource balance of the current user meets the current resource data to be used, if so, the simulation task is normally established, and the task scheduling center issues the simulation task to the simulation service equipment node in the simulation service cluster for execution. If not, the simulation task cannot be sent to the simulation service cluster. At this time, the system prompts that the balance of the user is insufficient, the user can provide a resource application to the tenant, the tenant administrator issues a corresponding resource quota to the user account after approval, and then the user can submit the simulation task by using the adjusted resource quota.
The technical scheme provided by the embodiment of the invention comprises the following steps: the method comprises the steps of summarizing resource data corresponding to at least one simulation service device based on a target monitoring platform, determining global used resource data and global unused resource data corresponding to a simulation service cluster, then determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data and a resource quota prediction model obtained through pre-training of each tenant, determining resource quota predicted values of the to-be-used resource data corresponding to the simulation task and the current tenant if a simulation task corresponding to the current tenant is received for each tenant, and determining whether to send the simulation task to the simulation service cluster or not so as to process the simulation task based on the simulation service cluster.
Example two
Fig. 4 is a flowchart of a simulation task resource management method provided in the second embodiment of the present invention, and the second embodiment of the present invention further refines steps S120 and S130 in the second embodiment of the present invention on the basis of the above embodiments, and the second embodiment of the present invention may be combined with various alternatives in one or more of the above embodiments. As shown in fig. 4, the method includes:
s210, summarizing the resource data corresponding to the at least one simulation service device based on the target monitoring platform, and determining the global resource data corresponding to the simulation service cluster.
And S220, determining a matrix to be processed based on the historical resource use data and the global resource data of each tenant.
The matrix to be processed is obtained by splicing the historical resource use data of each tenant and the matrix determined by the global resource data.
In this embodiment, the historical resource usage data includes historical quotas of tenants, historical actual resource usage of tenants, and quota adjustment record data of users in tenants. The historical quota of the tenant may be characterized by a matrix, for example, different rows of the historical quota matrix of the tenant represent historical quotas of different tenants, and different columns represent historical quotas corresponding to different times. The actual resource usage of the tenant history can also be characterized by a matrix, and the specific characterization mode is the same as that of the tenant history quota. Quota adjustment record data of each user in the tenant can be represented by a matrix. The global resource data includes global used resource data and global unused resource data, so the global resource data may also be characterized in a matrix form, for example, the global resource data may be represented as [100300], where the first flag is used to characterize the global used resource data, and the second flag is used to characterize the global unused resource data. Based on the above, the matrixes determined by the historical resource use data and the global resource data of each tenant are spliced, so that the matrix to be processed can be obtained.
For example, if the historical quota of the tenant can be represented by an A1 matrix, the actual resource usage amount of the tenant history can be represented by an A2 matrix, the quota adjustment record data of each user in the tenant can be represented by an A3 matrix, and the global resource data can be represented by an A4 matrix, the pending matrix can be represented as [ A1A2A3A4].
And S230, inputting the matrix to be processed into the resource quota prediction model to obtain an output matrix.
The output matrix is an output result obtained by inputting the matrix to be processed into the resource quota prediction model and processing the matrix to be processed by the resource quota prediction model. Each element value in the output matrix is used to characterize the resource quota predicted value corresponding to the corresponding tenant, for example, if there are 10 tenants in total, the output matrix may be a matrix of 1 × 10, and each element value in the matrix may characterize the resource quota predicted value corresponding to one tenant.
Specifically, the matrix to be processed may be input into the resource quota prediction model, and the resource quota prediction model processes the matrix to be processed to obtain an output matrix.
And S240, determining the resource quota predicted value corresponding to each tenant based on the output matrix.
In this embodiment, a corresponding relationship between each identification bit of the output matrix and each tenant may be established in advance, and based on this, after the output matrix is determined, a resource quota prediction value corresponding to each tenant may be determined based on the corresponding relationship between each identification bit of the output matrix and each tenant.
And S250, when the user in the tenant submits the simulation task to the system, determining the resource data to be used according to the configuration data corresponding to the simulation task.
The configuration data is at least one configuration requirement which is determined when a user in a tenant submits a simulation task to the system and corresponds to the simulation task. The configuration data includes at least one of a set of simulation scenarios, a type of simulation workflow, and output content. The simulation scene set can be understood as a test data set required to be used when a simulation task is carried out; the simulation workflow type is a process needed to be experienced by executing a simulation task; the output content is specific content corresponding to the final output result of the simulation task.
In this embodiment, a rule engine may be pre-established, and after a user in a tenant submits a simulation task, the system may call the rule engine for resource consumption of the simulation task according to configuration data of the simulation task, and calculate the size of each type of resource that may be consumed by the simulation task, thereby determining resource data to be used.
And S260, comparing the to-be-used resource data with the resource quota predicted value.
In this embodiment, if the to-be-used resource data is smaller than the resource quota prediction value, S271 is executed; if the to-be-used resource data is greater than the resource quota prediction value, S272 is performed.
And S271, if the to-be-used resource data is smaller than the resource quota predicted value, creating a simulation task, and sending the simulation task to the simulation service cluster.
In this embodiment, since the system administrator allocates the resource quota for the tenant based on the resource quota prediction value, and then the tenant administrator can issue the allocated resource quota to each user, the resource quota prediction value is divided into multiple shares at this time, and each share can be issued to a corresponding user. For a user, the resource quota predicted value obtained by the user and the resource data to be used corresponding to the submitted simulation task can be compared, if the resource data to be used is smaller than the resource quota predicted value, the resource quota predicted value obtained by representing the user can support and complete the simulation task, at this time, the simulation task can be created, and the task scheduling center issues the simulation task to the simulation service equipment node in the simulation service cluster for execution.
And S272, if the to-be-used resource data is larger than the resource quota predicted value, feeding back prompt information to submit a resource capacity expansion application based on the prompt information.
On the basis of the above embodiment, if the resource data to be used is larger than the resource quota predicted value, the resource quota predicted value obtained by the current user cannot support and complete the simulation task, at this time, the system may feed back prompt information, for example, may prompt that the user has insufficient resource quota balance. The user can submit a resource expansion application to the tenant based on the prompt information, the tenant administrator issues a corresponding resource quota to the user account after approval, then the user can submit the simulation task by using the adjusted resource quota, if the resource data to be used is smaller than the resource quota predicted value, the simulation task can be created at this time, and the task scheduling center issues the simulation task to the simulation service equipment node in the simulation service cluster for execution.
According to the technical scheme provided by the embodiment of the invention, the resource data corresponding to at least one simulation service device is collected based on the target monitoring platform, the global used resource data and the global unused resource data corresponding to the simulation service cluster are determined, the matrix to be processed is determined based on the historical resource use data and the global resource data of each tenant, the matrix to be processed is input into the resource quota prediction model to obtain the output matrix, and the resource quota prediction value corresponding to each tenant is determined based on the output matrix. And when the user in the tenant submits the simulation task to the system, determining the resource data to be used according to the configuration data corresponding to the simulation task. Further, the size of the resource data to be used is compared with the size of the resource quota predicted value, if the resource data to be used is smaller than the resource quota predicted value, a simulation task is created and sent to the simulation service cluster; and if the to-be-used resource data are larger than the resource quota predicted value, feeding back prompt information to submit a resource capacity expansion application based on the prompt information. The technical scheme provided by the embodiment of the invention solves the technical problem that a small number of users occupy a large number of resources to cause resource shortage of the whole platform when a simulation task is carried out, realizes a resource quota management scheme determined based on model prediction, is beneficial to reasonable use of system resources of the whole simulation platform, improves the efficiency of resource allocation, and reduces resource waste caused by unreasonable resource allocation.
EXAMPLE III
Fig. 5 is a schematic structural diagram of a simulation task resource management device according to a third embodiment of the present invention, where the device can execute the simulation task resource management method according to the third embodiment of the present invention. The device includes: a global data determination module 310, a resource quota determination module 320, and a simulation task determination module 330.
The global data determining module 310 is configured to summarize resource data corresponding to at least one simulation service device based on the target monitoring platform, and determine global resource data corresponding to the simulation service cluster; the global resource data comprise global used resource data and global unused resource data;
a resource quota determining module 320, configured to determine a resource quota prediction value corresponding to each tenant based on historical resource usage data, global resource data of each tenant, and a resource quota prediction model obtained through pre-training;
the simulation task determining module 330 is configured to, for each tenant, determine, when receiving a simulation task corresponding to a current tenant, to-be-used resource data corresponding to the simulation task and a resource quota prediction value of the current tenant, and determine whether to send the simulation task to the simulation service cluster, so as to process the simulation task based on the simulation service cluster.
On the basis of the technical schemes, the target monitoring platform comprises monitoring equipment which is communicated with each simulation service equipment. The global data determination module 310 further includes: a local resource determination module and a global resource determination module.
The local resource determining module is used for receiving local resource data corresponding to the current simulation service equipment based on the monitoring equipment corresponding to the current simulation service equipment for each simulation service equipment; the local resource data comprises consumed resource data and available resource data;
and the global resource determining module is used for determining global resource data of the simulation service cluster based on the local resource data of each simulation service device.
On the basis of the above technical solutions, the historical resource usage data includes at least one of a tenant historical quota, a tenant historical actual resource usage amount, and quota adjustment record data of each user in a tenant.
On the basis of the above technical solutions, the resource quota determining module 320 further includes: the device comprises a to-be-processed matrix determining unit, an output matrix determining unit and a quota predicted value determining unit.
The matrix to be processed determining unit is used for determining a matrix to be processed based on the historical resource use data and the global resource data of each tenant;
the output matrix determining unit is used for inputting the matrix to be processed into the resource quota prediction model to obtain an output matrix; each element value in the output matrix is used for representing a resource quota predicted value corresponding to a corresponding tenant;
and the quota predicted value determining unit is used for determining the resource quota predicted value corresponding to each tenant based on the output matrix.
On the basis of the above technical solutions, the simulation task determining module 330 further includes: a resource to be used determining unit and a simulation task creating unit.
The resource to be used determining unit is used for determining the resource data to be used according to the configuration data corresponding to the simulation task;
and the simulation task creating unit is used for creating the simulation task and sending the simulation task to the simulation service cluster if the to-be-used resource data is smaller than the resource quota predicted value.
On the basis of the technical solutions, the configuration data includes at least one of a simulation scene set, a simulation workflow type, and output content.
On the basis of the above technical solutions, the simulation task determining module 330 is further configured to feed back prompt information if the to-be-used resource data is greater than the resource quota predicted value, so as to submit a resource capacity expansion application based on the prompt information.
The technical scheme provided by the embodiment of the invention is applied to a simulation service cluster, wherein the simulation service cluster comprises at least one simulation service device, and the method comprises the following steps: summarizing resource data corresponding to at least one simulation service device based on a target monitoring platform, and determining global used resource data and global unused resource data corresponding to a simulation service cluster; determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data of each tenant and a resource quota prediction model obtained by pre-training; for each tenant, if a simulation task corresponding to the current tenant is received, the to-be-used resource data corresponding to the simulation task and the resource quota predicted value of the current tenant are determined, whether the simulation task is sent to the simulation service cluster or not is determined, the simulation task is processed based on the simulation service cluster, the technical problem that a small number of users occupy a large number of resources to cause resource shortage of the whole platform when the simulation task is performed is solved, a resource quota management scheme determined based on model prediction is achieved, reasonable use of system resources of the whole simulation platform is facilitated, resource allocation efficiency is improved, and resource waste caused by unreasonable allocation is reduced.
The simulation task resource management device provided by the embodiment of the disclosure can execute the simulation task resource management method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
It should be noted that, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the embodiments of the present disclosure.
Example four
Fig. 6 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. The electronic device 10 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM12, and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as the emulated task resource management method.
In some embodiments, the simulation task resource management method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into the RAM13 and executed by the processor 11, one or more steps of the above described method of emulated task resource management may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the emulated task resource management method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Computer programs for implementing the methods of the present invention can be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable emulated task resource management device, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome. It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved. The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A simulation task resource management method is applied to a simulation service cluster, wherein the simulation service cluster comprises at least one simulation service device, and the method comprises the following steps:
summarizing the resource data corresponding to the at least one simulation service device based on a target monitoring platform, and determining global resource data corresponding to the simulation service cluster; the global resource data comprise global used resource data and global unused resource data;
determining a resource quota predicted value corresponding to each tenant based on historical resource usage data, global resource data and a resource quota prediction model obtained by pre-training of each tenant;
for each tenant, if a simulation task corresponding to the current tenant is received, determining to-be-used resource data corresponding to the simulation task and a resource quota predicted value of the current tenant, and determining whether to send the simulation task to the simulation service cluster so as to process the simulation task based on the simulation service cluster.
2. The method of claim 1, wherein the target monitoring platform comprises a monitoring device in communication with each simulation service device, and the determining global resource data corresponding to the simulation service cluster based on the target monitoring platform summarizing the resource data corresponding to the at least one simulation service device comprises:
for each simulation service device, receiving local resource data corresponding to the current simulation service device based on the monitoring device corresponding to the current simulation service device; wherein the local resource data comprises consumed resource data and available resource data;
and determining global resource data of the simulation service cluster based on the local resource data of each simulation service device.
3. The method of claim 1, wherein the historical resource usage data includes at least one of historical quotas of tenants, historical actual resource usage of tenants, and quota adjustment record data of users within tenants.
4. The method according to claim 3, wherein the determining a resource quota prediction value corresponding to each tenant based on historical resource usage data, global resource data of each tenant, and a resource quota prediction model obtained by pre-training comprises:
determining a matrix to be processed based on historical resource use data and global resource data of each tenant;
inputting the matrix to be processed into the resource quota prediction model to obtain an output matrix; each element value in the output matrix is used for representing a resource quota predicted value corresponding to a corresponding tenant;
and determining a resource quota predicted value corresponding to each tenant based on the output matrix.
5. The method of claim 1, wherein the determining the resource data to be used corresponding to the simulation task and the resource quota prediction value of the current tenant, and determining whether to send the simulation task to the simulation service cluster comprises:
determining the resource data to be used according to the configuration data corresponding to the simulation task;
and if the resource data to be used is smaller than the resource quota predicted value, creating the simulation task and sending the simulation task to the simulation service cluster.
6. The method of claim 5, wherein the configuration data includes at least one of a set of simulation scenarios, a type of simulation workflow, and output content.
7. The method of claim 5, further comprising:
and if the to-be-used resource data is larger than the resource quota predicted value, feeding back prompt information to submit a resource capacity expansion application based on the prompt information.
8. An apparatus for managing a resource of an emulation task, comprising:
the global data determining module is used for summarizing the resource data corresponding to the at least one simulation service device based on the target monitoring platform and determining the global resource data corresponding to the simulation service cluster; the global resource data comprise global used resource data and global unused resource data;
the resource quota determining module is used for determining a resource quota predicted value corresponding to each tenant on the basis of historical resource use data and global resource data of each tenant and a resource quota prediction model obtained by pre-training;
and the simulation task determining module is used for determining to-be-used resource data corresponding to the simulation task and a resource quota predicted value of the current tenant and determining whether to send the simulation task to the simulation service cluster or not so as to process the simulation task based on the simulation service cluster when receiving the simulation task corresponding to the current tenant for each tenant.
9. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method for resource management of emulated tasks of any of claims 1-7.
10. A storage medium containing computer executable instructions for performing the method of simulation task resource management according to any one of claims 1-7 when executed by a computer processor.
CN202211658949.1A 2022-12-22 2022-12-22 Simulation task resource management method, device, equipment and medium Pending CN115952054A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211658949.1A CN115952054A (en) 2022-12-22 2022-12-22 Simulation task resource management method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211658949.1A CN115952054A (en) 2022-12-22 2022-12-22 Simulation task resource management method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN115952054A true CN115952054A (en) 2023-04-11

Family

ID=87290213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211658949.1A Pending CN115952054A (en) 2022-12-22 2022-12-22 Simulation task resource management method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115952054A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932174A (en) * 2023-09-19 2023-10-24 浙江大学 Dynamic resource scheduling method, device, terminal and medium for EDA simulation task

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932174A (en) * 2023-09-19 2023-10-24 浙江大学 Dynamic resource scheduling method, device, terminal and medium for EDA simulation task
CN116932174B (en) * 2023-09-19 2023-12-08 浙江大学 Dynamic resource scheduling method, device, terminal and medium for EDA simulation task

Similar Documents

Publication Publication Date Title
US10474504B2 (en) Distributed node intra-group task scheduling method and system
Calheiros et al. Energy-efficient scheduling of urgent bag-of-tasks applications in clouds through DVFS
CN112559182B (en) Resource allocation method, device, equipment and storage medium
CN110389816B (en) Method, apparatus and computer readable medium for resource scheduling
CN107003887A (en) Overloaded cpu setting and cloud computing workload schedules mechanism
CN111738446A (en) Scheduling method, device, equipment and medium of deep learning inference engine
CN112579304A (en) Resource scheduling method, device, equipment and medium based on distributed platform
US11307898B2 (en) Server resource balancing using a dynamic-sharing strategy
Mekala et al. DAWM: Cost-aware asset claim analysis approach on big data analytic computation model for cloud data centre
CN104598311A (en) Method and device for real-time operation fair scheduling for Hadoop
CN114911598A (en) Task scheduling method, device, equipment and storage medium
CN113032102A (en) Resource rescheduling method, device, equipment and medium
CN115952054A (en) Simulation task resource management method, device, equipment and medium
CN114490048A (en) Task execution method and device, electronic equipment and computer storage medium
Kwon et al. Dynamic scheduling method for cooperative resource sharing in mobile cloud computing environments
CN105550025A (en) Distributed IaaS (Infrastructure as a Service) scheduling method and system
CN117331668A (en) Job scheduling method, device, equipment and storage medium
CN115373860B (en) Scheduling method, device and equipment of GPU (graphics processing Unit) tasks and storage medium
CN115658311A (en) Resource scheduling method, device, equipment and medium
CN114862223A (en) Robot scheduling method, device, equipment and storage medium
Bouterse et al. Performance analysis of the reserve capacity policy for dynamic VM allocation in a SaaS environment
Cao et al. Online cost-rejection rate scheduling for resource requests in hybrid clouds
EP3982258A1 (en) Method and apparatus for reducing power consumption of virtual machine cluster
CN113986497A (en) Queue scheduling method, device and system based on multi-tenant technology
CN113515355A (en) Resource scheduling method, device, server and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination