CN111865644B

CN111865644B - Recommendation method and device of computing resources, electronic equipment and storage medium

Info

Publication number: CN111865644B
Application number: CN201911228948.1A
Authority: CN
Inventors: 朱中涛; 孔建钢; 王琤; 裴文谦; 杨江华; 杨烽
Original assignee: Beijing Small Orange Technology Co ltd
Current assignee: Beijing Small Orange Technology Co ltd
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2023-04-07
Anticipated expiration: 2039-12-04
Also published as: CN111865644A

Abstract

The application relates to a recommendation method and device of computing resources, electronic equipment and a storage medium. According to the method and the device, the data volume of the model file and the input model corresponding to the task to be calculated is obtained, the pressure test is respectively carried out on the plurality of computing resources according to the model parameters for representing the complexity of the model in the model file and the data volume of the input model, the first performance index corresponding to the task to be calculated based on the operation of each computing resource can be determined according to the obtained plurality of test results, the performance index of each resource for processing the task to be processed is obtained, and the reasonable resource allocation scheme which is recommended to the user side and is suitable for processing the task to be calculated can be determined based on the first performance indexes corresponding to the plurality of computing resources and the second performance index corresponding to the service demand given by the user, so that the efficiency of selecting the computing resources can be improved, and meanwhile, the service demand of the user can be well matched with the performance index of the computing resource.

Description

Recommendation method and device of computing resources, electronic equipment and storage medium

Technical Field

The present application relates to the technical field of computing resource services, and in particular, to a method and an apparatus for recommending computing resources, an electronic device, and a storage medium.

Background

The computing task may be understood as performing computation on input data by using a trained model, for example, performing face recognition computation on an input face image by using a face recognition model, where the execution of the computing task needs a certain amount of computing resources to perform computation, and the computing resources generally refer to Central Processing Unit (CPU), graphics Processing Unit (GPU), hard disk resources, network resources, and the like, which are needed when a computer program runs.

The selection of which type of computing resource depends on the computing complexity of the computing task itself and the cost that the client is willing to pay, and generally, the user selects the computing resource for processing the computing task according to the experience of the user, but this method has a high requirement on the experience and capability of the user, and if the experience of the user is insufficient, it is difficult to well match the business requirement with the computing force requirement, and it takes a lot of time to select the computing resource, resulting in a low efficiency of selecting the computing resource.

Disclosure of Invention

In view of this, an object of the embodiments of the present application is to provide a method and an apparatus for recommending computing resources, an electronic device, and a storage medium, which can improve efficiency of selecting computing resources and achieve good matching between service requirements of a user and performance indexes of the computing resources.

The application mainly comprises the following aspects:

in a first aspect, an embodiment of the present application provides a method for recommending computing resources, where the method for recommending computing resources includes:

acquiring a model file corresponding to a task to be calculated and the data volume of an input model; the model file comprises model parameters representing the complexity of the model;

respectively performing pressure testing on a plurality of computing resources according to the model parameters and the data volume of the input model to obtain a plurality of testing results, and determining a first performance index corresponding to the task to be computed based on each computing resource according to the plurality of testing results;

and determining a resource configuration scheme recommended to the user side for computing the task to be computed based on the first performance index of each computing resource in the computing resources and the second performance index input by the user side.

In a possible implementation, the recommendation method further comprises determining the model parameters according to the following steps:

and analyzing the model file according to the model name input by the user side and the name of the deep learning frame used by the model to determine the model parameters.

In a possible embodiment, if the model is an image recognition model, the data amount of the input model includes at least one of the following data:

the size of the image input to the model, the number of channels of the image input to the model, and the number of images input to the model.

In a possible implementation manner, the performing, according to the model parameter and the data amount of the input model, a pressure test on a plurality of computing resources respectively to obtain a plurality of test results, and determining, according to the plurality of test results, a first performance index corresponding to the task to be calculated that is executed based on each computing resource includes:

for each computing resource of the plurality of computing resources, sending a plurality of service requests for each computing resource;

calculating the average number of the service requests processed by each computing resource in unit time and the average delay time length of each computing resource for processing each service request;

and determining the first performance index corresponding to each computing resource according to the average number and the average delay time corresponding to each computing resource.

In a possible implementation manner, the determining, based on the first performance index of each of the plurality of computing resources and the second performance index input by the user side, a resource configuration scheme recommended to the user side for computing the task to be computed includes:

selecting at least one candidate resource meeting a preset condition from the plurality of computing resources; the performance corresponding to each candidate resource meets the requirement and the corresponding virtual machine is available under the preset condition;

determining a resource configuration scheme recommended to the user side based on the first performance index and the second performance index of each of the at least one candidate resource.

In one possible implementation, the first performance indicator includes a first delay period and the second performance indicator includes a second delay period; the selecting at least one candidate resource satisfying a preset condition from the plurality of computing resources includes:

selecting a plurality of resources to be selected, of which the first delay time length is less than or equal to the second delay time length, from the plurality of computing resources;

and selecting the at least one candidate resource available for the corresponding virtual machine from the plurality of resources to be selected.

In one possible implementation, the first performance metric comprises a first throughput and the second performance metric comprises a second throughput; the determining a resource configuration scheme recommended to the user side based on the first performance indicator and the second performance indicator of each computing resource in the at least one candidate resource comprises:

determining a plurality of candidate configuration schemes according to the second throughput and the first throughput of each of the at least one candidate resource; each candidate configuration scheme comprises at least one candidate resource and the number of virtual machines corresponding to each candidate resource;

determining a candidate configuration scheme satisfying a usage cost requirement as the resource configuration scheme from the plurality of candidate configuration schemes.

In a possible embodiment, the determining, from the plurality of candidate configurations, a candidate configuration that satisfies a usage cost requirement as the resource configuration includes:

determining a cost of use for each of the plurality of candidate configurations based on the cost of effort for each of the at least one candidate resource;

and determining the candidate configuration scheme with the lowest use cost from the plurality of candidate configuration schemes as the resource configuration scheme according to the use cost of each candidate configuration scheme in the plurality of candidate configuration schemes.

In a possible implementation manner, if each candidate configuration solution includes only one candidate resource, the recommendation method further includes determining the number of virtual machines corresponding to each candidate resource according to the following steps:

and determining the number of the virtual machines corresponding to each candidate resource according to a first numerical value obtained by dividing the second throughput by the first throughput of each candidate resource.

In a possible implementation manner, if each candidate configuration scheme includes at least two candidate resources, the recommendation method further includes determining the number of virtual machines corresponding to each candidate resource according to the following steps:

determining a second value obtained by adding the first throughputs of at least two candidate resources in the at least one candidate resource;

and determining the number of the virtual machines corresponding to each candidate resource in the at least two candidate resources according to a third numerical value obtained by dividing the second throughput by the second numerical value.

In one possible embodiment, the determining the usage cost of each candidate configuration solution in the plurality of candidate configuration solutions according to the calculated cost of each candidate resource in the at least one candidate resource includes:

for each candidate configuration scheme in the plurality of candidate configuration schemes, multiplying the number of virtual machines corresponding to each candidate resource in each candidate configuration scheme by the corresponding computational cost to obtain at least one product value;

and adding the product values of the at least one product value to obtain a fourth numerical value, and determining the fourth numerical value as the use cost of each candidate configuration scheme.

In a second aspect, an embodiment of the present application further provides a recommendation apparatus for computing resources, where the recommendation apparatus includes:

the acquisition module is used for acquiring a model file corresponding to the task to be calculated and the data volume of the input model; the model file comprises model parameters representing the complexity of the model;

the first determining module is used for respectively carrying out pressure testing on a plurality of computing resources according to the model parameters and the data volume of the input model to obtain a plurality of testing results, and determining a first performance index corresponding to the task to be calculated based on each computing resource according to the plurality of testing results;

and the second determining module is used for determining a resource configuration scheme recommended to the user side for computing the task to be computed based on the first performance index of each computing resource in the plurality of computing resources and the second performance index input by the user side.

In a possible implementation, the recommendation device further includes:

and the third determining module is used for analyzing the model file according to the model name input by the user side and the name of the deep learning frame used by the model to determine the model parameters.

In one possible implementation, the first determining module includes:

a transmitting unit configured to transmit, for each of the plurality of computing resources, a plurality of service requests for each of the computing resources;

a calculation unit, configured to calculate an average number of the service requests processed by each computing resource in a unit time, and an average delay time length for each computing resource to process each of the service requests;

a first determining unit, configured to determine the first performance indicator corresponding to each computing resource according to the average number and the average delay duration corresponding to each computing resource.

In one possible implementation, the second determining module includes:

the selection unit is used for selecting at least one candidate resource meeting a preset condition from the plurality of computing resources; the performance corresponding to each candidate resource meets the requirement and the corresponding virtual machine is available under the preset condition;

a second determining unit, configured to determine, based on the first performance indicator of each candidate resource in the at least one candidate resource and the second performance indicator, a resource configuration scheme recommended to the user end.

In one possible implementation, the first performance indicator includes a first delay period and the second performance indicator includes a second delay period; the selecting unit is configured to select the at least one candidate resource according to the following steps:

In one possible implementation, the first performance metric comprises a first throughput and the second performance metric comprises a second throughput; the second determining unit is configured to determine the resource configuration scheme recommended to the user side according to the following steps:

In a possible implementation manner, the second determining unit is specifically configured to determine the resource allocation scheme according to the following steps:

determining a cost of use for each candidate allocation in the plurality of candidate allocations based on the calculated cost for each candidate resource in the at least one candidate resource;

In a possible implementation manner, if each candidate configuration scheme includes only one candidate resource, the second determining unit is further configured to:

In a possible implementation manner, if each candidate configuration scheme includes at least two candidate resources, the second determining unit is further configured to:

determining a second value obtained by adding first throughputs of at least two candidate resources in the at least one candidate resource;

In a possible implementation, the second determining unit is further configured to determine the usage cost of each candidate configuration scheme according to the following steps:

and adding the product values in the at least one product value to obtain a fourth numerical value, and determining the fourth numerical value as the use cost of each candidate configuration scheme.

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the memory communicate with each other through the bus, and the machine-readable instructions are executed by the processor to perform the steps of the method for recommending computing resources in the first aspect or any of the possible implementation manners of the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method for recommending computing resources described in the first aspect or any possible implementation manner of the first aspect.

In the embodiment of the application, the data volume of the model file and the input model corresponding to the task to be calculated is obtained, the plurality of computing resources are respectively subjected to pressure test according to the model parameters for representing the complexity of the model in the model file and the data volume of the input model, the first performance index corresponding to the task to be calculated when each computing resource operates can be determined according to the obtained plurality of test results, the performance index of each resource for processing the task to be processed is obtained, and the resource configuration scheme which is recommended to the user side and is suitable for reasonably processing the task to be calculated can be determined based on the first performance indexes corresponding to the plurality of computing resources and the second performance index corresponding to the service requirement given by the user, so that the service requirement of the user and the performance index of the computing resource can be well matched while the efficiency of selecting the computing resource is improved.

In order to make the aforementioned objects, features and advantages of the present application comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a flow chart illustrating a method for recommending computing resources according to an embodiment of the present application;

FIG. 2 is a flow chart illustrating another method for recommending computing resources provided by an embodiment of the present application;

FIG. 3 is a functional block diagram of a computing resource recommendation apparatus according to an embodiment of the present application;

FIG. 4 is a second functional block diagram of a recommendation apparatus for computing resources according to an embodiment of the present application;

FIG. 5 illustrates a functional block diagram of the first determination module of FIG. 3;

FIG. 6 shows a functional block diagram of a second determination block in FIG. 3;

fig. 7 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Description of the main element symbols:

in the figure: 300-recommendation means of computing resources; 310-an acquisition module; 320-a first determination module; 322-a transmitting unit; 324-a calculation unit; 326 — first determining unit; 330-a second determination module; 332-selecting unit; 334-a second determination unit; 340-a third determination module; 700-an electronic device; 710-a processor; 720-a memory; 730-bus.

Detailed Description

To make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and that steps without logical context may be performed in reverse order or concurrently. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, as generally described and illustrated in the figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

To enable those skilled in the art to use the present disclosure, the following embodiments are given in connection with the specific application scenario "recommending a resource configuration scheme to a user end providing a task to be computed", and it will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and application scenarios without departing from the spirit and scope of the present application.

The following method, apparatus, electronic device, or computer-readable storage medium in the embodiments of the present application may be applied to any scenario that requires recommendation of computing resources, and the embodiments of the present application are not limited to a specific application scenario, and any scheme that uses the method and apparatus for recommending computing resources provided in the embodiments of the present application is within the scope of protection of the present application.

It is worth noting that, before the application is provided, in the existing scheme, a user can select computing resources for processing computing tasks according to own experience, but the experience and the capability of the user are high in the mode, if the experience of the user is insufficient, the business requirements and the computing power requirements are difficult to be well matched, a large amount of time is needed for selecting the computing resources, and therefore the selecting efficiency of the computing resources is low. Specifically, a user needs to run a face recognition program (to-be-processed task), the user sets needed resource configurations as an 8-core CPU, a 16GB memory and a 1-card MLU100 cloud inference chip, the user transmits the configurations to K8S, the K8S schedules a virtual machine configured with an MLU100 card, and creates a container with 1-core CPU, a 16GB memory and a 1-card MLU100, 200GB storage space on the virtual machine, and the user can execute the face recognition computation task after configuring an operating environment on the container.

In order to solve the above problems, in the embodiment of the present application, a model file corresponding to a task to be computed and a data volume of an input model are obtained, a plurality of computing resources are respectively subjected to a pressure test according to a model parameter representing a model complexity in the model file and the data volume of the input model, and a first performance index corresponding to each computing resource running the task to be computed can be determined according to a plurality of obtained test results, so that a performance index of each resource processing the task to be processed is obtained.

It should be noted that the container service (kubernets, K8S) is a management tool for an application container cluster, is used for managing containerized applications on multiple hosts in a cloud platform, is simple and efficient to apply, and provides application deployment.

For the convenience of understanding of the present application, the technical solutions provided in the present application will be described in detail below with reference to specific embodiments.

Fig. 1 is a flowchart of a method for recommending computing resources according to an embodiment of the present application. As shown in fig. 1, a method for recommending computing resources provided in an embodiment of the present application includes the following steps:

s101: acquiring a model file corresponding to a task to be calculated and the data volume of an input model; the model file comprises model parameters representing the complexity of the model.

In specific implementation, after receiving a model file corresponding to a task to be calculated and data volume of an input model uploaded by a user, extracting model parameters from the model file, wherein the model parameters can represent model complexity, the model complexity generally refers to the depth and width of a model structure, for example, in a neural network model, the more the number of layers of the model is, the more the number of nodes in each layer is, the more the model is complex; the data size of the input model, for example, the model is an image recognition model, and the data of the input image recognition model is an image, so the data size of the input model is the number of images of the input image recognition model, the size of each image, and the like; for example, the model is a character recognition model, and the data of the input character recognition model is a text document, so the data amount of the input model is the number of text documents of the input character recognition model, the size of each document, and the like.

Here, the computing resources generally refer to CPUs, GPUs, hard disk resources, network resources, and the like required for running a computer program, the GPUs such as the inference accelerators Nvidia Tesla P4 and Nvidia Tesla T4, and the processors such as the artificial intelligence processor Cambricon MLU100 and the AI chip Ascend 910.

It should be noted that the task to be calculated is a task that the user wants to perform calculation through the calculation resources, and the calculation task may be understood as performing calculation on the input data by using a trained model, for example, the model is a face recognition model, and the calculation task may be a task of performing face recognition calculation on a face image input to the face recognition model by using the calculation resources; the model is a word recognition model and the computation may be a task of performing word recognition computation on a text document input to the word recognition model by utilizing computational resources.

Further, the recommendation method further comprises the step of determining the model parameters according to the following steps:

In specific implementation, after receiving a model file provided by a user side, the model file needs to be analyzed to obtain each model parameter of the model in the model file, and specifically, the model file can be analyzed according to a model name provided by the user side and a name of a deep learning framework used by the user side in model training to obtain the model parameter.

Here, a model is, for example, a neural network model, and the model is named as a deep Residual network (ResNet); the neural network model generally comprises two stages of training and testing, wherein the training is a process of extracting model parameters from training data and neural network models (neural network training frameworks such as AlexNet and RNN, for example, convolutional neural network frameworks) by using a CPU or a GPU, the testing is to run the testing data by using the trained models (neural network models + model parameters) and then check results, and the Convolutional neural network frameworks (Convolutional Architecture for Fast Feature Embedding, mask), keras and tenserflow are used for uniformly abstracting link data related to the training process to form a usable framework.

Further, if the model is an image recognition model, the data amount of the input model includes at least one of the following data:

In a specific implementation, the model corresponding to the task to be calculated may be various types of models, such as an image recognition model, a character recognition model, and the like, if the model is an image recognition model, the data input to the model is a plurality of images, and the data amount input to the model includes the size of the image input to the model, the number of channels of the image input to the model, and the number of images input to the model.

In one example, the task to be calculated is a face recognition task, the model is a face recognition model, the data amount of the input model is 100 images, the images are color images, the number of channels of the images is 3 (including a channel R, a channel G and a channel B), and the size of each image is 244 × 244 pixels.

S102: and respectively carrying out pressure test on a plurality of computing resources according to the model parameters and the data volume of the input model to obtain a plurality of test results, and determining a first performance index corresponding to the task to be calculated based on each computing resource according to the plurality of test results.

In specific implementation, after the data volume of the input model is determined, data consistent with the data volume is automatically acquired, for example, if the input model is an image, a plurality of images consistent with the data volume are acquired, a corresponding model is generated according to model parameters representing the complexity of the model, and then, each computing resource is used for performing simulation calculation on a task to be calculated (calculating the input data through the model), that is, each computing resource is subjected to a pressure test, a test result corresponding to each computing resource can be obtained, and a first performance index of each computing resource for running the task to be calculated can be determined according to the test result.

It should be noted that, the pressure test is to continuously pressurize the computing resource, force it to run under the limit condition, and observe how much it can run, so as to find out the performance defect, by building a test environment similar to the actual environment, and by sending an expected number of service requests, testing the efficiency conditions of the computing resource under different pressure conditions, and the pressure conditions that the computing resource can bear to the system within the same time or a certain period of time through the test program, and then determining the performance index of the computing resource according to the test result, where the performance index is, for example, throughput and delay, the throughput is the number of service requests that can be processed in a unit time, and the delay is the duration between the time of submitting the service request and the time of receiving the service result fed back for the service request.

Here, the first performance index of the computing resource is strongly correlated with the complexity of the model and the data volume of the input model, so that each computing resource can be subjected to a pressure test according to the model parameters and the data volume of the input model, and the first performance index corresponding to the task to be computed operated by each computing resource is further determined.

S103: and determining a resource configuration scheme recommended to the user side for computing the task to be computed based on the first performance index of each computing resource in the computing resources and the second performance index input by the user side.

In a specific implementation, after a first performance index of each computing resource in a plurality of computing resources is determined, a resource configuration scheme recommended to a user side is determined according to an acquired second performance index corresponding to a service requirement input by the user side, where the resource configuration scheme includes at least one computing resource, and each computing resource corresponds to at least one virtual machine, that is, in a scheme recommended to the user side for computing a task to be computed, a plurality of computing resources may be required to cooperate together for processing, and a plurality of virtual machines corresponding to each computing resource may be required to perform joint processing for completion.

It should be noted that, after the user confirms the resource allocation scheme, the K8S may schedule the computing resources in the resource allocation scheme to compute the to-be-processed task.

In an example, if the performance index corresponding to the computing resource Nvidia Tesla P4 is 10QPS and 80ms delay, the performance index corresponding to the computing resource Nvidia Tesla T4 is 30qps and 40ms delay, and the service requirement is 30QPS and 100ms delay, then 3 Nvidia Tesla P4 cards or 1 Nvidia Tesla T4 card may be selected as the resource configuration scheme recommended to the user.

In the embodiment of the application, the data volume of the model file and the input model corresponding to the task to be calculated is obtained, the plurality of computing resources are respectively subjected to pressure test according to the model parameters for representing the complexity of the model in the model file and the data volume of the input model, the first performance index corresponding to the task to be calculated, which is operated by each computing resource, can be determined according to the obtained plurality of test results, the performance index of each resource for processing the task to be processed is obtained, and the resource configuration scheme which is recommended to the user side and is suitable for reasonably processing the task to be calculated can be determined based on the first performance indexes corresponding to the plurality of computing resources and the second performance index corresponding to the service demand given by the user, so that the efficiency of selecting the computing resources can be improved, and the service demand of the user can be well matched with the performance index of the computing resource.

Fig. 2 is a flowchart of another method for recommending computing resources according to an embodiment of the present application. As shown in fig. 2, a method for recommending a computing resource provided in an embodiment of the present application includes the following steps:

s201: acquiring a model file corresponding to a task to be calculated and the data volume of an input model; the model file comprises model parameters representing the complexity of the model.

The description of S201 may refer to the description of S101, and the same technical effect may be achieved, which is not described in detail herein.

S202: for each of the plurality of computing resources, sending a plurality of service requests to each computing resource.

In a specific implementation, the stress test for each of the plurality of computing resources is to continuously pressurize the computing resource, force it to operate under a limit condition, and observe how much it can operate, where the stress test is to continuously send service requests for each computing resource, specifically, a small number of service requests may be sent in a unit time first, and then continuously increase the number of service requests sent in the unit time until the computing limit of each computing resource is reached.

S203: calculating an average number of the service requests processed per unit time by each computing resource, and an average delay time period for each computing resource to process each of the service requests.

In the concrete implementation, when each computing resource is subjected to the stress test, the number of service requests processed in each unit time by each computing resource is counted, the time length between the moment when each computing resource receives the service request and the moment when the service request is processed is counted, the time length is used as the time delay, after the stress test is finished, the average number of the service requests processed in each unit time by each computing resource is calculated, and the average time delay time length for each computing resource to process each service request is calculated.

S204: and determining the first performance index corresponding to each computing resource according to the average number and the average delay time corresponding to each computing resource.

In a specific implementation, for each computing resource of the plurality of computing resources, an average number of service requests processed by each computing resource in a unit time is a throughput corresponding to the computing resource, and an average extended duration for each computing resource to process each service request is a delay corresponding to the computing resource, and the delay and the throughput may reflect performance of each computing resource, that is, an average number and an average extended duration corresponding to each computing resource are used as the first performance indicator corresponding to each computing resource.

Here, the method and the device can determine a first performance index corresponding to each computing resource processing resource to be computed by performing a stress test on each computing resource, and further recommend a resource configuration scheme for the user according to the first performance index and the service requirement of the user. Therefore, the user is not required to select the computing resource for processing the task to be computed, the time spent on selecting the computing resource can be saved for the user, and the selecting efficiency of the computing resource can be improved.

S205: and determining a resource configuration scheme recommended to the user side for computing the task to be computed based on the first performance index of each computing resource in the plurality of computing resources and the second performance index input by the user side.

The description of S205 may refer to the description of S103, and the same technical effect can be achieved, which is not described in detail herein.

Further, in step S205, determining a resource allocation scheme recommended to the user side for computing the task to be computed based on the first performance index of each of the plurality of computing resources and the second performance index input by the user side, includes the following steps:

step a: selecting at least one candidate resource meeting a preset condition from the plurality of computing resources; and the performance corresponding to each candidate resource of the preset condition meets the requirement and the corresponding virtual machine is available.

In specific implementation, after performing a stress test on each computing resource of the multiple computing resources, a first performance index corresponding to each computing resource may be determined, and then, according to the first performance index of each computing resource and a second performance index of a service requirement given by a user, which first performance indexes of the computing resources of the multiple computing resources can meet a performance requirement may be determined, and a computing resource available to a corresponding virtual machine among the computing resources that meet the performance requirement is selected as a candidate resource. Here, the virtual machine is usable, which means that the virtual machine corresponding to the computing resource accessed to the K8S platform is not occupied, that is, does not compute other computing tasks.

Further, the first performance indicator comprises a first delay duration and the second performance indicator comprises a second delay duration; in the step a, at least one candidate resource meeting a preset condition is selected from the plurality of computing resources, and the method comprises the following steps:

selecting a plurality of resources to be selected, of which the first delay duration is smaller than or equal to the second delay duration, from the plurality of computing resources; and selecting the at least one candidate resource available for the corresponding virtual machine from the plurality of resources to be selected.

In specific implementation, for each computing resource in the multiple computing resources, the computing resource whose corresponding first delay duration is less than or equal to the second delay duration is determined as a candidate resource that meets the performance requirement, and further, the candidate resource that is available to the corresponding virtual machine is selected from the multiple candidate resources and determined as a candidate resource.

In an example, the first delay duration corresponding to the computing resource Nvidia Tesla P4 is 80ms, the first delay duration corresponding to the computing resource Nvidia Tesla T4 is 40ms, the second delay duration in the service requirement given by the user is 60ms,40ms is less than or equal to 60ms and less than or equal to 80ms, then the computing resource Nvidia Tesla T4 meets the performance requirement, and the computing resource Nvidia Tesla P4 does not meet the performance requirement.

Step b: determining a resource configuration scheme recommended to the user side based on the first performance index and the second performance index of each of the at least one candidate resource.

In a specific implementation, after candidate resources meeting a preset condition are selected from the plurality of computing resources, which candidate resources are included in the resource configuration scheme recommended to the user side and the number of virtual machines corresponding to each included candidate resource may be determined according to the first performance index and the second performance index of each candidate resource.

Further, the first performance metric comprises a first throughput, and the second performance metric comprises a second throughput; in step b, the determining a resource allocation scheme recommended to the user side based on the first performance index and the second performance index of each computing resource in the at least one candidate resource includes the following steps:

step (1): determining a plurality of candidate configuration schemes according to the second throughput and the first throughput of each of the at least one candidate resource; each candidate configuration scheme comprises at least one candidate resource and the number of virtual machines corresponding to each candidate resource.

In a specific implementation, which candidate resources are included in the resource configuration scheme recommended to the user side and the number of virtual machines corresponding to each included candidate resource may be determined according to the first throughput and the second throughput of each candidate resource.

Further, if each candidate configuration scheme only includes one candidate resource, the recommendation method further includes determining the number of virtual machines corresponding to each candidate resource according to the following steps: and determining the number of the virtual machines corresponding to each candidate resource according to a first numerical value obtained by dividing the second throughput by the first throughput of each candidate resource.

In specific implementation, each candidate configuration scheme includes only one candidate resource, the second throughput may be directly divided by the first throughput corresponding to the candidate resource, and the number of virtual machines corresponding to each candidate resource is determined according to the obtained first numerical value.

It should be noted that, if the obtained first numerical value is not an integer, the integer obtained by adding 1 to the first numerical value is selected as the number of the corresponding virtual machines, for example, if the first numerical value is 3.5, the number of the corresponding virtual machines is 4.

In an example, the candidate configuration scheme includes only one candidate resource Nvidia Tesla P4, where a first throughput corresponding to the candidate resource Nvidia Tesla P4 is 10QPS, and if a second throughput corresponding to a service requirement given by a user is 20QPS, then, 20 ÷ 10=2, the number of virtual machines corresponding to the candidate resource Nvidia Tesla P4 is 2.

Further, if each candidate configuration scheme includes at least two candidate resources, the recommendation method further includes determining the number of virtual machines corresponding to each candidate resource according to the following steps:

determining a second value obtained by adding the first throughputs of at least two candidate resources in the at least one candidate resource; and determining the number of the virtual machines corresponding to each candidate resource in the at least two candidate resources according to a third numerical value obtained by dividing the second throughput by the second numerical value.

In a specific implementation, when each candidate configuration scheme includes at least two candidate resources, a second value obtained by directly adding first throughputs of the at least two candidate resources may be obtained, and the second throughput is divided by the second value to obtain a third value, and further, the number of virtual machines corresponding to each candidate resource is determined according to the obtained third value.

It should be noted that, if the obtained third value is not an integer, the integer obtained by adding 1 to the third value is selected as the number of the corresponding virtual machines, for example, if the third value is 2.6, the number of the corresponding virtual machines is 3.

It should be noted that the number of virtual machines corresponding to the candidate resources included in the candidate configuration scheme may be different.

In an example, the candidate configuration scheme includes candidate resources Nvidia Tesla P4, nvidia Tesla T4, MLU100, where a first throughput corresponding to the candidate resources Nvidia Tesla P4 is 10qps, a first throughput corresponding to the Nvidia Tesla T4 is 30qps, and a first throughput corresponding to the MLU100 is 20QPS, and if a second throughput corresponding to a service requirement given by a user is 60QPS, then 60 ÷ (10 +30+ 20) =1 is given, and the number of virtual machines corresponding to each of the candidate resources Nvidia Tesla P4, nvidia Tesla T4, MLU100, and Nvidia Tesla P4 is 1.

Step (2): determining a candidate configuration scheme satisfying a usage cost requirement as the resource configuration scheme from the plurality of candidate configuration schemes.

In specific implementation, for a plurality of candidate configuration schemes meeting service requirements, a candidate configuration scheme with relatively high cost performance is selected as a resource configuration scheme. Here, the user needs to pay for the use of the computing resources, and the cost of different computing resources is different, so that a candidate allocation scheme with relatively low use cost can be recommended to the user as the resource allocation scheme.

Further, the step (2) of determining a candidate configuration scheme satisfying the usage cost requirement from the plurality of candidate configuration schemes as the resource configuration scheme includes the following steps:

determining a cost of use for each of the plurality of candidate configurations based on the cost of effort for each of the at least one candidate resource; according to the use cost of each candidate configuration scheme in the candidate configuration schemes, determining the candidate configuration scheme with the lowest use cost from the candidate configuration schemes as the resource configuration scheme.

In a specific implementation, after determining which candidate resources are included in each candidate configuration scheme, the number of virtual machines corresponding to each included candidate resource, and the computational cost of each candidate resource, the use cost of each candidate configuration scheme may be determined, and further, from the plurality of candidate configuration schemes, the candidate configuration scheme with the lowest use cost is determined to be the resource configuration scheme.

Further, the determining the usage cost of each candidate configuration scenario in the plurality of candidate configuration scenarios according to the calculated cost of each candidate resource in the at least one candidate resource comprises:

for each candidate configuration scheme in the multiple candidate configuration schemes, multiplying the number of virtual machines corresponding to each candidate resource in each candidate configuration scheme by the corresponding calculation cost to obtain at least one product value; and adding the product values of the at least one product value to obtain a fourth numerical value, and determining the fourth numerical value as the use cost of each candidate configuration scheme.

In specific implementation, the number of virtual machines corresponding to each candidate resource in each candidate configuration scheme is multiplied by the computational cost of the corresponding virtual machine to obtain at least one product value, the product values in the at least one product value are added to obtain a fourth numerical value, and the fourth numerical value is determined as the use cost of each candidate configuration scheme.

In an example, if the candidate configuration scheme includes candidate resources Nvidia Tesla T4 and MLU100, where Nvidia Tesla T4 corresponds to 2 virtual machines, MLU100 corresponds to 1 virtual machine, the cost of Nvidia Tesla T4 is 5 yuan/hour, and the cost of MLU100 is 3 yuan/hour, then the use cost of the candidate configuration scheme is 5 × 2+1 × 3=13 yuan.

In the implementation of the application, the model file corresponding to the task to be calculated and the data volume of the input model are obtained, the plurality of computing resources are respectively subjected to pressure test according to the model parameters for representing the model complexity in the model file and the data volume of the input model, the first performance index corresponding to the task to be calculated when each computing resource operates can be determined according to the obtained plurality of test results, the performance index of each resource for processing the task to be processed is obtained, and the resource configuration scheme which is recommended to the user side and is suitable for reasonably processing the task to be calculated can be determined based on the first performance indexes corresponding to the plurality of computing resources and the second performance index corresponding to the service demand given by the user.

Based on the same application concept, a computing resource recommendation device corresponding to the provided computing resource recommendation method is further provided in the embodiment of the present application, and as the principle of solving the problem of the device in the embodiment of the present application is similar to the computing resource recommendation method in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 3 to fig. 6, fig. 3 is a functional block diagram of a computing resource recommendation device 300 according to an embodiment of the present application; FIG. 4 is a second functional block diagram of an apparatus 300 for recommending computing resources according to an embodiment of the present application; FIG. 5 illustrates a functional block diagram of the first determination module 320 of FIG. 3; fig. 6 shows a functional block diagram of the second determination module 330 in fig. 3.

As shown in fig. 3 and 4, the apparatus 300 for recommending computing resources includes:

an obtaining module 310, configured to obtain a model file corresponding to a task to be calculated and a data amount of an input model; the model file comprises model parameters representing the complexity of the model;

a first determining module 320, configured to perform pressure tests on multiple computing resources according to the model parameters and the data size of the input model, respectively, to obtain multiple test results, and determine, according to the multiple test results, a first performance index corresponding to the task to be computed that is executed based on each computing resource;

a second determining module 330, configured to determine, based on the first performance index of each of the plurality of computing resources and the second performance index input by the user side, a resource configuration scheme recommended to the user side for computing the task to be computed.

In a possible implementation manner, as shown in fig. 4, the apparatus 300 for recommending computing resources further includes:

the third determining module 340 is configured to parse the model file according to the model name input by the user side and the name of the deep learning frame used by the model, and determine the model parameter.

In one possible implementation, as shown in fig. 5, the first determining module 320 includes:

a sending unit 322, configured to send, for each of the plurality of computing resources, a plurality of service requests for each of the computing resources;

a calculating unit 324, configured to calculate an average number of the service requests processed by each computing resource in a unit time, and an average delay time length for each computing resource to process each service request;

a first determining unit 326, configured to determine the first performance indicator corresponding to each computing resource according to the average number and the average delay duration corresponding to each computing resource.

In one possible implementation, as shown in fig. 6, the second determining module 330 includes:

a selecting unit 332, configured to select at least one candidate resource that meets a preset condition from the multiple computing resources; the performance corresponding to each candidate resource meets the requirement and the corresponding virtual machine is available under the preset condition;

a second determining unit 334, configured to determine, based on the first performance indicator of each candidate resource in the at least one candidate resource and the second performance indicator, a resource configuration scheme recommended to the user end.

In one possible implementation, as shown in FIG. 6, the first performance metric comprises a first delay period and the second performance metric comprises a second delay period; the selecting unit 332 is configured to select the at least one candidate resource according to the following steps:

In one possible implementation, as shown in fig. 6, the first performance metric comprises a first throughput and the second performance metric comprises a second throughput; the second determining unit 334 is configured to determine the resource configuration scheme recommended to the user end according to the following steps:

In a possible implementation manner, as shown in fig. 6, the second determining unit 334 is specifically configured to determine the resource configuration scheme according to the following steps:

according to the use cost of each candidate configuration scheme in the candidate configuration schemes, determining the candidate configuration scheme with the lowest use cost from the candidate configuration schemes as the resource configuration scheme.

In a possible implementation manner, as shown in fig. 6, if each candidate configuration solution includes only one candidate resource, the second determining unit 334 is further configured to:

In a possible implementation manner, as shown in fig. 6, if each candidate configuration scheme includes at least two candidate resources, the second determining unit 334 is further configured to:

In a possible implementation, as shown in fig. 6, the second determining unit 334 is further configured to determine the usage cost of each candidate configuration solution according to the following steps:

In the embodiment of the application, the data volume of the model file and the input model corresponding to the task to be computed is obtained through the obtaining module 310, then the pressure test is performed on the multiple computing resources respectively according to the model parameters characterizing the complexity of the model in the model file and the data volume of the input model, and according to the obtained multiple test results, the first performance index corresponding to the task to be computed operated by each computing resource can be determined through the first determining module 320, that is, the performance index of each resource processing task to be processed is obtained, and then, based on the first performance indexes corresponding to the multiple computing resources and the second performance index corresponding to the business demand given by the user, the resource configuration scheme which is recommended to the user side and is suitable for processing the task to be computed is reasonable can be determined through the second determining module 330, so that the business demand of the user and the performance index of the computing resource can be well matched while the efficiency of selecting the computing resource is improved.

Based on the same application concept, referring to fig. 7, a schematic structural diagram of an electronic device 700 provided in the embodiment of the present application includes: a processor 710, a memory 720 and a bus 730, wherein the memory 720 stores machine-readable instructions executable by the processor 710, when the electronic device 700 is operated, the processor 710 communicates with the memory 720 via the bus 730, and the machine-readable instructions are executed by the processor 710 to perform the steps of the method for recommending computing resources according to any of the embodiments.

In particular, the machine readable instructions, when executed by the processor 710, may perform the following:

respectively carrying out pressure testing on a plurality of computing resources according to the model parameters and the data volume of the input model to obtain a plurality of testing results, and determining a first performance index corresponding to the task to be calculated based on each computing resource according to the plurality of testing results;

and determining a resource configuration scheme recommended to the user side for computing the task to be computed based on the first performance index of each computing resource in the plurality of computing resources and the second performance index input by the user side.

In the embodiment of the application, the data volume of the model file and the input model corresponding to the task to be calculated is obtained, the plurality of computing resources are respectively subjected to pressure test according to the model parameters for representing the complexity of the model in the model file and the data volume of the input model, the first performance index corresponding to the task to be calculated of each computing resource operation can be determined according to the obtained plurality of test results, the performance index of each resource for processing the task to be processed is obtained, and the resource configuration scheme which is recommended to the user side and is suitable for reasonably processing the task to be calculated can be determined based on the first performance indexes corresponding to the plurality of computing resources and the second performance index corresponding to the service demand given by the user, so that the efficiency of selecting the computing resources can be improved, and the service demand of the user can be well matched with the performance index of the computing resource.

Based on the same application concept, the embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method for recommending computing resources provided by the above embodiment are performed.

Specifically, the storage medium may be a general storage medium, such as a mobile disk, a hard disk, or the like, and when a computer program on the storage medium is executed, the method for recommending the computing resource may be executed, so that efficiency of selecting the computing resource may be improved, and a service requirement of a user may be well matched with a performance index of the computing resource.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units into only one type of logical function may be implemented in other ways, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A recommendation method for computing resources, the recommendation method comprising:

acquiring a model file corresponding to a task to be calculated and the data volume of an input model; the model file comprises model parameters representing the complexity of the model; the model complexity refers to the depth and width of the model structure;

determining a resource configuration scheme recommended to a user side for computing the task to be computed based on a first performance index of each computing resource in the computing resources and a second performance index input by the user side;

the first performance index and the second performance index are the same type of performance index, and the type of the performance index comprises delay duration and throughput.

2. The recommendation method according to claim 1, further comprising determining the model parameters according to the following steps:

3. The recommendation method according to claim 1, wherein if the model is an image recognition model, the data amount of the input model includes at least one of the following data:

4. The recommendation method according to claim 1, wherein the performing pressure tests on a plurality of computing resources according to the model parameters and the data volume of the input model to obtain a plurality of test results, and determining a first performance index corresponding to the task to be calculated running on the basis of each computing resource according to the plurality of test results comprises:

calculating the average number of the service requests processed in unit time by each computing resource and the average delay time length of each computing resource for processing each service request;

5. The recommendation method according to claim 1, wherein the determining a resource allocation scheme recommended to the user side for computing the task to be computed based on the first performance index of each of the plurality of computing resources and the second performance index input by the user side comprises:

6. The recommendation method according to claim 5, wherein the first performance indicator comprises a first delay duration and the second performance indicator comprises a second delay duration; the selecting at least one candidate resource satisfying a preset condition from the plurality of computing resources includes:

7. The recommendation method of claim 5, wherein the first performance metric comprises a first throughput and the second performance metric comprises a second throughput; the determining a resource configuration scheme recommended to the user side based on the first performance indicator and the second performance indicator of each computing resource in the at least one candidate resource comprises:

8. The recommendation method according to claim 7, wherein the determining, from the plurality of candidate configurations, a candidate configuration that satisfies a usage cost requirement as the resource configuration comprises:

9. The recommendation method according to claim 7, wherein if each candidate configuration scenario includes only one candidate resource, the recommendation method further comprises determining the number of virtual machines corresponding to each candidate resource according to the following steps:

10. The recommendation method according to claim 7, wherein if each candidate configuration scheme includes at least two candidate resources, the recommendation method further comprises determining the number of virtual machines corresponding to each candidate resource according to the following steps:

11. The recommendation method of claim 8, wherein determining the cost of use for each of the plurality of candidate configurations based on the cost of effort for each of the at least one candidate resource comprises:

12. An apparatus for recommending computing resources, said apparatus comprising:

the acquisition module is used for acquiring a model file corresponding to the task to be calculated and the data volume of the input model; the model file comprises model parameters representing the complexity of the model; the model complexity refers to the depth and width of the model structure;

a second determining module, configured to determine, based on the first performance index of each computing resource in the multiple computing resources and a second performance index input by a user side, a resource configuration scheme recommended to the user side for computing the task to be computed;

13. The recommendation device of claim 12, further comprising:

14. The recommendation device of claim 12, wherein if the model is an image recognition model, the data volume of the input model comprises at least one of:

15. The recommendation device according to claim 12, wherein the first determining module comprises:

16. The recommendation device of claim 12, wherein the second determining module comprises:

a selecting unit, configured to select at least one candidate resource that satisfies a preset condition from the plurality of computing resources; the performance corresponding to each candidate resource meets the requirement and the corresponding virtual machine is available under the preset condition;

a second determining unit, configured to determine, based on the first performance indicator of each of the at least one candidate resource and the second performance indicator, a resource configuration scheme recommended to the user end.

17. The recommendation device of claim 16, wherein the first performance metric comprises a first delay period and the second performance metric comprises a second delay period; the selecting unit is configured to select the at least one candidate resource according to the following steps:

selecting a plurality of resources to be selected, of which the first delay duration is smaller than or equal to the second delay duration, from the plurality of computing resources;

18. The recommendation device of claim 16, wherein the first performance metric comprises a first throughput and the second performance metric comprises a second throughput; the second determining unit is configured to determine the resource configuration scheme recommended to the user side according to the following steps:

19. The recommendation device according to claim 18, wherein the second determining unit is specifically configured to determine the resource allocation scheme according to the following steps:

20. The recommendation device of claim 18, wherein if each candidate configuration includes only one candidate resource, the second determining unit is further configured to:

21. The recommendation device of claim 18, wherein if each candidate configuration scheme comprises at least two candidate resources, the second determining unit is further configured to:

22. The recommendation device according to claim 19, wherein the second determining unit is further configured to determine the usage cost of each candidate configuration solution according to the following steps:

23. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is run, the machine-readable instructions when executed by the processor performing the steps of the method for recommending computing resources of any of claims 1 to 11.

24. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of a method for recommendation of computing resources according to any one of claims 1 to 11.