CN114217947A

CN114217947A - Task execution method and device, electronic equipment and readable storage medium

Info

Publication number: CN114217947A
Application number: CN202111301514.7A
Authority: CN
Inventors: 王亚男; 刘洋; 王晖; 慕正锋
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2022-03-22

Abstract

The present disclosure provides a task execution method, a task execution device, an electronic device, and a readable storage medium, which relate to the technical field of artificial intelligence, specifically to the technical field of computer vision and deep learning, and can be used in a video streaming processing framework scene. The task execution method comprises the following steps: creating a resource pool according to the configuration file of the corresponding operator; adding the received at least one calling task to a task queue, and distributing computing resources for the at least one calling task according to the resource pool; and executing the at least one calling task by utilizing the computing resources respectively, and adding an execution result of the at least one calling task to a result queue. The method and the device can ensure that the operator has multithreading capability, so that the parallelism of the operator in task execution is improved.

Description

Task execution method and device, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technology, and more particularly to the field of computer vision and deep learning technology, which can be used in video streaming processing framework scenes. A task execution method, a task execution device, an electronic device and a readable storage medium are provided.

Background

Multithreading design is a common means in engineering codes, and can improve the parallelism of programs and the utilization rate of hardware resources. For the multithreading design of an operator, the prior art generally adopts a mode of designing a multithreading function by an architecture layer, and a processing function of the operator is designed to be multithreading for processing. However, when an operator has a special requirement on the design of a thread, the multithreading capability of the operator can be realized only by singly designing an adaptive code of the operator or calling the operator in a multiprocess mode, and the technical problems of more complicated steps and greater resource waste exist.

Disclosure of Invention

According to a first aspect of the present disclosure, there is provided a task execution method including: creating a resource pool according to the configuration file of the corresponding operator; adding the received at least one calling task to a task queue, and distributing computing resources for the at least one calling task according to the resource pool; and executing the at least one calling task by utilizing the computing resources respectively, and adding an execution result of the at least one calling task to a result queue.

According to a second aspect of the present disclosure, there is provided a task execution apparatus including: the creating unit is used for creating a resource pool according to the configuration file of the corresponding operator; the processing unit is used for adding the received at least one calling task to a task queue and distributing computing resources for the at least one calling task according to the resource pool; and the execution unit is used for executing the at least one calling task by utilizing the computing resources respectively and adding the execution result of the at least one calling task to a result queue.

According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.

According to a fifth aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method as described above.

According to the technical scheme, the method and the device can ensure that the operator has multithreading capability, so that the parallelism of the operator in task execution is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;

fig. 4 is a block diagram of an electronic device for implementing a task execution method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure. As shown in fig. 1, the task execution method of this embodiment specifically includes the following steps:

s101, creating a resource pool according to a configuration file of a corresponding operator;

s102, adding the received at least one calling task to a task queue, and distributing computing resources for the at least one calling task according to the resource pool;

s103, executing the at least one calling task by utilizing the computing resources respectively, and adding an execution result of the at least one calling task to a result queue.

The task execution method of the embodiment includes that an execution main body is an operator, the operator firstly creates a resource pool according to a configuration file corresponding to the operator, then adds at least one received calling task to a task queue, allocates a computing resource for the at least one calling task according to the created resource pool, finally executes the at least one calling task by using the allocated computing resource respectively, and adds an execution result of the at least one calling task to a result queue.

The operators in this embodiment correspond to different types of neural network models, such as a target detection model, an image classification model, and the like, and are used for completing different types of tasks such as target detection, image classification, and the like.

In this embodiment, when S101 is executed, a configuration file corresponding to an operator is first determined, and then a resource pool of the operator is created according to the determined configuration file; the resource number of the operator is predefined in the configuration file determined by the embodiment executing S101, that is, the resource pool created by the embodiment executing S101 includes a specific number of resources.

In this embodiment, when S101 is executed to determine the configuration file of the corresponding operator, an optional implementation manner that can be adopted is as follows: determining the model type of an operator; and taking the configuration file corresponding to the determined model type as the configuration file corresponding to the operator. The operator itself in this embodiment may also contain a configuration file.

That is to say, in the present embodiment, the configuration file of the operator is determined according to the preset correspondence between the model type and the configuration file, and the accuracy of the determined configuration file can be improved.

In addition, in this embodiment, after the resource pool is created in step S101, the initialization operation of the operator itself may also be performed, so that the operator is ready to execute the received call task.

After executing S101 to create a resource pool, the present embodiment executes S102 to add the received at least one calling task to the task queue, and allocates a computing resource to the received at least one calling task according to the created resource pool.

In this embodiment, at least one received call task when executing S102 is sent when the architecture layer calls an operator in multiple threads, each call task corresponds to one thread, and the operator generates a task queue and a result queue according to the call of the architecture layer.

In order to ensure that different calling tasks can share one task queue when the architecture layer performs multi-thread calling on an operator, in this embodiment, when the framework layer performs S102 to add at least one received calling task to the task queue, an optional implementation manner that can be adopted is as follows: determining a thread count corresponding to at least one calling task; and taking the determined thread count as the identification information of the at least one calling task in the task queue.

In this embodiment, when the execution S102 determines that the thread count corresponding to the at least one calling task is determined, the optional implementation manner that may be adopted is: setting a thread counter, wherein the counting value of the thread counter represents the calling times of calling an operator by an architecture layer, namely the counting value of the thread counter is increased by 1 after the thread counter determines that the architecture layer adds a thread to call the operator; and taking the counting value of the thread counter when the at least one calling task is received as the thread count corresponding to the at least one calling task.

That is to say, in this embodiment, the number of processes of the architecture layer when the operator is called is used as identification information of different calling tasks received by the operator, so that the purpose of distinguishing different calling tasks in the task queue can be achieved, and the method is more convenient and effective.

In this embodiment, when executing S102 to allocate a computing resource for the received at least one calling task according to the created resource pool, an optional implementation manner that may be adopted is as follows: and under the condition that the created resource pool is determined to have available computing resources, distributing the computing resources for at least one calling task, and otherwise, waiting until the available computing resources exist in the resource pool.

That is, the present embodiment can dynamically allocate the computing resources for the calling task, and can ensure that the computing resources in the resource pool are effectively utilized.

After the execution S102 allocates the computing resources to the at least one calling task according to the resource pool, the execution S103 executes the at least one calling task by using the allocated computing resources, and adds the execution result of the at least one calling task to the result queue.

In this embodiment, when the execution result of at least one calling task is added to the result queue by executing S103, the optional implementation manner that may be adopted is: determining a thread count corresponding to the at least one calling task; and counting the determined threads as identification information of the execution result of the at least one calling task in the result queue.

The process of executing S103 to determine the thread count of the call task is the same as the process of executing S102 to determine the thread count of the call task in this embodiment, and is not described herein again.

That is to say, in the present embodiment, for one call task, the same identification information is used to distinguish different call tasks in the task queue and execution results corresponding to different call tasks in the result queue, so that result query is performed in the result queue according to the identification information corresponding to the call task.

In addition, the present embodiment may further include the following: after obtaining the execution result of a calling task, subtracting 1 from the counting value of the thread counter; and under the condition that the counting value of the thread counter is determined to be 0, ending the resource pool and releasing all the computing resources in the resource pool.

That is to say, after the resource pool of the operator is created, the resource pool can be ended only after the operator completes execution of all received call tasks, so that the problem that the resource pool is ended in advance when the operator does not complete execution of the call tasks is avoided.

Fig. 2 is a schematic diagram according to a second embodiment of the present disclosure. As shown in fig. 2, when the operator adds different call tasks to the task queue, the operator uses the thread count corresponding to each call task as the ID of each call task in the task queue, and the call task corresponding to ID10 is the 10 th call to the operator by the framework layer; and after the operator executes each calling task and obtains the execution result of each calling task, the thread count corresponding to each calling task is used as the ID of each execution result, and then the query can be carried out according to the ID of each execution result.

Fig. 3 is a schematic diagram according to a third embodiment of the present disclosure. As shown in fig. 3, the task execution device 300 of the present embodiment is located at an operator, and includes:

the creating unit 301 is configured to create a resource pool according to the configuration file of the corresponding operator;

the processing unit 302 is configured to add the received at least one call task to a task queue, and allocate a computing resource to the at least one call task according to the resource pool;

the execution unit 303 is configured to execute the at least one call task by using the computing resource, and add an execution result of the at least one call task to a result queue.

The creating unit 301 first determines a configuration file of a corresponding operator, and then creates a resource pool of the operator according to the determined configuration file; the number of resources of the operator is predefined in the configuration file determined by the creating unit 301, that is, the resource pool created by the creating unit 301 includes a specific number of resources.

When determining the configuration file of the corresponding operator, the creating unit 301 may adopt an optional implementation manner as follows: determining the model type of an operator; and taking the configuration file corresponding to the determined model type as the configuration file corresponding to the operator. The operator itself in this embodiment may also contain a configuration file.

That is to say, the creating unit 301 determines the configuration file of the operator according to the preset correspondence between the model type and the configuration file, and can improve the accuracy of the determined configuration file.

In addition, after creating the resource pool, the creating unit 301 may further perform an initialization operation of the operator itself, so that the operator is ready to execute the received call task.

After the resource pool is created by the creating unit 301, the processing unit 302 adds the received at least one calling task to the task queue, and allocates the computing resource for the received at least one calling task according to the created resource pool.

At least one call task received by the processing unit 302 is sent when the framework layer calls the operator in multiple threads, and each call task corresponds to one thread.

In order to ensure that different calling tasks can share one task queue when the architecture layer performs multi-threaded calling on an operator, the processing unit 302 may adopt an optional implementation manner when adding the received at least one calling task to the task queue: determining a thread count corresponding to at least one calling task; and taking the determined thread count as the identification information of the at least one calling task in the task queue.

When determining the thread count corresponding to at least one calling task, the processing unit 302 may adopt an optional implementation manner as follows: setting a thread counter, wherein the counting value of the thread counter represents the calling times of the architecture layer calling operator; and taking the counting value of the thread counter when the at least one calling task is received as the thread count corresponding to the at least one calling task.

That is to say, the processing unit 302 uses the number of processes of the architecture layer when the operator is called as identification information of different calling tasks received by the operator, so that the purpose of distinguishing different calling tasks in the task queue can be achieved, and the method is more convenient and effective.

When the processing unit 302 allocates the computing resource for the received at least one calling task according to the created resource pool, the optional implementation manners that can be adopted are as follows: and under the condition that the created resource pool is determined to have available computing resources, distributing the computing resources for at least one calling task, and otherwise, waiting until the available computing resources exist in the resource pool.

That is, the processing unit 302 can dynamically allocate computing resources for the invoking task, and can ensure that the computing resources in the resource pool are efficiently utilized.

After the processing unit 302 allocates the computing resources to the at least one calling task according to the resource pool, the execution unit 303 executes the at least one calling task by using the allocated computing resources, and adds the execution result of the at least one calling task to the result queue.

When the execution unit 303 adds the execution result of the at least one calling task to the result queue, the optional implementation manner that can be adopted is as follows: determining a thread count corresponding to the at least one calling task; and counting the determined threads as identification information of the execution result of the at least one calling task in the result queue.

The process of determining the thread count of the call task by the execution unit 303 is consistent with the process of determining the thread count of the call task by the processing unit 302, which is not described herein again.

That is, the execution unit 303 distinguishes, for one call task, different call tasks in the task queue and execution results corresponding to the different call tasks in the result queue using the same identification information, so as to perform result query in the result queue according to the identification information corresponding to the call tasks.

In addition, the task execution device 300 of the present embodiment may further include a release unit 304, configured to execute the following: after obtaining the execution result of a calling task, subtracting 1 from the counting value of the thread counter; and under the condition that the counting value of the thread counter is determined to be 0, ending the resource pool and releasing all the computing resources in the resource pool.

That is to say, in the present embodiment, after the creating unit 301 creates the resource pool of the operator, the resource pool is ended only after the releasing unit 304 determines that the operator has completed executing all the received call tasks, so as to avoid the problem that the resource pool is ended in advance when the operator has not completed executing the call tasks.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

As shown in fig. 4, is a block diagram of an electronic device of a task execution method according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 4, the apparatus 400 includes a computing unit 401 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data required for the operation of the device 400 can also be stored. The computing unit 401, ROM402, and RAM403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

A number of components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, or the like; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408 such as a magnetic disk, optical disk, or the like; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

Computing unit 401 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 401 executes the respective methods and processes described above, such as the task execution method. For example, in some embodiments, the task execution method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 408.

In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 400 via the ROM402 and/or the communication unit 409. When the computer program is loaded into RAM403 and executed by computing unit 401, one or more steps of the task execution method described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the task execution method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A task execution method, comprising:

creating a resource pool according to the configuration file of the corresponding operator;

adding the received at least one calling task to a task queue, and distributing computing resources for the at least one calling task according to the resource pool;

and executing the at least one calling task by utilizing the computing resources respectively, and adding an execution result of the at least one calling task to a result queue.

2. The method of claim 1, wherein determining a profile of a corresponding operator comprises:

determining a model type of the operator;

and taking the configuration file corresponding to the model type as the configuration file corresponding to the operator.

3. The method of claim 1, wherein the adding the received at least one calling task to a task queue comprises:

determining a thread count corresponding to the at least one calling task;

and counting the threads as the identification information of the at least one calling task in the task queue.

4. The method of claim 1, wherein the adding the execution result of the at least one calling task to a result queue comprises:

determining a thread count corresponding to the at least one calling task;

and counting the threads as the identification information of the execution result of the at least one calling task in the result queue.

5. The method of claim 3 or 4, wherein the determining a thread count corresponding to the at least one calling task comprises:

setting a thread counter, wherein the counting value of the thread counter represents the calling times of the architecture layer for calling the operator;

and taking the counting value of the thread counter when the at least one calling task is received as the thread count corresponding to the at least one calling task.

6. The method of claim 5, further comprising,

after obtaining the execution result of a calling task, subtracting 1 from the counting value of the thread counter;

and under the condition that the counting value of the thread counter is determined to be 0, ending the resource pool, and releasing all the computing resources in the resource pool.

7. A task execution device comprising:

the creating unit is used for creating a resource pool according to the configuration file of the corresponding operator;

the processing unit is used for adding the received at least one calling task to a task queue and distributing computing resources for the at least one calling task according to the resource pool;

and the execution unit is used for executing the at least one calling task by utilizing the computing resources respectively and adding the execution result of the at least one calling task to a result queue.

8. The apparatus according to claim 7, wherein the creating unit, when determining the configuration file of the corresponding operator, specifically performs:

determining a model type of the operator;

9. The apparatus according to claim 7, wherein the processing unit, when adding the received at least one calling task to the task queue, specifically performs:

determining a thread count corresponding to the at least one calling task;

10. The apparatus according to claim 7, wherein the execution unit, when adding the execution result of the at least one calling task to the result queue, specifically executes:

determining a thread count corresponding to the at least one calling task;

11. The apparatus according to claim 9 or 10, wherein the processing unit or the execution unit, when determining the thread count corresponding to the at least one calling task, specifically performs:

12. The apparatus of claim 7, further comprising a release unit,

the thread counter is used for subtracting 1 from the counting value of the thread counter after the execution result of one calling task is obtained;

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.