CN113204478A - Method, device and equipment for running test unit and storage medium - Google Patents

Method, device and equipment for running test unit and storage medium Download PDF

Info

Publication number
CN113204478A
CN113204478A CN202110368224.8A CN202110368224A CN113204478A CN 113204478 A CN113204478 A CN 113204478A CN 202110368224 A CN202110368224 A CN 202110368224A CN 113204478 A CN113204478 A CN 113204478A
Authority
CN
China
Prior art keywords
test
original
test units
grouping
test unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110368224.8A
Other languages
Chinese (zh)
Other versions
CN113204478B (en
Inventor
周威
蓝翔
骆涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110368224.8A priority Critical patent/CN113204478B/en
Publication of CN113204478A publication Critical patent/CN113204478A/en
Application granted granted Critical
Publication of CN113204478B publication Critical patent/CN113204478B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The disclosure discloses an operation method, an operation device, equipment and a storage medium of a test unit, and relates to the technical field of computers, in particular to the technical field of artificial intelligence such as deep learning. The operation method of the test unit comprises the following steps: concurrently running a plurality of test units based on an original grouping of the plurality of test units, and acquiring a running result of the concurrent running, wherein the original grouping is determined based on resource utilization rates of the plurality of test units; if the test units with the operation results of operation failure exist, grouping the test units again to obtain adjustment groups of the test units; and based on the adjustment grouping, the plurality of test units are run again and concurrently until no test unit with failed running exists. The present disclosure can improve the operation efficiency of the test unit.

Description

Method, device and equipment for running test unit and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to the field of artificial intelligence technologies such as deep learning, and in particular, to a method, an apparatus, a device, and a storage medium for operating a test unit.
Background
The deep learning framework is an important basic tool in the field of artificial intelligence. Due to the complex function and numerous modules of the deep learning framework, a large number of test units are introduced to ensure the functional stability. After the code in the deep learning framework is modified, in order to ensure the correctness of the code modification, all test units in the deep learning framework generally need to be run each time the code is changed.
In the related art, when all the test units operate, a serial operation mode is adopted.
Disclosure of Invention
The disclosure provides a test unit operation method, a test unit operation device, a test unit operation equipment and a storage medium.
According to an aspect of the present disclosure, there is provided an operating method of a test unit, including: concurrently running a plurality of test units based on an original grouping of the plurality of test units, and acquiring a running result of the concurrent running, wherein the original grouping is determined based on resource utilization rates of the plurality of test units; if the test units with the operation results of operation failure exist, grouping the test units again to obtain adjustment groups of the test units; and based on the adjustment grouping, the plurality of test units are run again and concurrently until no test unit with failed running exists.
According to another aspect of the present disclosure, there is provided an operating device of a test unit, including: the system comprises a first operation module, a second operation module and a third operation module, wherein the first operation module is used for operating a plurality of test units concurrently based on original groups of the test units and acquiring operation results of the concurrent operation, and the original groups are determined based on resource utilization rates of the test units; the adjusting module is used for grouping the plurality of test units again to obtain the adjusting groups of the plurality of test units if the test units with the operation results of operation failure exist; and the second running module is used for running the plurality of test units again and concurrently based on the adjustment grouping until no test unit with running failure exists.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the above aspects.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method according to any one of the above aspects.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of the above aspects.
According to the technical scheme disclosed by the invention, the operation efficiency of the test unit can be improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;
FIG. 5 is a schematic diagram according to a fifth embodiment of the present disclosure;
FIG. 6 is a schematic diagram according to a sixth embodiment of the present disclosure;
FIG. 7 is a schematic diagram of an electronic device for implementing any of the methods of operating a test cell of embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Deep learning frameworks, such as paddlepaddlee, PyTorch, TensorFlow, etc., are open-source models, and include a large number of test units, which attract global developers to contribute code to the deep learning frameworks.
In the deep learning framework, a Unified computing Device Architecture (CUDA) context is initialized for the video card, and in order to avoid the problems of resource conflict, insufficient video memory, and the like of the CUDA context, in the related art, each test unit is operated exclusively by the video card, and different test units are operated in a serial manner.
With the increase of time and the increase of functions of the deep learning framework, the number of the test units is continuously increased, the total operation time is also continuously increased, taking the propeller deep learning framework as an example, the propeller deep learning framework comprises 1200+ test units, the time for carrying out complete one-time operation exceeds 60 minutes (min), which causes the low efficiency of a deep learning framework developer, a large amount of time is consumed in the process of waiting for the operation of the test units, how to reduce the operation time of the test units of the deep learning framework, and the problem which needs to be solved urgently is solved, and the propeller deep learning framework has great significance for improving the research and development efficiency of the developer and saving the machine resource cost.
In order to improve efficiency, a concurrent operation mode may be adopted, but if the test unit is randomly processed concurrently, there may be a failure in the operation of the test unit due to resource conflict, and the test unit may also lose the meaning of the functional verification.
In the deep learning framework, different test units have great difference to the occupation condition of display card, and some test units are higher to the rate of utilization of display card when operation, and some test units are lower to the rate of utilization of display card when operation, for example, only need occupy 1% of the display card. If the test units with low resource utilization rate are operated in a serial operation mode, the test units also need to monopolize the display card for a long time to operate, and resource waste is caused.
In order to solve the problem of low efficiency of the serial operation mode of the test unit in the related art, the present disclosure provides the following embodiments.
Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure. The embodiment provides an operation method of a test unit, including:
101. the method comprises the steps of concurrently running a plurality of test units based on original grouping of the test units, and obtaining running results of the concurrent running, wherein the original grouping is determined based on resource utilization rates of the test units.
102. And if the test units with the operation results of operation failure exist, grouping the test units again to obtain the adjustment groups of the test units.
103. And based on the adjustment grouping, the plurality of test units are run again and concurrently until no test unit with failed running exists.
The execution subject of this embodiment may be a processor in the deep learning framework, and specifically, the method may be executed after the processor receives an instruction sent by a developer to trigger the test unit to run.
In the development process of a large-scale software project, in order to ensure the software quality, a developer develops a large number of test programs for testing whether the software functions are normal, and the test programs can be called test units. Particularly, in an open source community with cooperation of multiple persons, a plurality of developers exist, the code amount is huge, and at the moment, the test unit becomes an important means for ensuring the software quality and is an indispensable link in the continuous integration process. A test cell may also be referred to as a cell test, a single test, etc.
In the deep learning framework based on the open source mode, after a developer modifies codes, the developer can trigger the deep learning framework to run all test units in the deep learning framework so as to test whether the modified codes are correct. The testing of the developer to modify the code and to trigger the code may be considered an online process, and the process before the developer modifies the code may be considered an offline process. The original grouping of the test units can be determined in an off-line process, and the above 101 to 103 can be implemented on-line. Namely, the deep learning framework can be configured with original packets in advance, after a developer triggers the test unit to run, the deep learning framework concurrently runs the test unit according to the preconfigured original packets, if the test unit which fails to run exists, the original packets are adjusted to obtain adjusted packets, and then the test unit is concurrently run again based on the adjusted packets.
The deep learning framework can be configured with a script in advance, and the script is triggered to be executed when a developer triggers the test unit to run, namely, the deep learning framework starts the script after receiving an instruction of the developer to trigger the test unit to run, and the script is configured to execute the 101-103.
The original grouping refers to a grouping which exists before a developer triggers a test unit to run, and the adjusting grouping refers to a grouping obtained after the test unit in the original grouping is adjusted. The original packet and the adjusted packet each include at least two packets, each packet including at least one test unit, at least one corresponding test unit included in the original packet and the adjusted packet being different. For example, the original group includes group A including test unit-1 and test unit-2, and group B including test unit-3, and assuming that test unit-1 needs to be adjusted to group B, group A of the adjusted group includes test unit-2, and group B of the adjusted group includes test unit-2 and test unit-3.
The original grouping of the test unit is determined based on the resource utilization rate of the test unit, for example, in an off-line process, the original grouping may be divided into at least two groups based on the resource utilization rate of each test unit during operation, the resource utilization rates in different groups are different, and the at least two groups are used as the original grouping. Specifically, the individual test units may be operated one by one, the resource utilization rate of the individual test units during operation is obtained, the test units with lower resource utilization rates are divided into one group based on the resource utilization rate, the test units with higher resource utilization rates are divided into another group, and the like.
By basing the resource usage on the total test cells, the total test cells may be divided into a plurality of groups.
Resource usage rates include, for example: graphics Processing Unit (GPU) usage, and/or Graphics card usage.
The concurrent operation of the test units refers to concurrent processing of the test units in at least one group. The different groups are operated in series, and the concurrent number of the test units in the different groups can be different. For example, the original packet may be divided into 4 packets, represented as a group, B group, C group, D group, the number of concurrencies in a group is 16, the number of concurrencies in B group is 4, and so on.
After the test units are concurrently operated, an operation result can be obtained, and the operation result can be: there is a test unit that fails to operate, or there is no test unit that fails to operate.
When the test unit with operation failure exists, it indicates that there is an unreasonable place in the original packet, and the packet after the adjustment of the original packet may be called an adjustment packet. For example, in the original packet, group a includes test unit-1 and test unit-2, and if test unit-1 fails to operate after concurrent operation, test unit-1 may be removed from group a and adjusted into another packet, and group a of the corresponding adjusted packet no longer includes test unit-1.
After the adjustment packet is obtained, the test units may be run concurrently again based on the adjustment packet until there are no test units that fail to run.
In this embodiment, the operating efficiency of the test unit can be improved by a concurrent operation mode. And when the test unit with operation failure exists, the original grouping is adjusted, and the parallel operation is performed again based on the adjusted grouping until the test unit with operation failure does not exist, so that the success of the test function of the test unit can be ensured.
Fig. 2 is a schematic diagram according to a second embodiment of the present disclosure. The embodiment provides an operation method of a test unit, including:
201. the off-line phase determines the original grouping of test cells.
202. And in an online phase, the test units are operated concurrently.
The original groups may be determined by the deep learning framework and provided to the developer, or, if a certain developer adjusts the original groups determined by the deep learning framework, for a developer behind the developer, the original groups corresponding to the following developer are the groups that have been adjusted by the previous developer. For example, the grouping determined by the deep learning framework includes: the group A comprises a test unit-1, a test unit-2 and a test unit-3, the group B comprises a test unit-4, after a developer triggers the test unit to run concurrently, if the test unit-1 fails to run, the test unit-1 can be adjusted to the group B, so that the group A comprises the test unit-2 and the test unit-3 and the group B comprises the test unit-1 and the test unit-4 as the original group of the developer later.
Taking the deep learning framework as an example to determine the original grouping, the flow of the offline stage may include: each test unit in the plurality of test units is independently operated to obtain the resource utilization rate of each test unit in operation; grouping the plurality of test units based on the resource utilization rate of each test unit in operation to obtain the original grouping; if the original grouping has the conflicting test unit, the conflicting test unit is adjusted to a different original grouping to update the original grouping.
By grouping based on the resource utilization rate, resource repetition can be avoided as much as possible; by updating the original packet, the accuracy of the original packet can be improved.
Further, the number of the test units with collision is multiple, the original packet includes a first packet and a second packet, and the adjusting the test units with collision into different original packets includes: and adjusting at least one of the test units with conflict from a first packet to a second packet, wherein the resource utilization rate of the first packet is less than that of the second packet.
By performing degradation processing on the test units with conflicts, the running conflict problem which may exist when the test units with conflicts are positioned in the same group can be avoided.
Specifically, as shown in fig. 3, the flow of the offline phase may include:
301. and respectively taking each test unit in the plurality of test units as a current test unit, and operating the current test unit.
The test units can be transported one by one, namely, when the current test unit runs, other test units are kept in a non-running state.
302. And polling whether the current test unit is completely operated, if so, executing 306, and otherwise, executing 303.
The polling may be performed every other preset time, and if the preset time is 0.01 second, whether the current test unit is finished operating or not may be polled every 0.01 second.
303. And collecting the current resource utilization rate.
The current resource usage refers to a resource usage obtained during current detection, and the resource usage includes, for example: graphics card usage, and/or GPU usage.
304. And judging whether the current resource utilization rate is greater than the recorded value, if so, executing 305, otherwise, executing 302 and the subsequent steps again.
305. And updating the record value by using the current resource utilization rate.
The initial value of the recorded value may be set, for example, to 0, and then updated.
306. And judging whether all the test units are operated, if so, re-executing 301 and the subsequent steps.
307. And recording the resource utilization rate corresponding to each test unit.
And the resource utilization rate corresponding to each test unit is an updated recorded value.
308. And grouping all the test units based on the resource utilization rate corresponding to each test unit to obtain an original group.
Taking the division into 4 groups as an example, assuming that 4 groups are respectively expressed as a group a, a group B, a group C and a group D, the division of the 4 groups may be according to the following:
group A: and the display card utilization rate and the GPU utilization rate are both 0. When a developer writes a test unit, the developer does not occupy a display card for a deep learning module which does not relate to CUDA logic.
Group B: the display card utilization rate and the GPU utilization rate are both greater than 0 and less than or equal to 10 percent, and the partial test units occupy less display cards;
group C: the display card utilization rate and the GPU utilization rate are both more than 10% and less than or equal to 30% of the test units, and the partial test units partially occupy the display card;
group D: the display card utilization rate and the GPU utilization rate are both greater than 30%, and the partial test units occupy the display card in a higher mode.
309. The test units with collisions in the original packet are adjusted to update the original packet.
Wherein the existence of the conflict may include: a running conflict, and/or a CUDA context conflict.
Operational conflicts include, for example: reading and writing files with the same name, and/or, the front and back dependency relationship among the test units exists. There are context dependencies between test units, such as one test unit being a test unit that builds a model and another test unit being a test unit that applies the model. One or more of the test units for reading and writing the same-name file can be demoted to other groups, and the dependent test units (such as the test units of the application model) can be demoted to other groups for the test units with front-back dependency relationship among the test units. Downgrading refers to adjusting from a packet with low resource usage to a packet with high resource usage, for example, in the above packets, adjusting from group a to group B is downgrading, and vice versa, upgrading. If group A includes conflicting test unit-1 and test unit-2, then test unit-1 may be adjusted to group B.
CUDA context conflicts include, for example: inheriting the same type of test unit from the same base class. In the deep learning framework, in order to avoid problems of repeated development, overhead reduction and the like, a base class can be written, and a plurality of subclasses inherit the base class. For example, the base class may correspond to subclasses of test methods for multiple learning rates corresponding to the same base class, and the subclasses of test methods for multiple learning rates need to be adjusted to different groups.
The original grouping can be obtained through the process, after the original grouping is determined, the original grouping can be adjusted in an online operation mode to obtain an adjusted grouping, and finally the test unit is operated based on the adjusted grouping in a concurrent mode. The on-line process can be as described in 101-103.
Further, in another embodiment, the online process may include:
and running a preset number of test units in each group of the original groups, wherein the preset number corresponding to different groups is different. Generally speaking, the grouping with low resource utilization rate has a corresponding concurrency number, that is, the preset number is larger, whereas the grouping with high resource utilization rate has a corresponding concurrency number which is smaller. For example, the above-mentioned group a has the largest number of concurrencies and the group D has the smallest number.
The different groups correspond to different concurrency quantities, reasonable concurrent operation can be carried out according to the characteristics of the different groups, the efficiency is improved, and the possibility of conflict is reduced.
If a test unit which fails to operate exists after concurrent operation, the test unit which fails to operate can be adjusted from the original grouping to an adjusting grouping, and the resource utilization rate of the adjusting grouping is higher than that of the original grouping. That is, for the test unit which fails to operate, the test unit can be degraded, and the operation failure of the test unit can be avoided as much as possible because the number of the packet concurrency in which the degradation is performed is small.
As shown in fig. 4, the method may include:
401. group a test cells, 16 run concurrently.
402. Group B test cells, 4 concurrent runs.
403. Group C test cells, 2 run concurrently.
404. And D groups of test units run in series.
The execution sequence of 401-404 is not limited, and the different groups (group a, group B, group C, and group D) are executed in series.
405. And judging whether a test unit with operation failure exists, if so, executing 406, and otherwise, executing 407.
406. And performing degradation processing on the test unit with failed operation to obtain an adjustment packet. Thereafter, 401 and its subsequent steps are re-executed based on the adjustment packet.
For test units that fail to run, the packets are adjusted from the original packets to the new packets to obtain adjusted packets. The adjustment from the original packet to the new packet may be that the number of the adjusted packets is consistent with that of the original packet, and the adjusted packets are adjusted in different groups, or that a packet is newly added, and the packet with failed operation is adjusted to the newly added packet. The resource usage of the adjusted packet is higher than the resource usage of the original packet. Assuming that the original and adjusted groups are all groups A, B, C, and D as described above, test unit-1 may be adjusted to group B, assuming that test unit-1 in group A fails to operate.
407. And determining a rule and ending.
That is, there will be no packet of test units that failed to run, which may be an original packet or an adjusted packet, as a final packet. Concurrent operations may then be performed based on the final packet.
By taking the propeller deep learning platform as an example, the total running time of the test units can be reduced by 25% by 1200+ test units based on the concurrent processing scheme, so that the resource utilization rate of the GPU and the display card is improved, and the running efficiency of the test units is improved. And the operation stability of the test unit of the deep learning framework can be ensured by performing grouping adjustment after the operation of the test unit fails, and other negative risk factors are not introduced.
Fig. 5 is a schematic diagram of a third embodiment according to the present disclosure, as shown in fig. 5, the method comprising:
501. an original grouping of test cells is determined.
The process of determining the original packet may be as shown in fig. 3.
502. After receiving an instruction for triggering the test unit to operate, judging whether to trigger a single test unit according to a preset mapping relation, if so, executing 503, otherwise, executing 504.
The mapping relationship between various codes and the test unit can be configured in advance, a developer can trigger the deep learning framework to run the test unit after modifying the codes, and at this time, the instruction for triggering the test unit to run can include the information of the modified codes so as to judge whether a certain test unit can be triggered according to the information and the mapping relationship. For example, a code corresponding to a test method for which a learning rate can be configured corresponds to the test unit-1, when the deep learning framework receives an instruction for triggering the test unit to run, the instruction may include a code identifier-1, and if the code identifier-1 in the preconfigured mapping relationship corresponds to the test unit-1, it may be determined that a single test unit, that is, the test unit-1, is triggered.
503. The single test cell is run.
For example, trigger test unit-1. For other test cells, the non-operational state may be maintained.
It is understood that, for example, a single test unit is triggered by a code, and if one or some codes correspond to multiple test units, the multiple test units can be triggered to be serialized. Further, the number of the triggered test units may also be set, and if the number is less than or equal to the set number, the test units are triggered to perform serialization, and if the number is greater than the set number, the test units may be concurrently operated, and the concurrent operation may be performed according to the step 504.
504. The test units are run concurrently.
The specific flow can be as shown in 101-103.
In this embodiment, through the mapping relationship, only the operation of the test unit corresponding to the code may be triggered, and accurate triggering may be implemented, thereby improving the operation efficiency. And a concurrent operation scheme is adopted when the mapping cannot be carried out on the corresponding test unit, so that the mutual combination of accurate triggering and concurrent operation can be realized, and the operation efficiency is jointly improved.
Fig. 6 is a schematic diagram of a sixth embodiment of the present disclosure, which provides an operating device of a test unit. As shown in fig. 6, the operation device 600 of the test unit includes a first operation module 601, an adjustment module 602, and a second operation module 603.
The first operation module 601 is configured to concurrently operate a plurality of test units based on an original grouping of the test units, and obtain an operation result of the concurrent operation, where the original grouping is determined based on resource usage rates of the test units; the adjusting module 602 is configured to, if there is a test unit whose operation result is operation failure, re-group the plurality of test units to obtain an adjusted group of the plurality of test units; the second running module 603 is configured to run the plurality of test units again concurrently based on the adjustment packet until there is no test unit with a failed operation.
In some embodiments, the apparatus further comprises: the device comprises a counting module, a grouping module and an updating module.
The statistical module is used for independently operating each test unit in the plurality of test units to obtain the resource utilization rate of each test unit during operation; the grouping module is used for grouping the plurality of test units based on the resource utilization rate of each test unit in operation so as to obtain the original grouping; and the updating module is used for adjusting the test units with conflicts to different original packets to update the original packets if the test units with conflicts exist in the original packets.
In some embodiments, the grouping module is specifically configured to: and dividing the resource utilization rate of each test unit in operation into at least two groups, wherein the resource utilization rates in different groups are different, and taking the at least two groups as the original groups.
In some embodiments, the number of the test units with collision is multiple, the original packet includes a first packet and a second packet, and the update module is specifically configured to: and adjusting at least one of the test units with conflict from the first packet to the second packet, wherein the resource utilization rate of the first packet is smaller than that of the second packet.
In some embodiments, the original packet includes a plurality of packets, and the first execution module is specifically configured to: and running a preset number of test units in each group of the original groups, wherein the preset number corresponding to different groups is different.
In some embodiments, the adjusting module is specifically configured to: and adjusting the test units with the operation results of failed operation from the original packets to new packets to obtain adjusted packets of the plurality of test units, wherein the concurrency number of the adjusted packets is smaller than that of the original packets.
In the embodiment of the disclosure, the operation efficiency of the test unit can be improved in a concurrent operation mode. Moreover, when the test unit with operation failure exists, the original grouping is adjusted, and the operation is performed again and concurrently based on the adjusted group until the test unit with operation failure does not exist, so that the success of the test function of the test unit can be ensured; by grouping based on the resource utilization rate, resource repetition can be avoided as much as possible; by updating the original packet, the accuracy of the original packet can be improved; by carrying out degradation processing on the test units with conflicts, the running conflict problem which may exist when the test units with conflicts are positioned in the same group can be avoided; by corresponding different concurrent quantities to different groups, reasonable concurrent operation can be carried out according to the characteristics of the different groups, the efficiency is improved, and the possibility of conflict is reduced; by performing grouping adjustment after the operation of the test unit fails, the operation stability of the test unit of the deep learning framework can be ensured.
It is to be understood that in the disclosed embodiments, the same or similar elements in different embodiments may be referenced.
It is to be understood that "first", "second", and the like in the embodiments of the present disclosure are used for distinction only, and do not indicate the degree of importance, the order of timing, and the like.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the electronic device 700 includes a computing unit 701, which may perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 can also be stored. The computing unit 701, the ROM 602, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
A number of components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the electronic device 700 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the respective methods and processes described above, such as the operation method of the test unit. For example, in some embodiments, the method of operation of the test unit may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM 702 and/or the communication unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the method of operating a test unit described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g., by means of firmware) to perform the method of running of the test unit.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1. A method of operating a test cell, comprising:
concurrently running a plurality of test units based on an original grouping of the plurality of test units, and acquiring a running result of the concurrent running, wherein the original grouping is determined based on resource utilization rates of the plurality of test units;
if the test units with the operation results of operation failure exist, grouping the test units again to obtain adjustment groups of the test units;
and based on the adjustment grouping, the plurality of test units are run again and concurrently until no test unit with failed running exists.
2. The method of claim 1, further comprising:
each test unit in the plurality of test units is independently operated to obtain the resource utilization rate of each test unit in operation;
grouping the plurality of test units based on the resource utilization rate of each test unit in operation to obtain the original grouping;
if the original grouping has the conflicting test unit, the conflicting test unit is adjusted to a different original grouping to update the original grouping.
3. The method of claim 2, wherein the grouping the plurality of test units based on resource usage of the respective test unit during runtime to obtain the original grouping comprises:
and dividing the resource utilization rate of each test unit in operation into at least two groups, wherein the resource utilization rates in different groups are different, and taking the at least two groups as the original groups.
4. The method of claim 2, wherein the plurality of conflicting test units, the original packet comprising a first packet and a second packet, and wherein adjusting the conflicting test units into different original packets comprises:
and adjusting at least one of the test units with conflict from the first packet to the second packet, wherein the resource utilization rate of the first packet is smaller than that of the second packet.
5. The method of any of claims 1-4, wherein the original packet comprises a plurality of packets, the running the test unit concurrently based on the original packet of the test unit comprising:
and running a preset number of test units in each group of the original groups, wherein the preset number corresponding to different groups is different.
6. The method of any of claims 1-4, wherein said regrouping said plurality of test cells to obtain an adjusted grouping of said plurality of test cells comprises:
and adjusting the test units with the operation results of failed operation from the original packets to new packets to obtain adjusted packets of the plurality of test units, wherein the concurrency number of the adjusted packets is smaller than that of the original packets.
7. An operating device for a test cell, comprising:
the system comprises a first operation module, a second operation module and a third operation module, wherein the first operation module is used for operating a plurality of test units concurrently based on original groups of the test units and acquiring operation results of the concurrent operation, and the original groups are determined based on resource utilization rates of the test units;
the adjusting module is used for grouping the plurality of test units again to obtain the adjusting groups of the plurality of test units if the test units with the operation results of operation failure exist;
and the second running module is used for running the plurality of test units again and concurrently based on the adjustment grouping until no test unit with running failure exists.
8. The apparatus of claim 7, further comprising:
the statistical module is used for independently operating each test unit in the plurality of test units to obtain the resource utilization rate of each test unit during operation;
the grouping module is used for grouping the plurality of test units based on the resource utilization rate of each test unit in operation so as to obtain the original grouping;
and the updating module is used for adjusting the test units with conflicts to different original packets to update the original packets if the test units with conflicts exist in the original packets.
9. The apparatus of claim 8, wherein the grouping module is specifically configured to:
and dividing the resource utilization rate of each test unit in operation into at least two groups, wherein the resource utilization rates in different groups are different, and taking the at least two groups as the original groups.
10. The apparatus of claim 8, wherein there are a plurality of the conflicting test units, the original packet comprises a first packet and a second packet, and the update module is specifically configured to:
and adjusting at least one of the test units with conflict from the first packet to the second packet, wherein the resource utilization rate of the first packet is smaller than that of the second packet.
11. The apparatus according to any of claims 7-10, wherein the original packet comprises a plurality of packets, the first execution module being specifically configured to:
and running a preset number of test units in each group of the original groups, wherein the preset number corresponding to different groups is different.
12. The apparatus according to any one of claims 7-10, wherein the adjusting means is specifically configured to:
and adjusting the test units with the operation results of failed operation from the original packets to new packets to obtain adjusted packets of the plurality of test units, wherein the concurrency number of the adjusted packets is smaller than that of the original packets.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
CN202110368224.8A 2021-04-06 2021-04-06 Method, device and equipment for operating test unit and storage medium Active CN113204478B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110368224.8A CN113204478B (en) 2021-04-06 2021-04-06 Method, device and equipment for operating test unit and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110368224.8A CN113204478B (en) 2021-04-06 2021-04-06 Method, device and equipment for operating test unit and storage medium

Publications (2)

Publication Number Publication Date
CN113204478A true CN113204478A (en) 2021-08-03
CN113204478B CN113204478B (en) 2022-05-03

Family

ID=77026245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110368224.8A Active CN113204478B (en) 2021-04-06 2021-04-06 Method, device and equipment for operating test unit and storage medium

Country Status (1)

Country Link
CN (1) CN113204478B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104583789A (en) * 2012-04-11 2015-04-29 爱德万测试公司 Creation and scheduling of a decision and execution tree of a test cell controller
CN109344309A (en) * 2018-09-18 2019-02-15 上海唯识律简信息科技有限公司 Extensive file and picture classification method and system are stacked based on convolutional neural networks
WO2019126758A1 (en) * 2017-12-22 2019-06-27 Alibaba Group Holding Limited A unified memory organization for neural network processors
CN111984545A (en) * 2020-09-24 2020-11-24 北京百度网讯科技有限公司 Method and device for testing stability of detection unit, electronic equipment and storage medium
CN112036902A (en) * 2020-07-14 2020-12-04 深圳大学 Product authentication method and device based on deep learning, server and storage medium
CN112380131A (en) * 2020-11-20 2021-02-19 北京百度网讯科技有限公司 Module testing method and device and electronic equipment
CN112540914A (en) * 2020-11-27 2021-03-23 北京百度网讯科技有限公司 Execution method, execution device, server and storage medium for unit test

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104583789A (en) * 2012-04-11 2015-04-29 爱德万测试公司 Creation and scheduling of a decision and execution tree of a test cell controller
WO2019126758A1 (en) * 2017-12-22 2019-06-27 Alibaba Group Holding Limited A unified memory organization for neural network processors
CN109344309A (en) * 2018-09-18 2019-02-15 上海唯识律简信息科技有限公司 Extensive file and picture classification method and system are stacked based on convolutional neural networks
CN112036902A (en) * 2020-07-14 2020-12-04 深圳大学 Product authentication method and device based on deep learning, server and storage medium
CN111984545A (en) * 2020-09-24 2020-11-24 北京百度网讯科技有限公司 Method and device for testing stability of detection unit, electronic equipment and storage medium
CN112380131A (en) * 2020-11-20 2021-02-19 北京百度网讯科技有限公司 Module testing method and device and electronic equipment
CN112540914A (en) * 2020-11-27 2021-03-23 北京百度网讯科技有限公司 Execution method, execution device, server and storage medium for unit test

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
E. BUBER 等: "Performance Analysis and CPU vs GPU Comparison for Deep Learning", 《2018 6TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING & INFORMATION TECHNOLOGY (CEIT)》 *
俞洋 等: "一种可应用于并发在线测试的扫描单元设计", 《电子学报》 *

Also Published As

Publication number Publication date
CN113204478B (en) 2022-05-03

Similar Documents

Publication Publication Date Title
US10579459B2 (en) Log events for root cause error diagnosis
EP4113299A2 (en) Task processing method and device, and electronic device
CN113656227A (en) Chip verification method and device, electronic equipment and storage medium
CN112925587A (en) Method and apparatus for initializing applications
CN115150471A (en) Data processing method, device, equipment, storage medium and program product
CN114896166A (en) Scene library construction method and device, electronic equipment and storage medium
CN112527281A (en) Operator upgrading method and device based on artificial intelligence, electronic equipment and medium
CN115168130A (en) Chip testing method and device, electronic equipment and storage medium
CN114816393B (en) Information generation method, device, equipment and storage medium
CN113193947A (en) Method, apparatus, medium, and program product for implementing distributed global ordering
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN113553216A (en) Data recovery method and device, electronic equipment and storage medium
CN113204478B (en) Method, device and equipment for operating test unit and storage medium
CN115729742A (en) Error processing method and device, electronic equipment and storage medium
CN115329143A (en) Directed acyclic graph evaluation method, device, equipment and storage medium
CN112783574B (en) Application development method, device, equipment and storage medium
US20210312365A1 (en) Analysis of resources utilized during execution of a process
CN117709255B (en) Test method, device, equipment and medium for indirect access register
CN114816758B (en) Resource allocation method and device
CN117592311B (en) Multi-level simulation method, device and equipment for workflow and readable medium
CN117270862A (en) Software pluggable method, device, equipment and medium based on dynamic compiling
CN116383284A (en) Data access method, device, equipment and storage medium
CN115098520A (en) Device data updating method and device, electronic device and storage medium
CN117609054A (en) Automatic test method, device, equipment and storage medium
CN115311082A (en) Transaction data difference rolling method, difference rolling application system, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant