CN107844420B

CN107844420B - Binary online simulation and error checking method based on virtual machine snapshot

Info

Publication number: CN107844420B
Application number: CN201711022557.5A
Authority: CN
Inventors: 高翔; 杨小凡; 朱杰媛; 朱岩
Original assignee: Nanjing SAC Automation Co Ltd
Current assignee: Nanjing SAC Automation Co Ltd
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2020-08-28
Anticipated expiration: 2037-10-27
Also published as: CN107844420A

Abstract

The invention discloses a dichotomy online simulation and error checking method based on virtual machine snapshots, which comprises the following steps of: firstly, generating a No. 0 mirror image by using a software component to be subjected to online simulation and error checking and a software environment thereof; generating mutually independent multiple virtual machine running environment snapshots to form a virtual machine mirror image matrix; step three, sorting the snapshot mirror images of the virtual machine; step four, reducing the number of the snapshot images of the virtual machine by using a dichotomy; submitting the manual judgment of the fault reason; according to the method and the device, the characteristics of the virtual machine are fully utilized, the speed of fault recurrence is improved, different operation sequences of the same fault are analyzed manually, and the defect introduction point can be traced back, so that the position of the software defect possibly existing in the source code is pointed out.

Description

Binary online simulation and error checking method based on virtual machine snapshot

Technical Field

The invention relates to the technical field of software testing and fault positioning, in particular to a dichotomy online simulation and error checking method based on virtual machine snapshots.

Background

Due to the continuous development of virtualization technologies, most application scenarios can be realized by using virtualization technologies, as well as the debugging and testing environments of programs.

Software testing occupies a large amount of manpower and material resources in the software development process, and fault location is one of the behaviors with the highest payment cost in the testing. In the conventional method based on a single machine, a developer introduces a defect in a program, which causes an error state in the program, which in turn causes a program failure in the form of a perceptible external error. Then, the developer executes a failed test case, sets breakpoints repeatedly by using a debugger, and observes the program state until an error state occurs. Then deducing possible infection sources, positioning the defect position, verifying and correcting the software defect.

This conventional approach often requires program debuggers to simulate inputs, outputs one and the same time in order to get the program into the portion of the code where the problem is likely to occur, and also requires a significant amount of time to repeat the test if the problem is associated with runtime. The most fatal problem is that in the traditional method, in the process of repeatedly debugging from the beginning for a plurality of times, the problem that the debugging can not be repeated again exists or the phenomenon that the debugging is repeated is different every time, and the simulation and the test are difficult. The reason is mainly that in a multi-process and multi-thread environment of an operating system, due to uncertainty of external I/O delay and the like, scheduling of the operating system also has uncertainty, so that execution sequences of defective codes are inconsistent, and finally some problems cannot be reproduced reliably, so that the defect problems are difficult to locate and process.

Virtual machine technology may capture and store a snapshot of a running "machine," which is the state of the virtual machine at a particular time, including the hardware state, software state, operating system state, file system state, memory state, etc. of the machine. This captured state of the "in motion" machine has a complete context of the running state. The virtual machine can then be restored to any previous state by applying the corresponding snapshot to the virtual machine.

The market needs an online simulation and error checking method which can fully utilize the characteristics of a virtual machine, improve the speed of recurring faults, facilitate manual analysis of different running sequences of the same fault and trace back to a defect introduction point so as to point out the possible positions of software defects in source codes, and the invention solves the problems.

Disclosure of Invention

In order to solve the defects of the prior art, the invention aims to provide a dichotomy online simulation and error checking method based on virtual machine snapshots.

In order to achieve the above object, the present invention adopts the following technical solutions:

a dichotomy online simulation and error checking method based on virtual machine snapshots comprises the following steps:

firstly, generating a No. 0 mirror image by using a software component to be subjected to online simulation and error checking and a software environment thereof;

a) deploying a software component to be subjected to online simulation and debugging, a set target operating system, a set third-party soft environment, network initial configuration and a target program system in a virtual machine management platform;

b) deploying a stub code script program;

c) taking the virtual mechanism as an initial virtual machine mirror image, which is called as a No. 0 mirror image;

step two: generating mutually independent multiple virtual machine running environment snapshots to form a virtual machine mirror image matrix;

a) let the probability of the occurrence of a question be p, p ∈ (0, 1)]Deploying, in order to guarantee probability reproduction

A number of engineering instances; where k is a reliability parameter, k>0；

b) Deploying multiple No. 0 mirror image instances by using a virtual machine management API (application programming interface), and deploying a related virtual network environment; instantiating the mirror image No. 0 into a running state virtual machine;

c) for each mirror image instance No. 0, performing remote SSH control by using a management machine, and starting a stub code program running in the virtual machine instance;

d) after the pile code is started, the time when the target software starts to run is called as the respective 0 time, and a virtual machine snapshot is established for the target software, namely the 0 time virtual machine snapshot; if N mirror image instances of 0 are provided, S ═ S is generated₀₁,S₀₂,S₀₃....S_0N]The virtual machine snapshot cluster;

e) in the running process of the virtual machine, the management machine is arranged at time intervals T₀Snapshotting the N virtual machines at regular time, and forming corresponding No. 1 mirror image instance, No. 2 mirror image instance and No. 3 mirror image instance … … N mirror image instance;

f) after a certain virtual machine runs for a period of time and program interruption or other abnormal behaviors to be debugged occur, recording the running working condition of the virtual machine at the running moment by the pile code, reporting the running working condition to the management machine in a communication mode, and closing the running of the virtual machine by the management machine; setting the virtual machine image number recorded before as I, and setting the virtual machine image at the time of operation error as an I +1 image;

g) waiting time T_maxThen, part of the virtual machine reappears at T_maxBefore or T_maxIf the other part of the virtual machines fails to successfully reproduce the fault, a virtual machine mirror image matrix shown as the following is formed; wherein K is T_max/T₀，

Wherein T is not reached due to partial virtual machine instance running_maxThe fault is reproduced, this time S after this point in time_xyMirror is set to null, S_xyRepresenting a virtual machine mirror image, wherein x is a row of a virtual machine mirror image matrix, and y is a column of the virtual machine mirror image matrix;

h) the management machine identifies the position of the program exception according to the stub code and confirms the exception as a target exception according to a preset index;

step three: sorting the snapshot mirror images of the virtual machine;

a) the management machine sets the mirror image S corresponding to the virtual machine with the abnormal reproduction as S abnormal_x1,S_x2,S_x3....S_xN]Reserve, will reach T_maxNormally deleting the mirror image S corresponding to the abnormal virtual machine when the abnormal virtual machine is not reproduced;

b) setting the mirror image set of the virtual machine with the fault as M_failFor each virtual machine image found to be abnormal i ∈ M_failIs provided with S_x1,S_x2,S_x3..S_xkNot empty, S_x(k+1)If the number is null, the supervisor will S_x1-S_x,middleImage deletion, wherein midle ═ UpperRound (k/2); UpperRound is a value which is more than or equal to the minimum integer of k/2; then restoring S in the original virtual machine instance_x,middleMirroring;

step four: the number of the snapshot images of the virtual machine is reduced progressively by utilizing a dichotomy;

a) ensuring that each virtual machine image with abnormal recurrence enters the next iteration, and taking out S of each virtual machine image i with abnormal recurrence_x,middleMirroring and deploying at least 1 running instance; and after all abnormal virtual machine images are deployed, the abnormal virtual machine images still do not reach

When the number of the engineering instances is multiple, deploying the running environment of the residual number of the virtual machines which does not reach the number of the running instances of the virtual machines, wherein the mirror images adopted by the running environment are the mirror images of all the virtual machines which are found to be abnormal;

b) it is necessary to ensure that all exception-discovered virtual machine images are deployed at least equal to or greater than

A number of engineering instances;

c) for each utilization stub code implementation pair S_x,middleSubsequent input and output operations; and performing actual simulation operation as shown in e) -g) in the step two, wherein T is not reached due to partial virtual machine instance operation_maxThe fault is reproduced, this time S after this point in time_xyThe mirror image is set to null; the virtual machine mirror image matrix S, S can be generated again according to the method of the third step_x,0-S_x,middleAn example is empty;

step five: submitting the fault causes for manual judgment;

a) when the number of the mirror image sets of a certain virtual machine which is not empty is found to be less than 100 samples, and the virtual machine enters a fault state; restoring all the images meeting the condition to the target virtual machine;

b) and submitting the information logs collected by the secondary stub codes of the virtual machines to manual analysis, and providing a virtual machine mirror image matrix for manual reduction test and debugging.

In the binary online simulation and error checking method based on the virtual machine snapshot, the function of the stub code includes: starting a target program; monitoring relevant software operation conditions; simulating related network communication, command input and human-computer interface input according to a set script; and generating the memory and call stack information of the target program crash point.

The dichotomy online simulation and error checking method based on the virtual machine snapshot comprises the following software operation conditions: memory occupation, CPU load, disk I/O, program crash intercept.

In the above dichotomy online simulation and error checking method based on virtual machine snapshot, in the second step, the probability of problem occurrence is p, p ∈ (0, 1)]To ensure probability recurrence, deployments may be made

A number of engineering instances; wherein k is a reliability parameter, and k is greater than or equal to 5.

In the binary online simulation and error checking method based on the virtual machine snapshot, in the second step, the virtual machine management API performs snapshot management by using functions provided by LibVirt and Qemu.

The binary method online simulation and error checking method based on the virtual machine snapshot,

a) in order to ensure a certain recurrence rate, S of each abnormal virtual machine image i is taken out_x,middleMirroring and deploying at least 1 running instance; and after all abnormal virtual machine images are deployed, the abnormal virtual machine images still do not reach

When the number of the engineering instances is multiple, deploying the running environments of the residual number of the virtual machines which do not reach the number of the running instances of the virtual machines, wherein the images adopted by the running environments are the images of all the virtual machines which are found to be abnormal;

b) virtual machine images needing to ensure that deployment discovery exception is at least greater than or equal to

A number of engineering instances;

d) and repeating the contents of a) to c) in the third step and the fourth step to ensure that the possible virtual machine image instances can be further reduced.

In the fifth step, when the number of the mirror image sets of a certain virtual machine which is not empty is found to be less than 100 samples, and the virtual machine enters a fault state; restoring all the images meeting the condition to the target virtual machine; the judgment condition for the virtual machine to enter the fault state is as follows: and judging to enter a fault state when the running time of the virtual machine is less than 1 minute.

The invention has the advantages that: the invention provides a dichotomy online simulation and error checking method based on virtual machine snapshots, which utilizes the snapshot function of a virtual machine and the state information of a test case by fully utilizing the characteristics of the virtual machine; the method and the device have the advantages that the speed of reproducing the fault is improved, different operation sequences (located in different virtual machine instances of a fault point) of the same fault can be analyzed manually, and the defect introduction point can be traced back, so that the position of the software defect possibly existing in the source code can be pointed out. By the design, the software productivity and the software quality are greatly improved, and meanwhile, the characteristics of the computing resources of the modern pooled virtual machines are fully utilized, namely the number of deployed instances can be enough, and when the resource pool is large enough, as long as the sum of the running time of all instances is approximately equal to the cost of the virtual machine when the same instance and the same model machine are run; the method can reduce the use of an integral machine, thereby reducing the cost.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention;

FIG. 2 is a deployment process diagram of an embodiment of the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and the embodiments.

As shown in fig. 1, a binary method online simulation and error checking method based on virtual machine snapshot includes the following steps:

a) deploying a target operating system to be subjected to online simulation and debugging and well set, a third-party soft environment, network initial configuration and a target program system in a virtual machine management platform;

b) deploying a stub code script program; it should be noted that: the functions of the stub code include: starting a target program; monitoring relevant software operation conditions; simulating related network communication, command input and human-computer interface input according to a set script; and generating the memory of the collapse point of the target program and the call stack situation information. The software operation conditions comprise: memory occupation, CPU load, disk I/O, program crash intercept.

c) The virtual mechanism is used as an initial virtual machine image and is called as a No. 0 image.

A number of engineering instances; where k is a reliability parameter, k>0; as an example of engineering practice, k is greater than or equal to 5; however, in the principle of the patent, no limitation is made on the value of k, and the value of k should be related to factors such as actual hardware resources, problem reproduction difficulty, human resource consumption degree and the like;

b) deploying multiple No. 0 mirror image instances by using a virtual machine management API (application programming interface), and deploying a related virtual network environment; instantiating the mirror image No. 0 into a running state virtual machine; as an embodiment, the virtual machine management API performs snapshot management by using functions provided by LibVirt and Qemu; it should be noted that the method presented in the example is only an example of the virtual machine management API, and the manner described in this patent may be controlled according to the API provided by the virtualized software and hardware environment that is actually running.

d) after the pile code is started, the time when the target software starts to run is called 0 time respectively, and a virtual machine snapshot, namely the virtual machine snapshot at the time 0, is established for the target software; if N mirror image instances of 0 are provided, S ═ S is generated₀₁,S₀₂,S₀₃....S_0N]The virtual machine snapshot cluster;

Wherein part of the virtual machine instance is not runningTo T_maxThe fault is reproduced, this time S after this point in time_xyMirror is set to null, S_xyRepresenting a virtual machine mirror image, wherein x is a row of a virtual machine mirror image matrix, and y is a column of the virtual machine mirror image matrix;

step three: sorting the snapshot mirror images of the virtual machine;

a) the management machine sets the mirror image S corresponding to the virtual machine with the abnormal reproduction as S abnormal_x1,S_x2,S_x3....S_xN]Reserve, will reach T_maxThe mirror image S corresponding to the abnormal virtual machine is normally deleted when the time is still not repeated;

b) setting a virtual machine set with a recurring fault as M_failFor each virtual machine image found to be abnormal i ∈ M_failIs provided with S_x1,S_y2,S_x3..S_xkNot empty, S_x(k+1)If the number is null, the supervisor will S_x1-S_x,middleImage deletion, wherein midle ═ UpperRound (k/2); UpperRound is a value which is more than or equal to the minimum integer of k/2; then restoring S in the original virtual machine instance_x,middleMirroring;

a) in order to ensure that each virtual machine image with abnormal recurrence enters the next iteration, S of each virtual machine i with abnormal recurrence is taken out_x,middleMirroring and deploying at least 1 running instance; and after all abnormal virtual machine images are deployed, the abnormal virtual machine images still do not reach

b) it is necessary to ensure that all of the abnormally discovered mirror-image deployed virtual machines are at least equal to or greater than

A number of engineering instances;

c) for each utilization stub code implementation pair S_x,middleSubsequent input and output operations; for each utilization stub code implementation pair S_x,middleSubsequent input and output operations; and performing actual simulation operation as shown in e) -g) in the step two, wherein T is not reached due to partial virtual machine instance operation_maxThe fault is reproduced, this time S after this point in time_xyThe mirror image is set to null; the virtual machine mirror image matrix S, S can be generated again according to the method of the third step_x,0-S_x,middleAn example is empty;

d) repeating the contents of steps three and a) -c) in step four, which ensures that the possible instances of the virtual machine image, i.e. the virtual machine image whose instance is not empty, can be further reduced.

Step five: submitting the fault causes for manual judgment;

a) when the number of the images of a certain virtual machine which is not empty is found to be less than 100 samples, and the virtual machine enters a fault state; restoring all the mirror images meeting the conditions to the target virtual machine; the judgment condition for the virtual machine to enter the fault state is as follows: and judging to enter a fault state when the running time of the virtual machine is less than 1 minute.

b) And submitting the information logs collected by the past stub codes of the virtual machines to manual analysis, and providing a virtual machine mirror image matrix (mostly set to be empty) for manual recovery testing and debugging.

As shown in fig. 2, the deployment process of the method specifically includes:

a) in a virtual machine management platform, a target operating system to be simulated and debugged on line and set, a third-party soft environment, network initial configuration and a target program system are deployed.

b) Deploying stub code scripts or programs. The function of the stub code is to start the target program and monitor the relevant software operating conditions (memory usage, CPU load, disk I/O, program crash intercept, etc.). For the requirement of on-line simulation, if necessary, the stub code simulates related network communication, command input, human-computer interface input and other operations according to a set script. Of particular importance, stub code will generate memory and call stack information for the point of target program crash.

d) In order to ensure the recurrence of the problems, the scheme requires the deployment of a plurality of identical operation environment instances, and the probability of the occurrence of the problems is set as p, p ∈ (0, 1)]. To ensure that problems are replicated with a certain probability, deployments may be made

Number of engineering instances. Where k is a reliability parameter, k>0. As one proposal of engineering practice, k is equal to or greater than 5.

e) Deploying the multiple No. 0 mirror image instances by using a virtual machine management API, and deploying a related virtual network environment; image number 0 is instantiated as a running state virtual machine.

f) And for each mirror image instance No. 0, starting a stub code program running in the virtual machine instance by using a mode of performing remote SSH control and the like by using a management machine.

g) After the pile code is started, the time when the target software starts to run is called 0 time respectively, and a virtual machine snapshot, namely the virtual machine snapshot at the time 0, is established for the target software; if N mirror image instances of 0 are provided, S ═ S is generated₀₁,S₀₂,S₀₃....S_0N]The virtual machine snapshot cluster.

The invention provides a dichotomy online simulation and error checking method based on virtual machine snapshots, which utilizes the snapshot function of a virtual machine and the state information of a test case by fully utilizing the characteristics of the virtual machine; the method and the device have the advantages that the speed of reproducing the fault is improved, different operation sequences (located in different virtual machine instances of a fault point) of the same fault can be analyzed manually, and the defect introduction point can be traced back, so that the position of the software defect possibly existing in the source code can be pointed out. By the design, the software productivity and the software quality are greatly improved, and meanwhile, the characteristics of the computing resources of the modern pooled virtual machines are fully utilized, namely the number of deployed instances can be enough, and when the resource pool is large enough, as long as the sum of the running time of all instances is approximately equal to the cost of the virtual machine when the same instance and the same model machine are run; the method can reduce the use of an integral machine, thereby reducing the cost.

The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It should be understood by those skilled in the art that the above embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the scope of the present invention.

Claims

1. A dichotomy online simulation and error checking method based on virtual machine snapshots is characterized by comprising the following steps:

b) deploying a stub code script program;

A number of engineering instances; where k is a reliability parameter, k>0；

d) start stake codeThen, the time when the target software starts to run is called as the respective 0 time, and a virtual machine snapshot is established for the target software, namely the 0 time virtual machine snapshot; if N mirror image instances of 0 are provided, S ═ S is generated₀₁,S₀₂,S₀₃....S_0N]The virtual machine snapshot cluster;

step three: sorting the snapshot mirror images of the virtual machine;

A number of engineering instances;

c) for each utilization stub code implementation pair S_x,middleSubsequent input and output operations; and performing actual simulation operation as shown in e) -g) in the step two, wherein T is not reached due to partial virtual machine instance operation_maxThe fault is reproduced, this time S after this point in time_xyThe mirror image is set to null; the virtual machine mirror image matrix S, S can be generated again according to the method of the third step_x,0-S_x,middleExample is empty；

Step five: submitting the fault causes for manual judgment;

2. The binary online simulation and error-checking method based on the virtual machine snapshot according to claim 1, wherein the functions of the stub code include: starting a target program; monitoring relevant software operation conditions; simulating related network communication, command input and human-computer interface input according to a set script; and generating the memory and call stack information of the target program crash point.

3. The dichotomy online simulation and error checking method based on the virtual machine snapshot according to claim 2, wherein the software operation condition comprises: memory occupation, CPU load, disk I/O, program crash intercept.

4. The binary online simulation and error-checking method based on virtual machine snapshot as claimed in claim 1, wherein in the second step, the probability of problem occurrence is set as p, p ∈ (0, 1)]To ensure probability recurrence, deployments may be made

5. The binary online simulation and error-checking method based on virtual machine snapshots as claimed in claim 1, wherein in step two, the virtual machine management API utilizes functions provided by LibVirt and Qemu to perform snapshot management.

6. The binary online simulation and error-checking method based on virtual machine snapshot according to claim 1,

A number of engineering instances;

7. The dichotomy online simulation and error checking method based on the virtual machine snapshot according to claim 1, wherein in the fifth step, when the number of the mirror image sets that a certain virtual machine is not empty is found to be less than 100 samples, and the virtual machine enters a fault state; restoring all the images meeting the condition to the target virtual machine; the judgment condition for the virtual machine to enter the fault state is as follows: and judging to enter a fault state when the running time of the virtual machine is less than 1 minute.