CN113032271B

CN113032271B - Quantitative determination method and system for redundancy of data samples

Info

Publication number: CN113032271B
Application number: CN202110352339.8A
Authority: CN
Inventors: 汤艳; 关昕; 陈理国; 马由; 李思雨; 陈美萍
Original assignee: CETC 15 Research Institute
Current assignee: CETC 15 Research Institute
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2023-10-13
Anticipated expiration: 2041-03-31
Also published as: CN113032271A

Abstract

The invention provides a quantitative determination method and a quantitative determination system for redundancy of a data sample. The method comprises the following steps: determining a uniform value range interval of the input item, and naming the input item according to the value range interval; replacing the test data value in the sample by using the name of the value range interval; judging whether identical interval name samples exist or not; if the identical section name samples do not exist, calculating the occurrence times of each section name of the input item; calculating redundancy of each test data sample according to the occurrence times of each interval name; and calculating the redundancy of the test data sample set according to the redundancy of each test data sample. The quantitative determination method and the quantitative determination system for the redundancy of the data sample can quantitatively determine the redundancy of the test data sample set.

Description

Quantitative determination method and system for redundancy of data samples

Technical Field

The invention relates to the technical field of software testing, in particular to a quantitative determination method and a quantitative determination system for redundancy of a data sample.

Background

In the software testing process, in order to improve the testing quality, the test data samples need to cover the combination condition of each input item as comprehensively as possible, and in order to meet the testing progress requirement, the number of the test data samples in the test data sample set needs to be reduced so as to improve the testing efficiency. Therefore, in order to balance the relation between the test quality and the test efficiency, a part of redundant samples are manually set according to the specific conditions such as the importance degree of the tested piece, the test progress requirement and the like, so that the software test can cover the combination condition among all the input items as comprehensively as possible on the premise of meeting the progress requirement.

At present, redundancy samples can only be set through personal experience of test designers, redundancy determination is mainly applied to hardware equipment or software systems, and in the field of software testing, a quantitative determination method for redundancy of test data samples is lacking, and redundancy of existing test data samples cannot be determined.

Disclosure of Invention

The invention aims to provide a quantitative determination method and a quantitative determination system for redundancy of a data sample, which can quantitatively determine redundancy of a test data sample set.

In order to solve the technical problems, the invention provides a quantitative determination method for redundancy of a data sample, which comprises the following steps: determining a uniform value range interval of the input item, and naming the input item according to the value range interval; replacing the test data value in the sample by using the name of the value range interval; judging whether identical interval name samples exist or not; if the identical section name samples do not exist, calculating the occurrence times of each section name of the input item; calculating redundancy of each test data sample according to the occurrence times of each interval name; and calculating the redundancy of the test data sample set according to the redundancy of each test data sample.

In some embodiments, further comprising: after judging whether the identical section name sample exists, if the identical section name sample exists, judging that the unacceptable redundant sample exists, and ending the judging process.

In some embodiments, calculating redundancy for each test data sample based on the number of occurrences of each interval name comprises: and calculating the redundancy of each test data sample according to the first redundancy calculation formula.

In some embodiments, the first redundancy calculation formula is as follows:

wherein RS is redundancy of the test data sample S, m is the number of input items forming the test data sample, RP _j The number of times that test data value redundancy occurs for the value of the j-th input item of the test data sample S and the value of the j-th input item of other test data samples in the same test data sample set.

In some embodiments, calculating the redundancy of the test data sample set based on the redundancy of each test data sample comprises: and calculating redundancy of the test data sample set according to the second redundancy calculation formula.

In some embodiments, the second redundancy calculation formula is as follows:

wherein RC is the redundancy of the test data sample set C, n is the number of test data samples contained in the test data sample set C, RC _i Redundancy for the ith test data sample in test data sample set C.

In some embodiments, the first redundancy calculation formula and the second redundancy calculation formula both belong to a redundancy data model.

In addition, the invention also provides a quantitative determination system for redundancy of the data samples, which comprises: one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method for quantitative determination of redundancy of data samples according to the foregoing.

With such a design, the invention has at least the following advantages:

in the field of software testing, the invention can quantitatively judge the redundancy of the test data sample set and verify whether the design of the test data sample meets the redundancy requirement of the test item.

In addition, when a plurality of test data sample sets exist, the invention can compare the advantages and disadvantages of different test data sample sets in terms of redundancy, and provide scientific quantitative index basis for the selection of the test data sample sets.

Drawings

The foregoing is merely an overview of the present invention, and the present invention is further described in detail below with reference to the accompanying drawings and detailed description.

Fig. 1 is an overall flowchart of a data sample redundancy quantitative determination method provided by the present invention.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.

Because in the field of software testing, the redundancy samples can only be set according to human experience, the redundancy of the test data samples lacks quantitative judgment indexes, and when the test data sample set is designed or selected, the redundancy test data samples lack scientific judgment standards, so that the redundancy of the test data samples cannot be accurately measured. Therefore, the discovery aims to provide a quantitative determination method for the redundancy of the data sample according to the characteristics of the test data sample in the software test field so as to solve the problem of quantitative determination of the redundancy of the data sample in the current software test field.

1.1 determination of redundant samples

In software testing, a test data sample is formed by combining all input items with a value, and all test data samples of a certain test case form a test data sample set of the test case.

In the process of designing test data samples, firstly, determining all test data values which need to be covered by each input item, and then combining the test data of each input item into a complete test data sample according to a certain rule. Thus, the redundancy of the test data samples can be divided into two layers, namely:

1) Redundancy of the input item test data;

2) Redundancy of the data samples is tested.

1.1.1 input test data redundancy

Redundancy of test data of an input item means that in all test data of a certain input item, a plurality of test data values exist in the same uniform value range of the input item, and the test data values are equivalent to the input item, so that the input item can be considered to have redundancy of the test data, and the equivalent test data values are called redundant test data.

Entry test data redundancy can be divided into 2 cases:

1) Redundancy of the same test data value;

2) Redundancy of different test data values.

For most tests, such as functional tests, interface tests, etc., it is meaningless to design the same test data values for the same input item, and thus redundancy of such same test data values should be avoided entirely.

The redundancy of different test data values can be increased, so that the purpose of improving the test quality is achieved to a certain extent, and the redundancy of different test data values can be reserved according to project importance and progress requirements.

1.1.2 test data sample redundancy

Redundancy of a test data sample means that the values of all the input items in a certain test data sample and the values of the corresponding input items in other test data samples in the test data sample set form redundancy of test data, and then the test data sample set is considered to have redundancy of the test data sample, and the test data sample is called as a redundancy sample.

Redundancy of test data samples is redundancy of different combination types of values of various input items between test data samples. Test data sample redundancy can be divided into 3 cases:

1) The test data values of all the input items of the two test data samples are redundant of the same test data value, namely the two test data samples are completely equal, which is not meaningful for most tests, and the occurrence of such redundancy should be avoided;

2) The test data values of all the input items of the two test data samples are redundant of different test data values, namely, the test data values are equivalent in terms of the combination type of the input item values, the coverage rate of the value range is only increased, the test data values are generally suitable for reliability tests, and the redundancy is avoided in most tests;

3) The redundancy of the test data values exists between the test data values of all the input items of a certain test data sample and the values of corresponding input items in other 2 or more than 2 test data samples in the test data sample set, and the redundancy samples increase the coverage rate of different combination types of the values of the input items, so that the test quality is improved to a certain extent, and the redundant test data can be reserved according to the importance degree of a tested piece and the requirement of the test progress.

1.2 redundancy data model

The redundancy is a quantitative determination index for measuring the redundancy of the test data samples, and the redundancy of the test data samples in the test data samples is determined by establishing a redundancy data model and calculating the redundancy of the test data samples.

The redundancy data model is used only for the 3 rd case in testing data sample redundancy. If there is redundancy in the test data sample set determined to be avoided, the test data sample set is not subjected to redundancy calculation, and the test data sample set is determined to have unacceptable redundancy samples.

The redundancy data model of the test data sample set is:

wherein RC is redundancy of the test data sample set CDegree, n is the number of test data samples contained in test data sample set C, RC _i Redundancy for the ith test data sample in test data sample set C. I.e. the redundancy of a test data sample set is the maximum of the redundancies of all test data samples in the test data sample set.

The redundancy data model of a certain test data sample in the test data sample set is as follows:

wherein RS is redundancy of the test data sample S, m is the number of input items forming the test data sample, RP _j The number of times that test data value redundancy occurs for the value of the j-th input item of the test data sample S and the value of the j-th input item of other test data samples in the same test data sample set. The redundancy of the test data sample in the test data sample set is the number of times that the redundancy of the test data value occurs in the test data sample set for all the input item values in the test data sample set.

When the redundancy of the test data sample set is 1, it means that there is no redundancy in the test data sample set.

1.3 redundancy decision Algorithm

According to the definition of the redundancy data model, the test data sample redundancy judging algorithm flow is as follows:

1) Dividing all the value ranges of each input item into a plurality of internal uniform value range intervals which are not overlapped with each other, wherein all values in each uniform value range interval can be regarded as equivalent;

2) Naming each value range interval, wherein all value range interval names of the same input item have uniqueness;

3) Traversing all test data samples in the test data set, replacing the test data samples by using the names of the value range intervals of the test data samples according to the test data values of the input items to obtain test data samples consisting of the names of the value range intervals of the input items;

4) Judging whether 2 or more interval name samples formed by the 2 or more interval name samples are completely the same in the test data sample set; if so, the test data sample set is judged to have unacceptable redundant samples; if not, continuing to execute the subsequent steps;

5) Calculating the number of times that each interval name appears in the test data sample set for each input item;

6) The redundancy of each test data sample is the minimum value of the occurrence times of the names of the input item intervals in the sample;

7) The redundancy of the test data sample set is the maximum of the redundancy of all test data samples therein.

2.1 quantitative determination of redundancy

In a test item of an information management system, there is a use case of "add department information", the department information data requirement is shown in table 1, and a test data sample of the use case is shown in table 2.

TABLE 1 department information data requirements

Field name	Content	Remarks
			Department name	Text information	Non-empty, length (0, 100]
Description of responsibilities	Text information	Length [0,1000 ]]

Table 2 add department information test data sample

Sample identification	Department name	Description of responsibilities
			S1	Software testing part	Performing software testing
S2	A part (C)	-
			S3	(100 characters)	(1000 characters)
F1	-	Performing software testing
			F2	(101 characters)	Performing software testing
F3	Software testing part	(1001 characters)

The invention is used for judging the redundancy of the test data sample of the use case:

step 1, determining a uniform value range interval of the input item, and naming.

The use case contains 2 entries of department names and responsibility descriptions, and the entries are divided into uniform value range intervals and named uniquely, as shown in table 3.

Table 3 shows the range of values

And 2, replacing the test data value in the sample by using the name of the value range interval.

The values of the entries in the test data samples in table 2 are replaced with the corresponding section names in table 3, as shown in table 4.

TABLE 4 section name sample

Sample identification	Department name	Description of responsibilities
			S1	Effective equivalence class	Effective equivalence class
S2	Legal left boundary	Legal null value
			S3	Legal right boundary	Legal right boundary
F1	Illegal null value	Effective equivalence class
			F2	Illegal right boundary	Effective equivalence class
F3	Effective equivalence class	Illegal right boundary

And step 3, judging whether identical section name samples exist, if not, calculating the occurrence times of each section name of the input item.

There are no test data samples in table 2 for which the bin name combinations are identical. The number of times each entry interval name appears in the test data sample set is shown in table 5.

TABLE 5 number of occurrences of interval names

Sample identification	Department name	Description of responsibilities
			S1	Effective equivalence class (2 times)	Effective equivalence class (3 times)
S2	Legal left border (1 time)	Legal null (1 time)
			S3	Legal right boundary (1 time)	Legal right boundary (1 time)
F1	Illegal null value (1 times)	Effective equivalence class (3 times)
			F2	Illegal right boundary (1 times)	Effective equivalence class (3 times)
F3	Effective equivalence class (2 times)	Illegal right boundary (1 times)

And 4, calculating redundancy of each test data sample.

According to the redundancy calculation formulaThe redundancy of the case test data samples is shown in table 6.

Table 6 test data sample redundancy

Sample identification	Redundancy degree
		S1	2
S2	1
		S3	1
F1	1
		F2	1
F3	1

And 5, calculating redundancy of the test data sample set.

According to the redundancy calculation formulaThe case test data sample set has a redundancy of 2.

The test item requires that the redundancy of the test data sample is 1, but since the S1 sample is specified as the main path test, all the input items take valid equivalence class values, and after the S1 sample is excluded, the redundancy of the case test data sample set is 1. Therefore, the test data sample design of the use case can be determined to meet the redundancy requirement of the test item.

2.2 the beneficial effects obtained by the embodiment of the invention

By using the redundancy quantitative determination method, the test item can quantitatively present clear index requirements on the aspect of test design redundancy. After the test design is completed, the redundancy of the test data sample can be quantitatively verified through the redundancy of the test data set, so that the quality of the test data sample is evaluated in terms of redundancy.

The above description is only of the preferred embodiments of the present invention, and is not intended to limit the invention in any way, and some simple modifications, equivalent variations or modifications can be made by those skilled in the art using the teachings disclosed herein, which fall within the scope of the present invention.

Claims

1. A method for quantitatively determining redundancy of a data sample, comprising:

determining a uniform value range interval of the input item, and naming the input item according to the value range interval;

replacing the test data value in the sample by using the name of the value range interval;

judging whether identical interval name samples exist or not;

if the identical section name samples do not exist, calculating the occurrence times of each section name of the input item;

according to the number of times of occurrence of each interval name, calculating redundancy of each test data sample, wherein the method comprises the following steps:

according to a first redundancy calculation formula, calculating redundancy of each test data sample;

the first redundancy calculation formula is as follows:

wherein RS is redundancy of the test data sample S, m is the number of input items forming the test data sample, RP _j The number of times of redundancy of the test data value appears for the value of the j-th input item of the test data sample S and the value of the j-th input item of other test data samples in the same test data sample set;

calculating redundancy of the test data sample set according to the redundancy of each test data sample, comprising: according to the second redundancy calculation formula, calculating redundancy of the test data sample set;

the second redundancy calculation formula is as follows:

2. The method for quantitatively determining redundancy in a data sample according to claim 1, further comprising:

after judging whether the identical section name sample exists, if the identical section name sample exists, judging that the unacceptable redundant sample exists, and ending the judging process.

3. The method of claim 1, wherein the first redundancy calculation formula and the second redundancy calculation formula are both redundancy data models.

4. A data sample redundancy quantitative determination system, comprising:

one or more processors;

a storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data sample redundancy quantitative determination method of any one of claims 1 to 3.