CN110618926A

CN110618926A - Source code analysis method and source code analysis device

Info

Publication number: CN110618926A
Application number: CN201910378645.1A
Authority: CN
Inventors: 堀旭宏; 市井诚; 利国爱; 川上真澄
Original assignee: Clarion Co Ltd
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 2018-06-19
Filing date: 2019-05-08
Publication date: 2019-12-27
Also published as: JP2019219848A

Abstract

The invention provides a source code analysis method and a source code analysis device. Which prompts for program elements included in the source code that should be split. A processor (11) calculates a code metric based on source code (13B), calculates a Long Method Score based on the code metric, calculates a maintainability index based on the code metric, determines a program element to be divided based on the Long Method Score, classifies the program element to be divided into child-reverse patterns based on child-reverse pattern definition information, and prompts an improvement position of the source code based on improvement category information.

Description

Source code analysis method and source code analysis device

Technical Field

The invention relates to a source code analysis method and a source code analysis device.

Background

In recent years, development of software is mainly derived by expanding or changing already developed parent software to develop new software. In derivative development of software, since functions are repeatedly expanded or changed over the years, software becomes complicated, and readability of source code tends to be reduced. To solve such a situation, software reconfiguration is generally performed. Reconfiguration means that the internal structure of the software is changed without changing the operation of the software. If program components (hereinafter, referred to as program components) such as methods, classes, and files having low maintainability are appropriately reconfigured, software is easily expanded or changed.

There are reverse modes in the techniques that support reconstruction. The reverse mode is a combination of a feature (hereinafter, referred to as a problem feature) of a program element having low maintainability and a mode of a reconstruction method of a program element having the problem feature. In the reconstruction target software, if a program element having a problem characteristic as shown in the reverse mode can be specified, the reconstruction can be performed by applying the reverse mode.

Patent document 1 describes a method of measuring metrics from a source code, a configuration management system, or an operation log, and evaluating whether or not the source code matches a reverse mode. Thus, it is possible to determine the source code that actually needs to be reconstructed without extracting a complicated source code that does not need to be reconstructed.

Documents of the prior art

Patent document

Patent document 1: japanese patent laid-open publication No. 2016-143107

Disclosure of Invention

Technical problem to be solved by the invention

However, in the method of patent document 1, it is determined whether or not a program element matches a reverse pattern using a feature quantity (hereinafter, referred to as a code metric) of the program element obtained by analyzing a program of a source code, and if the program element matches the reverse pattern, the program element is presented together with a reconstruction flow and a priority. However, the presented reconstruction flow is a correction policy for the entire program elements set in advance for each reverse mode. Therefore, the reconstruction worker needs to determine by himself/herself which position in the program element is to be corrected, and a high skill is required for the reconstruction worker.

The following description will be made by taking a reverse mode of the Long Method as an example.

The Long Method refers to a mode for reconstructing a program element having a problem that readability of the program element is low because the program element is an excessively large Method. When the readability of the program is low, the corrector of the program element may not only take time to understand the program element to be corrected, but also erroneously understand the operation of the program element to be corrected and erroneously correct the program element to be corrected when correcting the program element.

Generally, there is a reconstruction flow of Method segmentation for program elements conforming to the Long Method. This is done by splitting the over-sized Method into a number of smaller methods to make it inconsistent with the Long Method.

With the Method of patent document 1, it is possible to find a program element that matches the Long Method and present the program element on a screen together with a reconstruction flow and priority. However, the reconstruction flow for the Long Method is limited to presenting Method division, and it is not possible to present which position in the Method division is performed. Therefore, the reconstruction worker needs to determine the division position of the method by himself/herself, and high skill is required.

The present invention has been made in view of the above circumstances, and an object thereof is to provide a source code analysis method and a source code analysis apparatus capable of presenting a program element to be divided included in a source code.

Means for solving the problems

In order to achieve the above object, a source code analysis method of a first aspect of the present invention is characterized in that: a code metric acquisition unit that analyzes a source code having a plurality of program elements and generates a code metric regarding the length and complexity of the program elements; a submission-information acquiring section that acquires submission information on a change history of the program element; a score calculation section calculating a score of the program element based on the code metric and the submission information; a maintainability index calculation unit that calculates a maintainability index of the program element based on the code metric; a list generation unit generates a list of the program elements that are candidates to be divided into a plurality of parts, based on the score and the maintainability index.

Effects of the invention

The invention can prompt the program elements to be divided included in the source code.

Drawings

Fig. 1 is a block diagram showing a hardware configuration of a source code analysis device according to a first embodiment.

Fig. 2 is a block diagram showing a functional configuration of the source code analysis device of fig. 1.

Fig. 3 is a diagram showing an example of code metrics acquired by the source code analysis device in fig. 2.

Fig. 4 is a diagram showing an example of a program element change history input to the source code analysis device of fig. 2.

Fig. 5 is a diagram showing an example of the Long Method Score calculated by the source code analysis device of fig. 2.

Fig. 6 is a diagram showing an example of the maintainability index calculated by the source code analyzer of fig. 2.

Fig. 7 is a diagram showing an example of a Long Method list generated by the source code analysis device in fig. 2.

Fig. 8 is a flowchart showing a process of the code metric acquisition unit of fig. 2.

Fig. 9 is a flowchart showing a process of the Long Method Score calculating unit of fig. 2.

Fig. 10 is a flowchart showing a process of the maintainability index calculation unit.

Fig. 11 is a flowchart showing a process of the Long Method condition setting unit in fig. 2.

Fig. 12 is a flowchart showing a process of the Long Method list generation unit in fig. 2.

Fig. 13 is a diagram showing an example of a histogram in which the horizontal axis represents the maintainability index and the vertical axis represents the number of program elements, which is displayed by the Long Method condition setting unit in fig. 2.

Fig. 14 is a diagram showing an example of a histogram with the horizontal axis of the Long Method Score and the vertical axis of the program element number displayed by the Long Method condition setting unit of fig. 2.

Fig. 15 is a block diagram showing a functional configuration of a source code analysis device according to the second embodiment.

Fig. 16 is a diagram showing an example of error (bug) history information input to the source code analysis device of fig. 15.

Fig. 17 is a flowchart showing a process of the Long Method condition setting unit in fig. 15.

Fig. 18 is a flowchart showing a process of the Long Method list generation unit in fig. 15.

Fig. 19 is a diagram showing an example of a histogram in which the horizontal axis represents the maintainability index and the vertical axis represents the number of program elements, which is displayed by the Long Method condition setting unit in fig. 15.

Fig. 20 is a diagram showing an example of a histogram with the horizontal axis of the Long Method Score and the vertical axis of the program element number displayed by the Long Method condition setting unit of fig. 15.

Fig. 21 is a diagram showing an example of a Long Method list generated by the source code analyzer according to the third embodiment.

Fig. 22 is a block diagram showing a functional configuration of a source code analysis device according to the fourth embodiment.

Fig. 23 is a diagram showing an example of child-negative pattern definition information input to the source code analysis device of fig. 22.

Fig. 24 is a diagram showing an example of the child reverse side pattern list generated by the source code analysis device in fig. 22.

Fig. 25 is a diagram showing an example of improvement category information input to the source code analysis device of fig. 22.

Fig. 26 is a flowchart showing the processing of the child reverse side pattern classification section of fig. 22.

Fig. 27 is a flowchart showing a process of the improvement position mapping unit of fig. 22.

Fig. 28 is a diagram showing an example of source code output from the source code analysis device according to the sixth embodiment, the source code showing an improved position.

Fig. 29 is a diagram showing an example of screen transition of the source code analysis device according to the eighth embodiment.

Fig. 30 is a diagram showing an example of an analysis object selection screen of the source code analysis device shown in fig. 22.

Fig. 31 is a diagram showing an example of a Long Method condition setting screen of the source code analysis device shown in fig. 22.

Fig. 32 is a diagram showing an example of a Long Method list screen of the source code analysis device in fig. 22.

Fig. 33 is a diagram showing an example of an improved position display screen of the source code analyzer of fig. 22.

Description of the symbols

10 … … source code analysis device, 11 … … processor, 12 … … main storage device, 13 … … auxiliary storage device, 14 … … input device, 15 … … output device, 16 … … communication device and 17 … … bus.

Detailed Description

Embodiments are described with reference to the drawings. The embodiments described below are not intended to limit the invention in the claims, and the elements and all combinations thereof described in the embodiments are not necessarily essential to the solving means of the invention.

(first embodiment)

In fig. 1, the source code analysis device 10A includes a processor 11, a main storage device 12, an auxiliary storage device 13, an input device 14, an output device 15, and a communication device 16. The processor 11, the main storage 12, the auxiliary storage 13, the input device 14, the output device 15, and the communication device 16 are communicably connected to each other via a communication unit such as a bus 17.

The processor 11 is constituted by, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). The processor 11 reads and executes a program stored in the main storage 12 to realize various functions of the source code analysis apparatus 10A. The main storage device 12 is a device for storing programs and data, and examples thereof include a ROM (Read Only Memory), a RAM (Random Access Memory), a non-Volatile semiconductor Memory (nvram) (non Volatile RAM), and the like.

The auxiliary storage device 13 is, for example, a hard disk Drive, an SSD (Solid State Drive), an optical storage device (CD (Compact Disc), DVD (Digital Versatile Disc), or the like), a storage system, a reading/writing device for a recording medium such as an IC card, an SD memory card, or an optical recording medium, or a storage area of a cloud server. Programs and data stored in the secondary storage device 13 can be loaded to the primary storage device 12 at any time.

The input device 14 is, for example, a keyboard, a mouse, a touch panel, a card reader, an audio input device, or the like. The output device 15 is a user interface for providing various information such as the progress of processing and the result of processing to the user. The output device 15 is, for example, a screen display device (a liquid crystal monitor, an organic EL display, a video card, or the like), an audio output device (a speaker, or the like), a printing device, or the like. For example, the source code analysis device 10A may be configured to input and output information to and from another device via the communication device 16.

The communication device 16 is a communication interface of a wired or wireless system that realizes communication with another device via a communication means such as a LAN (Local Area Network) or the internet. The communication device 16 is, for example, an NIC (Network Interface Card), a wireless communication module, a USB (Universal serial Interface) module, a serial communication module, or the like.

Here, the auxiliary storage device 13 can store the source code analysis program 13A and the source code 13B. Then, the processor 11 loads the source code analysis program 13A and the source code 13B into the main storage device 12, and executes the source code analysis program 13A, thereby analyzing the source code 13B.

At this time, the processor 11 can analyze the source code 13B having a plurality of program elements to generate a code metric regarding the length and complexity of the program element, acquire submission information regarding the change history of the program element, calculate the score of the program element based on the code metric and the submission information, calculate the maintainability index of the program element based on the code metric, and generate a list of program elements that are candidates to be divided into a plurality of pieces based on the score and the maintainability index. The Score may be LongMethod Score.

The processor 11 is also capable of classifying the program elements to be divided into child-reverse patterns based on the child-reverse pattern definition information, and prompting the improvement positions of the source codes based on the improvement category information.

The code metric is a value obtained by digitizing the source code 13B by a series of references of the source code. For example, as the code metric, values regarding the length and complexity of the program element generated by analyzing the source code 13B may be used. The program element is a component of the program. For example, a program element is a module, class, file, method, field, function, variable, and the like. The LongMethod Score is obtained by numerically expressing the characteristic that the readability of the program element is low because the program element is an excessively large method. The maintainability index is obtained by numerically expressing a characteristic that affects the workload related to the change of the program. The child reverse side patterns are obtained by classifying the reverse side patterns according to the type of the Long Method. The child reverse side pattern definition information is information defining a child reverse side pattern. The improvement category information is information indicating a procedure improvement method for the child reverse side mode.

The execution of the source code analysis program 13A may be shared by a plurality of processors or computers. Alternatively, the processor 11 may instruct a cloud computer or the like to execute all or a part of the source code analysis program 13A via the communication device 16 and receive the execution result.

In fig. 2, the source code analysis device 10A includes a submission information acquisition unit 100, a code metric acquisition unit 101, a Long Method Score calculation unit 102, a maintainability index calculation unit 103, a Long Method condition setting unit 104A, LongMethod list generation unit 105A, and an information storage unit 200.

These functions can be realized, for example, by the processor 11 reading and executing programs stored in the main storage 12 and the auxiliary storage 13. Alternatively, these functions can be realized by, for example, hardware (ASIC (Application Specific Integrated Circuit) included in the source code analyzer 10A.

In the source code analysis, the source code 109, the program element change history 110 of the source code 109, the Long Method Score threshold 112, and the maintainability index threshold 113 are input to the source code analysis apparatus 10A. The code metric acquisition unit 101 receives a source code 109 to be analyzed, generates a code metric 117, and outputs the code metric to the Long Method Score calculation unit 102 and the maintainability index calculation unit 103. The submission-information acquiring unit 100 receives a program element change history 110 of a source code 109 to be analyzed, generates submission information 108, and outputs the submission information to the Long Method Score calculating unit 102.

The Long Method Score calculating unit 102 receives the code metrics 101 and the submission information 108, generates a Long Method Score118, and outputs the generated Long Method Score to the Long Method condition setting unit 104A and the Long Method list generating unit 105A. The maintainability index calculation unit 103 receives the code metric 117, generates a maintainability index 119, and outputs the maintainability index to the Long Method condition setting unit 104A and the Long Method list generation unit 105A.

The Long Method condition setting unit 104A receives the Long Method Score118 and the maintainability index 119, and can set the Long Method Score threshold 112 and the maintainability index threshold 113. At this time, the Long Method condition setting unit 104A can display the relationship between the number of program elements and the maintainability index 119 and the relationship between the number of program elements and the Long Method score 118. The threshold value setter can set the Long Method Score threshold value 112 and the maintenance index threshold value 113 by referring to these relationships.

The Long Method Score threshold 112 and the maintainability index threshold 113 are input to the Long Method list generation unit 105A. The Long Method list generation unit 105A receives the Long Method Score118, the maintainability index 119, the Long Method Score threshold 112, and the maintainability index threshold 113, and generates a Long Method list 120A. The Long Method list 120A is obtained by recording program elements that match the Long Method as program elements to be divided.

Here, the Long Method list generating unit 105A may determine that a program element having a Long Method Score of 112 or more and a maintainability index of 113 or less is a Long Method. Alternatively, the Long Method list generation unit 105A may determine that a program element having a Long Method Score equal to or greater than the Long Method Score threshold 112 is a Long Method.

The information storage section 200 stores the code metrics 117, the Long Method Score118, and the maintainability index 119. The auxiliary storage device 13 of fig. 1 can be used as the information storage unit 200. The information storage unit 200 stores, in addition to these pieces of information, information that the code metric acquisition unit 101, the Long Method Score calculation unit 102, the maintainability index calculation unit 103, the Long Method condition setting unit 104A, and the Long Method list generation unit 105A refer to or generate as appropriate, and the like. The information storage unit 200 can manage information stored in the secondary storage device 13 by using, for example, a file System or a DBMS (DataBase Management System).

Here, the source code analysis device 10A can present the program elements to be divided included in the source code 109 by generating the Long Method Score118 based on the code metrics 117 and the submission information 108, and determining whether or not the program elements are Long methods based on the Long Method Score 118. Therefore, the reconstruction worker can know at which position in the method is divided without judging the division position of the method by himself/herself, and can improve the readability of the software in which the function expansion and change are repeated.

In fig. 3, the code metric 117 is obtained by recording various metric values calculated by the code metric acquisition unit 101 as input to the source code 109 of the program to be analyzed as a code metric record 303. Code metric record 303 includes: a program element name 301 in which the calculation target program element of the code metric record 303 is recorded; and code metrics 302 representing various metric values of the program element name 301. For example, in the program element 1, values of the metric M1-75 and the metric M2-0.3 are recorded.

In fig. 4, each time a program to be analyzed is changed, a program element change history 110 is obtained by using a change history 403 as a history of the change. The change history 403 includes: a commit number 401 uniquely attached to identify the change history 403; and one or more changed program element names 402 that are changed simultaneously with the occurrence of a program change corresponding to the change history 403. For example, when a change corresponding to the change history 403 with the filing number 401 being T1 occurs, the program element 1, the program element 2, and the program element 3 described in the changed program element name 402 are changed at the same time.

In fig. 5, the Long Method Score118 records values calculated in the Long Method Score calculation section 102 with the code metrics 117 and the submission information 108 as inputs as a Long Method Score record 503. Long Method Score record 503 includes: a program element name 501 in which the program element to be calculated of the Long Method record 503 is recorded; and a Long Method Score 502 of the program element name 501. For example, in the program element 1, a Long Method Score of 57 is recorded. The larger the Long Method Score 502, the higher the correspondence of the program element to the reverse side pattern of Long Method.

In fig. 6, the maintainability index 119 is obtained by recording a value calculated by the maintainability index calculation unit 103 with the code metric 117 as an input, as a maintainability index record 603. The maintenance index record 603 includes: a program element name 601 in which the program element to be calculated of the maintainability index record 603 is recorded; and a maintainability index 602 of the program element name 601. As a calculation method of the maintainability index 602, a commonly used calculation formula can be used. The larger the maintainability index 602 is, the higher the maintainability of the program element is.

In fig. 2, the Long Method Score threshold 112 is recorded in the threshold of the Long Method Score118 used when the Long Method generation unit 105 determines whether or not each program element is a Long Method. For example, 60 is set as the Long Method Score threshold 112. The maintenance index threshold 113 is obtained by recording a threshold of the maintenance index 119 used when the Long Method generation unit 105 determines whether or not each program element is a Long Method. For example, 30 is set as the maintainability index threshold 113.

In fig. 7, the Long Method list 120A is obtained by recording program elements determined to be compatible with the Long Method by the Long Method list generation unit 105A as Long Method records 702. Long Method record 702 includes program element names 701 that correspond to the Long Method. For example, the program elements 1, 2 are recorded in the Long Method list 120A as program elements that conform to the Long Method.

In fig. 8, the code metric acquisition unit 101 reads (801) the source code 109 of the analysis target program. Next, the code metric acquisition unit 101 divides the source code 109 into program elements (802). Next, the code metric acquisition unit 101 determines whether or not the code metrics 117(803) have been acquired for all the program elements. When code metrics 117 have not been acquired for all program elements, code metric acquisition unit 101 acquires code metrics 117 of the program elements, and returns to 803 (804). As the type of the acquired code metric 117, the type of the code metric used in the Long Method Score calculation unit 102 and the maintainability index calculation unit 103 is selected. When the code metrics 117 have been acquired for all the program elements, the processing is ended.

In fig. 9, the Long Method Score calculating unit 102 determines whether or not the Long Method Score118 (901) has been calculated for all program elements. In the case where the Long Method Score118 has not been calculated for all program elements, the Long Method Score calculation section 102 reads the submission information 108 and the code metrics 117 of the program elements (902, 903). However, in the case where the submitted information 108 is not required in the Long Method Score calculation (904), the reading of the submitted information 108 may be omitted. In the case where code metrics 117 are not needed in the Long Method Score calculation (904), the reading of code metrics 117 may be omitted.

At this point, a Long Method Score118 is calculated from either or both of the submission information 108 and the code metrics 117. The calculation Method of Long Method Score is not limited. The order in which the reading (902) of the information 108 and the reading (903) of the code metric 117 are submitted is not limited.

Next, the Long Method Score calculating section 102 calculates the Long Method Score118 based on the submission information 108 and the code metric 117 of the program element, and returns to 901 (904). When the Long method score118 has been calculated for all program elements, the process ends.

The code metrics 117 used in the calculation of the Long Method Score118 are, for example, the number of lines without annotations, the degree of circle complexity, the number of lines of annotations, and the number of local variable declarations. The submission information 108 used in the calculation of the Long Method Score118 is, for example, the number of change locations and the weighted number of submissions per submission.

In fig. 10, the maintenance index calculation unit 103 determines whether or not the maintenance index 119(1001) has been calculated for all the program elements. When the maintainability index 119 has not been calculated for all the program elements, the maintainability index calculation unit 103 reads the code metric 117(1002) of the program element. Next, the maintainability index calculation unit 103 calculates the maintainability index 119 based on the code metric 117 of the program element, and returns to 1001 (1003). When the maintainability index 119 has been calculated for all the program elements, the process ends.

The maintainability index is a code metric representing the ease of maintenance of the software. One of the maintainability indexes is, for example, an easy maintainability index. The ease of maintenance index can be calculated by the following equation, for example.

Easy maintenance index ═ 171-5.2 × ln ((N1+ N2) × log2(N1+ N2)) -0.23 × G) -16.2 × ln (loc) × 100/171

Where N1 is (total number of operators), N2 is (total number of operators), N1 is (class number of operators), N2 is (class number of operators), G is (number of turns), and LOC is (number of rows without comments). The maintainability index is not limited to the maintainability index, and may be any index relating to the maintainability of the program.

In fig. 11, the Long Method condition setting unit 104A determines whether or not the Long Method Score118 and the maintainability index 119(1101) have been read for all program elements. If the Long Method Score118 and the maintainability index 119 have not been read for all program elements, the Long Method condition setting unit 104A reads the Long Method Score118 and the maintainability index 119 of the program element (1102, 1103).

When the Long Method condition setting unit 104A has read the Long Method Score118 and the maintainability index 119 for all the program elements, initial values are set for the Long Method Score threshold 112 and the maintainability index threshold 113 (1104). Here, an arbitrary value may be selected as the initial value.

Next, the Long Method condition setting unit 104A displays a histogram with the maintenance index on the horizontal axis and the program element number on the vertical axis (1105). The histogram display assists the threshold setter in setting the Long Method Score threshold 112 and the maintainability index threshold 113.

Next, the Long Method condition setting unit 104A displays a histogram with the Long Method Score as the horizontal axis and the program element number as the vertical axis (1106). The histogram display assists the threshold setter in setting the Long Method Score threshold 112 and the maintainability index threshold 113.

Next, the Long Method condition setting unit 104A receives a determination whether or not the threshold value setter has continued the threshold value setting (1107). When it is determined that the threshold value setter continues the threshold value setting, the Long Method condition setting unit 104A receives the numerical value input of the Long Method Score threshold value 112 by the threshold value setter, and sets it as the Long Method Score threshold value 112 (1108). Next, the Long Method condition setting unit 104A receives a numerical value input of the maintenance index threshold 113 by the threshold value setter, and sets the numerical value as the maintenance index threshold 113 (1109).

Next, when the Long Method condition setting unit 104A sets the numerical values input by the threshold value setter as the Long Method Score threshold value 112 and the maintainability index threshold value 113, the process returns to 1105. The Long Method condition setting unit 104A repeats the processing of 1105 to 1109 until it is determined that the threshold value setter does not continue the threshold value setting, and ends the processing when it is determined that the threshold value setter does not continue the threshold value setting.

The order of reading 1102 of the Long Method Score118 and 1103 of the maintainability index 119 is not limited. The order of displaying the histogram with the horizontal axis of the maintainability index 119 and the vertical axis of the program element number (1105) and displaying the histogram with the horizontal axis of the LongMethod Score118 and the vertical axis of the program element number (1106) is not limited. The order of the input (1108) of the Long Method Score threshold 112 by the threshold setter and the input (1109) of the maintainability index threshold 113 by the threshold setter is not limited.

In fig. 12, Long Method list generation unit 105A prepares a blank Long Method list 120A (1201). Next, the Long Method list generation unit 105A reads the Long Method Score threshold 112 and the maintainability index threshold 113 of the program element (1202, 1203).

Next, the Long Method list generation unit 105A determines whether or not the Long Method Score118 and the maintainability index 119 have been compared with the Long Method Score threshold 112 and the maintainability index threshold 113, respectively, for all program elements (1204). If the Long Method Score118 and the maintainability index 119 have not been compared with the Long Method Score threshold 112 and the maintainability index threshold 113, respectively, for all program elements, the Long Method list generation unit 105A reads the Long Method Score118 and the maintainability index 119 of the program element (1205, 1206).

Next, the Long Method list generation unit 105A determines whether the Long Method Score118 is not less than the Long Method Score threshold 112 and the maintainability index 119 is not more than the maintainability index threshold 113 (1207). When the condition of 1207 is satisfied, Long Method list generation unit 105A adds the program elements and other information to Long Method list 120A (1208), and returns to 1204. If the condition of 1207 is not satisfied, 1208 is skipped and 1204 is returned. When the Long Method Score118 and the maintainability index 119 have been compared with the Long Method Score threshold 112 and the maintainability index threshold 113, respectively, for all program elements, the processing is ended.

The order of preparation (1201) of the blank Long Method list 120A, reading (1202) of the Long Method Score threshold 112, and reading (1203) of the maintainability index threshold 113 is not limited. The order of reading (1205) of the Long Method Score118 and reading (1206) of the maintainability index 119 is not limited.

In the example of fig. 13, a case is shown in which 60 is set as the Long Method Score threshold 112 and 30 is set as the maintainability index threshold 113. The histogram includes a legend 1301, a maintainability index threshold axis 1302, a program element number 1303 as the vertical axis, and a maintainability index 1304 as the horizontal axis. The histogram body 1305 includes histograms 1306 of all program elements and histograms 1307 of program elements whose Long Method Score is equal to or greater than a threshold. The Long Method Score is a program element existing on the left side of the maintainability index threshold axis 1302 among program elements equal to or greater than the threshold, and is determined to be a Long Method by the Long Method list generation unit 105A. The threshold setter can change the LongMethod Score threshold 112 and the maintainability index threshold 113 with reference to the histogram.

In the example of fig. 14, a case is shown in which 60 is set as the Long Method Score threshold 112 and 30 is set as the maintainability index threshold 113. The histogram includes a legend 1401, a Long Method Score threshold axis 1402, program element number 1403 as vertical axis, and Long Method Score 1404 as horizontal axis. The histogram body 1405 includes a histogram 1406 of all program elements and a histogram 1407 of program elements whose maintainability index is equal to or less than a threshold value. Program elements having a maintainability index equal to or less than the threshold value are program elements existing on the right side of the Long Method Score threshold axis 1402, and are determined to be Long methods by the Long Method list generating unit 105A. The threshold setter can change the Long Method Score threshold 112 and the maintainability index threshold 113 with reference to the histogram.

In this way, by using the Long Method Score118 and the maintainability index 119, it is possible to find program elements that match the Long Method, which is one of the reverse patterns, and list them as program elements to be divided. At this time, the threshold value setter can change the number of program elements extracted as the Long Method by changing the Long Method Score threshold value 112 and the maintainability index threshold value 113 with reference to the histogram of fig. 13 or 14.

(second embodiment)

The second embodiment shows a case where the Long Method condition setting unit 104A and the Long Method list generating unit 105A shown in the first embodiment use error history information and a readability-reducing word condition as inputs in addition to the inputs shown in example 1.

In fig. 15, source code analysis device 10B includes Long Method condition setting unit 104B and Long Method list generation unit 105B instead of Long Method condition setting unit 104A and Long Method list generation unit 105A of source code analysis device 10A in fig. 2.

In the source code analysis, the source code analysis device 10B inputs the source code 109, the program element change history 110 of the source code 109, the Long Method Score threshold 112, and the maintenance index threshold 113, and also inputs the error history information 111 and the readability reduction word condition 114. The error history information 111 records the contents of errors that occurred in the past in each program element and the cause thereof. The readability reduction word condition 114 has a judgment condition recorded therein for judging the cause of error occurrence in the error history information 111 used when the Long Method generation unit 105B judges whether or not each program element is a Long Method.

The Long Method condition setting section 104B accepts the Long Method Score118, the maintainability index 119, and the error history information 111, so that the Long Method Score threshold 112, the maintainability index threshold 113, and the readability reduction word condition 114 can be set. At this time, the Long Method condition setting unit 104B can display the relationship between the program element number and the maintenance index 119 and the relationship between the program element number and the Long Method Score118, for each of the presence or absence of the readability reduction word. The threshold setter can set the Long Method Score threshold 112, the maintainability index threshold 113, and the readability-reducing word condition 114 with reference to these relationships.

The Long Method Score threshold 112, the maintainability index threshold 113, the error history information 111, and the readability reduction word condition 114 are input to the Long Method list generation section 105B. The Long Method list generation unit 105B receives the Long Method Score118, the maintainability index 119, the Long Method Score threshold 112, the maintainability index threshold 113, the error history information 111, and the readability reduction word condition 114, and generates a Long Method list 120A.

Here, the Long Method list generating unit 105B may determine that a program element in which the Long Method Score is equal to or greater than the Long Method Score threshold 112 and the maintainability index is equal to or less than the maintainability index threshold 113 and the cause of error occurrence satisfies the readability reduction word condition 114 is a Long Method. Alternatively, the Long Method list generation unit 105B may determine that the program element having the Long Method Score equal to or greater than the Long Method Score threshold 112 and the error occurrence cause satisfying the readability reduction word condition 114 is a Long Method. Alternatively, the Long Method list generation unit 105B may determine that the program element whose maintenance index is equal to or less than the maintenance index threshold 113 and whose error occurrence cause satisfies the readability reduction word condition 114 is a Long Method.

Here, the source code analysis device 10B can present the program elements to be divided included in the source code 109 by generating the Long Method Score118 based on the code metric 117 and the submission information 108, and determining whether or not the program element is a Long Method based on the Long Method Score118 and the error occurrence cause. In this case, if the cause of the error satisfies the readability reduction word condition 114, misreading of the program is likely to occur, and the program element is likely to be a Long Method. Therefore, by referring to the error history information 111 in addition to the Long Method Score118 and the maintainability index 119 to determine whether or not the program element of the source code 109 is a Long Method, the determination accuracy of the Long Method can be improved.

Fig. 16 is a diagram showing an example of error history information input to the source code analysis device of fig. 15.

In fig. 16, error history information 111 exists for each program element. The error history information 111 includes: a program element name 1601 indicating to which program element the error history information 111 corresponds; and past error information 1602 for the program element. Past error information 1602 includes an error log 1605 of the program element. The error record 1605 includes the error content 1603 of the error of the program element and the error occurrence reason 1604. For example, error contents B1 and B2 are recorded as error contents 1603 in the error history information 111 of the program element 1, and exception processing omission and conditional branch consideration omission are recorded as error occurrence causes 1604. However, error content 1603 may also be omitted.

As the readability-reducing word condition 114, for example, "(understanding | considering | confirming | reviewing | studying | correcting)" (missing | insufficient | inattentive | forgetting) "is set. In this example, the readability reducing word condition 114 is represented in the form of a regular expression, but is not limited to this method, and may be any method capable of expressing a matching condition of a natural language. The word or combination thereof used in the readability-reducing word condition 114 may be any combination of words as long as the readability is reduced.

In fig. 17, the Long Method condition setting unit 104B determines whether or not the Long Method Score118, the maintainability index 119, and the error history information 111(1701) have been read for all program elements. If the Long Method Score118, the maintainability index 119, and the error history information 111 have not been read for all program elements, the Long Method condition setting unit 104B reads the Long Method Score118, the maintainability index 119, and the error history information 111 of the program element (1702, 1703, 1704).

When the Long Method condition setting unit 104B has read the Long Method Score118, the maintainability index 119, and the error history information 111 for all the program elements, initial values are set for the Long Method Score threshold 112, the maintainability index threshold 113, and the error history information 111 (1705). Here, an arbitrary value may be selected as the initial value.

Next, the Long Method condition setting unit 104B displays a histogram having the maintenance index as the horizontal axis and the program element number as the vertical axis for each readability-reducing word (1706). The display of the histogram assists the threshold setter in setting the LongMethod Score threshold 112, the maintenance indicator threshold 113, and the readability reducing word condition 114.

Next, the Long Method condition setting unit 104B displays a histogram in which the horizontal axis is the Long Method Score and the vertical axis is the number of program elements for each readability-reducing word (1707). The display of the histogram assists the threshold setter in setting the Long Method Score threshold 112, the maintenance indicator threshold 113, and the readability reducing word condition 114.

Next, the Long Method condition setting unit 104B receives a determination whether or not the threshold value setter has continued the threshold value setting (1708). When it is determined that the threshold value setter continues the threshold value setting, the Long Method condition setting unit 104B receives the numerical value input of the Long Method Score threshold value 112 by the threshold value setter, and sets the numerical value as the Long Method Score threshold value 112 (1709). Next, the Long Method condition setting unit 104B receives the numerical value input of the maintenance index threshold 113 by the threshold value setter, and sets it as the maintenance index threshold 113 (1710). Next, the Long Method condition setting unit 104B receives the character input of the readability-reducing word condition 114 by the threshold value setter, and sets it as the readability-reducing word condition 114 (1711).

Next, when the Long Method condition setting unit 104B sets the input of the threshold value setter as the Long Method Score threshold 112, the maintenance index threshold 113, and the readability-reducing word condition 114, the routine returns to 1706. The Long Method condition setting unit 104B repeats the processing of 1706 to 1711 until it is determined that the threshold value setting is not to be continued by the threshold value setter, and ends the processing when it is determined that the threshold value setting is not to be continued by the threshold value setter.

The order of reading (1702) of the Long Method Score118, reading (1703) of the maintainability index 119, and reading (1704) of the error history information 111 is not limited. The order of the input (1709) of the Long Method Score threshold 112, the input (1710) of the maintenance index threshold 113, and the input (1711) of the readability reduction word condition 114 by the threshold setter is not limited.

In fig. 18, Long Method list generation unit 105B prepares a blank Long Method list 120A (1801). Next, the Long Method list generation unit 105B reads the Long Method Score threshold 112, the maintenance index threshold 113, and the readability reduction word condition 114 of the program element (1802, 1803, 1804).

Next, the Long Method list generation unit 105B determines whether or not the error occurrence causes of the Long Method Score118, the maintainability index 119, and the error history information 111 have been compared with the Long Method Score threshold 112, the maintainability index threshold 113, and the readability reducing word condition 114, respectively, for all the program elements (1805). In a case where the error occurrence causes of the Long Method Score118, the maintainability index 119, and the error history information 111 have not been compared with the Long Method Score threshold 112, the maintainability index threshold 113, and the readability reduction word condition 114, respectively, for all the program elements, the Long Method list generating part 105B reads the Long Method Score118, the maintainability index 119, and the error history information 111 of the program element (1806, 1807, 1808).

Next, the Long Method generation unit 105B determines whether the Long Method Score118 is not less than the Long Method Score threshold 112, the maintainability index 119 is not more than the maintainability index threshold 113, and the error occurrence cause satisfies the readability reduction word condition 114 (1809). When the condition 1809 is satisfied, Long Method list generation unit 105B adds the program element to Long Method list 120A (1810), and returns to 1805. If the condition of 1809 is not satisfied, 1810 is skipped and the process returns to 1805. When the error occurrence causes of the Long Method Score118, the maintainability index 119, and the error history information 111 have been compared with the Long Method Score threshold 112, the maintainability index threshold 113, and the readability reduction word condition 114, respectively, for all program elements, the processing is ended.

The order of the preparation of the blank Long Method list (1801), the reading of the Long Method Score threshold 112 (1802), the reading of the maintenance index threshold 113 (1803), and the reading of the reduced readability word condition 114 (1804) is not limited. The order of reading (1806) of the Long Method Score118, reading (1807) of the maintainability index 119, and reading (1808) of the error history information 111 is not limited.

In the example of fig. 19, a case is shown in which 60 is set as the Long Method Score threshold 112 and 30 is set as the maintainability index threshold 113. The histogram includes a legend 1901, a maintainability index threshold axis 1302, a program element number 1303 as the vertical axis, and a maintainability index 1304 as the horizontal axis. Histogram body 1905 includes histograms 1306 for all program elements, histograms 1307 for program elements with a Long Method Score above a threshold, and histograms 1902 for program elements with a Long Method Score above a threshold and containing reduced readability words. Among the program elements that are equal to or higher than the threshold value and include the readability reduction word and that are present on the left side of the maintenance index threshold axis 1302, the Long Method list generation unit 105B determines that the program element is a Long Method. The threshold setter can alter the Long Method Score threshold 112, the maintenance indicator threshold 113, and the readability reducing word condition 114 with reference to the histogram.

In the example of fig. 20, a case is shown in which 60 is set as the Long Method Score threshold 112 and 30 is set as the maintainability index threshold 113. The histogram includes a legend 2001, a Long Method Score threshold axis 1402, program element number 1403 as vertical axis, and Long Method Score 1404 as horizontal axis. The histogram main body 2005 includes a histogram 1406 of all program elements, a histogram 1407 of program elements whose maintainability index is equal to or less than a threshold value, and a histogram 2002 of program elements whose maintainability index is equal to or less than a threshold value and which include readability-reducing words. The program elements having the maintenance index equal to or less than the threshold and including the readability-reducing word, which are present on the right side of the Long Method Score threshold axis 1402, are determined to be Long methods by the Long Method list generating unit 105B. The threshold setter can alter the Long Method Score threshold 112, the maintenance indicator threshold 113, and the readability reducing word condition 114 with reference to the histogram.

In this way, by determining whether or not a program element includes a readability-reducing word while using the Long Method Score118 and the maintainability index 119, it is possible to improve the accuracy of determining a program element that matches a Long Method, which is one of the reverse modes.

(third embodiment)

In the third embodiment, in addition to the program element name 701 corresponding to the Long Method, the Long Method record 702 of the Long Method list 120A in fig. 7 records the Long Method Score, the maintainability index, and the like of the program element.

In fig. 21, the Long Method list 120B is obtained by recording, as a Long Method record 2104, program elements determined to be compatible with the Long Method by the Long Method list generating units 105A and 105B. The Long Method record 2104 includes a program element name 2101, a Long Method Score2102, and a maintainability index 2103, etc., that are consistent with the Long Method. Other information may also be included in Long Method record 2104.

(fourth embodiment)

In the fourth embodiment, after the Long Method list 120A in fig. 7 or the Long Method list 120B in fig. 21 is generated, the program elements in the Long Method lists 120A and 120B are further classified into the child-back pattern, and the source code indicating the improved position according to the child-back pattern is generated.

In fig. 22, the source code analysis device 10C includes a child reverse side pattern classification unit 106 and an improvement position mapping unit 107 in addition to the configuration of the source code analysis device 10B in fig. 15. These functions can be realized as in the case of the first to third embodiments.

In the source code analysis, the source code analysis device 10C inputs child reverse side pattern definition information 115 and improvement category information 116 in addition to the source code 109, the program element change history 110 of the source code 109, the Long Method Score threshold 112 and the maintainability index threshold 113, the error history information 111, and the readability reduction word condition 114.

The child reverse side pattern classification unit 106 receives the source code 109 to be analyzed, child reverse side pattern definition information 115, program element change history 110, code metrics 117, and Long Method list 120B, and generates a child reverse side pattern list 121. The improvement position mapping unit 107 receives the source code 109 and the improvement category information 116, and generates the source code 122 indicating the improvement position. The improved position of the source code can be expressed in a line unit or a token (token) unit of the program.

The improvement positions of the Long Method can be classified into, for example, the following a) to D).

A) If the processing is at the section, the processing is at the section;

B) if there is a common part in the branch target, extracting and functionalizing the common part;

C) a segment into which the processing to determine which state to shift next and the processing performed in that state are divided;

D) when exception processing is added to the original processing, the original processing and the exception processing are divided into segments.

The information storage unit 200 stores a Long Method list 120B and a child reverse pattern list 121 in addition to the code metrics 117, the Long Method Score118, and the maintainability index 119. The information storage unit 200 stores, in addition to these pieces of information, information appropriately referred to or generated by the code metric acquisition unit 101, the Long Method Score calculation unit 102, the maintainability index calculation unit 103, the Long Method condition setting unit 104B, the Long Method list generation unit 105B, the child-back pattern classification unit 106, and the improvement position mapping unit 107. The information storage unit 200 can manage information stored in the secondary storage device 13 by using, for example, a file System or a DBMS (DataBase Management System).

Here, the source code analysis device 10C classifies program elements to be divided into child-reverse patterns based on the child-reverse pattern definition information 115, and presents the source code 122 indicating the improvement position based on the improvement category information 116, so that the restructuring operator can know at which position in the method is divided without judging the division position of the method by himself/herself, and the readability of software repeatedly performing the expansion or change of the function can be improved.

Fig. 22 shows a configuration in which the child reverse side pattern classification unit 106 and the improvement position mapping unit 107 are added to the configuration of the source code analysis device 10B in fig. 15, but the child reverse side pattern classification unit 106 and the improvement position mapping unit 107 may be added to the configuration of the source code analysis device 10A in fig. 2.

In fig. 23, the child-back-side pattern definition information 115 is obtained by recording definition information for classifying Long Method, which is one of the back-side patterns, into child-back-side patterns, as a child-back-side pattern definition record 2303. The child back side pattern record 2303 includes the name 2301 of the child back side pattern and a definition 2302 of the child back side pattern.

In the child reverse side schema definition 2302, 1 program element may be defined to correspond to a plurality of child reverse side schemas. The description method of the child negative pattern definition 2302 is not limited, and may be defined by a combination of the source code 109, the code metric 117, and the program element change history 110. The child-reverse mode may be, for example, "striped writing". This is a child reverse mode in which a plurality of processing blocks are described as a series of processing. In addition to these, there may be mentioned: a "large number of conditional statements" which are child negative patterns made up of a large number of conditional statements; and "state transition" in which a processing body in the state and a child reverse mode for the processing of the state transition are described mixedly in the code for performing the state transition. Further, a child reverse mode in which exception processing is added to the original processing may be given as "leave".

In fig. 24, the child back pattern list 121 includes child back pattern classification records 2402 in which names 2301 of child back patterns classified as Long methods are additionally recorded for each Long Method record 2104 of the Long Method list 120B in fig. 21.

For example, when the program element 1 matches the child reverse side pattern a1 defined by the child reverse side pattern definition 2302 of fig. 23, the child reverse side pattern classification record 2402 of the program element 1 is additionally recorded with the name 2301 of the child reverse side pattern a 1. When the program element 2 matches the child reverse side pattern a2 defined by the child reverse side pattern definition 2302 of fig. 23, the name 2301 of the child reverse side pattern a2 is additionally recorded in the child reverse side pattern classification record 2402 of the program element 2.

When the program element corresponds to a plurality of child reverse side patterns, there may be a plurality of child reverse side pattern classification records 2405 having the same program element name but different child reverse side patterns.

Although fig. 24 shows an example in which the name 2301 of the child back pattern classified as Long Method is added to each Long Method record 2104 of the Long Method list 120B of fig. 21, the name 2301 of the child back pattern classified as Long Method may be added to each Long Method record 702 of the Long Method list 120A of fig. 7.

In fig. 25, the improvement category information 116 is obtained by recording the program improvement method for each child reverse mode defined by the child reverse mode definition information 115 as an improvement method record 2503. The improvement method record 2503 includes the name 2501 of the child reverse side pattern and an improvement method 2502 of the child reverse side pattern. The description method of the improvement method 2502 is not limited.

In fig. 26, the child reverse side pattern classification section 106 prepares a blank child reverse side pattern list 121 (2601). Next, the child reverse pattern classification section 106 reads the source code 109 of the analysis target program, the child reverse pattern definition information 115, and the LongMethod list 120B (2602, 2603, 2604).

Next, the child reverse pattern classification unit 106 determines whether or not the code metrics 117 and the program element change history 110 have been read for all the program elements in the Long Method list 120B (2605). In a case where the code metrics 117 and the program element change history 110 have not been read for all the program elements in the Long Method list 120B, the child reverse side pattern classification section 106 reads the code metrics 117 and the program element change history 110(2606, 2607). As the type of the acquired code metric, one described in the child-negative pattern definition information 115 is selected.

Next, the child reverse side pattern classification section 106 determines whether or not the source code pattern of the program element in the Long Method list 120B, the code metric 117, and the program element change history 110 satisfy the child reverse side pattern definition for all the child reverse side pattern definition information 115 (2608). When the determination as to whether or not the source code pattern, the code metric 117, and the program element change history 110 of the program element in the LongMethod list 120B satisfy the child reverse pattern definition has not been performed on all the child reverse pattern definition information 115, the child reverse pattern classification section 106 determines whether or not the source code pattern, the code metric 117, and the program element change history 110 of the program element satisfy the child reverse pattern definition (2609).

When the condition of 2609 is satisfied, the child reverse side pattern classification unit 106 adds the program element and the child reverse side pattern to the child reverse side pattern list 121 (2610), and returns to 2608. If the condition of 2609 is not satisfied, 2610 is skipped and the process returns to 2608. When the determination as to whether or not the source code pattern of the program element, the code metric 117, and the program element change history 110 in the Long Method list 120B satisfy the child reverse side pattern definition has been made for all the child reverse side pattern definition information 115, the process returns to 2605. When the processes 2606 to 2610 have been executed for all the program elements in the Long Method list 120B, the process ends.

The order of preparation (2601) of the blank child reverse side pattern list 121, reading (2602) of the source code 109, reading (2603) of the child reverse side pattern definition information 115, and reading (2604) of the Long Method list 120B is not limited. The order of reading (2606) of the code metric 117 and reading (2607) of the program element change history 110 is not limited.

In fig. 27, the improvement position mapping unit 107 reads the source code 109, the improvement category information 116, and the child reverse mode list 121(2701, 2702, 2703) of the analysis target program. Next, the improvement position mapping section 107 determines whether or not the source code 122(2704) indicating the improvement position has been generated for all the program elements in the child-negative pattern list 121. If the source code 122 indicating the improvement position has not been generated for all the program elements in the reverse pattern list 121, the improvement position mapping unit 107 maps the positions to be divided to the source code of the program element in accordance with the improvement category information 116 to generate the source code 122 indicating the improvement position, and the process returns to 2704 (2705). The manner of expression of the improved position is not limited. When the source code 122 indicating the improvement position has been generated for all the program elements in the child negative pattern list 121, the processing is ended.

The order of reading (2701) of the source code 109, reading (2702) of the improvement category information 116, and reading (2703) of the child reverse mode list 121 is not limited. In this way, by displaying the improvement position of the Long Method after the Long Method list is generated, it is possible to assist the program correction by the program corrector.

(fifth embodiment)

The Long Method Score calculating unit 102 uses the readability Score and the alteration occurrence Score as a calculating Method of the Long Method Score. In the first embodiment, when the processing of the Long Method Score calculating unit 102 is shown in the flowchart of fig. 9, the calculation Method (904) of the Long Method Score118 is not limited, but in the present embodiment, the readability Score and the change occurrence Score are used as the calculation Method. The readability score is a score obtained by quantifying the readability of the program element, and the calculation method is not limited. The change occurrence score is a score obtained by digitizing the degree of occurrence of a change to the program element, and the calculation method is not limited. There is no limitation on how the Long Method Score118 is calculated using the readability Score and the change occurrence Score.

(sixth embodiment)

In fig. 28, the source code 122 showing the improvement position includes a source code 2801 of the program element, an improvement target position 2802 in the source code 2801 of the program element, and a source code 2803 showing an improvement method.

In the example of fig. 28, the source code 2801 of the program element is extracted as Long Method. At this time, Long Method list generation unit 105B registers source code 2801 of the program element in Long Method list 120B.

The source code 2801 is a child-side model in which exception processing R1 and R2 are added to the original processing. At this time, the child reverse side pattern classification unit 106 registers the source code 2801 of the program element in the child reverse side pattern list 121 as a child reverse side pattern to which the exception processing R1 and R2 are added to the original processing.

The improvement position mapping unit 107 specifies the exception processing R1 and R2 as an improvement target position 2802 in the source code 2801 of the program element. Then, the improvement position mapping unit 107 presents the code obtained by functionalizing the exception processing R1 and R2 as the source code 2803 indicating the improvement method.

Therefore, the reconfiguration operator can divide the exception processes R1 and R2 from the source code 2801 of the program element without determining the division position of the method by himself/herself, and can improve the readability of the software in which the exception processes R1 and R2 are added to the original process.

(seventh embodiment)

The Long Method Score calculating unit 102 uses the harmonic mean of the readability Score and the alteration occurrence Score as a Method of calculating the Long Method Score. In the fifth embodiment, how the Long Method Score118 is calculated using the readability Score and the alteration occurrence Score is not limited, but in the present embodiment, the calculation Method is a harmonic mean of the readability Score and the alteration occurrence Score. However, the method of calculating the readability score and the alteration occurrence score is not limited. The harmonic mean is defined as the inverse of the arithmetic mean of the inverses. That is, in the present embodiment, an arithmetic mean of the reciprocal of the readability Score obtained by a certain method and the reciprocal of the alteration occurrence Score obtained by a certain method is calculated, and the reciprocal is defined as LongMethod Score 118.

For example, the readability reduction score may be a harmonic mean of 4 values normalized by the number of lines without annotations, the degree of circle complexity, the number of lines of annotations, and the number of local variable declarations. The change occurrence score may take the form of a harmonic mean of 2 values normalized by the number of change locations and the weighted number of submissions for each submission. The Long Method Score118 may employ a harmonic mean of the readability reduction Score and the alteration occurrence Score.

(eighth embodiment)

In fig. 29, the screen of source code analysis device 10C includes an analysis target selection screen 2901, a Long Method condition setting screen 2902, a Long Method list screen 2903, and an improvement position display screen 2904. The analysis object selection screen 2901 and the Long Method condition setting screen 2902 can be transferred bidirectionally. The Long Method condition setting screen 2902 and the Long Method list screen 2903 can be transferred bidirectionally. The Long Method list screen 2903 and the improved position display screen 2904 can be transferred bidirectionally. The improved position display screen 2904 can be shifted to the analysis object selection screen 2901.

In fig. 30, an analysis object selection screen 2901 includes an analysis tab 3001 and a setting tab 3002. The analysis tab 3001 includes: an analysis target source code selection unit 3003 for selecting the analysis target source code 109; an error history information selection section 3004 for selecting the error history information 111; a program element change history selection unit 3005 for selecting the program element change history 110; and an analysis start button 3006.

The user of the source code analysis device 10C selects each data by using the analysis target source code selection unit 3003, the error history information selection unit 3004, and the program element change history selection unit 3005 at each analysis, and presses the analysis button 3006 to start the analysis. When the user of source code analysis device 10C presses analysis button 3006, code metric calculation unit 101, Long Method Score calculation unit 102, maintainability index calculation unit 103, and Long Method condition setting unit 104B in source code analysis device 10 operate, and the screen of source code analysis device 10C transitions to Long Method condition setting screen 2902.

The setting tab 3002 includes: an improvement category information selection section 3007 for selecting the improvement category information 116; and a child reverse side pattern definition information selection part 3008 for selecting the child reverse side pattern definition information 115. When the user of the source code analysis device 10C first uses the source code analysis device 10C, the user selects each data using the improvement category information selection unit 3007 and the child reverse side pattern definition information selection unit 3008. Even in the second and subsequent uses, when the improvement category information 116 and the child reverse side pattern definition information 115 need to be changed, the data is selected by using the improvement category information selection unit 3007 and the child reverse side pattern definition information selection unit 3008.

In fig. 31, Long Method condition setting screen 2902 includes a threshold setting tab 3101, a histogram tab 3102, a return analysis object selection button 3113, and a setting completion button 3114.

The threshold setting tab 3101 includes: a Long Method Score threshold value selection unit 3105 for setting the Long Method Score threshold value 112; a maintainability index threshold value selection unit 3106 for selecting the maintainability index threshold value 113; and a readability-reducing word condition selecting section 3107 for selecting the readability-reducing word condition 114.

When the user of the source code analysis device 10C changes the values and conditions of the Long Method Score threshold selection unit 3105, the maintenance index threshold selection unit 3106, and the readability reduction word selection unit 3107, the contents of the maintenance index histogram tab 3103 and the Long Method Score histogram tab 3104 in the histogram tab 3102 are changed accordingly.

Histogram tabs 3102 include a maintenance metric histogram tab 3103 and a Long Method Score histogram tab 3104. The contents shown in fig. 19 are depicted in the maintainability metrics histogram tab 3103, and the contents shown in fig. 20 are depicted in the Long method score histogram tab 3104.

The user of source code analysis device 10C sets the values and conditions of Long Method Score threshold value selection unit 3105, maintenance index threshold value selection unit 3106, and readability-reducing word selection unit 3107 with reference to the contents of maintenance index histogram tab 3103 and Long Method Score histogram tab 3104. When the user of the source code analysis device 10C presses the setting completion button 3114 after completing the setting, the Long Method list generation unit 105B and the reverse-major pattern classification unit 106 in the source code analysis device 10C operate, and the screen transitions to the Long Method list screen 2903. On the other hand, when the user of the source code analyzer 10C presses the return analysis target selection button 3113, the screen transitions to the analysis target selection screen 2901.

In fig. 32, a Long Method list screen 2903 includes a child-reverse mode list display unit 3201 and a return Long Method condition setting button 3202. The child reverse mode list display unit 3201 displays the child reverse mode list 121. When the user of the source code analysis device 10C presses a program element of the Long Method in each child reverse pattern classification record 2401 in the child reverse pattern list 121, the improvement position mapping unit 107 operates, and the screen transitions to an improvement position display screen 2904. On the other hand, when the user of source code analysis device 10C presses return-to-Long Method condition setting button 3202, the screen transitions to Long Method condition setting screen 2902.

In fig. 33, an improvement position display screen 2904 includes a source code display unit 3301 showing an improvement position, a return analysis object selection button 3302, and a return Long Method list button 3303. The source code display unit 3301 showing the improvement position displays the source code 122 showing the improvement position. The user of the source code analysis device 10C can know the improvement position and the improvement method of the source code by viewing the improvement position display screen 2904.

When the user of source code analysis device 10C presses back analysis target selection button 3302, the screen transitions to analysis target selection screen 2901. On the other hand, when the user of source code analysis device 10C presses return-to-Long Method list button 3303, the screen transitions to Long Method list screen 2903.

Claims

1. A source code analysis method, characterized by:

a code metric acquisition unit that analyzes a source code having a plurality of program elements and generates a code metric regarding the length and complexity of the program elements;

a submission-information acquiring section that acquires submission information on a change history of the program element;

a score calculation section calculating a score of the program element based on the code metric and the submission information;

a maintainability index calculation unit that calculates a maintainability index of the program element based on the code metric;

a list generation unit generates a list of the program elements that are candidates to be divided into a plurality of parts, based on the score and the maintainability index.

2. The source code analysis method of claim 1, wherein:

the Score was Long method Score.

3. The source code analysis method of claim 1, wherein:

the list generation unit generates the list using a score threshold and/or a maintainability index threshold.

4. The source code analysis method of claim 3, wherein:

the condition setting unit outputs a map having an axis of the score or the maintainability index and an axis of the program element number corresponding to the score or the maintainability index, and receives the setting of the score threshold value or/and the maintainability index threshold value from a user.

5. The source code analysis method of claim 4, wherein:

the graph is a graph that can distinguish both the score and the maintenance index.

6. The source code analysis method of claim 1, wherein:

the score calculation unit further calculates a score of each program element using error history information regarding occurrence of an error in the program element.

7. The source code analysis method of claim 6, wherein:

the error history information includes a cause of the error,

the score calculation unit further calculates a score of the program element using a readability-reducing term condition that specifies a specific term that may be included in the error cause.

8. The source code analysis method of claim 1, wherein:

has child reverse side pattern definition information defining a child reverse side pattern,

the child-back pattern classification unit generates a child-back pattern list that classifies the program elements described in the score into child-back patterns, based on the score, the source code, and the code metric.

9. The source code analysis method of claim 8, wherein:

an improvement position mapping unit generates a source code indicating an improvement position based on the child reverse side pattern list and the improvement category.

10. A source code analysis apparatus, comprising:

a score calculation section that calculates a score of the program element based on the code metric and the submission information;

a maintainability index calculation unit that calculates a maintainability index of the program element based on the code metric; and

and a list generating unit that generates a list of the program elements that are candidates to be divided into a plurality of parts, based on the score and the maintainability index.