CN114579130A - Automatic inference method for node.JS code segment environment dependency based on program analysis - Google Patents

Automatic inference method for node.JS code segment environment dependency based on program analysis Download PDF

Info

Publication number
CN114579130A
CN114579130A CN202011374137.5A CN202011374137A CN114579130A CN 114579130 A CN114579130 A CN 114579130A CN 202011374137 A CN202011374137 A CN 202011374137A CN 114579130 A CN114579130 A CN 114579130A
Authority
CN
China
Prior art keywords
dependency
package
node
code
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011374137.5A
Other languages
Chinese (zh)
Inventor
张卫丰
黄泽龙
***
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202011374137.5A priority Critical patent/CN114579130A/en
Publication of CN114579130A publication Critical patent/CN114579130A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to an automatic inference method of node.JS code segment environment dependency based on program analysis, which comprises the following steps: firstly, constructing a knowledge base of a known npm package according to SourceRank in a library. Secondly, discovering information related to package dependency by using a combination of static analysis, dynamic analysis and association rule mining, modeling the information into an interdependence graph according to the relationship between the information and the graph, and storing the graph by using a graph database; then, for a given new node.js code segment, the target code is analyzed and a list of all imported resources is extracted, the list is mapped back to a group of installable software packages, and then the found dependency items are correctly sequenced by using an inference algorithm for direct dependency and transfer dependency following the installation sequence, so as to obtain the final returned result.

Description

Automatic inference method for node.JS code segment environment dependency based on program analysis
Technical Field
The invention belongs to the technical field of computers. Especially in the field of software technology. The invention provides a method for automatically deducing the environment dependency of node.JS code segments based on program analysis, which can effectively solve the problems that the codes shared by all large code communication platforms at present cannot run and are difficult to reproduce, thereby promoting code technical communication and improving development efficiency.
Background
JS is an event-driven I/O server side JavaScript environment, and based on a V8 engine of Google, the V8 engine has very high Javascript execution speed and very good performance. Therefore, in the period of several years, node.js gradually develops into a mature development platform, and attracts a plurality of developers. Js, and in addition, developers can also use it to develop mobile Web frameworks.
The reason why js was successful is that its built-in packet manager npm is also extremely powerful, except that it uses the same syntax of front-end js, directly attracting a large front-end developer as the initial user. npm is the largest software registry in the world, with approximately 30 hundred million downloads per week, containing over 600000 packages. Open source software developers from various continents share and use npm for reference with each other. The structure of the package enables developers to easily track dependencies and versions. npm can manage the dependency of the node.js project well and make it easy for the developer to issue its own package change exception. Therefore, the cost is not high whether you use the bags of other people or distribute the bags to other people.
Currently, code sourcing is becoming more and more a concern in developer circles due to the growing proliferation of code sourcing communities, where platform registered users, such as StackOverFlow, GitHub Gist, and Jupyter communities, are very many and active, on which thousands of code sharing segments have accumulated. This is equivalent to providing a very useful application store for the user, all of whom are free to download the required code fragments or items, which provides great convenience to the developer.
The large-scale application of the node.JS expands the communication requirement among developers. Platforms such as StackOverFlow, GitHub Gist, and Jupyter communities provide a way for users to communicate technology and code. In the solution of many questions in these platforms, respondents are usually attached with corresponding code fragments. These code fragments are usually validated by the respondents and can be used to solve the problem.
While code sharing provides many benefits, these shared codes often suffer from being unworkable and difficult to reproduce. According to research, more than 50% of codes on the Gist platform cannot run smoothly under the native environment. Although there is a possibility that the code itself has errors, generally speaking, the main problem is caused by the inconsistency of the running environments of different users, wherein the relevant dependent library missing is the most important problem. Parnin's survey shows that developers configure code for which the operating environment depends on time consumption of typically no less than 20 minutes. Therefore, the method for automatically deducing the environment dependency of the Nodejs code segments has strong practical significance for promoting technical communication and improving development efficiency.
Disclosure of Invention
The invention mainly provides an automatic inference method for node.JS code segment environment dependency based on program analysis. Js code fragments are the result of the function call and the dependency package containing the function declaration. Second, the present invention uses an offline repository to correctly infer the dependencies of target scripts. This knowledge base contains the packages, their versions and resources and the relationships between them. It is constructed by applying static and dynamic analysis to known packages in the library of library. Where static analysis enumerates known resources of a package for later retrieval, dynamic analysis gathers information about delivery dependencies. Then, the association rule mining of dependencies in the public Python project leverages the knowledge of system-level transitive dependencies generated by developers. Finally, an inference algorithm of direct dependency and transitive dependency following the installation order is used on the basis of an offline knowledge base for the given strange code fragments. In view of the above problems, the work and contributions of the present invention are as follows:
1. a technique for computing package dependencies using static analysis, dynamic analysis, and mining knowledge sources of system-level transitive dependencies generated with the behavior of developers. The present invention analyzes npm the first ten thousand packets with high utilization according to SourceRank in library.io dataset, and selects packets to contain the most common library according to source level, because the common library can affect most packet ecosystems, and the size of the whole packet ecosystem cannot be analyzed comprehensively. Known common packet resources are found and enumerated through static analysis. For the case where some software packages may not list the dependent items correctly, we also address this by using the top ten thousand packages from SourceRank in the libraries.
2. An inference algorithm for direct dependency and transitive dependency following an installation order. For the environment dependent items of the code fragments, an additional constraint is provided, namely the dependent items must be returned in a correct mode after inference, otherwise errors can occur due to problems of direct dependency and transfer dependency among the dependent packages. The inference algorithm first extracts the imported resources from the target application, then queries a knowledge base generated by static analysis to determine a set of packages to which the resources may belong, and then traverses a dependency graph between the set of packages to determine delivery dependencies.
Drawings
FIG. 1 is a final modeled interdependence graph in the knowledge base of the present invention
FIG. 2 is a flow chart of the inference algorithm for direct dependency and transitive dependency following installation order according to the present invention
FIG. 3 is a schematic diagram illustrating the automatic inference generation of dependency of node.JS code fragment environment based on program analysis in accordance with the present invention
Detailed Description
The invention specifically comprises the following steps:
1) the most common packets are first selected according to the source level of SourceRank in the libraries. io dataset. The known resources of the package are enumerated for later retrieval by static analysis.
2) For software packages that cannot list dependent items correctly, we use dynamic analysis to resolve. An attempt is made to install the software package using npm install, record the successfully installed resources, parse the error output for the resources that failed to install, and based on the output, our dynamic analysis process will rely on the record input to the knowledge base.
3) We model the knowledge base as an interdependency graph, using a graph database store. Where nodes represent existing objects in the knowledge base and directed edges represent relationships between them.
4) Js code fragments are given, the target code is parsed and a list of all imported resources is extracted, which is mainly implemented by constructing an Abstract Syntax Tree (AST) of the source code.
5) Knowing the resources of the code, it can be mapped back to a set of installable software packages. We perform this reverse lookup by querying our knowledge base and the package management system of potential matching records.
6) After the required dependency packages are obtained in 5), the found dependency items are correctly sequenced according to the interdependence graphs in 3) by the direct dependency and the transfer dependency of the packages, and the final return result is obtained.
The flow of static analysis in step 1) is as follows: for the first ten thousand npm packets in the SourceRank source-level ordering in the libraries. io dataset, an attempt is made to install using npm install, and if the installation is successful, the packet resources are recorded. For a small fraction of software packages that cannot be installed, we try to manually download and parse the distribution of the package.
Step 2) some software packages may not list their dependencies correctly, thereby preventing npm from automatically processing the resolution during installation. We will parse the output for its wrong output when the installation fails, for example: "no module name < name >", "cand find < name >" etc., which indicate dependence on certain non-existent packages, and enter their dependence records into the knowledge base according to its hint.
Step 3) for the interdependency graph, the nodes mainly used by us are package nodes, version nodes, resource nodes and association nodes, and the method specifically refers to fig. 1 in the description of the attached drawings. Where all known versions of a packet are represented as version nodes, the versions are tagged with tagged versions and store packet version numbers. The resource node is owned by the version node and indicated by the directed edge of the version node. The association nodes represent various association rules, the nodes are marked as associated and metadata is maintained to ensure confidence, support, promotion and counting.
And 4) analyzing the target application program and extracting all lists of the imported resources. We do this by building an Abstract Syntax Tree (AST) of the source code.
Step 5) once the resources of the application are known, a set of installable software packages is mapped back. We perform this reverse lookup by querying our knowledge base and the package management system of potential matching records. The match between the resource required by the application and the installable package may be determined by a full match or a partial match of one or more known resources in the knowledge base. In addition, we also check if there is a package with the same name as the required resource, i.e. after the reverse lookup is completed, the package name is normalized to match the name on the package management system.
Step 6) knowing only the packets corresponding to the top level resources is often not sufficient for proper environment configuration, as these packets may themselves depend on other packets. Assuming that the interdependence graph contains all necessary relationships, the set of packages that must be installed P is a set of resolved direct dependencies S joined with a set of packages R reachable from S.
However, it is not sufficient to calculate P alone. We must also maintain the correct ordering of dependencies so that each package is installed before any other package that depends on it. We do this by performing a depth-first search rooted at each package p ∈ S.

Claims (7)

1. A method for automatically deducing node.JS code segment environment dependency based on program analysis comprises the following steps: firstly, constructing a knowledge base of a known npm package according to SourceRank in a library. Secondly, discovering information related to package dependency by using a combination of static analysis, dynamic analysis and association rule mining, modeling the information into an interdependence graph according to the relationship between the information and the graph, and storing the graph by using a graph database; then, for a given new node.js code segment, the target code is analyzed and a list of all imported resources is extracted, the list is mapped back to a group of installable software packages, and then the found dependency items are correctly sequenced by using an inference algorithm for direct dependency and transfer dependency following the installation sequence, so as to obtain the final returned result.
2. Js code fragment environment dependency automatic inference method based on program analysis described in claim 1, characterized by the following steps:
1) the most common packet is first selected according to the source level of SourceRank in the library. The known resources of the package are enumerated for later retrieval by static analysis.
2) For software packages that cannot list dependent items correctly, we use dynamic analysis to resolve. An attempt is made to install the software package using npm install, record the successfully installed resources, parse the error output for the resources that failed to install, and based on the output, our dynamic analysis process will rely on the record input to the knowledge base.
3) We model the knowledge base as an interdependency graph, using a graph database store. Where nodes represent existing objects in the knowledge base and directed edges represent relationships between them.
4) Js code fragments are given, the target code is parsed and a list of all imported resources is extracted, which is mainly implemented by constructing an Abstract Syntax Tree (AST) of the source code. Knowing the resources of the code, it can be mapped back to a set of installable software packages. We perform this reverse lookup by querying our knowledge base and the package management system of potential matching records.
5) After the required dependency packages are obtained in 4), the found dependency items are correctly sequenced according to the interdependence graphs in 3) by the direct dependency and the transfer dependency of the packages, and the final return result is obtained.
3. The method of claim 2, wherein in step 1) a technique is used to compute package dependencies using a knowledge source of system-level transitive dependencies generated by static analysis. The most common packets are selected according to the source level of SourceRank in the libraries. io dataset. The known resources of the package are enumerated for later retrieval by static analysis, i.e., an offline knowledge base is built.
4. Js code fragment environment dependency automatic inference method based on program analysis as claimed in claim 2, characterized by the fact that in step 2) we use dynamic analysis to resolve for packages that cannot list the dependent items correctly. Some software packages may not list their dependencies correctly, preventing npm from automatically processing the resolution during installation. We will parse the output for its wrong output when the installation fails, for example: "no module name < name >", "cand find < name >" etc., which indicate dependence on certain non-existent packages, and enter their dependence records into the knowledge base according to its hint.
5. A method for automatically inferring node.js code fragment environment dependency based on procedural analysis as claimed in claim 2 wherein in step 3) the knowledge base is modeled as an interdependence graph. For the interdependency graph, the nodes mainly used by us are a package node, a version node, a resource node and an association node, and refer to fig. 1 in the description of the drawings specifically. Where all known versions of a package are represented as version nodes, the versions are tagged with tagged versions and store package version numbers. The resource node is owned by the version node, indicated by the directed edge of the version node. The association nodes represent various association rules, and the nodes are marked as associations and maintain metadata to ensure confidence, support, promotion, and counting.
6. A method of automatically inferring node.js code fragment environment dependency based on procedural analysis as claimed in claim 2, wherein in step 4) the object code is parsed and a list of all imported resources is extracted for a given new node.js code fragment, this reverse lookup is performed by querying our knowledge base and the package management system for potential matching records. The match between the resource required by the application and the installable package may be determined by a full match or a partial match of one or more known resources in the knowledge base. In addition, we also check if there is a package with the same name as the required resource, i.e. after the reverse lookup is completed, the package name is normalized to match the name on the package management system.
7. Method for automatic inference of dependency of environment of js code sections based on procedural analysis according to claim 2 characterised by an inference algorithm for direct dependency and transitive dependency following the installation order in step 5). Knowing only the packets corresponding to the top level resources is often not sufficient for proper environment configuration, as these packets may themselves depend on other packets. Assuming that the interdependence graph contains all necessary relationships, the set of packages that must be installed P is a set of resolved direct dependencies S joined with a set of packages R reachable from S.
However, it is not sufficient to calculate P alone. We must also maintain the correct ordering of dependencies so that each package is installed before any other package that depends on it. We do this by performing a depth-first search rooted at each package p ∈ S.
The invention automatically deduces the dependency of the node.JS code segment environment based on program analysis. First, the present invention is concerned with the relationship between function calls in a node.js code fragment and the dependency package containing the function declaration. Second, the present invention uses an offline repository to correctly infer the dependencies of target scripts. This knowledge base contains the packages, their versions and resources and the relationships between them. It is constructed by applying static and dynamic analysis to known packages in the library of library. Where static analysis enumerates known resources of a package for later retrieval, dynamic analysis gathers information about delivery dependencies. Then, the association rule mining of dependencies in the public Python project leverages the knowledge of system-level transitive dependencies generated by developers. Finally, an inference algorithm of direct dependency and transitive dependency following the installation order is used on the basis of an offline knowledge base for the given strange code fragments.
CN202011374137.5A 2020-11-30 2020-11-30 Automatic inference method for node.JS code segment environment dependency based on program analysis Pending CN114579130A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011374137.5A CN114579130A (en) 2020-11-30 2020-11-30 Automatic inference method for node.JS code segment environment dependency based on program analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011374137.5A CN114579130A (en) 2020-11-30 2020-11-30 Automatic inference method for node.JS code segment environment dependency based on program analysis

Publications (1)

Publication Number Publication Date
CN114579130A true CN114579130A (en) 2022-06-03

Family

ID=81767118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011374137.5A Pending CN114579130A (en) 2020-11-30 2020-11-30 Automatic inference method for node.JS code segment environment dependency based on program analysis

Country Status (1)

Country Link
CN (1) CN114579130A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116484356A (en) * 2023-04-26 2023-07-25 安元科技股份有限公司 Npm packet hierarchical authorization management method and device based on RBAC authority model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116484356A (en) * 2023-04-26 2023-07-25 安元科技股份有限公司 Npm packet hierarchical authorization management method and device based on RBAC authority model

Similar Documents

Publication Publication Date Title
US10089103B2 (en) Systems and methods for transformation of reporting schema
US8392896B2 (en) Software test bed generation
US9762589B2 (en) Methods and systems for generating a dynamic workflow in a multi-tenant database environment
US8548947B2 (en) Systems and methods for file maintenance
US7926051B2 (en) Automatic parallel non-dependent component deployment
US8434054B2 (en) System and method for managing cross project dependencies at development time
EP2228726B1 (en) A method and system for task modeling of mobile phone applications
US20180074804A1 (en) Systems and methods for dynamically replacing code objects for code pushdown
CN103562863A (en) Creating a correlation rule defining a relationship between event types
US7765520B2 (en) System and method for managing cross project dependencies at development time
US20230035486A1 (en) Managing execution of continuous delivery pipelines for a cloud platform based data center
Oliveira et al. Delivering software with agility and quality in a cloud environment
Riva et al. UML-based reverse engineering and model analysis approaches for software architecture maintenance
CN114579130A (en) Automatic inference method for node.JS code segment environment dependency based on program analysis
Wang et al. Microservice architecture recovery based on intra-service and inter-service features
US20200356885A1 (en) Service management in a dbms
US20220337620A1 (en) System for collecting computer network entity information employing abstract models
CN111352631A (en) Interface compatibility detection method and device
Gujral et al. An exploratory semantic analysis of logging questions
CN115640578A (en) Vulnerability reachability analysis method, device, equipment and medium for application program
US20190244151A1 (en) Just in time compilation (jit) for business process execution
US20240248691A1 (en) Detecting software code anomalies based on organizational information
US11797279B2 (en) Systems and methods for dependency analysis
US11893120B1 (en) Apparatus and method for efficient vulnerability detection in dependency trees
US20240232333A9 (en) Systems and methods for contextual alert enrichment in computing infrastructure and remediation thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination