CN112579463B - Solidity intelligent contract-oriented defect prediction method - Google Patents
Solidity intelligent contract-oriented defect prediction method Download PDFInfo
- Publication number
- CN112579463B CN112579463B CN202011562073.1A CN202011562073A CN112579463B CN 112579463 B CN112579463 B CN 112579463B CN 202011562073 A CN202011562073 A CN 202011562073A CN 112579463 B CN112579463 B CN 112579463B
- Authority
- CN
- China
- Prior art keywords
- defect
- solidity
- prediction
- information
- regression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007547 defect Effects 0.000 title claims abstract description 112
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000001514 detection method Methods 0.000 claims abstract description 17
- 238000013145 classification model Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 238000003066 decision tree Methods 0.000 claims description 6
- 238000007637 random forest analysis Methods 0.000 claims description 6
- 238000012706 support-vector machine Methods 0.000 claims description 5
- 238000012417 linear regression Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 8
- 230000001133 acceleration Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000000275 quality assurance Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
- G06F11/3608—Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- General Factory Administration (AREA)
Abstract
The invention discloses a Solidity intelligent contract-oriented defect prediction method, which is applied to the technical field of software defect prediction, and comprises the steps of firstly extracting metric elements of code modules from Solidity source codes, marking the number of defects for each code module, and thus constructing a defect prediction dataset; then, aiming at the problem of class unbalance in Solidity defect prediction data sets, carrying out data preprocessing by adopting an oversampling method; and finally, respectively constructing a defect quantity prediction model and a defect tendency prediction model, and evaluating the performance of the models. According to the invention, the metric metaset is combined with Solidity intelligent contract defect detection results to construct a Solidity intelligent contract defect prediction dataset, so that the characteristics of Solidity intelligent contracts can be better described, and the performance differences of different models in defect quantity prediction and defect tendency prediction problems are respectively verified based on the dataset.
Description
Technical Field
The invention relates to the technical field of software defect prediction, in particular to a defect prediction method for Solidity intelligent contracts.
Background
Blockchains are the core support technology for digital cryptocurrency systems, represented by bitcoin. The block chain technology has the core advantages of decentralization, and provides a solution for solving the problems of high cost, low efficiency, unsafe data storage and the like existing in a decentralization mechanism. Research and application of blockchain technology presents explosive growth, government departments, financial institutions, technological enterprises, and capital markets alike are exploring ways to solve practical problems using blockchain technology.
The intelligent contract is a core component of the blockchain, and is a digital protocol which uses algorithms and programs to compile contract terms, runs on the blockchain and can be automatically executed according to rules. Intelligent contracts were first proposed in 1994, and there is a great deal of attention with the advent of blockchain technology. More complex applications can be realized by writing intelligent contracts, thereby expanding the functions of the blockchain. At present, intelligence is about to play a role in the aspects of traditional financial assets, asset management in social systems, contract management and the like, such as stock right crowd funding, or voting agreement establishment based on intelligent contracts and the like.
The intelligent contract brings potential safety risks while expanding the function of the blockchain, and the defects of the intelligent contract can cause huge loss to property, such as: the 11 th 2017 party wallet was attacked, resulting in 2.85 billion dollars of ethernet currency being frozen; 300 ten thousand Ethernet coins of the maximum crowd funding project TheDAO in 2016 are illegally transferred and the like, and unlike traditional software, patch repair of an intelligent contract after deployment is very difficult, so that the quality assurance technology of the intelligent contract is widely focused in industry and academia.
The software defect prediction is an effective supplement to a defect detection technology, and the software defect prediction technology predicts defect tendency or defect quantity of a software module by analyzing software codes or development processes, designing measurement elements related to defects and adopting methods such as machine learning, and the like, optimizes distribution of defect detection resources according to prediction results, or judges the test sufficiency of a system, and is used as a basis for judging whether software can be delivered or not so as to promote improvement of software quality.
To our knowledge, however, there has been no study on defect prediction in the field of smart contracts. Applying software defect prediction techniques to the intelligent contract field faces the following challenges:
there is no intelligent contract defect prediction dataset.
Existing metric meta-sets focus on code complexity and object-oriented program characteristics, while intelligent contracts are used as a novel program related to monetary variation, and no specific metric meta-sets currently describe related characteristics of intelligent contracts.
Therefore, how to provide a defect prediction method for Solidity intelligent contracts is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a defect prediction method for Solidity intelligent contracts, which combines a metric element set and Solidity defect detection results to construct a Solidity intelligent contract defect prediction dataset, so that the characteristics of Solidity intelligent contracts can be better described, and based on the dataset, the performance differences of different models in defect number prediction and defect tendency prediction problems are respectively verified. For the defect predisposition prediction problem, it is further analyzed whether processing of class imbalance data sets using an oversampling technique would improve the prediction performance.
In order to achieve the above object, the present invention provides the following technical solutions:
a defect prediction method for Solidity intelligent contracts comprises the following specific steps:
Extracting metric meta information from a source code, performing defect detection on the source code to obtain Solidity defect detection information, and correspondingly combining the two information according to the contact/library to form a defect data set;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
Preferably, in the defect prediction method for Solidity intelligent contracts, the metric meta information includes: solidity smart contract functions, methods, variable types, attributes, and Solidity language constraints.
Preferably, in the defect prediction method for Solidity intelligent contracts, the specific step of extracting the metric meta information includes: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
Preferably, in the defect prediction method for Solidity intelligent contracts, the specific step of obtaining defect information includes: according to Solidity intelligent contract defect detection information, sorting the intelligent contract defect detection information into the defect number of different types of defects contained in each contact/library; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, the number of defects of each type is binarized, i.e. the label of the defect/library with a number of defects greater than 1 is marked 1, otherwise 0.
Preferably, in the defect prediction method for Solidity intelligent contracts, the defect number of the Solidity intelligent contracts is predicted by using a regression model; the regression model is one of linear regression, bayesian ridge, decision tree regression, random forest regression, K-neighbor regression, gradient acceleration regression, support vector machine regression and the like.
Preferably, in the defect prediction method for Solidity intelligent contracts, the defect tendency of Solidity intelligent contracts is predicted by using a classification model; the classification model is one of a Bernoulli Bayesian classifier, a Gao Sibei leaf classifier, a K-neighbor classifier, a decision tree classifier, a random forest classifier, a support vector machine classifier and the like.
Compared with the prior art, the defect prediction method for Solidity intelligent contracts is provided by the invention, and firstly, the metric elements of the code modules are extracted from Solidity source codes, and the defect number is marked for each code module, so that a defect prediction data set is constructed; then, aiming at the problem of class unbalance in Solidity defect prediction data sets, carrying out data preprocessing by adopting an oversampling method; and finally, respectively constructing a defect quantity prediction model and a defect tendency prediction model, and evaluating the performance of the models. The invention combines Solidity defect detection information to construct Solidity intelligent contract defect prediction dataset, can better describe the characteristics of Solidity intelligent contracts, and respectively verifies the performance difference of different models in defect quantity prediction and defect tendency prediction problems based on the dataset.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A defect prediction method for Solidity intelligent contracts is shown in FIG. 1, and comprises the following specific steps:
extracting metric meta information from a source code, performing defect detection on the source code, acquiring defect detection information, and correspondingly combining the two information according to the contact/library to form a defect prediction data set;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
Specifically, first, metric meta-information, CKBD-SC-Sol, is extracted from the source code, and combined with Solidity defect detection results, a smart contract defect dataset, called Solidity, is composed.
Secondly, 7 regression models, namely linear regression, bayesian ridge, decision tree regression, random forest regression, K-nearest neighbor regression, gradient acceleration regression and support vector machine regression are applied to the defect number prediction problem of Solidity intelligent contracts.
Third, for the defect tendency prediction problem of Solidity intelligent contracts, 6 classification models, namely a Bernoulli Bayesian classifier, a Gao Sibei leaf-Sizer, a K-nearest neighbor classifier, a decision tree classifier, a random forest classifier and a support vector machine classifier are applied.
Further, the metric meta information includes: object-oriented features, code complexity, solidity smart contract functions, methods, variable types, attributes, and Solidity language constraints.
In order to further optimize the technical scheme, the specific steps of extracting the measurement meta information are as follows: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
Further, since there is no defect prediction dataset of Solidity smart contracts yet, in order to construct a defect prediction model, first, the source codes of Solidity smart contracts are obtained from Xblock and ETHERSCAN, CKBD-SC-Sol metric metasets in Solidity smart contracts are extracted using AST analysis tool solidity-parser-antlr, and the extracted CKBD-SC-Sol metric metaset information is combined with corresponding defect detection information to construct Solidity defect prediction dataset.
In order to further optimize the technical scheme, the specific steps of obtaining defect information are as follows: the intelligent block chain contract detection platform outputs defect types and corresponding code line numbers after analyzing and detecting the source codes or Ethernet contract addresses through the input Solidity, and the defect reports output by the intelligent block chain contract detection platform are arranged into the defect number of different types of defects contained in each contact/library according to the defect types; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, the number of defects of each type is binarized, i.e. the label of the defect/library with a number of defects greater than 1 is marked 1, otherwise 0.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (4)
1. A Solidity intelligent contract-oriented defect prediction method is characterized by comprising the following specific steps:
Extracting metric meta information from the source code, obtaining defect information, and combining to form a defect data set; the metric meta information includes: object-oriented features, code complexity, solidity functions of intelligent contracts, methods, variable types, attributes, and Solidity language constraints; the specific steps of obtaining defect information are as follows: according to Solidity defect detection information, sorting the defect number of different types of defects contained in each contact/library according to the defect type; for the defect number data set, the defect number of each contact/library is the sum of the defect numbers of the various types; for defect predisposition data sets, binarizing the defect quantity of each type, namely marking a label of a contact/library with the defect quantity larger than 1 as 1, otherwise marking as 0; obtaining source codes of Solidity intelligent contracts from Xblock and ETHERSCAN, extracting CKBD-SC-Sol metric metasets in Solidity intelligent contracts by using an AST analysis tool solidity-parser-antlr, and combining the extracted CKBD-SC-Sol metric metaset information with corresponding defect detection information to construct a Solidity defect prediction dataset;
and predicting the predicted Solidity intelligent contracts by using the regression model and the classification model.
2. The method for predicting defects in Solidity smart contracts according to claim 1, wherein the specific steps of extracting metric meta information are as follows: the CKBD metrics meta-information for object-oriented features and code complexity is combined with the SC-Sol metrics meta-information for the constraints of the Solidity smart contract for functions, methods, variable types, attributes and Solidity language to obtain CKBD-SC-Sol metrics metasets.
3. The method for predicting defects of Solidity-oriented intelligent contracts according to claim 1, wherein the number of defects of Solidity intelligent contracts is predicted by using a regression model; wherein the method comprises the steps of
The regression model is one of linear regression, bayesian ridge, decision tree regression, random forest regression and K adjacent regression.
4. The method for predicting defects in Solidity smart contracts according to claim 1, wherein the defect tendencies of Solidity smart contracts are predicted using a classification model; the classification model is one of a Bernoulli Bayesian classifier, a Gao Sibei leaf classifier, a K-neighbor classifier, a decision tree classifier, a random forest classifier and a support vector machine classifier.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011562073.1A CN112579463B (en) | 2020-12-25 | 2020-12-25 | Solidity intelligent contract-oriented defect prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011562073.1A CN112579463B (en) | 2020-12-25 | 2020-12-25 | Solidity intelligent contract-oriented defect prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112579463A CN112579463A (en) | 2021-03-30 |
CN112579463B true CN112579463B (en) | 2024-05-24 |
Family
ID=75139676
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011562073.1A Active CN112579463B (en) | 2020-12-25 | 2020-12-25 | Solidity intelligent contract-oriented defect prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112579463B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114331396A (en) * | 2021-12-28 | 2022-04-12 | 中国科学技术大学 | Automatic protocol security attribute extraction method and system for Ether house intelligent contract |
CN114510431B (en) * | 2022-04-20 | 2022-07-05 | 武汉理工大学 | Workload-aware intelligent contract defect prediction method, system and equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN108664402A (en) * | 2018-05-14 | 2018-10-16 | 北京航空航天大学 | A kind of failure prediction method based on software network feature learning |
CN109977682A (en) * | 2019-04-01 | 2019-07-05 | 中山大学 | A kind of block chain intelligence contract leak detection method and device based on deep learning |
CN110543419A (en) * | 2019-08-28 | 2019-12-06 | 杭州趣链科技有限公司 | intelligent contract code vulnerability detection method based on deep learning technology |
CN111240993A (en) * | 2020-01-20 | 2020-06-05 | 北京航空航天大学 | Software defect prediction method based on module dependency graph |
CN111339535A (en) * | 2020-02-17 | 2020-06-26 | 扬州大学 | Vulnerability prediction method and system for intelligent contract codes, computer equipment and storage medium |
CN111506504A (en) * | 2020-04-13 | 2020-08-07 | 扬州大学 | Software development process measurement-based software security defect prediction method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3824423B1 (en) * | 2018-07-20 | 2024-08-21 | Coral Protocol | Blockchain transaction safety using smart contracts |
-
2020
- 2020-12-25 CN CN202011562073.1A patent/CN112579463B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106201871A (en) * | 2016-06-30 | 2016-12-07 | 重庆大学 | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised |
CN108664402A (en) * | 2018-05-14 | 2018-10-16 | 北京航空航天大学 | A kind of failure prediction method based on software network feature learning |
CN109977682A (en) * | 2019-04-01 | 2019-07-05 | 中山大学 | A kind of block chain intelligence contract leak detection method and device based on deep learning |
CN110543419A (en) * | 2019-08-28 | 2019-12-06 | 杭州趣链科技有限公司 | intelligent contract code vulnerability detection method based on deep learning technology |
CN111240993A (en) * | 2020-01-20 | 2020-06-05 | 北京航空航天大学 | Software defect prediction method based on module dependency graph |
CN111339535A (en) * | 2020-02-17 | 2020-06-26 | 扬州大学 | Vulnerability prediction method and system for intelligent contract codes, computer equipment and storage medium |
CN111506504A (en) * | 2020-04-13 | 2020-08-07 | 扬州大学 | Software development process measurement-based software security defect prediction method and device |
Also Published As
Publication number | Publication date |
---|---|
CN112579463A (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vacca et al. | A systematic literature review of blockchain and smart contract development: Techniques, tools, and open challenges | |
Huang | Hunting the ethereum smart contract: Color-inspired inspection of potential attacks | |
US20190164015A1 (en) | Machine learning techniques for evaluating entities | |
CN111461216B (en) | Case risk identification method based on machine learning | |
CN112579463B (en) | Solidity intelligent contract-oriented defect prediction method | |
CN112070138A (en) | Multi-label mixed classification model construction method, news classification method and system | |
US20210201270A1 (en) | Machine learning-based change control systems | |
Choe et al. | The Real‐Time Mobile Application for Classifying of Endangered Parrot Species Using the CNN Models Based on Transfer Learning | |
CN113705909A (en) | Risk level prediction method and device based on prediction model and storage medium | |
Wu et al. | Code vulnerability detection based on deep sequence and graph models: A survey | |
CN112231746B (en) | Joint data analysis method, device, system and computer readable storage medium | |
Gopali et al. | Vulnerability detection in smart contracts using deep learning | |
CN116739605A (en) | Transaction data detection method, device, equipment and storage medium | |
Jain et al. | An integrated deep learning model for Ethereum smart contract vulnerability detection | |
CN115268847A (en) | Block chain intelligent contract generation method and device and electronic equipment | |
Peng et al. | Unbalanced Data Processing and Machine Learning in Credit Card Fraud Detection | |
KR102409019B1 (en) | System and method for risk assessment of financial transactions and computer program for the same | |
CN112561538B (en) | Risk model creation method, apparatus, computer device and readable storage medium | |
CN112115212B (en) | Parameter identification method and device and electronic equipment | |
Gangopadhyay et al. | LAD in finance: accounting analytics and fraud detection | |
Gopala Krishnan et al. | Predictive algorithm and criteria to perform big data analytics | |
Valtonen et al. | Human-in-the-loop: Explainable or accurate artificial intelligence by exploiting human bias? | |
Li et al. | A review of data representation methods for vulnerability mining using deep learning | |
Nha et al. | Methodology Interaction by Machine Learning Model to Detect Vulnerability in Smart Contract of Blockchain | |
Liermann et al. | Use Case: Optimization of Regression Tests—Reduction of the Test Portfolio Through Representative Identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220920 Address after: Room 3006, Building 2, Tianchang Garden, No. 34 Beiyuan Road, Chaoyang District, Beijing 100000 Applicant after: Dabu Technology (Beijing) Co.,Ltd. Address before: 100192 Beijing city Haidian District Qinghe small Camp Road No. 12 Applicant before: BEIJING INFORMATION SCIENCE AND TECHNOLOGY University |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |