CN112508254B - Method for determining investment prediction data of transformer substation engineering project - Google Patents

Method for determining investment prediction data of transformer substation engineering project Download PDF

Info

Publication number
CN112508254B
CN112508254B CN202011375669.0A CN202011375669A CN112508254B CN 112508254 B CN112508254 B CN 112508254B CN 202011375669 A CN202011375669 A CN 202011375669A CN 112508254 B CN112508254 B CN 112508254B
Authority
CN
China
Prior art keywords
transformer substation
investment prediction
project
substation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011375669.0A
Other languages
Chinese (zh)
Other versions
CN112508254A (en
Inventor
管维亚
李国文
张旺
袁竞峰
卢文飞
张嘉澍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangsu Electric Power Design Consultation Co ltd
Southeast University
Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Jiangsu Electric Power Design Consultation Co ltd
Southeast University
Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangsu Electric Power Design Consultation Co ltd, Southeast University, Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Jiangsu Electric Power Design Consultation Co ltd
Priority to CN202011375669.0A priority Critical patent/CN112508254B/en
Publication of CN112508254A publication Critical patent/CN112508254A/en
Application granted granted Critical
Publication of CN112508254B publication Critical patent/CN112508254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Software Systems (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Water Supply & Treatment (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Technology Law (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method for determining investment prediction data of a transformer substation project, which comprises the steps of constructing a transformer substation project investment prediction index system, collecting and processing the transformer substation project investment prediction index data, constructing a transformer substation project investment prediction model based on an XGBoost algorithm, inputting the transformer substation project data into the transformer substation project investment prediction model, accurately determining the transformer substation project investment data according to the output result of the transformer substation project investment prediction model, establishing the transformer substation project investment prediction index system aiming at a target for improving the transformer substation project investment prediction, constructing the transformer substation project investment prediction model based on the XGBoost, providing accurate reference data for a decision maker through the investment prediction data obtained by the XGBoost prediction model, supporting an optimization auxiliary decision-making technology of construction investment prediction, and solving the problem of coordination of data and accuracy of the transformer substation project investment prediction.

Description

Method for determining investment prediction data of transformer substation engineering project
Technical Field
The invention relates to the technical field of investment decision-making of transformer substation engineering projects, in particular to a method for determining investment prediction data of transformer substation engineering projects.
Background
The power transmission and transformation engineering construction is the main content of the electric power infrastructure construction and is also the main direction of the national power grid construction investment. And the project of the transformer substation engineering is a project with more complex investment in the power transmission and transformation engineering. Investment prediction is a main task of a decision-making stage in the early stage of transformer substation engineering, and along with economic health, rapid and continuous development and increase of the power consumption demands of all communities, large-scale construction of power transmission and distribution networks of large-power enterprises is caused to increase the power generation capacity to meet the increasingly-growing power consumption demands of the communities, so that the shortage of power supply in China is relieved. As an important source spring for increasing national economy, the construction of transformer substation engineering projects is widely focused. In the process of scale-up of the power industry, the national economy is developed at a high speed and the modernization is realized gradually, and the electric energy supply is required to be highly reliable and the system operation is highly safe. As a channel for transmitting electric energy, how to reasonably plan the project of transformer substation engineering by a national power grid company, so as to carry out reasonable investment prediction and decision, meet the increasing electric energy demand of people, and be an important problem to be solved urgently in the development of power grid enterprises.
Under the current social conditions, the construction of the transformer substation engineering project has the following characteristics: the engineering construction amount of the transformer substation in partial areas is huge and the construction task is very difficult. Secondly, the natural environment is bad, the construction difficulty is high, and the project engineering investment is difficult to control. Along with the shift of the center of gravity of social economic development, more and more newly developed hydropower and wind power projects are accepted in the southwest or western remote areas at present, the severe condition of the natural environment directly causes the construction difficulty of the transformer substation engineering to be increased, and the engineering investment to be improved; thirdly, the project channel cleaning difficulty of the transformer substation engineering is increased by the output of the object right method, and the project cost control is directly caused to be more difficult. The above-mentioned characteristics of the project of the substation directly affect the investment costs of the national grid company. In order to change the current state of engineering construction of the transformer substation, the inefficient working mode of the transformer substation must be changed to perform technical and management innovation. Especially, the control degree of investment is improved in the current planning stage of the transformer substation engineering project, the project investment prediction and control are enhanced, the investment control lean level is improved, and finally the economic and social benefits of the national grid company are improved. In conclusion, the realization of the investment prediction of the transformer substation engineering project is beneficial to improving the efficiency of the planning investment of the transformer substation engineering project and the accuracy of the investment scale planning.
The models or methods commonly used in the investment prediction of the transformer substation engineering are regression analysis, neural network models, matrix models and the like. Because the influence factors in the engineering investment of the transformer substation are numerous, and nonlinear relations are presented among the influence factors, the traditional statistical model method such as regression analysis is difficult to overcome the limitation of the traditional statistical model method when the traditional statistical model method is applied to nonlinear data, and the prediction precision is low. In order to solve the problems of nonlinearity, high dimension, small sample and the like in a socioeconomic system, the neural network model is applied more in the project investment prediction process at present. Although the neural network model has good prediction effect, the algorithm parameter setting is complex, the interpretation is poor, and particularly the problems of over-learning, poor generalization capability and the like easily occur in a small sample. The neural network model is enabled to have excessive limiting conditions under the condition of ensuring certain precision in the project investment prediction process, and the neural network model is not suitable for transformer substation engineering planning investment prediction.
XGBoost is a novel integrated learning algorithm based on Boosting. Compared with the traditional machine learning algorithm, the method has the advantages of high running speed, strong generalization capability, high prediction precision, good robustness and the like, in addition, the model of the XGBoost algorithm has higher interpretability and can be used for predicting small samples, and is widely applied to the fields of runoff prediction, credit card transaction prediction, fault monitoring and the like at present, but is not deeply applied to the field of substation engineering investment prediction.
Disclosure of Invention
Aiming at the problems, the invention provides a method for determining project investment prediction data of a transformer substation, which comprises the steps of firstly constructing a project investment prediction index system of a power station according to project attributes of the transformer substation, secondly collecting relevant data of project investment prediction of the transformer substation, and then constructing a project investment prediction model of the transformer substation based on an XGBoost algorithm and evaluating the effectiveness of the model. And finally, inputting the project data of the transformer substation into a project investment prediction model of the transformer substation, and accurately determining the project investment data of the transformer substation according to the output result of the project investment prediction model of the transformer substation.
In order to achieve the purpose of the invention, a method for determining investment prediction data of a transformer substation engineering project is provided, and comprises the following steps:
s10, constructing a substation engineering project investment prediction index system;
s20, collecting and processing investment prediction index data of a transformer substation engineering project;
s30, constructing a substation engineering project investment prediction model based on an XGBoost algorithm;
s40, inputting substation project data into the substation project investment prediction model, and determining the substation project investment data according to the output result of the substation project investment prediction model.
In one embodiment, constructing a substation project investment prediction index system comprises:
s11, carding the engineering project composition and construction process of the transformer substation, and preliminarily constructing a transformer substation engineering project investment prediction index system;
s12, extracting a main component of a project investment prediction index of the transformer substation by adopting a main component analysis method aiming at the project investment prediction index system of the transformer substation;
s13, correcting the main components of the investment prediction index of the substation engineering project to determine a final investment prediction index system of the substation engineering project.
In one embodiment, collecting and processing substation project investment prediction index data includes:
s21, converting the Chinese name of the project investment prediction index of the transformer substation into a set form so as to enable a computer to accurately identify;
s22, filling missing values in the converted data by adopting a regression interpolation method;
s23, recognizing abnormal values of the filled data, and eliminating the abnormal values obtained by recognition to obtain substation engineering project investment prediction index data.
In one embodiment, constructing a substation engineering project investment prediction model based on an XGBoost algorithm includes:
s31, dividing the investment prediction index data of the engineering project of the transformer substation into a training set and a testing set, and setting the proportion as 7:3, a step of;
s32, in order to eliminate the influence of the division mode and the sequence randomness of the samples on the prediction result, the training set obtained in the step S31 is further divided into K shares, wherein the value range of N meets N/K >3D, N represents the sample data quantity, and D represents the feature number;
s33, constructing an investment prediction model based on an XGBoost algorithm, training a tree integration model used in the XGBoost in an additive mode, and stopping splitting until a depth threshold of a tree is reached to obtain a substation engineering project investment prediction model;
s34, setting parameters of a project investment prediction model of the transformer substation.
And S35, inputting the test set into a project investment prediction model of the transformer substation, verifying, evaluating the model based on comparison of a prediction result and an actual result of the project investment prediction model of the transformer substation, and determining the project investment prediction model of the transformer substation according to the operation parameters of the project investment prediction model of the transformer substation after the evaluation is passed.
According to the method for determining the project investment prediction data of the transformer substation, the project investment prediction index system of the transformer substation is constructed, the project investment prediction index data of the transformer substation is collected and processed, the project investment prediction model of the transformer substation based on the XGBoost algorithm is constructed, the project investment data of the transformer substation is input into the project investment prediction model of the transformer substation, the project investment data of the transformer substation is accurately determined according to the output result of the project investment prediction model of the transformer substation, the project investment prediction index system of the transformer substation is established aiming at the object for improving project investment prediction of the transformer substation, the project investment prediction index system of the transformer substation is constructed, the project investment prediction model based on the XGBoost is constructed, the investment prediction data obtained through the project prediction model of the XGBoost can provide accurate reference data for a decision maker, the auxiliary decision-making technology for supporting construction investment prediction optimization is used for solving the problem of data and accuracy coordination of project investment prediction of the transformer substation, and is beneficial to improving the accuracy of project investment decision of the transformer substation, and supporting high-quality development of power grid enterprises.
Drawings
FIG. 1 is a flow chart of a method of determining substation project investment prediction data for one embodiment;
FIG. 2 is a general flow diagram of a substation project investment prediction model based on the XGBoost algorithm of one embodiment;
FIG. 3 is a substation project construction flow diagram of one embodiment;
FIG. 4 is a substation investment prediction preliminary index system diagram of one embodiment;
FIG. 5 is a main flow chart of a substation engineering investment prediction index screening principal component analysis of one embodiment;
FIG. 6 is a final index system diagram of substation engineering investment prediction for one embodiment;
FIG. 7 is a flow diagram of substation engineering investment prediction index data processing for one embodiment;
FIG. 8 is a K-fold partitioning process diagram of a training set, according to one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The traditional statistical model method such as regression analysis is difficult to overcome the limitation of the method when applied to nonlinear data, the prediction precision is low, but the neural network model has good prediction effect, but the algorithm parameter setting is complex, the interpretation is poor, and the problems of poor learning and generalization capability and the like are easy to occur particularly when a sample is small. The transformer substation engineering is one of important components of the electric power infrastructure, and has the characteristics of large investment amount, single product structure, long project period and the like. The investment decision is influenced by multi-dimensional factors, and the investment prediction of the transformer substation is effectively performed under the background of new electricity change, so that the method is very important for realizing the high-benefit development of the electric power infrastructure. The invention aims at the project investment prediction work of the transformer substation, knows the investment process and project attribute of the project of the transformer substation in detail, constructs a project investment prediction index system of the transformer substation, constructs a project investment prediction model of the transformer substation based on the XGBoost algorithm, is beneficial to improving the precision of project investment prediction of the transformer substation, can provide more accurate reference information for a decision maker, and realizes the high-benefit investment of the infrastructure project.
Referring to fig. 1, fig. 1 is a flowchart of a method for determining investment prediction data of a substation project according to one embodiment, including the following steps:
s10, constructing a substation engineering project investment prediction index system.
S20, collecting and processing investment prediction index data of the engineering project of the transformer substation.
S30, constructing a substation engineering project investment prediction model based on an XGBoost algorithm.
S40, inputting substation project data into the substation project investment prediction model, and determining the substation project investment data according to the output result of the substation project investment prediction model.
In one embodiment, step S40 includes the steps of:
step S41, obtaining objective state data of the index obtained in step S10. And (5) obtaining an index system according to the step S10, and determining index objective state data according to objective conditions of transformer substation engineering projects.
And step S42, inputting the index objective state data acquired in the step S41 into the model acquired in the step S30 to acquire investment prediction data.
According to the method for determining the project investment prediction data of the transformer substation, the project investment prediction index system of the transformer substation is constructed, the project investment prediction index data of the transformer substation is collected and processed, the project investment prediction model of the transformer substation based on the XGBoost algorithm is constructed, the project investment data of the transformer substation is input into the project investment prediction model of the transformer substation, the project investment data of the transformer substation is accurately determined according to the output result of the project investment prediction model of the transformer substation, the project investment prediction index system of the transformer substation is established aiming at the object for improving project investment prediction of the transformer substation, the project investment prediction index system of the transformer substation is constructed, the project investment prediction model based on the XGBoost is constructed, the investment prediction data obtained through the project prediction model of the XGBoost can provide accurate reference data for a decision maker, the auxiliary decision-making technology for supporting construction investment prediction optimization is used for solving the problem of data and accuracy coordination of project investment prediction of the transformer substation, and is beneficial to improving the accuracy of project investment decision of the transformer substation, and supporting high-quality development of power grid enterprises.
In one embodiment, constructing a substation project investment prediction index system comprises:
s11, carding the engineering project composition and construction process of the transformer substation, and preliminarily constructing a transformer substation engineering project investment prediction index system;
s12, extracting a main component of a project investment prediction index of the transformer substation by adopting a main component analysis method aiming at the project investment prediction index system of the transformer substation;
s13, correcting the main components of the investment prediction index of the substation engineering project to determine a final investment prediction index system of the substation engineering project; in particular, expert interview modification may be performed to determine a final index system for investment prediction of a substation project.
In one embodiment, collecting and processing substation project investment prediction index data includes:
s21, converting the Chinese name of the project investment prediction index of the transformer substation into a set form so as to enable the computer to accurately identify. The setting form can be an English letter plus number form; the step is used for converting the characteristic names, and converting the Chinese names of the investment prediction indexes of the transformer substation engineering projects into English letters and numbers, so that the recognition and the processing of a computer are facilitated.
S22, filling missing values in the converted data by adopting a regression interpolation method. This step lacks value processing. Usually, the data directly acquired often has a plurality of missing values and has noise data and other problems. Therefore, the missing values in the original data are filled in by adopting a regression interpolation method in the step, so that the data can be more effectively utilized. The specific method comprises the following steps: and taking the missing attribute as a dependent variable and other related attributes as independent variables, and establishing a regression model by utilizing the relation between the missing attribute and the independent variable to predict the missing value, thereby completing the missing value interpolation method.
S23, recognizing abnormal values of the filled data, and eliminating the abnormal values obtained by recognition to obtain substation engineering project investment prediction index data. In this step, abnormal values, namely, so-called "outliers", are identified and processed, and the abnormal values are identified by the following methods, which are usually caused by entry errors and the like: for continuous values, according toIn principle, data outside 3 standard deviations of the line-of-sight mean are outliers; for discrete values, data is clustered, and data that does not fall within any cluster is considered an outlier.
In one embodiment, constructing a substation engineering project investment prediction model based on an XGBoost algorithm includes:
s31, dividing the investment prediction index data of the engineering project of the transformer substation into a training set and a testing set, and setting the proportion as 7:3. the training set and the testing set are divided, and before the substation engineering project investment prediction model based on the XGBoost algorithm is constructed, the data set is divided into the training set and the testing set. The data set processed in step 2 may be specifically divided by means of sklearn packets in Python.
S32, in order to eliminate the influence of the division mode and the sequence randomness of the samples on the prediction result, the training set obtained in the step S31 is further divided into K shares, wherein the value range of N satisfies N/K >3D, N represents the sample data quantity, and D represents the feature number.
S33, constructing an investment prediction model based on an XGBoost algorithm, training a tree integration model used in the XGBoost in an additive mode, and stopping splitting until a depth threshold of a tree is reached to obtain a substation engineering project investment prediction model;
s34, setting parameters of a project investment prediction model of the transformer substation.
And S35, inputting the test set into a project investment prediction model of the transformer substation, verifying, evaluating the model based on comparison of a prediction result and an actual result of the project investment prediction model of the transformer substation, and determining the project investment prediction model of the transformer substation according to the operation parameters of the project investment prediction model of the transformer substation after the evaluation is passed. This step uses the individual test set data to validate the model and evaluate the model based on a comparison of the model's predicted results to the actual results. The substation engineering project investment prediction model evaluation index based on the XGBoost algorithm mainly comprises an average absolute error (MAE), an average absolute percentage error (MAPE) and a fitting goodness (R) 2 )。
In one embodiment, the method for determining investment prediction data of a substation engineering project can also refer to fig. 2, which firstly establishes a substation engineering data set and divides data types according to a basic data source of electric power in each place, then establishes an investment prediction index system of the substation engineering, then collects and processes related data of the investment prediction index of the substation engineering, finally establishes a substation engineering investment prediction model based on an XGBoost algorithm, and evaluates and verifies accuracy of the model, and mainly comprises the following procedures:
step 1), constructing a substation engineering project investment prediction index system;
step 2) collecting and processing investment prediction index data of a transformer substation engineering project;
step 3) constructing a substation engineering construction investment prediction model based on an XGBoost algorithm;
and 4) inputting substation engineering project data, and determining the substation engineering project investment data.
Specific steps of the foregoing steps are described below with reference to fig. 2-8.
Step 1) constructing a substation engineering project investment prediction index system. Specifically, the implementation of step 1 includes the following steps:
step 1-1) carding the engineering project composition and construction process of the transformer substation, and preliminarily constructing a transformer substation engineering project investment prediction index system. The transformer substation project mainly comprises two parts of building engineering and electrical equipment installation engineering. The construction engineering mainly comprises a building and a structure, and the electrical equipment installation engineering mainly comprises primary equipment and secondary equipment installation, test and debugging and the like. The concrete construction process is shown in fig. 3. Therefore, the invention classifies investment prediction indexes of the transformer substation into two types of electric equipment and constructional engineering. The building engineering indexes mainly reflect the main contents of building composition, construction, building material consumption, foundation treatment mode and the like. The electrical equipment indexes mainly reflect the contents of the composition, the type, the quantity, the power distribution type and the like of the electrical equipment. And after preliminary screening, a substation investment prediction preliminary index system is constructed as shown in figure 4.
And step 1-2), extracting a main component of a project investment prediction index of the transformer substation by adopting a main component analysis method. Specifically, the implementation of step 1-2) includes the following steps:
step 1-2-1) constructing an original data matrix and data standardization. Assume that there are m substation projects, each with n investment predictors. First, constructing an original data matrix X:
then, the data are standardized, the influence of dimension and magnitude order on evaluation is eliminated, and the method adopts a Z-score method for standardization:
wherein the method comprises the steps ofAnd var (X) i ) The mean and the variance of the investment prediction indexes of the transformer substation are respectively.
Step 1-2-2) calculates a correlation coefficient matrix and a feature vector group. Calculating a correlation coefficient matrix R according to the matrix standardized in the step 1-2-1):
wherein r is ij As variable x i And x j Is used for the correlation coefficient of the (c). Calculating corresponding eigenvalue lambda according to the correlation coefficient matrix 1 ,λ 2 ,…,λ n Standard orthogonal characteristic vector beta corresponding to characteristic 1 ,β 2 ,…,β n The feature vector group is b= (β) 1 ,β 2 ,…,β n )。
Step 1-2-3) calculating the variance accumulation contribution rate, extracting the principal component and calculating the weight. After the feature vector is obtained, p main components before extraction are determined according to the magnitude of the variance accumulation contribution rate, wherein the variance accumulation contribution rate is as follows:
the weight value of the extracted principal component can be calculated according to the following formula:
table 1 below shows key indexes and corresponding weights of the construction engineering and electrical equipment parts screened out according to the principal component analysis method.
TABLE 1 engineering investment prediction index system principal component for transformer substation
Step 1-3: expert interview correction is carried out to determine a final index system of investment prediction of the engineering project of the transformer substation. Based on the preliminary index of table 1, expert interviews are performed to correct the index. This patent increases basic characteristic class index of transformer substation, and what wherein contains is: rated voltage class, construction time, construction site, transformer substation type, total station building area, construction property and whether the intelligent transformer substation is a basic attribute of several transformer substations. In the electric equipment part, the copper bars and flat steel consumption which have little influence on investment are removed, the number and capacity of main transformers are increased, and the indexes of high-voltage side distribution form, wiring pattern, circuit breaker number and bus PT interval, bridge connection, main transformer incoming and outgoing line or outgoing line interval and the like which have great influence on the total investment are increased. In the construction engineering part, cable channels with small influence on total investment are deleted, and indexes with large influence on investment such as foundation treatment scheme, main control (comprehensive building) construction area, steel consumption of steel structures and brackets, main transformer, incoming and outgoing lines, foundation concrete quantity and the like in the construction engineering are increased. Finally, a substation engineering project investment prediction index system shown in fig. 4 is constructed.
Step 2) collecting and processing substation engineering construction investment prediction index data to obtain a data structure suitable for modeling, as shown in fig. 5. Specifically, the implementation of step 2) includes the following steps:
step 2-1) feature name conversion. The Chinese names of the project investment predictors of the transformer substation are converted into English letters and numbers, so that the recognition and the processing of a computer are facilitated; the corresponding final index system diagram of the investment prediction of the substation engineering can be shown by referring to fig. 6, and the flow chart of the data processing of the investment prediction index of the substation engineering can be shown by referring to fig. 7.
And carrying out label code setting on the investment prediction index of the transformer substation engineering project in an alphanumeric combination mode. The details are shown in table 2 below.
Table 2 conversion table for project investment prediction index system of transformer station
Step 2-2) missing value processing. Usually, the data directly acquired often has a plurality of missing values and has noise data and other problems. Therefore, the missing values in the original data are filled in by adopting a regression interpolation method in the step, so that the data can be more effectively utilized. The specific method comprises the following steps: and taking the missing attribute as a dependent variable and other related attributes as independent variables, and establishing a regression model by utilizing the relation between the missing attribute and the independent variable to predict the missing value, thereby completing the missing value interpolation method.
Step 2-3) identification and processing of abnormal values. Outliers, commonly referred to as "outliers," are typically caused by entry errors and the like. The identification method of the abnormal value is as follows: for continuous values, according toIn principle, data outside 3 standard deviations of the line-of-sight mean are outliers; for discrete values, clustering data, data that does not fall within any cluster is consideredAn outlier;
and 3) constructing a substation engineering project investment prediction model based on an XGBoost algorithm. Specifically, the implementation of step 3) includes the following steps:
step 3-1) training set and test set partitioning. Before constructing a substation engineering project investment prediction model based on the XGBoost algorithm, a data set is divided into a training set and a testing set, and the proportion is set as 7:3. dividing the data set obtained in the step 2 by means of a sklearn packet in Python;
table 3 data set partitioning
Test set Training set Totalizing
Quantity of 449 193 642
Proportion of 70% 30% 100%
Step 3-2) K-fold partitioning of the training set. In order to eliminate the influence of the division mode and the sequence randomness of the samples on the prediction result, the training set is further divided into K mutually exclusive subsets on the basis of the training set obtained in the step 3-1), and each time (K-1) subset is selected as the training set, and 1 subset is selected as the test set, as shown in fig. 8. Wherein the value range of N should satisfy N/K >3D, where N represents the sample data amount and D represents the feature number.
Step 3-3) construction of an investment prediction model based on an XGBoost algorithm. The tree integration model used in XGBoost is trained in an additive manner until a stop condition is met. According to the invention, python software is adopted to construct a substation engineering project investment prediction model based on XGBoost algorithm. The code of the Python software is given in table 4 below. The construction of the substation engineering project investment prediction model based on the XGBoost algorithm comprises the following 5 steps:
(1) determining a training set according to the selected sample division mode;
(2) taking the CART classification tree as a basic learner, and defining an objective function and a gain function;
(3) determining a tree f added in the t-th round of iteration by calculating the optimal tree structure and the optimal splitting node t
(4) Completing the iterative process, and obtaining all classification trees obtained by training
(5) And integrating all tree models in an addition form to obtain an XGBoost prediction model.
TABLE 4 Python codes of substation project investment prediction model based on XGBoost algorithm
Step 3-4), setting and adjusting and optimizing model parameters. The parameters of a substation engineering project investment prediction model based on the XGBoost algorithm are set in the step. The parameter settings are shown in table 5, and the specific contents are as follows:
boost is a model that selects each iteration, with two options: the performance of the gbtree is better than that of the gblin, so that the gbtree is selected as an iteration model.
learning_rate is the learning rate, range [0,1]. The smaller the parameter, the slower the calculation speed; the larger the parameter, the more likely it is that convergence is impossible. The invention takes 0.005.
max depth is the maximum depth of each tree, in the range of [0 ], ++ infinity A kind of electronic device. The larger the parameter, the more likely the overfitting occurs. The larger max_depth, the more specific and localized samples will be learned by the model. The invention takes 7.
n_optimizers is the number of trees in XGBoost, the better the model performance is the number, but when the number is to a certain extent, the model performance is improved only to a limited extent, and the speed of the algorithm is adversely affected. 5000 is taken in the invention.
column_byte is the column sampling rate, typically the feature sampling rate, and the range (0, 1) is taken by the present invention by taking column samples similar to random forests for the features used for the generation of each tree.
min _ child _ weight is the minimum sum of weights inside each leaf, in the range of [0 ], ++ infinity A kind of electronic device. The larger the parameter, the more conservative the algorithm, the less likely the overfitting. The invention takes 11.
lambda is an L2 regularization parameter used to control the regularized portion of XGBoost. In the range of [0 ], ++ infinity A kind of electronic device. The larger the parameter, the less likely the overfitting. The invention takes 1.
gamma is a loss threshold, a parameter that controls the number of leaves, gamma specifies the minimum loss function drop value required for node splitting, in the range of [0 ], +++). The larger the parameter, the more conservative the algorithm, the less likely the overfitting. This embodiment takes 0.
TABLE 5 parameter settings of substation project investment prediction model based on XGBoost algorithm
Parameter name Numerical value Meaning of
booster gbtree Model for each iteration
learning_rate 0.005 Learning rate
scale_pos_weight 1 Controlling balance parameters of positive and negative types
max_depth 7 Maximum depth of each tree
n_estimators 5000 Number of decision trees in model
colsample_bytree 1 Column sampling rate, controlling the duty cycle of each randomly sampled column number
min_child_weight 11 Sum of minimum sample weights
lambda 1 L2 regularization parameters controlling regularization portion of XGBoost
gamma 0 Loss threshold value, control of leaf number
Step 3-5) evaluating the XGBoost investment prediction model based on the test set. The individual test set data is used to validate the model and evaluate the model based on a comparison of the predicted results output by the model with the actual results. Substation engineering project investment prediction model evaluation indexes based on XGBoost algorithm mainly comprise fitting goodness (R 2 ) Average absolute error (MAE) and average absolute percent error (MAPE).
The first evaluation index of the model is the goodness of fit (R 2 ) The formula is as follows:
R 2 the closer the value of (2) is to 1, the better the fitting degree of the model predicted value to the actual observed value is, if R 2 If the value of (2) can reach more than 0.8, the accuracy of the model is considered acceptable. Goodness of fit of model (R 2 ) Only the accuracy of the overall prediction of the model can be measured, and the overall accuracy of the model can not well reflect the actual situation of the error of the predicted value. Therefore, the Mean Absolute Error (MAE) and Mean Absolute Percent Error (MAPE) are also introduced herein to evaluate the error of the predicted value from the true value of the model, as follows:
/>
the MAE value range is 0, ++ infinity, and the predicted value is equal to 0 when the predicted value is completely matched with the true value, namely a perfect model; the larger the error, the larger the value. MAPE is a variant of MAE, which is a percentage value and therefore easier to understand than other statistics. The value range is [0, + ] infinity, when MAPE is 0% this represents a perfect model.
Substation engineering project investment prediction model evaluation indexes based on XGBoost algorithm are mainly the fitting goodness (R) described above 2 ) Average absolute error (MAE) and average absolute percent error (MAPE). The following table 6 gives the evaluation index of the investment prediction model of the substation engineering project of the present invention. According to the result of the model, the fitting goodness of the construction engineering fee is 0.81, the fitting goodness of the electric equipment fee is 0.85, the fitting goodness of the installation engineering fee is 0.90, the fitting goodness of the static investment is 0.83, and the fitting goodness of the dynamic investment is 0.82. The fitting goodness of all the predicted objects is above 0.8, and meanwhile, the results of MAE and MAPE also show that the deviation between the predicted value and the true value of the model is smaller, the whole prediction performance of the model is superior, and the reliability is high.
Table 6 evaluation index of substation project investment prediction model based on XGBoost algorithm
Predicting objects MAE MAPE R 2
Construction engineering fee 221.35 39694.43 0.81
Electric equipment cost 379.93 40111.54 0.85
Installation engineering fee 81.08 36.12 0.90
Static investment 683.33 26.92 0.83
Dynamic investment 704.54 27.46 0.82
And 4) inputting substation engineering project data, and determining the substation engineering project investment data. Specifically, the implementation of step 4) includes the following steps:
step 4-1) obtaining objective state data of the index obtained in step 1). Obtaining an index system according to the step 1), and determining index objective state data according to objective conditions of transformer substation engineering projects.
Step 4-2) inputting the index objective state data obtained in the step 4-1) into the model obtained in the step 3) to obtain investment prediction data.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
It should be noted that, the term "first\second\third" in the embodiments of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, and it is understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate to enable embodiments of the present application described herein to be implemented in sequences other than those illustrated or described herein.
The terms "comprising" and "having" and any variations thereof, in embodiments of the present application, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, article, or device that comprises a list of steps or modules is not limited to the particular steps or modules listed and may optionally include additional steps or modules not listed or inherent to such process, method, article, or device.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (3)

1. The method for determining the investment prediction data of the engineering project of the transformer substation is characterized by comprising the following steps:
s10, constructing a substation engineering project investment prediction index system;
s20, collecting and processing investment prediction index data of a transformer substation engineering project;
s30, constructing a substation engineering project investment prediction model based on an XGBoost algorithm;
s40, inputting substation project data into a substation project investment prediction model, and determining the substation project investment data according to the output result of the substation project investment prediction model;
the construction of the substation engineering project investment prediction model based on the XGBoost algorithm comprises the following steps:
s31, dividing the investment prediction index data of the engineering project of the transformer substation into a training set and a testing set, and setting the proportion as 7:3, a step of;
s32, in order to eliminate the influence of the division mode and the sequence randomness of the samples on the prediction result, the training set obtained in the step S31 is further divided into K shares, wherein the value range of N meets N/K >3D, N represents the sample data quantity, and D represents the feature number;
s33, constructing an investment prediction model based on an XGBoost algorithm, training a tree integration model used in the XGBoost in an additive mode, and stopping splitting until a depth threshold of a tree is reached to obtain a substation engineering project investment prediction model;
s34, setting parameters of a project investment prediction model of the transformer substation;
and S35, inputting the test set into a project investment prediction model of the transformer substation, verifying, evaluating the model based on comparison of a prediction result and an actual result of the project investment prediction model of the transformer substation, and determining the project investment prediction model of the transformer substation according to the operation parameters of the project investment prediction model of the transformer substation after the evaluation is passed.
2. The method for determining investment prediction data for a project of a transformer substation according to claim 1, wherein constructing a project investment prediction index system of the transformer substation comprises:
s11, carding the engineering project composition and construction process of the transformer substation, and preliminarily constructing a transformer substation engineering project investment prediction index system;
s12, extracting a main component of a project investment prediction index of the transformer substation by adopting a main component analysis method aiming at the project investment prediction index system of the transformer substation;
s13, correcting the main components of the investment prediction index of the substation engineering project to determine a final investment prediction index system of the substation engineering project.
3. The method of determining substation project investment prediction data according to claim 1, wherein collecting and processing the substation project investment prediction index data comprises:
s21, converting the Chinese name of the project investment prediction index of the transformer substation into a set form so as to enable a computer to accurately identify;
s22, filling missing values in the converted data by adopting a regression interpolation method;
s23, recognizing abnormal values of the filled data, and eliminating the abnormal values obtained by recognition to obtain substation engineering project investment prediction index data.
CN202011375669.0A 2020-11-30 2020-11-30 Method for determining investment prediction data of transformer substation engineering project Active CN112508254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011375669.0A CN112508254B (en) 2020-11-30 2020-11-30 Method for determining investment prediction data of transformer substation engineering project

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011375669.0A CN112508254B (en) 2020-11-30 2020-11-30 Method for determining investment prediction data of transformer substation engineering project

Publications (2)

Publication Number Publication Date
CN112508254A CN112508254A (en) 2021-03-16
CN112508254B true CN112508254B (en) 2024-03-29

Family

ID=74968086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011375669.0A Active CN112508254B (en) 2020-11-30 2020-11-30 Method for determining investment prediction data of transformer substation engineering project

Country Status (1)

Country Link
CN (1) CN112508254B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762791B (en) * 2021-09-13 2023-08-01 郑州铁路职业技术学院 Railway engineering cost management system
CN114037538B (en) * 2021-11-15 2023-11-03 国网湖北省电力有限公司经济技术研究院 Method and system for controlling investment balance of power grid infrastructure project

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288142A (en) * 2019-06-18 2019-09-27 国网上海市电力公司 A kind of engineering based on XGBoost algorithm is exceeded the time limit prediction technique
CN110992113A (en) * 2019-12-23 2020-04-10 国网湖北省电力有限公司 Neural network intelligent algorithm-based project cost prediction method for capital construction transformer substation
AU2020101854A4 (en) * 2020-08-17 2020-09-24 China Communications Construction Co., Ltd. A method for predicting concrete durability based on data mining and artificial intelligence algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288142A (en) * 2019-06-18 2019-09-27 国网上海市电力公司 A kind of engineering based on XGBoost algorithm is exceeded the time limit prediction technique
CN110992113A (en) * 2019-12-23 2020-04-10 国网湖北省电力有限公司 Neural network intelligent algorithm-based project cost prediction method for capital construction transformer substation
AU2020101854A4 (en) * 2020-08-17 2020-09-24 China Communications Construction Co., Ltd. A method for predicting concrete durability based on data mining and artificial intelligence algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于熵权法-DEA模型的电网建设项目投资效率研究;俞越中;方向;张旺;沈小伟;;陕西电力(11);全文 *

Also Published As

Publication number Publication date
CN112508254A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN111505433B (en) Low-voltage transformer area indoor variable relation error correction and phase identification method
CN110516912B (en) Method for identifying household transformer relation of distribution station
CN104809658B (en) A kind of rapid analysis method of low-voltage distribution network taiwan area line loss
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
CN106909933A (en) A kind of stealing classification Forecasting Methodology of three stages various visual angles Fusion Features
CN113011481B (en) Electric energy meter function abnormality assessment method and system based on decision tree algorithm
CN111369168B (en) Associated feature selection method suitable for multiple regulation and control operation scenes of power grid
CN112508254B (en) Method for determining investment prediction data of transformer substation engineering project
CN111461921B (en) Load modeling typical user database updating method based on machine learning
Fang et al. A statistical approach to estimate imbalance-induced energy losses for data-scarce low voltage networks
CN111612275A (en) Method and device for predicting load of regional user
Xue et al. Adaptive ultra-short-term wind power prediction based on risk assessment
CN111881961A (en) Power distribution network fault risk grade prediction method based on data mining
CN111654392A (en) Low-voltage distribution network topology identification method and system based on mutual information
CN110968703B (en) Method and system for constructing abnormal metering point knowledge base based on LSTM end-to-end extraction algorithm
CN111091223B (en) Matching short-term load prediction method based on intelligent sensing technology of Internet of things
CN108694475B (en) Short-time-scale photovoltaic cell power generation capacity prediction method based on hybrid model
CN113327047B (en) Power marketing service channel decision method and system based on fuzzy comprehensive model
CN115640950A (en) Method for diagnosing abnormal line loss of distribution network line in active area based on factor analysis
CN114186733A (en) Short-term load prediction method and device
Dehghani et al. Distribution feeder classification based on self organized maps (case study: Lorestan province, Iran)
CN113806899A (en) Method and device for identifying topological relation of power distribution network and mobile terminal
CN106875077B (en) Hybrid planning method for medium-and-long-term extended planning of power transmission network
Lingang et al. Research on integrated calculation method of theoretical line loss of MV and LV distribution Network based on Adaboost integrated learning
CN111668829B (en) Meteorological characteristic factor-based power distribution network low-voltage user number prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant