CN111612231A - Method and device for fusion processing of distribution network line re-jump models - Google Patents
Method and device for fusion processing of distribution network line re-jump models Download PDFInfo
- Publication number
- CN111612231A CN111612231A CN202010402397.2A CN202010402397A CN111612231A CN 111612231 A CN111612231 A CN 111612231A CN 202010402397 A CN202010402397 A CN 202010402397A CN 111612231 A CN111612231 A CN 111612231A
- Authority
- CN
- China
- Prior art keywords
- jump
- distribution network
- network line
- data
- models
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000009826 distribution Methods 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000007499 fusion processing Methods 0.000 title claims abstract description 21
- 238000003066 decision tree Methods 0.000 claims abstract description 42
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000012360 testing method Methods 0.000 claims abstract description 35
- 238000007477 logistic regression Methods 0.000 claims abstract description 18
- 238000012706 support-vector machine Methods 0.000 claims abstract description 18
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000003860 storage Methods 0.000 claims description 26
- 238000004590 computer program Methods 0.000 claims description 8
- 238000007689 inspection Methods 0.000 claims description 8
- 238000009499 grossing Methods 0.000 claims description 4
- 238000012544 monitoring process Methods 0.000 claims description 4
- 230000007547 defect Effects 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 2
- KWYHDKDOAIKMQN-UHFFFAOYSA-N N,N,N',N'-tetramethylethylenediamine Chemical compound CN(C)CCN(C)C KWYHDKDOAIKMQN-UHFFFAOYSA-N 0.000 claims 1
- 238000004140 cleaning Methods 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 description 19
- 238000013138 pruning Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 10
- 238000013480 data collection Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 239000012212 insulator Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000007637 random forest analysis Methods 0.000 description 4
- 238000001276 Kolmogorov–Smirnov test Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000005192 partition Methods 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 238000009966 trimming Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000032683 aging Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000010977 jade Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 230000007257 malfunction Effects 0.000 description 1
- 238000001151 non-parametric statistical test Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2203/00—Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
- H02J2203/20—Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Economics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Power Engineering (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application relates to the technical field of power grid equipment manufacturing, in particular to a method and a device for fusion processing of a network distribution line re-jump model. The method comprises the following steps: acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data; dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set; training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model on the basis of a training set, a test set and a characteristic value; respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 kinds of support vector machine optimal re-jump models, logistic regression optimal re-jump models and decision tree optimal re-jump models; combining the optimal re-hopping model prediction results of different algorithms to obtain 8 prediction results, wherein the prediction results are composed of elements in a category target set { Y, N }; and dividing the re-jump probability into four types of high type, medium type and low type based on the number of Y in the prediction result.
Description
Technical Field
The application relates to the technical field of power grid equipment manufacturing, in particular to a method and a device for fusion processing of a network distribution line re-jump model.
Background
The phenomenon that the distribution network line is tripped again or even frequently due to the fact that the distribution network defense capacity is reduced and the distribution network is prone to tripping caused by various internal and external factors because the power supply area of a power grid is enlarged, the number of branches of the line is large, the power supply radius is long, the ageing of equipment is large. The distribution network line structure is complex, various types of related equipment are provided, the power supply coverage is wide, a series of problems such as equipment aging faults and the like cause frequent tripping events, the tripping of the distribution network line can threaten the safety of a distribution network, the service users of the whole distribution network are threatened, and various hidden dangers can be brought. Therefore, a good line running state and a reasonable running state are the basis for ensuring the safe running of the distribution network.
Because the load of the distribution network is continuously increased, and the lines of the distribution network are continuously increased, tripping faults are easy to occur, even frequent tripping occurs, and the influence on the life of people is huge. The traditional mode is that all distribution network lines are inspected for hidden troubles by manpower, a strict itinerant inspection system is carried out, the conditions of line equipment are required to be known, and hidden troubles are eliminated in time.
However, the conventional method needs a lot of manpower and is not always capable of effectively finding the fault, and can not effectively predict and prevent the network line from re-jumping.
Disclosure of Invention
The application provides a method and a device for fusion processing of a distribution network line re-jump model, which are used for predicting the re-adjustment of a distribution network line by combining internal data and external data and utilizing a fusion processing method to obtain the probability of the re-jump of the distribution network line.
The embodiment of the application is realized as follows:
a first aspect of an embodiment of the present application provides a method for fusion processing of a network distribution line re-hopping model, where the method includes:
acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data;
dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set;
training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model based on the training set, the test set and the characteristic values;
respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression and 2 optimal re-jump models of a decision tree;
combining the optimal re-hopping model prediction results of different algorithms to obtain 8 different prediction results, wherein the prediction results are composed of elements in a category target set { Y, N };
and dividing the distribution network line re-hop probability into four types of high type, medium type and low type based on the number of Y in the prediction result.
A second aspect of the present embodiment provides a device for merging and processing a distribution network line re-hopping model, which includes a memory, a processor, and a computer program stored on the memory, where the processor executes the computer program to perform the method according to any one of the aspects of the present invention provided in the first aspect of the present embodiment.
A third aspect of embodiments of the present application provides a computer-readable storage medium storing computer instructions, at least part of the computer instructions, when executed by a processor, implementing a method as set forth in any one of the summary provided in the first aspect of embodiments of the present application.
The beneficial effect of this application lies in: by collecting data influencing the distribution network re-hopping and preprocessing, the accuracy, integrity and consistency of the data can be improved; furthermore, a joint hypothesis test algorithm is adopted for feature selection, and multiple algorithms are selected as a base classifier to construct multiple models for training, so that the probability prediction of the network distribution line re-jump can be realized; further, the optimal models which accord with the data are judged through an ROC curve, a KS value and the like, and each data can correspond to one optimal model respectively; furthermore, by combining various prediction results, the analysis probability of the occurrence of the re-jump of the future distribution network line is pre-judged, the problem of the trip of the distribution network line is solved, the safety threat to the distribution network is reduced, the threat to the service users of the whole distribution network is reduced, the hidden danger in all aspects is reduced, and the whole network can run stably.
Drawings
Specifically, in order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments are briefly described below, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without any creative effort.
Fig. 1 is a schematic diagram of a system for merging network line re-hop models according to some embodiments of the present disclosure;
FIG. 2 is a schematic diagram of an exemplary computing device shown in accordance with some embodiments of the present application;
fig. 3 shows a schematic flow chart of a method for merging distribution network line re-hop models in an embodiment of the present application.
Detailed Description
Certain exemplary embodiments will now be described to provide an overall understanding of the principles of the structure, function, manufacture, and use of the devices and methods disclosed herein. One or more examples of these embodiments are illustrated in the accompanying drawings. Those of ordinary skill in the art will understand that the devices and methods specifically described herein and illustrated in the accompanying drawings are non-limiting exemplary embodiments and that the scope of the various embodiments of the present invention is defined solely by the claims. Features illustrated or described in connection with one exemplary embodiment may be combined with features of other embodiments. Such modifications and variations are intended to be included within the scope of the present invention.
Reference throughout this specification to "embodiments," "some embodiments," "one embodiment," or "an embodiment," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in at least one other embodiment," or "in an embodiment" or the like throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, the particular features, structures, or characteristics shown or described in connection with one embodiment may be combined, in whole or in part, with the features, structures, or characteristics of one or more other embodiments, without limitation. Such modifications and variations are intended to be included within the scope of the present invention.
Fig. 1 is a schematic diagram of a system 100 for merging network line re-hop models according to some embodiments of the present application. The distribution network line re-jump model fusion processing system 100 is a platform capable of automatically predicting the probability of re-jump of the distribution network line. The system 100 for merging and processing a network line re-hop model may include a server 110, at least one storage device 120, at least one network 130, and one or more data acquisition devices 150-1, 150-2. The server 110 may include a processing engine 112.
In some embodiments, the server 110 may be a single server or a group of servers. The server farm may be centralized or distributed (e.g., server 110 may be a distributed system). In some embodiments, the server 110 may be local or remote. For example, server 110 may access data stored in storage device 120 via network 130. Server 110 may be directly connected to storage device 120 to access the stored data. In some embodiments, the server 110 may be implemented on a cloud platform. The cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, multiple clouds, the like, or any combination of the above. In some embodiments, server 110 may be implemented on a computing device as illustrated in FIG. 2 herein, including one or more components of computing device 200.
In some embodiments, the server 110 may include a processing engine 112. Processing engine 112 may process information and/or data related to the service request to perform one or more of the functions described herein. For example, the processing engine 112 may be configured to obtain the distribution network infrastructure data transmitted by the data collection device 150 and send the distribution network infrastructure data to the storage device 120 via the network 130 for updating the data stored therein. In some embodiments, processing engine 112 may include one or more processors. The processing engine 112 may include one or more hardware processors, such as a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an application specific instruction set processor (ASIP), an image processor (GPU), a physical arithmetic processor (PPU), a Digital Signal Processor (DSP), a field-programmable gate array (FPGA), a Programmable Logic Device (PLD), a controller, a micro-controller unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination of the above.
In some embodiments, the storage device 120 may be connected to the network 130 to enable communication with one or more components in the distribution network line re-hop model convergence processing system 100. One or more components of the network line re-hop model convergence processing system 100 may access data or instructions stored in the storage device 120 through the network 130. In some embodiments, the storage device 120 may be directly connected to or in communication with one or more components of the distribution network line re-hop model convergence processing system 100. In some embodiments, storage device 120 may be part of server 110.
The network 130 may facilitate the exchange of information and/or data. In some embodiments, one or more components in the distribution network line re-hop model convergence processing system 100 may send information and/or data to other components in the distribution network line re-hop model convergence processing system 100 over the network 130. For example, the server 110 may obtain/obtain the distribution network infrastructure data from the data collection device 150 via the network 130. In some embodiments, the network 130 may be any one of a wired network or a wireless network, or a combination thereof. In some embodiments, the network 130 may include one or more network access points. For example, the network 130 may include wired or wireless network access points, such as base stations and/or Internet switching points 130-1, 130-2, and so forth. Through the access point, one or more components of the distribution network line re-hop model convergence processing system 100 may be connected to the network 130 to exchange data and/or information.
The data collection facility 150 may include fault defect data, complaint warranty data, facility overload data, data such as routing inspection, weather monitoring data, and the like. In some embodiments, the data collection device 150 may send the collected distribution network infrastructure data to one or more devices in the distribution network line re-hop model fusion processing system 100. For example, the data collection device 150 may send the distribution network basic data to the server 110 for processing, or store the distribution network basic data in the storage device 120.
FIG. 2 is a schematic diagram of an exemplary computing device 200 shown in accordance with some embodiments of the present application. The server 110, storage device 120, and data collection device 150 may be implemented on a computing device 200. For example, the processing engine 112 may be implemented on the computing device 200 and configured to implement the functionality disclosed herein.
For ease of understanding, only one processor is exemplarily depicted in fig. 2. However, it should be noted that the computing device 200 in the present application may include multiple processors, and thus the operations and/or methods described in the present application that are implemented by one processor may also be implemented by multiple processors, collectively or independently. For example, if in the present application a processor of computing device 200 performs steps 1 and 2, it should be understood that steps 1 and 2 may also be performed by two different processors of computing device 200, either collectively or independently.
Fig. 3 shows a schematic flow chart of a method for merging distribution network line re-hop models in an embodiment of the present application.
The traditional machine learning method is to search a classifier in a space formed by rich functions, the classifier must meet the requirement of being closest to an actual classification function, and the main idea of the fusion learning is to fuse the classification results of a plurality of single classifiers when classifying a sample space, so as to obtain a better classification result than the single classifier. Considering a single classifier as a decision maker, the fusion learning is that a group of decision makers jointly make a decision on an event.
In step 301, basic data affecting the network line re-tripping is collected and preprocessed.
In some embodiments, the basic data specifically includes: fault defect data, complaint and warranty data, equipment overload data, data such as routing inspection and the like, and meteorological monitoring data.
The distribution network refers to an electric power network which receives electric energy from a transmission network or a regional power plant and distributes the electric energy to various users on site through distribution facilities or step by step according to voltage. The distribution network is composed of overhead lines, cables, towers, distribution transformers, isolating switches, reactive power compensators, a plurality of accessory facilities and the like, and plays a role in distributing electric energy in the power network.
The fault data is from faults of various devices in the distribution network, such as overhead lines, cables, towers, distribution transformers, disconnectors, reactive power compensators, distribution network switchgear, and the like.
The complaint and repair data come from a complaint telephone information source, an internet information source, a radio station information source, a file information source, a mail information source and a notification information source, such as a customer service complaint hot line from a distribution network, an application client APP and the like.
The equipment overload data, for example, when the substation belonging to two different sub-areas performs loop closing and power dispatching, is influenced by the system operation condition and the power grid parameters, and risks such as equipment overload, relay protection malfunction, short circuit current exceeding standard, accident expansion caused by an electromagnetic looped network and the like due to overlarge loop closing current may occur, so that the power grid safety is influenced.
Inspection data, for example, in the implementation of conventional insulator detection, the insulator in the transformer substation is inspected by a manual inspection method for online detection of the insulator in the transformer substation; the inspection of the power transmission and transformation equipment can be carried out in a mode of adopting an inspection trolley; or the real-time monitoring of the insulator is realized by arranging the sensor in the insulator mounting section and combining the data acquisition of the sensor and the communication means.
In some embodiments, the preprocessing of the base data includes one or more combinations of cleansing and denoising the base data, supplementing missing data, smoothing noise data, outlier identification and deletion, and normalization.
The preprocessing replaces abnormal data, such as null data and unreasonable data, in the abnormal data with second sample data acquired by other sensors at the same time. Normalization is a simplified computational method, i.e., a dimensional expression is converted into a dimensionless expression to form a scalar, and is often used in many computations. The noise data is smoothed and the original image may be smoothed using a low pass filter. The frequency response of the low-pass filter for smoothing the original image is matched with the modulation transfer function of the sensor, so that the smoothing effect of the original image is improved.
In step 302, the preprocessed basic data are divided into a training set and a test set, and feature values of the training set and the test set are obtained.
In some embodiments, the preprocessed basic data is divided into a training set and a test set, and the training set collects a preset number of samples for training the re-jump probability prediction model.
In the field of machine learning, it is generally necessary to divide a sample into three separate parts, namely a training set, a validation set, and a test set. The training set is used for estimating the model, the verification set is used for determining the network structure or parameters for controlling the complexity of the model, and the test set is used for checking how to finally select the optimal model.
In some embodiments, the training data may be divided such that the training set comprises 50% of the total samples and the others comprise 25%, all three being randomly drawn from the samples. And training the probability prediction model of the double jump by using the training set to obtain the optimal network model parameter with the globally minimized error.
In some embodiments, feature value selection is performed by employing a joint hypothesis testing algorithm.
Hypothesis testing, also known as statistical hypothesis testing, is a statistical inference method used to determine whether sample-to-sample, sample-to-population differences are caused by sampling errors or by intrinsic differences. The significance test is the most common method in hypothesis testing, and is the most basic form of statistical inference, whose rationale is to make certain assumptions about the characteristics of the population and then, through statistical reasoning in sampling studies, to infer whether the assumptions should be rejected or accepted. Commonly used hypothesis testing methods include Z test, t test, chi-square test, F test, and the like. The basic idea of hypothesis testing is the "small probability event" principle, whose statistical inference method is a counter-syndrome with some probabilistic nature.
In step 303, a support vector machine-based rebucketing model, a logistic regression-based rebucketing model, and a decision tree-based rebucketing model are trained based on the training set, the test set, and the feature values.
An SVM (Support Vector Machine) is a generalized linear classifier for binary classification of data in a supervised learning manner, and a decision boundary of the SVM is a maximum edge distance hyperplane for solving a learning sample. The SVM uses a hinge loss function to calculate empirical risk and adds a regularization term in a solving system to optimize structural risk, the classifier is a classifier with sparsity and robustness, and the SVM can carry out nonlinear classification by a kernel method.
LR (logistic regression) is a generalized linear regression, and dependent variables of LR can be classified into two categories or more categories, but the two categories are more common and easier to explain, and the more categories can be processed by using the softmax method. The most common in practice is the logistic regression of the two classes. LR can be used for finding risk factors, prediction and discrimination.
And in some embodiments, a certain proportion of data is extracted from the preprocessed basic data in a centralized manner to be used as a training set of the primary decision tree of the neural network model, and the rest part of data is used as a test set of the primary decision tree of the neural network model. For example, 70% of the data may be extracted as a training set of the primary decision tree of the random forest classifier, and the remaining 30% of the data may be used as a test set of the primary decision tree of the random forest classifier. In some embodiments, the construction process of the first-level decision tree of the random forest classifier may be performed by using bootstrapping (bootstrap) to draw back n times from the training set to form n training sample sets, that is, some samples may be drawn multiple times and some samples may not be drawn at one time in the training set. The decision tree includes the following common parameters: information gain and entropy of information gain. The entropy of the information gain is used for representing the size of the information quantity, the larger the information quantity is, the larger the corresponding entropy value is, and the smaller the information quantity is, the smaller the corresponding entropy value is. During the growth process of the decision tree, from the root node to the final leaf node, the information entropy is a descending process, and the descending amount of each step is called information gain. In some embodiments, pruning may be performed on the decision tree to prevent overfitting of subsequent random forest models. The complete decision tree is not the best tree to classify and predict new data objects. The reason for this is that the complete decision tree is too accurate, and as the decision tree grows, the number of samples processed by the decision tree when branching is continuously reduced, and the overall degree of representation of data by the decision tree is continuously reduced. When branching is performed on the root node, all samples are processed, and when branching is performed further, samples under different groups are processed. It can be seen that with the growth of the decision tree and the continuous decrease of the number of samples, the data features embodied by nodes at deeper layers are more personalized, and the phenomenon that the data features lose general representativeness and cannot be applied to new data classification prediction is called overfitting or overfitting. It is therefore desirable to deal with this by a trimming technique, which includes pre-trimming and post-trimming.
The pre-pruning is that in the process of constructing the decision tree, each node is estimated before division, and if the division of the current node cannot bring the generalization performance of the decision tree model, the current node is not divided and is marked as a leaf node. Compared with the decision tree without pruning and the decision tree subjected to pre-pruning, the pre-pruning ensures that many branches of the decision tree are not unfolded, so that the risk of over-fitting is reduced, and the training time overhead and the testing time overhead of the decision tree are also obviously reduced. On the other hand, although the current partition cannot improve generalization performance, a subsequent partition based on the partition may result in performance improvement, and thus the pre-pruning decision tree may bring a risk of under-fitting.
And the post pruning means that the whole decision tree is constructed, then non-leaf nodes are considered from bottom to top, and if the generalization performance can be improved by replacing the subtrees corresponding to the nodes with the leaf nodes, the subtrees are replaced with the leaf nodes.
Comparing the pre-pruning with the post-pruning, it can be found that the post-pruning decision tree generally retains more branches than the pre-pruning decision tree, and in general, the post-pruning decision tree has less under-fitting risk and better generalization performance than the pre-pruning decision tree. However, the post-pruning process is performed after the decision tree is constructed, and all non-leaf nodes in the decision tree are examined one by one from bottom to top, so that the training time overhead is more than that of the non-pruning decision tree and the pre-pruning decision tree.
In step 304, the ROC curve and the KS value are used as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression, and 2 optimal re-jump models of the decision tree.
Three algorithms of a Support Vector Machine (SVM), a Logistic Regression (LR) and a decision tree (CART) are selected as a base classifier, original data are divided into a test set and a training set according to a proportion, and the test set and the training set are used as input to carry out model training according to selected characteristics, so that the probability prediction of the network distribution line re-jump is completed. Judging an optimal model according with the data according to the ROC curve, the KS value and the like, namely, each data corresponds to an optimal model;
the ROC (Receiver Operating characteristics) curve, where each point reflects the same sensitivity, is a response to the same signal stimulus, but results obtained under several different criteria. The receiver operating characteristic curve is a coordinate graph formed by taking the false startle probability as a horizontal axis and the hit probability as a vertical axis, and is drawn by different results obtained by adopting different judgment standards under the specific stimulation condition.
KS (Kolmogorov-Smirnow) is a non-parametric statistical test, which is directed to a continuous distribution test. This test is often used to compare whether a single sample meets a known distribution, compare the cumulative frequency distribution of the sample data to a particular theoretical distribution, and if the difference between the two is small, conclude that the sample is taken from a particular distribution cluster, and the KS test for the double sample compares the cumulative distributions of the two data sets. The KS test can directly test n observed values of original data, the utilization of the data is more complete, and the KS test is mainly used for continuous and quantitative data with a metering unit. KS detection has robustness, does not depend on the position of a mean value, is insensitive to data dimension, and has wide application range.
When an ROC curve is used as a convergence condition, 1 optimal re-jump model of a support vector machine, 1 optimal re-jump model of logistic regression and 1 optimal re-jump model of a decision tree can be obtained;
when the KS value is taken as a convergence condition, 1 optimal re-jump model of the support vector machine, 1 optimal re-jump model of the logistic regression and 1 optimal re-jump model of the decision tree can be obtained.
According to the distribution network line re-jump prediction provided by the application, the final prediction result needs to comprehensively consider the prediction results of the 3 algorithm models; when the 3 algorithmic predictive models all have 2 optimal models, the various data combinations recorded by the predictions need to be considered.
In some embodiments, in the prediction result, Y represents that the distribution network line is re-jumped, and N represents that the distribution network line is not re-jumped.
For example, if the output result of the support vector machine optimal re-hopping model is Y, the output result of the logistic regression optimal re-hopping model is N, and the output result of the decision tree optimal re-hopping model is Y, the final prediction result is represented as { YN, Y }.
In step 305, the optimal re-hopping model prediction results of different algorithms are combined to obtain 8 different prediction results, and the prediction results are composed of elements in the category target set { Y, N }.
According to the results predicted from the category target set { Y, N } by each model, the predicted results are combined, and the combined results are sharedIn 8 cases: { Y, Y, Y }, { Y, Y, N }, { Y, N, Y }, { N, Y, Y }, { Y, N, N }, { N, Y, N }, { N, N, Y }, { N, N, N }, and { N, N, N }.
In step 306, the distribution network line re-hop probability is divided into four types, i.e., high, medium, and low, based on the number of Y in the prediction result.
In some embodiments, the distribution network line re-hop probability is high when 3 ys are included; 2Y is defined as higher probability of network distribution line re-jump; the probability of the network line re-jump is defined as 1Y; the distribution network line re-hop probability is defined to be low by containing 0Y.
And dividing the distribution network line re-hop probability into four types of high type, medium type and low type according to the number of Y in the prediction result. The network distribution network line re-hopping probability is high, the network distribution network line re-hopping probability is medium, and the network distribution network line re-hopping probability is low.
The embodiment of the application also provides a device for fusion processing of the distribution network line re-hopping models, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the content of the fusion processing method of the distribution network line re-hopping models according to the embodiment of the application when executing the computer program.
An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and when at least a part of the computer instructions are executed by a processor, the contents of the network line re-hop model fusion processing method according to the present application are implemented.
The method has the advantages that the accuracy, the integrity and the consistency of the data can be improved by acquiring and preprocessing the data influencing the network re-hopping; furthermore, a joint hypothesis test algorithm is adopted for feature selection, and multiple algorithms are selected as a base classifier to construct multiple models for training, so that the probability prediction of the network distribution line re-jump can be realized; further, the optimal models which accord with the data are judged through an ROC curve, a KS value and the like, and each data can correspond to one optimal model respectively; furthermore, by combining various prediction results, the analysis probability of the occurrence of the re-jump of the future distribution network line is pre-judged, the problem of the trip of the distribution network line is solved, the safety threat to the distribution network is reduced, the threat to the service users of the whole distribution network is reduced, the hidden danger in all aspects is reduced, and the whole network can run stably.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data blocks," modules, "" engines, "" units, "" components, "or" systems. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.
Claims (9)
1. A method for fusion processing of a network distribution line re-jump model is characterized by comprising the following steps:
acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data;
dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set;
training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model based on the training set, the test set and the characteristic values;
respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression and 2 optimal re-jump models of a decision tree;
combining the optimal re-hopping model prediction results of different algorithms to obtain 8 different prediction results, wherein the prediction results are composed of elements in a category target set { Y, N };
and dividing the distribution network line re-hop probability into four types of high type, medium type and low type based on the number of Y in the prediction result.
2. The method for fusion processing of the distribution network line re-hopping models as claimed in claim 1, wherein the basic data specifically includes: fault defect data, complaint and warranty data, equipment overload data, data such as routing inspection and the like, and meteorological monitoring data.
3. The method for merging the distribution network line re-hopping models as claimed in claim 1, wherein the preprocessing comprises: and cleaning and denoising the basic data, supplementing vacancy data, smoothing noise data, identifying and deleting isolated points, and normalizing.
4. The method for fusion processing of the distribution network line re-hop models according to claim 1, wherein the characteristic values are obtained by selecting characteristics by using a joint hypothesis testing algorithm.
5. The method for fusion processing of the distribution network line re-hopping models as claimed in claim 1, wherein the 8 different prediction results are specifically:
{Y,Y,Y}、{Y,Y,N}、{Y,N,Y}、{N,Y,Y}、{Y,N,N}、{N,Y,N}、{N,N,Y}、{N,N,N}。
6. the method for fusion processing of distribution network line re-jump models according to claim 1, wherein in the prediction result, Y represents a re-jump of the distribution network line, and N represents no re-jump of the distribution network line.
7. The method for fusion processing of distribution network line re-hop models according to claim 1, wherein the distribution network line re-hop probability is divided into four types, namely high, medium and low, based on the number of Y in the prediction result, and specifically comprises:
the probability of the network line re-jump is defined to be high by 3Y; 2Y is defined as higher probability of network distribution line re-jump; the probability of the network line re-jump is defined as 1Y; the distribution network line re-hop probability is defined to be low by containing 0Y.
8. A distribution network line re-jump model fusion processing device is characterized by comprising a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to execute the distribution network line re-jump model fusion processing method according to any one of claims 1 to 7.
9. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and when at least part of the computer instructions are executed by a processor, the method for merging the network line re-hop models in accordance with any one of claims 1 to 7 is implemented.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010402397.2A CN111612231B (en) | 2020-05-13 | 2020-05-13 | Distribution network line re-jump model fusion processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010402397.2A CN111612231B (en) | 2020-05-13 | 2020-05-13 | Distribution network line re-jump model fusion processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111612231A true CN111612231A (en) | 2020-09-01 |
CN111612231B CN111612231B (en) | 2023-09-01 |
Family
ID=72200296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010402397.2A Active CN111612231B (en) | 2020-05-13 | 2020-05-13 | Distribution network line re-jump model fusion processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111612231B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116258282A (en) * | 2023-05-12 | 2023-06-13 | 国网浙江省电力有限公司金华供电公司 | Smart grid resource scheduling and distributing method based on cloud platform |
CN116476060A (en) * | 2023-04-24 | 2023-07-25 | 武汉智网兴电科技开发有限公司 | Intelligent operation and detection method and system for substation equipment based on inspection robot |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001031579A2 (en) * | 1999-10-27 | 2001-05-03 | Barnhill Technologies, Llc | Methods and devices for identifying patterns in biological patterns |
CN108711107A (en) * | 2018-05-25 | 2018-10-26 | 上海钱智金融信息服务有限公司 | Intelligent financing services recommend method and its system |
US20190183429A1 (en) * | 2016-03-24 | 2019-06-20 | The Regents Of The University Of California | Deep-learning-based cancer classification using a hierarchical classification framework |
US20190208056A1 (en) * | 2018-01-04 | 2019-07-04 | Dell Products L.P. | Case Management Virtual Assistant to Enable Predictive Outputs |
CN110135614A (en) * | 2019-03-26 | 2019-08-16 | 广东工业大学 | It is a kind of to be tripped prediction technique based on rejecting outliers and the 10kV distribution low-voltage of sampling techniques |
CN110162014A (en) * | 2019-05-29 | 2019-08-23 | 上海理工大学 | A kind of breakdown of refrigeration system diagnostic method of integrated multi-intelligence algorithm |
-
2020
- 2020-05-13 CN CN202010402397.2A patent/CN111612231B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001031579A2 (en) * | 1999-10-27 | 2001-05-03 | Barnhill Technologies, Llc | Methods and devices for identifying patterns in biological patterns |
US20190183429A1 (en) * | 2016-03-24 | 2019-06-20 | The Regents Of The University Of California | Deep-learning-based cancer classification using a hierarchical classification framework |
US20190208056A1 (en) * | 2018-01-04 | 2019-07-04 | Dell Products L.P. | Case Management Virtual Assistant to Enable Predictive Outputs |
CN108711107A (en) * | 2018-05-25 | 2018-10-26 | 上海钱智金融信息服务有限公司 | Intelligent financing services recommend method and its system |
CN110135614A (en) * | 2019-03-26 | 2019-08-16 | 广东工业大学 | It is a kind of to be tripped prediction technique based on rejecting outliers and the 10kV distribution low-voltage of sampling techniques |
CN110162014A (en) * | 2019-05-29 | 2019-08-23 | 上海理工大学 | A kind of breakdown of refrigeration system diagnostic method of integrated multi-intelligence algorithm |
Non-Patent Citations (3)
Title |
---|
JIANRONG YAO: "Detecting Fraudulent Financial Statements for the Sustainable Development of the Socio-Economy in China: A Multi-Analytic Approach", 《SUSTAINABILITY》 * |
LU JIAZHENG: "Research and Application of Fire Forecasting Model for Electric Transmission Lines Incorporating Meteorological Data and Human Activities", 《MATHEMATICAL PROBLEMS IN ENGINEERING》 * |
邱维蓉: "几种聚类优化的机器学习方法在灵台县滑坡易发性评价中的应用", 《西北地质》, no. 1 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116476060A (en) * | 2023-04-24 | 2023-07-25 | 武汉智网兴电科技开发有限公司 | Intelligent operation and detection method and system for substation equipment based on inspection robot |
CN116476060B (en) * | 2023-04-24 | 2023-09-29 | 武汉智网兴电科技开发有限公司 | Intelligent operation and detection method and system for substation equipment based on inspection robot |
CN116258282A (en) * | 2023-05-12 | 2023-06-13 | 国网浙江省电力有限公司金华供电公司 | Smart grid resource scheduling and distributing method based on cloud platform |
Also Published As
Publication number | Publication date |
---|---|
CN111612231B (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110097297B (en) | Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium | |
CN112149873B (en) | Low-voltage station line loss reasonable interval prediction method based on deep learning | |
CN110232499A (en) | A kind of power distribution network information physical side method for prewarning risk and system | |
CN111612231B (en) | Distribution network line re-jump model fusion processing method and device | |
CN111695731A (en) | Load prediction method, system and equipment based on multi-source data and hybrid neural network | |
CN111525587B (en) | Reactive load situation-based power grid reactive voltage control method and system | |
CN115293326A (en) | Training method and device of power load prediction model and power load prediction method | |
CN111160626A (en) | Power load time sequence control method based on decomposition and fusion | |
CN115422788B (en) | Power distribution network line loss analysis management method, device, storage medium and system | |
CN112949207A (en) | Short-term load prediction method based on improved least square support vector machine | |
CN110991815A (en) | Distribution room power energy scheduling method and system | |
CN117239713A (en) | Intelligent security management and control method and system based on power distribution network dispatching | |
Qiao et al. | Predicting building energy consumption based on meteorological data | |
CN115603446A (en) | Power distribution station area operation monitoring system based on convolution neural network and cloud edge synergistic effect | |
CN111027841A (en) | Low-voltage transformer area line loss calculation method based on gradient lifting decision tree | |
Prakash et al. | A machine learning approach-based power theft detection using GRF optimization | |
CN113902181A (en) | Short-term prediction method and equipment for common variable heavy overload | |
CN117114161A (en) | Method for predicting wind deflection flashover risk of power transmission line based on meta-learning | |
Cheng et al. | Reactive Power Load Forecasting based on K-means Clustering and Random Forest Algorithm | |
CN110045197B (en) | Distribution network fault early warning method | |
CN111612233A (en) | Method and device for obtaining importance scores of power distribution network line re-jump influence factors | |
CN110458432A (en) | A kind of electric power Optical Transmission Network OTN reliability diagnostic method based on cloud model | |
CN112241812B (en) | Topology identification method for low-voltage distribution network based on single-side optimization and genetic algorithm cooperation | |
CN113723670B (en) | Photovoltaic power generation power short-term prediction method with variable time window | |
Liu et al. | Analysis and prediction of power distribution network loss based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |