CN111612231A - Method and device for fusion processing of distribution network line re-jump models - Google Patents

Method and device for fusion processing of distribution network line re-jump models Download PDF

Info

Publication number
CN111612231A
CN111612231A CN202010402397.2A CN202010402397A CN111612231A CN 111612231 A CN111612231 A CN 111612231A CN 202010402397 A CN202010402397 A CN 202010402397A CN 111612231 A CN111612231 A CN 111612231A
Authority
CN
China
Prior art keywords
jump
distribution network
network line
data
models
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010402397.2A
Other languages
Chinese (zh)
Other versions
CN111612231B (en
Inventor
聂鼎
宋忧乐
范黎涛
王洪林
骆怡
林广宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Yunnan Power Grid Co Ltd filed Critical Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority to CN202010402397.2A priority Critical patent/CN111612231B/en
Publication of CN111612231A publication Critical patent/CN111612231A/en
Application granted granted Critical
Publication of CN111612231B publication Critical patent/CN111612231B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Power Engineering (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application relates to the technical field of power grid equipment manufacturing, in particular to a method and a device for fusion processing of a network distribution line re-jump model. The method comprises the following steps: acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data; dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set; training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model on the basis of a training set, a test set and a characteristic value; respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 kinds of support vector machine optimal re-jump models, logistic regression optimal re-jump models and decision tree optimal re-jump models; combining the optimal re-hopping model prediction results of different algorithms to obtain 8 prediction results, wherein the prediction results are composed of elements in a category target set { Y, N }; and dividing the re-jump probability into four types of high type, medium type and low type based on the number of Y in the prediction result.

Description

Method and device for fusion processing of distribution network line re-jump models
Technical Field
The application relates to the technical field of power grid equipment manufacturing, in particular to a method and a device for fusion processing of a network distribution line re-jump model.
Background
The phenomenon that the distribution network line is tripped again or even frequently due to the fact that the distribution network defense capacity is reduced and the distribution network is prone to tripping caused by various internal and external factors because the power supply area of a power grid is enlarged, the number of branches of the line is large, the power supply radius is long, the ageing of equipment is large. The distribution network line structure is complex, various types of related equipment are provided, the power supply coverage is wide, a series of problems such as equipment aging faults and the like cause frequent tripping events, the tripping of the distribution network line can threaten the safety of a distribution network, the service users of the whole distribution network are threatened, and various hidden dangers can be brought. Therefore, a good line running state and a reasonable running state are the basis for ensuring the safe running of the distribution network.
Because the load of the distribution network is continuously increased, and the lines of the distribution network are continuously increased, tripping faults are easy to occur, even frequent tripping occurs, and the influence on the life of people is huge. The traditional mode is that all distribution network lines are inspected for hidden troubles by manpower, a strict itinerant inspection system is carried out, the conditions of line equipment are required to be known, and hidden troubles are eliminated in time.
However, the conventional method needs a lot of manpower and is not always capable of effectively finding the fault, and can not effectively predict and prevent the network line from re-jumping.
Disclosure of Invention
The application provides a method and a device for fusion processing of a distribution network line re-jump model, which are used for predicting the re-adjustment of a distribution network line by combining internal data and external data and utilizing a fusion processing method to obtain the probability of the re-jump of the distribution network line.
The embodiment of the application is realized as follows:
a first aspect of an embodiment of the present application provides a method for fusion processing of a network distribution line re-hopping model, where the method includes:
acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data;
dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set;
training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model based on the training set, the test set and the characteristic values;
respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression and 2 optimal re-jump models of a decision tree;
combining the optimal re-hopping model prediction results of different algorithms to obtain 8 different prediction results, wherein the prediction results are composed of elements in a category target set { Y, N };
and dividing the distribution network line re-hop probability into four types of high type, medium type and low type based on the number of Y in the prediction result.
A second aspect of the present embodiment provides a device for merging and processing a distribution network line re-hopping model, which includes a memory, a processor, and a computer program stored on the memory, where the processor executes the computer program to perform the method according to any one of the aspects of the present invention provided in the first aspect of the present embodiment.
A third aspect of embodiments of the present application provides a computer-readable storage medium storing computer instructions, at least part of the computer instructions, when executed by a processor, implementing a method as set forth in any one of the summary provided in the first aspect of embodiments of the present application.
The beneficial effect of this application lies in: by collecting data influencing the distribution network re-hopping and preprocessing, the accuracy, integrity and consistency of the data can be improved; furthermore, a joint hypothesis test algorithm is adopted for feature selection, and multiple algorithms are selected as a base classifier to construct multiple models for training, so that the probability prediction of the network distribution line re-jump can be realized; further, the optimal models which accord with the data are judged through an ROC curve, a KS value and the like, and each data can correspond to one optimal model respectively; furthermore, by combining various prediction results, the analysis probability of the occurrence of the re-jump of the future distribution network line is pre-judged, the problem of the trip of the distribution network line is solved, the safety threat to the distribution network is reduced, the threat to the service users of the whole distribution network is reduced, the hidden danger in all aspects is reduced, and the whole network can run stably.
Drawings
Specifically, in order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments are briefly described below, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without any creative effort.
Fig. 1 is a schematic diagram of a system for merging network line re-hop models according to some embodiments of the present disclosure;
FIG. 2 is a schematic diagram of an exemplary computing device shown in accordance with some embodiments of the present application;
fig. 3 shows a schematic flow chart of a method for merging distribution network line re-hop models in an embodiment of the present application.
Detailed Description
Certain exemplary embodiments will now be described to provide an overall understanding of the principles of the structure, function, manufacture, and use of the devices and methods disclosed herein. One or more examples of these embodiments are illustrated in the accompanying drawings. Those of ordinary skill in the art will understand that the devices and methods specifically described herein and illustrated in the accompanying drawings are non-limiting exemplary embodiments and that the scope of the various embodiments of the present invention is defined solely by the claims. Features illustrated or described in connection with one exemplary embodiment may be combined with features of other embodiments. Such modifications and variations are intended to be included within the scope of the present invention.
Reference throughout this specification to "embodiments," "some embodiments," "one embodiment," or "an embodiment," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in at least one other embodiment," or "in an embodiment" or the like throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, the particular features, structures, or characteristics shown or described in connection with one embodiment may be combined, in whole or in part, with the features, structures, or characteristics of one or more other embodiments, without limitation. Such modifications and variations are intended to be included within the scope of the present invention.
Fig. 1 is a schematic diagram of a system 100 for merging network line re-hop models according to some embodiments of the present application. The distribution network line re-jump model fusion processing system 100 is a platform capable of automatically predicting the probability of re-jump of the distribution network line. The system 100 for merging and processing a network line re-hop model may include a server 110, at least one storage device 120, at least one network 130, and one or more data acquisition devices 150-1, 150-2. The server 110 may include a processing engine 112.
In some embodiments, the server 110 may be a single server or a group of servers. The server farm may be centralized or distributed (e.g., server 110 may be a distributed system). In some embodiments, the server 110 may be local or remote. For example, server 110 may access data stored in storage device 120 via network 130. Server 110 may be directly connected to storage device 120 to access the stored data. In some embodiments, the server 110 may be implemented on a cloud platform. The cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, multiple clouds, the like, or any combination of the above. In some embodiments, server 110 may be implemented on a computing device as illustrated in FIG. 2 herein, including one or more components of computing device 200.
In some embodiments, the server 110 may include a processing engine 112. Processing engine 112 may process information and/or data related to the service request to perform one or more of the functions described herein. For example, the processing engine 112 may be configured to obtain the distribution network infrastructure data transmitted by the data collection device 150 and send the distribution network infrastructure data to the storage device 120 via the network 130 for updating the data stored therein. In some embodiments, processing engine 112 may include one or more processors. The processing engine 112 may include one or more hardware processors, such as a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an application specific instruction set processor (ASIP), an image processor (GPU), a physical arithmetic processor (PPU), a Digital Signal Processor (DSP), a field-programmable gate array (FPGA), a Programmable Logic Device (PLD), a controller, a micro-controller unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination of the above.
Storage device 120 may store data and/or instructions. In some embodiments, storage device 120 may store distribution network infrastructure data obtained from data collection device 150. In some embodiments, storage device 120 may store data and/or instructions for execution or use by server 110, which server 110 may execute or use to implement the embodiment methods described herein. In some embodiments, storage device 120 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), the like, or any combination of the above. In some embodiments, storage device 120 may be implemented on a cloud platform. For example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, multiple clouds, the like, or any combination of the above.
In some embodiments, the storage device 120 may be connected to the network 130 to enable communication with one or more components in the distribution network line re-hop model convergence processing system 100. One or more components of the network line re-hop model convergence processing system 100 may access data or instructions stored in the storage device 120 through the network 130. In some embodiments, the storage device 120 may be directly connected to or in communication with one or more components of the distribution network line re-hop model convergence processing system 100. In some embodiments, storage device 120 may be part of server 110.
The network 130 may facilitate the exchange of information and/or data. In some embodiments, one or more components in the distribution network line re-hop model convergence processing system 100 may send information and/or data to other components in the distribution network line re-hop model convergence processing system 100 over the network 130. For example, the server 110 may obtain/obtain the distribution network infrastructure data from the data collection device 150 via the network 130. In some embodiments, the network 130 may be any one of a wired network or a wireless network, or a combination thereof. In some embodiments, the network 130 may include one or more network access points. For example, the network 130 may include wired or wireless network access points, such as base stations and/or Internet switching points 130-1, 130-2, and so forth. Through the access point, one or more components of the distribution network line re-hop model convergence processing system 100 may be connected to the network 130 to exchange data and/or information.
The data collection facility 150 may include fault defect data, complaint warranty data, facility overload data, data such as routing inspection, weather monitoring data, and the like. In some embodiments, the data collection device 150 may send the collected distribution network infrastructure data to one or more devices in the distribution network line re-hop model fusion processing system 100. For example, the data collection device 150 may send the distribution network basic data to the server 110 for processing, or store the distribution network basic data in the storage device 120.
FIG. 2 is a schematic diagram of an exemplary computing device 200 shown in accordance with some embodiments of the present application. The server 110, storage device 120, and data collection device 150 may be implemented on a computing device 200. For example, the processing engine 112 may be implemented on the computing device 200 and configured to implement the functionality disclosed herein.
Computing device 200 may include any components used to implement the systems described herein. For example, the processing engine 112 may be implemented on the computing device 200 by its hardware, software programs, firmware, or a combination thereof. For convenience, only one computer is depicted in the figures, but the computing functions described herein in connection with the distribution network line re-hopping model fusion processing system 100 can be implemented in a distributed manner by a set of similar platforms to distribute the processing load of the system.
Computing device 200 may include a communication port 250 for connecting to a network for enabling data communication. Computing device 200 may include a processor 220 that may execute program instructions in the form of one or more processors. An exemplary computer platform may include an internal bus 210, various forms of program memory and data storage including, for example, a hard disk 270, and Read Only Memory (ROM)230 or Random Access Memory (RAM)240 for storing various data files that are processed and/or transmitted by the computer. An exemplary computing device may include program instructions stored in read-only memory 230, random access memory 240, and/or other types of non-transitory storage media that are executed by processor 220. The methods and/or processes of the present application may be embodied in the form of program instructions. Computing device 200 also includes input/output component 260 for supporting input/output between the computer and other components. Computing device 200 may also receive programs and data in the present disclosure via network communication.
For ease of understanding, only one processor is exemplarily depicted in fig. 2. However, it should be noted that the computing device 200 in the present application may include multiple processors, and thus the operations and/or methods described in the present application that are implemented by one processor may also be implemented by multiple processors, collectively or independently. For example, if in the present application a processor of computing device 200 performs steps 1 and 2, it should be understood that steps 1 and 2 may also be performed by two different processors of computing device 200, either collectively or independently.
Fig. 3 shows a schematic flow chart of a method for merging distribution network line re-hop models in an embodiment of the present application.
The traditional machine learning method is to search a classifier in a space formed by rich functions, the classifier must meet the requirement of being closest to an actual classification function, and the main idea of the fusion learning is to fuse the classification results of a plurality of single classifiers when classifying a sample space, so as to obtain a better classification result than the single classifier. Considering a single classifier as a decision maker, the fusion learning is that a group of decision makers jointly make a decision on an event.
In step 301, basic data affecting the network line re-tripping is collected and preprocessed.
In some embodiments, the basic data specifically includes: fault defect data, complaint and warranty data, equipment overload data, data such as routing inspection and the like, and meteorological monitoring data.
The distribution network refers to an electric power network which receives electric energy from a transmission network or a regional power plant and distributes the electric energy to various users on site through distribution facilities or step by step according to voltage. The distribution network is composed of overhead lines, cables, towers, distribution transformers, isolating switches, reactive power compensators, a plurality of accessory facilities and the like, and plays a role in distributing electric energy in the power network.
The fault data is from faults of various devices in the distribution network, such as overhead lines, cables, towers, distribution transformers, disconnectors, reactive power compensators, distribution network switchgear, and the like.
The complaint and repair data come from a complaint telephone information source, an internet information source, a radio station information source, a file information source, a mail information source and a notification information source, such as a customer service complaint hot line from a distribution network, an application client APP and the like.
The equipment overload data, for example, when the substation belonging to two different sub-areas performs loop closing and power dispatching, is influenced by the system operation condition and the power grid parameters, and risks such as equipment overload, relay protection malfunction, short circuit current exceeding standard, accident expansion caused by an electromagnetic looped network and the like due to overlarge loop closing current may occur, so that the power grid safety is influenced.
Inspection data, for example, in the implementation of conventional insulator detection, the insulator in the transformer substation is inspected by a manual inspection method for online detection of the insulator in the transformer substation; the inspection of the power transmission and transformation equipment can be carried out in a mode of adopting an inspection trolley; or the real-time monitoring of the insulator is realized by arranging the sensor in the insulator mounting section and combining the data acquisition of the sensor and the communication means.
In some embodiments, the preprocessing of the base data includes one or more combinations of cleansing and denoising the base data, supplementing missing data, smoothing noise data, outlier identification and deletion, and normalization.
The preprocessing replaces abnormal data, such as null data and unreasonable data, in the abnormal data with second sample data acquired by other sensors at the same time. Normalization is a simplified computational method, i.e., a dimensional expression is converted into a dimensionless expression to form a scalar, and is often used in many computations. The noise data is smoothed and the original image may be smoothed using a low pass filter. The frequency response of the low-pass filter for smoothing the original image is matched with the modulation transfer function of the sensor, so that the smoothing effect of the original image is improved.
In step 302, the preprocessed basic data are divided into a training set and a test set, and feature values of the training set and the test set are obtained.
In some embodiments, the preprocessed basic data is divided into a training set and a test set, and the training set collects a preset number of samples for training the re-jump probability prediction model.
In the field of machine learning, it is generally necessary to divide a sample into three separate parts, namely a training set, a validation set, and a test set. The training set is used for estimating the model, the verification set is used for determining the network structure or parameters for controlling the complexity of the model, and the test set is used for checking how to finally select the optimal model.
In some embodiments, the training data may be divided such that the training set comprises 50% of the total samples and the others comprise 25%, all three being randomly drawn from the samples. And training the probability prediction model of the double jump by using the training set to obtain the optimal network model parameter with the globally minimized error.
In some embodiments, feature value selection is performed by employing a joint hypothesis testing algorithm.
Hypothesis testing, also known as statistical hypothesis testing, is a statistical inference method used to determine whether sample-to-sample, sample-to-population differences are caused by sampling errors or by intrinsic differences. The significance test is the most common method in hypothesis testing, and is the most basic form of statistical inference, whose rationale is to make certain assumptions about the characteristics of the population and then, through statistical reasoning in sampling studies, to infer whether the assumptions should be rejected or accepted. Commonly used hypothesis testing methods include Z test, t test, chi-square test, F test, and the like. The basic idea of hypothesis testing is the "small probability event" principle, whose statistical inference method is a counter-syndrome with some probabilistic nature.
In step 303, a support vector machine-based rebucketing model, a logistic regression-based rebucketing model, and a decision tree-based rebucketing model are trained based on the training set, the test set, and the feature values.
An SVM (Support Vector Machine) is a generalized linear classifier for binary classification of data in a supervised learning manner, and a decision boundary of the SVM is a maximum edge distance hyperplane for solving a learning sample. The SVM uses a hinge loss function to calculate empirical risk and adds a regularization term in a solving system to optimize structural risk, the classifier is a classifier with sparsity and robustness, and the SVM can carry out nonlinear classification by a kernel method.
LR (logistic regression) is a generalized linear regression, and dependent variables of LR can be classified into two categories or more categories, but the two categories are more common and easier to explain, and the more categories can be processed by using the softmax method. The most common in practice is the logistic regression of the two classes. LR can be used for finding risk factors, prediction and discrimination.
And in some embodiments, a certain proportion of data is extracted from the preprocessed basic data in a centralized manner to be used as a training set of the primary decision tree of the neural network model, and the rest part of data is used as a test set of the primary decision tree of the neural network model. For example, 70% of the data may be extracted as a training set of the primary decision tree of the random forest classifier, and the remaining 30% of the data may be used as a test set of the primary decision tree of the random forest classifier. In some embodiments, the construction process of the first-level decision tree of the random forest classifier may be performed by using bootstrapping (bootstrap) to draw back n times from the training set to form n training sample sets, that is, some samples may be drawn multiple times and some samples may not be drawn at one time in the training set. The decision tree includes the following common parameters: information gain and entropy of information gain. The entropy of the information gain is used for representing the size of the information quantity, the larger the information quantity is, the larger the corresponding entropy value is, and the smaller the information quantity is, the smaller the corresponding entropy value is. During the growth process of the decision tree, from the root node to the final leaf node, the information entropy is a descending process, and the descending amount of each step is called information gain. In some embodiments, pruning may be performed on the decision tree to prevent overfitting of subsequent random forest models. The complete decision tree is not the best tree to classify and predict new data objects. The reason for this is that the complete decision tree is too accurate, and as the decision tree grows, the number of samples processed by the decision tree when branching is continuously reduced, and the overall degree of representation of data by the decision tree is continuously reduced. When branching is performed on the root node, all samples are processed, and when branching is performed further, samples under different groups are processed. It can be seen that with the growth of the decision tree and the continuous decrease of the number of samples, the data features embodied by nodes at deeper layers are more personalized, and the phenomenon that the data features lose general representativeness and cannot be applied to new data classification prediction is called overfitting or overfitting. It is therefore desirable to deal with this by a trimming technique, which includes pre-trimming and post-trimming.
The pre-pruning is that in the process of constructing the decision tree, each node is estimated before division, and if the division of the current node cannot bring the generalization performance of the decision tree model, the current node is not divided and is marked as a leaf node. Compared with the decision tree without pruning and the decision tree subjected to pre-pruning, the pre-pruning ensures that many branches of the decision tree are not unfolded, so that the risk of over-fitting is reduced, and the training time overhead and the testing time overhead of the decision tree are also obviously reduced. On the other hand, although the current partition cannot improve generalization performance, a subsequent partition based on the partition may result in performance improvement, and thus the pre-pruning decision tree may bring a risk of under-fitting.
And the post pruning means that the whole decision tree is constructed, then non-leaf nodes are considered from bottom to top, and if the generalization performance can be improved by replacing the subtrees corresponding to the nodes with the leaf nodes, the subtrees are replaced with the leaf nodes.
Comparing the pre-pruning with the post-pruning, it can be found that the post-pruning decision tree generally retains more branches than the pre-pruning decision tree, and in general, the post-pruning decision tree has less under-fitting risk and better generalization performance than the pre-pruning decision tree. However, the post-pruning process is performed after the decision tree is constructed, and all non-leaf nodes in the decision tree are examined one by one from bottom to top, so that the training time overhead is more than that of the non-pruning decision tree and the pre-pruning decision tree.
In step 304, the ROC curve and the KS value are used as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression, and 2 optimal re-jump models of the decision tree.
Three algorithms of a Support Vector Machine (SVM), a Logistic Regression (LR) and a decision tree (CART) are selected as a base classifier, original data are divided into a test set and a training set according to a proportion, and the test set and the training set are used as input to carry out model training according to selected characteristics, so that the probability prediction of the network distribution line re-jump is completed. Judging an optimal model according with the data according to the ROC curve, the KS value and the like, namely, each data corresponds to an optimal model;
the ROC (Receiver Operating characteristics) curve, where each point reflects the same sensitivity, is a response to the same signal stimulus, but results obtained under several different criteria. The receiver operating characteristic curve is a coordinate graph formed by taking the false startle probability as a horizontal axis and the hit probability as a vertical axis, and is drawn by different results obtained by adopting different judgment standards under the specific stimulation condition.
KS (Kolmogorov-Smirnow) is a non-parametric statistical test, which is directed to a continuous distribution test. This test is often used to compare whether a single sample meets a known distribution, compare the cumulative frequency distribution of the sample data to a particular theoretical distribution, and if the difference between the two is small, conclude that the sample is taken from a particular distribution cluster, and the KS test for the double sample compares the cumulative distributions of the two data sets. The KS test can directly test n observed values of original data, the utilization of the data is more complete, and the KS test is mainly used for continuous and quantitative data with a metering unit. KS detection has robustness, does not depend on the position of a mean value, is insensitive to data dimension, and has wide application range.
When an ROC curve is used as a convergence condition, 1 optimal re-jump model of a support vector machine, 1 optimal re-jump model of logistic regression and 1 optimal re-jump model of a decision tree can be obtained;
when the KS value is taken as a convergence condition, 1 optimal re-jump model of the support vector machine, 1 optimal re-jump model of the logistic regression and 1 optimal re-jump model of the decision tree can be obtained.
According to the distribution network line re-jump prediction provided by the application, the final prediction result needs to comprehensively consider the prediction results of the 3 algorithm models; when the 3 algorithmic predictive models all have 2 optimal models, the various data combinations recorded by the predictions need to be considered.
In some embodiments, in the prediction result, Y represents that the distribution network line is re-jumped, and N represents that the distribution network line is not re-jumped.
For example, if the output result of the support vector machine optimal re-hopping model is Y, the output result of the logistic regression optimal re-hopping model is N, and the output result of the decision tree optimal re-hopping model is Y, the final prediction result is represented as { YN, Y }.
In step 305, the optimal re-hopping model prediction results of different algorithms are combined to obtain 8 different prediction results, and the prediction results are composed of elements in the category target set { Y, N }.
According to the results predicted from the category target set { Y, N } by each model, the predicted results are combined, and the combined results are shared
Figure BDA0002489996410000091
In 8 cases: { Y, Y, Y }, { Y, Y, N }, { Y, N, Y }, { N, Y, Y }, { Y, N, N }, { N, Y, N }, { N, N, Y }, { N, N, N }, and { N, N, N }.
In step 306, the distribution network line re-hop probability is divided into four types, i.e., high, medium, and low, based on the number of Y in the prediction result.
In some embodiments, the distribution network line re-hop probability is high when 3 ys are included; 2Y is defined as higher probability of network distribution line re-jump; the probability of the network line re-jump is defined as 1Y; the distribution network line re-hop probability is defined to be low by containing 0Y.
And dividing the distribution network line re-hop probability into four types of high type, medium type and low type according to the number of Y in the prediction result. The network distribution network line re-hopping probability is high, the network distribution network line re-hopping probability is medium, and the network distribution network line re-hopping probability is low.
The embodiment of the application also provides a device for fusion processing of the distribution network line re-hopping models, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the content of the fusion processing method of the distribution network line re-hopping models according to the embodiment of the application when executing the computer program.
An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and when at least a part of the computer instructions are executed by a processor, the contents of the network line re-hop model fusion processing method according to the present application are implemented.
The method has the advantages that the accuracy, the integrity and the consistency of the data can be improved by acquiring and preprocessing the data influencing the network re-hopping; furthermore, a joint hypothesis test algorithm is adopted for feature selection, and multiple algorithms are selected as a base classifier to construct multiple models for training, so that the probability prediction of the network distribution line re-jump can be realized; further, the optimal models which accord with the data are judged through an ROC curve, a KS value and the like, and each data can correspond to one optimal model respectively; furthermore, by combining various prediction results, the analysis probability of the occurrence of the re-jump of the future distribution network line is pre-judged, the problem of the trip of the distribution network line is solved, the safety threat to the distribution network is reduced, the threat to the service users of the whole distribution network is reduced, the hidden danger in all aspects is reduced, and the whole network can run stably.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data blocks," modules, "" engines, "" units, "" components, "or" systems. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.

Claims (9)

1. A method for fusion processing of a network distribution line re-jump model is characterized by comprising the following steps:
acquiring basic data influencing the repeated tripping of the distribution network line, and preprocessing the basic data;
dividing the preprocessed basic data into a training set and a test set, and acquiring characteristic values of the training set and the test set;
training a support vector machine-based re-jump model, training a logistic regression-based re-jump model and training a decision tree-based re-jump model based on the training set, the test set and the characteristic values;
respectively taking the ROC curve and the KS value as convergence conditions to obtain 2 optimal re-jump models of the support vector machine, 2 optimal re-jump models of logistic regression and 2 optimal re-jump models of a decision tree;
combining the optimal re-hopping model prediction results of different algorithms to obtain 8 different prediction results, wherein the prediction results are composed of elements in a category target set { Y, N };
and dividing the distribution network line re-hop probability into four types of high type, medium type and low type based on the number of Y in the prediction result.
2. The method for fusion processing of the distribution network line re-hopping models as claimed in claim 1, wherein the basic data specifically includes: fault defect data, complaint and warranty data, equipment overload data, data such as routing inspection and the like, and meteorological monitoring data.
3. The method for merging the distribution network line re-hopping models as claimed in claim 1, wherein the preprocessing comprises: and cleaning and denoising the basic data, supplementing vacancy data, smoothing noise data, identifying and deleting isolated points, and normalizing.
4. The method for fusion processing of the distribution network line re-hop models according to claim 1, wherein the characteristic values are obtained by selecting characteristics by using a joint hypothesis testing algorithm.
5. The method for fusion processing of the distribution network line re-hopping models as claimed in claim 1, wherein the 8 different prediction results are specifically:
{Y,Y,Y}、{Y,Y,N}、{Y,N,Y}、{N,Y,Y}、{Y,N,N}、{N,Y,N}、{N,N,Y}、{N,N,N}。
6. the method for fusion processing of distribution network line re-jump models according to claim 1, wherein in the prediction result, Y represents a re-jump of the distribution network line, and N represents no re-jump of the distribution network line.
7. The method for fusion processing of distribution network line re-hop models according to claim 1, wherein the distribution network line re-hop probability is divided into four types, namely high, medium and low, based on the number of Y in the prediction result, and specifically comprises:
the probability of the network line re-jump is defined to be high by 3Y; 2Y is defined as higher probability of network distribution line re-jump; the probability of the network line re-jump is defined as 1Y; the distribution network line re-hop probability is defined to be low by containing 0Y.
8. A distribution network line re-jump model fusion processing device is characterized by comprising a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to execute the distribution network line re-jump model fusion processing method according to any one of claims 1 to 7.
9. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and when at least part of the computer instructions are executed by a processor, the method for merging the network line re-hop models in accordance with any one of claims 1 to 7 is implemented.
CN202010402397.2A 2020-05-13 2020-05-13 Distribution network line re-jump model fusion processing method and device Active CN111612231B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010402397.2A CN111612231B (en) 2020-05-13 2020-05-13 Distribution network line re-jump model fusion processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010402397.2A CN111612231B (en) 2020-05-13 2020-05-13 Distribution network line re-jump model fusion processing method and device

Publications (2)

Publication Number Publication Date
CN111612231A true CN111612231A (en) 2020-09-01
CN111612231B CN111612231B (en) 2023-09-01

Family

ID=72200296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010402397.2A Active CN111612231B (en) 2020-05-13 2020-05-13 Distribution network line re-jump model fusion processing method and device

Country Status (1)

Country Link
CN (1) CN111612231B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258282A (en) * 2023-05-12 2023-06-13 国网浙江省电力有限公司金华供电公司 Smart grid resource scheduling and distributing method based on cloud platform
CN116476060A (en) * 2023-04-24 2023-07-25 武汉智网兴电科技开发有限公司 Intelligent operation and detection method and system for substation equipment based on inspection robot

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001031579A2 (en) * 1999-10-27 2001-05-03 Barnhill Technologies, Llc Methods and devices for identifying patterns in biological patterns
CN108711107A (en) * 2018-05-25 2018-10-26 上海钱智金融信息服务有限公司 Intelligent financing services recommend method and its system
US20190183429A1 (en) * 2016-03-24 2019-06-20 The Regents Of The University Of California Deep-learning-based cancer classification using a hierarchical classification framework
US20190208056A1 (en) * 2018-01-04 2019-07-04 Dell Products L.P. Case Management Virtual Assistant to Enable Predictive Outputs
CN110135614A (en) * 2019-03-26 2019-08-16 广东工业大学 It is a kind of to be tripped prediction technique based on rejecting outliers and the 10kV distribution low-voltage of sampling techniques
CN110162014A (en) * 2019-05-29 2019-08-23 上海理工大学 A kind of breakdown of refrigeration system diagnostic method of integrated multi-intelligence algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001031579A2 (en) * 1999-10-27 2001-05-03 Barnhill Technologies, Llc Methods and devices for identifying patterns in biological patterns
US20190183429A1 (en) * 2016-03-24 2019-06-20 The Regents Of The University Of California Deep-learning-based cancer classification using a hierarchical classification framework
US20190208056A1 (en) * 2018-01-04 2019-07-04 Dell Products L.P. Case Management Virtual Assistant to Enable Predictive Outputs
CN108711107A (en) * 2018-05-25 2018-10-26 上海钱智金融信息服务有限公司 Intelligent financing services recommend method and its system
CN110135614A (en) * 2019-03-26 2019-08-16 广东工业大学 It is a kind of to be tripped prediction technique based on rejecting outliers and the 10kV distribution low-voltage of sampling techniques
CN110162014A (en) * 2019-05-29 2019-08-23 上海理工大学 A kind of breakdown of refrigeration system diagnostic method of integrated multi-intelligence algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANRONG YAO: "Detecting Fraudulent Financial Statements for the Sustainable Development of the Socio-Economy in China: A Multi-Analytic Approach", 《SUSTAINABILITY》 *
LU JIAZHENG: "Research and Application of Fire Forecasting Model for Electric Transmission Lines Incorporating Meteorological Data and Human Activities", 《MATHEMATICAL PROBLEMS IN ENGINEERING》 *
邱维蓉: "几种聚类优化的机器学习方法在灵台县滑坡易发性评价中的应用", 《西北地质》, no. 1 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116476060A (en) * 2023-04-24 2023-07-25 武汉智网兴电科技开发有限公司 Intelligent operation and detection method and system for substation equipment based on inspection robot
CN116476060B (en) * 2023-04-24 2023-09-29 武汉智网兴电科技开发有限公司 Intelligent operation and detection method and system for substation equipment based on inspection robot
CN116258282A (en) * 2023-05-12 2023-06-13 国网浙江省电力有限公司金华供电公司 Smart grid resource scheduling and distributing method based on cloud platform

Also Published As

Publication number Publication date
CN111612231B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN110097297B (en) Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
CN110232499A (en) A kind of power distribution network information physical side method for prewarning risk and system
CN111612231B (en) Distribution network line re-jump model fusion processing method and device
CN111695731A (en) Load prediction method, system and equipment based on multi-source data and hybrid neural network
CN111525587B (en) Reactive load situation-based power grid reactive voltage control method and system
CN115293326A (en) Training method and device of power load prediction model and power load prediction method
CN111160626A (en) Power load time sequence control method based on decomposition and fusion
CN115422788B (en) Power distribution network line loss analysis management method, device, storage medium and system
CN112949207A (en) Short-term load prediction method based on improved least square support vector machine
CN110991815A (en) Distribution room power energy scheduling method and system
CN117239713A (en) Intelligent security management and control method and system based on power distribution network dispatching
Qiao et al. Predicting building energy consumption based on meteorological data
CN115603446A (en) Power distribution station area operation monitoring system based on convolution neural network and cloud edge synergistic effect
CN111027841A (en) Low-voltage transformer area line loss calculation method based on gradient lifting decision tree
Prakash et al. A machine learning approach-based power theft detection using GRF optimization
CN113902181A (en) Short-term prediction method and equipment for common variable heavy overload
CN117114161A (en) Method for predicting wind deflection flashover risk of power transmission line based on meta-learning
Cheng et al. Reactive Power Load Forecasting based on K-means Clustering and Random Forest Algorithm
CN110045197B (en) Distribution network fault early warning method
CN111612233A (en) Method and device for obtaining importance scores of power distribution network line re-jump influence factors
CN110458432A (en) A kind of electric power Optical Transmission Network OTN reliability diagnostic method based on cloud model
CN112241812B (en) Topology identification method for low-voltage distribution network based on single-side optimization and genetic algorithm cooperation
CN113723670B (en) Photovoltaic power generation power short-term prediction method with variable time window
Liu et al. Analysis and prediction of power distribution network loss based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant