CN112580260A

CN112580260A - Method and device for predicting water flow of pipe network and computer readable storage medium

Info

Publication number: CN112580260A
Application number: CN202011533624.1A
Authority: CN
Inventors: 林凡; 张秋镇; 黄富铿
Original assignee: GCI Science and Technology Co Ltd
Current assignee: GCI Science and Technology Co Ltd
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2021-03-30

Abstract

The invention discloses a method, a device and a computer readable storage medium for predicting water flow of a pipe network, wherein the method comprises the steps of obtaining a sample, training the sample based on a clustering algorithm to obtain a final total clustering number and a data set corresponding to each category, inputting the data set corresponding to each category into k long and short term memory network models for training, wherein k is equal to the final total clustering number to obtain k types of trained long and short term memory network models for training, inputting each pipeline data into the trained clustering algorithm through each pipeline data acquired in real time, determining the category of each pipeline, inputting the category of each pipeline into the trained long and short term memory network models corresponding to the category of each pipeline for calculation to obtain the short term water flow of each pipeline, accumulating the short term water flow of each pipeline to obtain the total water flow of the pipe network, the problem that a single long-short term memory network model is insufficient in prediction accuracy in different environments is solved.

Description

Method and device for predicting water flow of pipe network and computer readable storage medium

Technical Field

The invention relates to the technical field of pipe network monitoring, in particular to a method and a device for predicting water flow of a pipe network and a computer readable storage medium.

Background

The short-time water flow prediction method for researching the large-scale pipe network can enable a water service company to better master the water flow rule of the large-scale pipe network, monitor whether the pipe network has the conditions of pipeline breakage, water leakage and the like in advance, and send people to check in advance to prevent major water cut-off faults.

The existing general method is water flow prediction based on an LSTM algorithm, the principle of the method is single, however, the same LSTM model used in different environments has large difference, and the accuracy of the prediction result of the water flow of the pipe network is low due to the fact that the single LSTM model is used under the condition that the environments are changed.

Disclosure of Invention

In order to solve the above problems, embodiments of the present invention provide a method and an apparatus for predicting a water flow rate of a pipe network, and a computer-readable storage medium, which can solve the problem that a prediction result of the water flow rate of the pipe network in the prior art is inaccurate.

An embodiment of the present invention provides a method for predicting a water flow rate of a pipe network, including:

acquiring collected water flow characteristic data of a pipe network as a sample;

clustering the samples based on a clustering algorithm, calculating Theisenbergin indexes under different total clustering numbers, selecting the total clustering number which enables the Theisenbergin index to be minimum as the final total clustering number, and obtaining a data set corresponding to each clustered category;

establishing k long-short term memory network models, and inputting the data sets corresponding to the clustered categories into the k long-short term memory network models respectively for training to obtain k trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one category, and k is equal to the final total number of clusters;

collecting data of each pipeline of the pipe network in real time, inputting the data of each pipeline into the trained clustering algorithm, and determining the category of the data of each pipeline;

and respectively inputting the data of each pipeline into the trained long-short term memory network model corresponding to the type of the data of each pipeline to obtain the short-term water flow of each pipeline, and accumulating the short-term water flow of each pipeline to obtain the total water flow of the current large-scale pipe network.

As an improvement of the above solution, the samples include a plurality of sample individuals, and the water flow characteristic data of the pipe network includes water pressure, water flow speed, current water temperature, and diameter of the pipe, and each sample individual is expressed as Z ═ E, V, T, M, P, where E is water pressure, V is water flow speed, T is current water temperature, and M is diameter of the pipe.

As an improvement of the above scheme, the clustering of the samples based on a clustering algorithm, the calculation of davison burgundy indexes under different total clustering numbers, and the selection of the total clustering number which makes the davison burgundy index minimum as the final total clustering number specifically include:

step S11, randomly selecting k sample individuals from the samples as k class centers, wherein each class center corresponds to a category, and k is the total number of the current clusters;

step S12, calculating the distance from the rest of the sample individuals to each class center, and distributing the rest of the sample individuals to each class to obtain a data set corresponding to each class, wherein the distribution result is that the distance from each sample individual to the class center of the class to which the sample individual belongs is smaller than the distance from the sample individual to the class centers of other classes;

step S13, updating the class centers of all classes, and selecting the sample individual with the minimum sum of the distances from the sample individuals in other classes as a new class center for each class;

step S14, repeating the steps S12 to S13 until all class centers are not changed;

step S15, determining the Theisenbergin index under the current total clustering number based on the current total clustering number;

and step S16, updating the value K, enabling the value K to be K +1, repeating the step S11 to the step S15 until the value K is equal to a preset value, obtaining the Davigneanbut indexes under different total clustering numbers, and selecting the total clustering number corresponding to the minimum Davigneanbut index as the final total clustering number.

As an improvement of the above scheme, the determining, based on the current total number of clusters, the davison burgundy index at the current total number of clusters specifically includes:

determining the davison burgunds index at the current cluster population by:

wherein, I_DBRepresenting Theisenberg index, k is the current total number of clusters, C_iRepresents the average distance of the sample individual i to the class center of the class to which it belongs, C_jRepresents the average distance, F, of the sample individual j to the class center of the class to which it belongs_i,jRepresenting the euclidean distance between the class center of class i to the class center of class j.

Another embodiment of the present invention correspondingly provides a device for predicting water flow in a pipe network, including:

the sample acquisition module is used for acquiring the acquired water flow characteristic data of the pipe network as a sample;

the clustering module is used for clustering the samples based on a clustering algorithm, calculating Theisentbind indexes under different clustering total numbers, selecting the clustering total number which enables the Theisentbind index to be minimum as a final clustering total number, and obtaining a data set corresponding to each clustered category;

the long-short term memory model training module is used for establishing k long-short term memory network models, inputting the data sets corresponding to the clustered categories into the k long-short term memory network models respectively for training to obtain k trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one category, and k is equal to the final total number of clusters;

the classification determining module is used for acquiring data of each pipeline of the pipe network in real time, inputting the data of each pipeline into the trained clustering algorithm and determining the category of the data of each pipeline;

and the total water flow calculation module is used for respectively inputting the data of each pipeline into the trained long-short term memory network model corresponding to the type of the data of each pipeline to obtain the short-term water flow of each pipeline, and accumulating the short-term water flow of each pipeline to obtain the total water flow of the current large pipe network.

As an improvement of the above solution, in the sample obtaining module, the sample includes a plurality of sample individuals, the water flow characteristic data of the pipe network includes water pressure, water flow speed, current water temperature, and diameter of the pipe, and each sample individual is expressed as Z ═ E, V, T, M, P, where E is the water pressure, V is the water flow speed, T is the current water temperature, and M is the diameter of the pipe.

As an improvement of the above scheme, the clustering module specifically includes:

a class center selecting unit for performing step S11: randomly selecting k sample individuals from the samples as k class centers, wherein each class center corresponds to one class, and k is the total number of the current clusters;

a sample distribution unit for executing step S12: calculating the distance from the rest of the sample individuals to each class center, and distributing the rest of the sample individuals to each class to obtain a data set corresponding to each class, wherein the distribution result is that the distance from each sample individual to the class center of the class to which the sample individual belongs is smaller than the distance from the sample individual to the class centers of other classes;

a class center updating unit configured to execute step S13: updating the class centers of all classes, and selecting the sample individual with the minimum sum of distances to the sample individuals in other classes as a new class center for each class;

a class center determining unit for performing step S14: repeating steps S12 through S13 until all class centers are no longer changed;

a davison bauding index determination unit for performing step S15: determining a davison bauxid index at a current cluster total based on the current cluster total;

a total-number-of-clusters determination unit configured to execute step S16: and updating the value K, enabling the value K to be K +1, repeating the step S11 to the step S15 until the value K is equal to a preset value, obtaining Davigneanberg indexes under different total clustering numbers, and selecting the total clustering number corresponding to the minimum Davigneanberg index as the final total clustering number.

As an improvement of the above scheme, the davison bauxid index determining unit is specifically configured to:

determining the davison burgunds index at the current cluster population by:

Another embodiment of the present invention correspondingly provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the apparatus in which the computer-readable storage medium is located is controlled to execute the method for predicting water flow rate of a pipe network described above.

The embodiment of the invention provides a method for predicting water flow of a pipe network, which comprises the steps of acquiring water flow characteristic data of the pipe network obtained by collection as a sample, clustering the sample based on a clustering algorithm, calculating Davignean baud indexes under different clustering total numbers, selecting the clustering total number which enables the Davignean baud index to be minimum as a final clustering total number, obtaining a data set corresponding to each clustered class, establishing k long-short term memory network models, inputting the data sets corresponding to each clustered class into k long-short term memory network models respectively for training, obtaining k trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one class, k is equal to the final clustering total number, collecting data of each pipeline of the pipe network in real time, and inputting the data of each pipeline into the trained clustering algorithm, determining the category of the data of each pipeline, inputting the data of each pipeline into the trained long-short term memory network model corresponding to the category of the data of each pipeline to obtain the short-term water flow of each pipeline, accumulating the short-time water flow of each pipeline to finally obtain the total water flow of the current large-scale pipe network, classifying the pipe network samples by adopting a clustering algorithm and inputting the data sets of all classes into all long-short term memory network models to obtain various well-trained long-short term memory models, the data of the pipelines classified into different categories can be input into the corresponding trained long and short term memory model for calculation, the problem that the prediction precision of a single long and short term memory network model is insufficient in different environments is solved, the short term water flow prediction precision of each pipeline is effectively improved, and the prediction precision of the total water flow of the pipe network is further improved.

Drawings

Fig. 1 is a schematic flow chart of a method for predicting water flow of a pipe network according to an embodiment of the present invention;

FIG. 2 is an architecture diagram of model training provided by a method for predicting water flow in a pipe network according to an embodiment of the present invention;

fig. 3 is a block diagram of a device for predicting water flow in a pipe network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a flow chart of a method for predicting a water flow rate of a pipe network according to an embodiment of the present invention is schematically shown.

The method for predicting the water flow of the pipe network provided by the embodiment of the invention comprises the following steps:

step S0, acquiring collected water flow characteristic data of the pipe network as a sample;

step S1, clustering the samples based on a clustering algorithm, calculating Theisenberg indexes under different total clustering numbers, selecting the total clustering number which enables the Theisenberg index to be minimum as the final total clustering number, and obtaining a data set corresponding to each clustered category;

step S2, K long-short term memory network models are established, and data sets corresponding to the clustered categories are respectively input into the K long-short term memory network models for training to obtain K trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one category, and K is equal to the final total number of clusters;

step S3, collecting data of each pipeline of the pipe network in real time, inputting the data of each pipeline into the trained clustering algorithm, and determining the category of the data of each pipeline;

and step S4, respectively inputting the data of each pipeline into the trained long-short term memory network model corresponding to the type of the data of each pipeline to obtain the short-term water flow of each pipeline, and accumulating the short-term water flow of each pipeline to obtain the total water flow of the current large pipe network.

As an alternative embodiment, the water flow characteristic data of the pipe network includes water pressure, water flow speed, current water temperature, and diameter of the pipe, and the sample includes a plurality of sample individuals, each sample individual is represented as Z ═ E, V, T, M, P, where E is water pressure, V is water flow speed, T is current water temperature, and M is diameter of the pipe.

In implementation, the water flow characteristic data of the water pressure, the water flow speed, the current water temperature and the diameter of the pipeline of a plurality of historical moments can be collected through the sensors to obtain a plurality of sample individuals, and in order to make the sample more representative, the number of the sample individuals is at least 1000.

As an optional implementation manner, the step S1 "clustering the samples based on a clustering algorithm, calculating davison burger indexes under different total clustering numbers, and selecting the total clustering number that makes the davison burger index minimum as a final total clustering number", specifically includes:

Specifically, the step S12 "calculates the distance between the remaining sample individuals and the center of each class" according to the equation | Z_n-u_i|²Calculation of, wherein Z_nRepresents the nth sample subject, u_iRepresenting the ith class center.

Specifically, the distance in the step S13 "for each category, the sample individual with the smallest sum of the distances to the sample individuals in the other present category is selected as the new class center", which refers to the euclidean distance between two sample individuals.

Specifically, in step S16, "until k equals to a preset value", the preset value may be determined according to actual measurement conditions, and in the embodiment of the present invention, the value of k should not exceed 20.

As an optional implementation manner, the step S15 "determining davisenberg index at the current cluster total number based on the current cluster total number" specifically includes:

determining the davison burgunds index at the current cluster population by:

In the embodiment of the present invention, referring to fig. 2, fig. 2 is an architecture diagram of model training provided by the method for predicting water flow of a pipe network provided by the embodiment of the present invention, before training, 1000 sample individuals are selected from water flow characteristic data collected by a sensor, that is, the sample is [ Z [₁ Z₂ … Z₁₀₀₀₀]And based on a K-Medoids clustering algorithm, from [ Z ]₁Z₂ … Z₁₀₀₀₀]In the random selection of k class centers u_iIt can be understood that k is the total number of clusters, the initial value of k is 1, and DB indexes (DB indexes, namely Davies-Bouldin indexes obtained as described above) at different k values are calculated through the above step S1, i.e., [ Z-Z ] can be calculated₁ Z₂ … Z₁₀₀₀₀]And clustering, determining the final total number k of clusters, obtaining k types of data, and inputting the k types of data into k long-short term memory network models respectively for training to obtain k types of well-trained long-short term memory network models.

In the specific implementation, in the step of training the long-short term memory network model, the LSTM model may be set by calling an LSTM toolkit in the neural network library Keras, where each LSTM model has 1000 neurons in a hidden layer, 1 neuron in an output layer, and a training set format input by the LSTM model is set as Z ═ E, V, T, M, P, a training sample number batch _ size is 1000, a training time epochs is 500, and when the network model is set in forward propagation, an activation value of a certain neuron is randomly stopped at a probability of 0.5, so that the model has stronger generalization.

Specifically, in step S3, "collect data of each pipeline of the pipe network in real time, and input the data of each pipeline into the trained clustering algorithm, and determine the category to which the data of each pipeline belongs", it should be understood that sensors are disposed on each pipeline of the pipe network to obtain water flow characteristic data N ═ E, V, T, M of each pipeline, and input the data into the trained clustering algorithm, and the trained clustering algorithm can classify the input data to determine the category to which the input data belongs, so that the data of each pipeline can be input into the trained clustering algorithm to finally determine the category to which the data of each pipeline belongs.

The method for predicting the water flow of the pipe network, provided by the embodiment of the invention, is based on a K-Medoids clustering algorithm and a long-short term memory network model algorithm, can solve the problems that the traditional unsupervised clustering algorithm is insufficient in robustness under abnormal data and needs to manually search for an optimal K value, and also solves the problem that a single long-short term memory network model is insufficient in prediction accuracy under different environments.

Referring to fig. 3, which is a block diagram of a structure of a prediction apparatus 100 for pipe network water flow rate according to an embodiment of the present invention, the prediction apparatus 100 for pipe network water flow rate according to the embodiment of the present invention includes:

the sample acquisition module 1 is used for acquiring the acquired water flow characteristic data of the pipe network as a sample;

the clustering module 2 is used for clustering the samples based on a clustering algorithm, calculating Theisentbind indexes under different clustering total numbers, selecting the clustering total number which enables the Theisentbind index to be minimum as a final clustering total number, and obtaining a data set corresponding to each clustered category;

the long-short term memory model training module 3 is used for establishing k long-short term memory network models, inputting the data sets corresponding to the clustered categories into the k long-short term memory network models respectively for training, and obtaining k trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one category, and k is equal to the final total number of clusters;

the classification determining module 4 is used for acquiring data of each pipeline of the pipe network in real time, inputting the data of each pipeline into the trained clustering algorithm, and determining the category of the data of each pipeline;

and the total water flow calculation module 5 is used for respectively inputting the data of each pipeline into the long-short term memory network model corresponding to the type of the data of each pipeline to obtain the short-term water flow of each pipeline, and accumulating the short-term water flow of each pipeline to obtain the total water flow of the current large-scale pipe network.

As an optional embodiment, in the sample obtaining module 1, the water flow characteristic data of the pipe network includes water pressure, water flow speed, current water temperature environment, and diameter of a pipe, so that the sample includes a plurality of sample individuals, each sample individual is represented by Z ═ E (E, V, T, M, P), where E is water pressure; v is the water flow speed; t is the current water temperature; m is the diameter of the pipe.

As an optional implementation manner, the clustering module 2 specifically includes:

a class center selecting unit that performs step S11: randomly selecting k sample individuals from the samples as k class centers, wherein each class center corresponds to one class, and k is the total number of the current clusters;

As an optional implementation manner, the davison bauxid index determining unit is specifically configured to:

determining the davison burgunds index at the current cluster population by:

Specifically, the class center updating unit performs step S12 "calculating the distance between the remaining sample individuals and each class center" by the equation | Z_n-u_i|²Calculation of Z_nRepresents the nth sample subject, u_iRepresenting the ith class center.

Specifically, the distance in step S13 "executed by the class center updating unit selects, for each class, the sample individual with the smallest sum of the distances to the sample individuals in the other class as the new class center", which is the euclidean distance between the two sample individuals.

Specifically, in step S16 executed by the total clustering number determining unit, "until k equals to a preset value", the preset value may be determined according to an actual measurement situation, and in the embodiment of the present invention, the value of k should not exceed 20.

In the embodiment of the present invention, referring to fig. 2, fig. 2 is an architecture diagram of model training provided by the method for predicting water flow of a pipe network provided by the embodiment of the present invention, before training, water flow collected from a sensor is first obtainedSelecting 1000 sample individuals from the quantity characteristic data to obtain a sample [ Z ]₁ Z₂ … Z₁₀₀₀₀]And based on a K-Medoids clustering algorithm, from [ Z ]₁Z₂ … Z₁₀₀₀₀]In the random selection of k class centers u_iIt can be understood that k is the total number of clusters, the initial value of k is 1, and DB indexes (DB indexes, namely Davies-Bouldin indexes obtained as described above) at different k values are calculated through the above step S1, i.e., [ Z-Z ] can be calculated₁ Z₂ … Z₁₀₀₀₀]And clustering, determining the final total number k of clusters, obtaining k types of data, and inputting the k types of data into k long-short term memory network models respectively for training to obtain k types of well-trained long-short term memory network models.

Another embodiment of the present invention accordingly provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to perform the steps S0 to S4 described above.

The computer-readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A method for predicting water flow of a pipe network is characterized by comprising the following steps:

establishing K long-short term memory network models, and inputting the data sets corresponding to the clustered categories into the K long-short term memory network models respectively for training to obtain K trained long-short term memory network models, wherein each trained long-short term memory network model corresponds to one category, and K is equal to the final total clustering number;

2. The method for predicting water flow of pipe network of claim 1, wherein the samples comprise a plurality of sample individuals, the water flow characteristic data of the pipe network comprises water pressure, water flow speed, current water temperature and diameter of the pipe, and each sample individual is expressed as Z ═ E, V, T, M, P, where E is water pressure, V is water flow speed, T is current water temperature and M is diameter of the pipe.

3. The method for predicting the water flow of a pipe network according to claim 2, wherein the clustering is performed on the samples based on a clustering algorithm, the davison burgunds indexes under different total clustering numbers are calculated, and the total clustering number which enables the davison burgunds index to be minimum is selected as the final total clustering number, and specifically the method comprises the following steps:

4. The method for predicting water flow of a pipe network according to claim 3, wherein the determining the davisenbergin index at the current cluster population based on the current cluster population specifically comprises:

determining the davison burgunds index at the current cluster population by:

5. A device for predicting water flow in a pipe network, comprising:

and the total water flow calculation module is used for inputting the data of each pipeline into the long-short term memory network model corresponding to the type of the data of each pipeline to obtain the short-term water flow of each pipeline, and accumulating the short-term water flow of each pipeline to obtain the total water flow of the current large-scale pipe network.

6. The apparatus for predicting water flow of pipe network of claim 5, wherein the sample obtaining module comprises a plurality of sample individuals, the water flow characteristic data of the pipe network comprises water pressure, water flow speed, current water temperature, and diameter of the pipe, and each sample individual is represented by Z ═ Z (E, V, T, M, P), where E is water pressure, V is water flow speed, T is current water temperature, and M is diameter of the pipe.

7. The prediction device of pipe network water flow according to claim 6, wherein the clustering module specifically comprises:

8. The prediction device of pipe network water flow of claim 7, wherein the davison bauxid index determining unit is specifically configured to:

determining the davison burgunds index at the current cluster population by:

9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method for predicting water flow in a pipe network according to any one of claims 1 to 4.