CN113205368B - Industrial and commercial customer clustering method based on time sequence water consumption data - Google Patents
Industrial and commercial customer clustering method based on time sequence water consumption data Download PDFInfo
- Publication number
- CN113205368B CN113205368B CN202110569868.3A CN202110569868A CN113205368B CN 113205368 B CN113205368 B CN 113205368B CN 202110569868 A CN202110569868 A CN 202110569868A CN 113205368 B CN113205368 B CN 113205368B
- Authority
- CN
- China
- Prior art keywords
- industrial
- commercial
- water consumption
- value
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 237
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000002354 daily effect Effects 0.000 claims description 56
- 239000013598 vector Substances 0.000 claims description 40
- 238000012545 processing Methods 0.000 claims description 26
- 238000012549 training Methods 0.000 claims description 23
- 238000010606 normalization Methods 0.000 claims description 12
- 230000002159 abnormal effect Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 10
- 238000012795 verification Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 6
- 239000011541 reaction mixture Substances 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 230000003203 everyday effect Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 abstract 1
- 230000008859 change Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000001186 cumulative effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012800 visualization Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000015654 memory Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 230000004720 fertilization Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000003337 fertilizer Substances 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Development Economics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Entrepreneurship & Innovation (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Game Theory and Decision Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for clustering industrial and commercial businesses based on time sequence water consumption data, which comprises the following steps: 1. building daily water consumption data of industrial and commercial enterprises and carrying out data preprocessing work; 2. learning and representing the time-series water data based on an LSTM model; 3. clustering industrial and commercial customers based on the water use trend; 4. clustering industrial and commercial customers based on the water use range on the basis of clustering according to the water use trend; 5. and visually displaying the clustering result. The invention can learn abundant water use patterns and trend information hidden in the time sequence water use data of the industrial and commercial enterprises through the LSTM model, the water use patterns and the trend information are used as the water use characteristic representation of the industrial and commercial enterprises, and the clustering of the industrial and commercial enterprises based on two factors of the water use trend and the water use range can be accurately and rapidly completed by combining with the kmeans algorithm.
Description
Technical Field
The invention relates to the technical field of user clustering, in particular to a time sequence data-based industrial and commercial customer clustering method.
Background
In the existing research on a user clustering method, a kmeans algorithm plays an excellent effect in static data clustering, but similarity between industrial and commercial customers is calculated by adopting Euclidean distance, the sequence of time points cannot be considered, only similarity of water consumption can be captured, and the trend of water consumption characteristics changing along with time cannot be described.
The water consumption data of the industrial and commercial enterprises is time sequence data, and the water consumption of the industrial and commercial enterprises on day, week and month is recorded according to fixed time intervals, such as day, week and month, and potential time sequence water consumption characteristics of a plurality of industrial and commercial enterprises, such as water consumption period, water consumption mode and the like, are hidden. The state change of a certain moment of a sample in time series data is related to the state of the previous moment and the next moment, so how to analyze the change rule in the time series data by combining the state of the previous moment and the next moment is a difficult point.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a time sequence data-based industrial and commercial customer clustering method, so that learning and characterization of a water use mode hidden in time sequence water use data of industrial and commercial customers can be realized through an LSTM model, clustering of characteristics of two aspects of water use trend and water use range of the industrial and commercial customers is performed by combining a kmeans algorithm, clustering accuracy is improved, and mining of rich change rules and trends hidden in the time sequence water use data is facilitated, so that the water use mode of the industrial and commercial customers is accurately and completely carved.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a clustering method of industrial and commercial enterprises based on time sequence water consumption data, which is characterized by comprising the following steps:
step1, constructing daily water consumption data of industrial and commercial enterprises;
step 1.1, obtaining remote water meter data of industrial and commercial enterprises, and extracting industrial and commercial enterprise id, water meter updating time, accumulated water flow, industrial and commercial enterprise remote water meter address and industrial and commercial enterprise name in the remote water meter data;
step 1.2, carrying out longitude and latitude conversion on the industrial and commercial tenant remote water meter address to obtain longitude and latitude information of the industrial and commercial tenant;
step 1.3, dividing the remote water meter data according to the industrial and commercial customer id to obtain m parts of water meter data files named by the industrial and commercial customer id, and arranging all data in the water meter data files according to the sequence of water meter updating time; wherein m represents the total number of industrial and commercial businesses;
step 1.4, carrying out difference processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companiesWherein, the first and the second end of the pipe are connected with each other,representing the daily water consumption value of the ith industrial and commercial company on the t day, wherein t represents the water consumption days, and marking the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
Step 1.5, carrying out detection and processing of abnormal values on the sample feature set A to obtain a sample feature set X' after abnormal processing;
step 1.6, processing the missing value of the processed sample feature set A 'to obtain a sample feature set X' subjected to missing processing;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded asWherein the content of the first and second substances,expressing the normalized daily water consumption value of the ith industrial company t day, an Expressing the normalized daily water consumption value of the ith industrial and commercial company on the tth day;
step 2.2, pre-training an LSTM model;
the normalized sample characteristic setDividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the training set into the LSTM model to obtain a prediction sequence of a verification set, then calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root-mean-square error, thereby completing one training of the LSTM model, and stopping training when the training times reach the epoch value, thereby obtaining the trained LSTM model and using the trained LSTM model as a merchant time sequence water use characteristic extraction model;
step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into the commercial business time sequence water use characteristic extraction model, and outputting the water use characteristic vector Y = { Y ] of each industrial and commercial business i I =1,2,. ·, m }; wherein, y i Representing the water use characteristic vector of the ith industrial and commercial company, an Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector;
step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2, · m } performing industrial and commercial customer clustering based on water usage trends;
step 3.1, determining the optimal clustering quantity by combining an elbow method and a contour coefficient method, and recording the optimal clustering quantity as K;
step 3.2, based on the optimal clustering quantity K, using the water consumption characteristic vector y of the ith industrial and commercial company i The method is used as a sample to be detected and input into a kmeans algorithm, so that industrial and commercial businesses in the water use characteristic vector Y of each industrial and commercial business are gathered into K clusters, and the coordinates of the centers of the K clusters are randomly initialized by using the formula (1):
in the formula (1), the reaction mixture is,the center of the k-th cluster is represented,a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster centerEuropean distance ofThereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
step 3.4, according to the sample y to be measured i The Euclidean distance from the center of each cluster is used for measuring the sample y i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, after all samples to be tested are divided into the clusters to which the samples belong, K classes are obtained, and the set of industrial and commercial businesses in each class is obtained asWherein the content of the first and second substances,represents the feature vector of the jth industrial business in the kth class, anj=1,2,...,S k ,S k Representing the number of industrial businesses in the kth class;representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)Thereby obtaining an updated cluster center ofAnd assign a value tok=1,2,...,K;
In the formula (6), the reaction mixture is,coordinate value representing the nth dimension of the updated kth class center;
Step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
step4, clustering industrial and commercial businesses based on the water consumption range;
step 4.1, based on the result of the water use trend clustering of the industrial and commercial customers, acquiring a set B of the industrial and commercial customers in each class k ={b j |j=1,2,...,S′ k In which b j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial businesses in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k class after normalization j The real daily water consumption value on the t day is recorded Represents the j industrial business b in the k class after normalization j Actual daily water usage value on day t, j =1, 2. k Repeating the process from the step 3.1 to the step 3.6, and re-clustering the real daily water consumption values of the industrial and commercial customers in each class, so as to cluster the industrial and commercial customers with similar water use trends and water consumption into a class;
step5, visualizing a clustering result;
step 5.1, calculating the average value of the daily water consumption of the industrial and commercial customers in each class by taking the average value vector of the daily water consumption vectors of all the industrial and commercial customers in each class in the clustering result of the step4 as a class center, and respectively classifying and visualizing the water consumption condition, the class center and the daily water consumption average value of the industrial and commercial customers in each class by drawing a two-dimensional coordinate system;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
and 5.3, acquiring the industrial and commercial tenant id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial tenant so as to visualize the geographical position of the industrial and commercial tenant in each class.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the difference processing is carried out on the accumulated flow data, and the data completion is carried out on the missing value and the abnormal value by adopting the adjacent data, so that the daily water consumption data of each industrial and commercial company is constructed, thereby completing the pretreatment process of the time-series water consumption data, greatly improving the quality of a data mining mode, and being beneficial to improving the efficiency of actual data mining.
2. The method constructs and trains an LSTM model, learns rich water consumption patterns hidden in the time sequence water consumption data of the industrial and commercial enterprises, and represents the time sequence data as a static feature vector with specified dimensionality, so that the feature representation of the water consumption trend of the industrial and commercial enterprises is realized, the water consumption change rule of the industrial and commercial enterprises can be accurately described, and the accuracy of a subsequent clustering algorithm is improved;
3. the method combines a kmeans algorithm to carry out similarity calculation on the water use characteristic vectors of the industrial and commercial customers represented by the LSTM model, thereby completing the clustering of the industrial and commercial customers based on the water use trend, clustering the industrial and commercial customers with similar water use trends into one class, and being beneficial to mining and analyzing different water use change rules presented in different clustering results;
4. the invention aims at industrial and commercial enterprises with similar water use trends, and uses the kmeans algorithm again based on the real daily water use data to finish the clustering of the industrial and commercial enterprises based on the aspect of the water use range, thereby clustering the industrial and commercial enterprises with similar water use trends and water use ranges into a class, being beneficial to analyzing and comparing the difference of the water use ranges of different industrial and commercial enterprises in the clustering results with similar water use trends, and further accurately and completely depicting the water use modes of the industrial and commercial enterprises.
Drawings
FIG. 1 is a process flow diagram for industrial and commercial customer clustering in accordance with the present invention;
FIG. 2 is a diagram of the single cell state structure of the long short term memory model (LSTM) of the present invention;
FIG. 3 is a flow chart of the kmeans clustering algorithm of the present invention.
Detailed Description
In this embodiment, a method for clustering industrial and commercial businesses based on time-series water consumption data, specifically, as shown in fig. 1, is performed according to the following steps:
step1, building daily water consumption data of industrial and commercial businesses;
step 1.1, obtaining remote water meter data of industrial and commercial customers, and extracting industrial and commercial customer id, water meter updating time, accumulated water flow, industrial and commercial customer remote water meter address and industrial and commercial customer name in the remote water meter data;
in the specific implementation, the data of the remote water meter records the accumulated water flow of all industrial and commercial enterprises in 366 days from 2020-01-01 to 2020-12-31, wherein the accumulated flow is updated every hour at the whole point of the water meter. Secondly, the original data are randomly and disorderly arranged and belong to a large csv file, so that required columns such as the industrial and commercial customers id, the water meter updating time, the accumulated water flow, the industrial and commercial customer remote water meter addresses, the industrial and commercial customer names and the like are extracted firstly to reduce the file memory;
step 1.2, carrying out longitude and latitude conversion on the remote water meter address of the industrial and commercial tenant to obtain longitude and latitude information of the industrial and commercial tenant;
in this embodiment, the method for converting the address name of the user into latitude and longitude information by calling the high-resolution map API includes the following steps:
step1, acquiring the URL of the address on the high-resolution map, and then entering the address in keywords of the URL, thereby obtaining the URL of the address.
Step2, sending a request to the URL, obtaining page information corresponding to the URL by using a request.get (URL) text method in python, and converting the page information into character string type data.
And Step3, analyzing json data by using a json loads () method in Python based on the json format of the return type of the page data obtained in Step2, and converting the json data into dictionary type data.
And Step4, extracting the data obtained in Step3, and extracting longitude and latitude information according to the key value and the value in the dictionary.
Step 1.3, remote water meter data are divided according to industrial and commercial customer id to obtain m water meter data files named by the industrial and commercial customer id, and all data in the water meter data files are arranged according to the sequence of water meter updating time; wherein m represents the total number of industrial and commercial businesses;
in specific implementation, the data after the key column is extracted is still huge, which may cause the operation efficiency to be greatly reduced. The processing method includes the steps that large csv files are divided into files according to the industrial and commercial company id columns, the divided files are named according to the id of each industrial and commercial company, and therefore the independent remote water meter recording data of each industrial and commercial company are obtained.
Step 1.4, carrying out difference calculation processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companiesWherein, the first and the second end of the pipe are connected with each other,representing the daily water consumption value of the ith industrial and commercial company on the t day, t representing the water consumption days, and recording the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
In this embodiment, a total 2158 industrial and commercial company with 366 days of complete water record from 2020-01-01 to 2020-12-31 is provided, and assuming that the accumulated flow of the industrial and commercial company a at 1 month and 1 day 00 in 2020 is x and the accumulated flow of the industrial and commercial company a at 1 month and 1 day 24 in 2020 is y, the daily water consumption value of the industrial and commercial company a at 1 month and 1 day in 2020 is (x-y), and so on, the daily water consumption data of 366 days of all the industrial and commercial companies can be calculated.
Step 1.5, carrying out detection and processing of abnormal values on the sample feature set A to obtain a sample feature set X' after abnormal processing;
due to the fact that the water consumption accumulated flow at the initial moment or the water consumption accumulated flow at the last moment of a certain day is lost in record of the remote water meter, an abnormal value occurs in the daily water consumption calculation process. Performing special value (null value) processing on the abnormal value;
in specific implementation, due to abnormal record of the remote water meter, the cumulative flow at 00 time or the cumulative flow at 24 time of a certain industrial and commercial company is 0, so that a correct daily water consumption value cannot be calculated. Therefore, when the daily water consumption data is calculated, the judgment of a conditional statement is required to be set: if (00 time cumulative water flow = =0or 24 time cumulative water flow = = 0), the daily water consumption can be first subjected to null value processing; else, daily water consumption =24 moment accumulated water consumption flow-00 moment accumulated water consumption flow;
step 1.6, processing the missing value of the processed sample feature set A 'to obtain a sample feature set X' subjected to missing processing;
the condition that the water consumption record of a certain industrial and commercial company in a certain day is lost exists in the record of the remote water meter, so that the daily water consumption of a changed day cannot be correctly calculated. Assigning a special value (null value) to the missing daily water consumption data for processing;
in specific implementation, due to abnormal record of the remote water meter, all data of certain industrial and commercial businesses in a certain day are lost, so that the daily water consumption value of the lost day cannot be calculated, and therefore, the judgment of condition statements is required to be set: if (xx year-xx month-xx day = null) can give a null value to the daily water consumption of the missing day;
in this embodiment, the empty value may be filled by using a fillna () method in Python, because the water consumption of the industrial and commercial enterprises is generally large-scale water users, the water consumption is relatively stable, and the daily water consumption values of adjacent days are relatively similar, so parameters in the method may be selected as fillna (method = "filll", axis = 1) for filling the empty value with the value as a previous (column) value of a same row, or selected as fillna (method = "backsfill", axis = 1) for filling the empty value with the value as a next (column) value of the same row;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded asWherein, the first and the second end of the pipe are connected with each other,expressing the normalized daily water consumption value of the ith industrial company t day, and expressing the normalized daily water consumption value of the ith industrial and commercial company on the tth day;
the formula of the normalization processing is as follows:
in the formula (1), the acid-base catalyst,the data after the normalization for the variables is carried out,in the case of the original data of the variables,andmaximum and minimum values in the raw data, respectively;
step 2.2, pre-training an LSTM model;
normalizing the sample characteristic setDividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the training set into an LSTM model to obtain a prediction sequence of a verification set, calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root-mean-square error to finish one-time training of the model, and stopping training when the training times reach a predetermined epoch value, so as to obtain a trained LSTM model and serve as a merchant time sequence water use characteristic extraction model;
in specific implementation, the structure diagram of the single cell state of the LSTM model is shown in fig. 2, and the pre-training LSTM model comprises the following steps:
step1: training a forgetting gate, wherein the process is expressed as:
f t =σ(W ft x t +W fh h t-1 +b f )
in the formula, x t To input samples, f t To forget the gate sample, σ (-) represents the activation function, sigmod, W is used ft And W fh Respectively representing forgetting gate and x t And h t-1 Inter weight coefficient, h t-1 Representing a hidden state at time t-1, b f Is the forgetting gate bias coefficient;
step2: training input gates, whose process is represented as:
g t =σ(W gt x t +W gh h t-1 +b g )
in the formula, g t Represents the input Gate sample, W gt And W gh Respectively representing input gate and x t And h t-1 Inter weight coefficient, b g Is the input gate bias coefficient;
step3: updating the memory unit, wherein the process is represented as:
s t =f t s t-1 +g t tanh(W st x t +W sh h t-1 +b s )
in the formula s t In a cellular state, W st And W sh Respectively represent cell and x t And h t-1 Inter weight coefficient, b s The corresponding bias coefficient of the cell;
step4: updating the current state of the output gate, wherein the activation function is a tanh function;
step5: and repeating step1 to step4 until the model converges.
Step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into a commercial business time sequence water use characteristic extraction model, and outputting the water use characteristic vector Y = { Y } of each industrial and commercial business i I =1,2, ·, m }; wherein, y i Represents the water consumption characteristic vector of the ith industrial and commercial company, an Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector; in one embodiment, the trained model is stored in a test.h5 file, and is used for storing information such as parameters of the finally determined LSTM model; and setting the dimension of the model output vector to be 64, namely converting 366-dimensional original daily water consumption data into 64-dimensional characteristic vectors to be used as the characteristics of each industrial and commercial company.
Step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2,..., m } for industrial and commercial business clustering based on water usage trends, as shown in fig. 3;
step 3.1, determining the optimal clustering number by combining an elbow method and a contour coefficient method, and recording the optimal clustering number as K;
determining the optimal clustering quantity by combining an elbow method and a contour coefficient method, and recording the optimal clustering quantity as K; the formula of the elbow method is as follows:
in the formula (2), K is the number of clusters, C i Denotes the ith cluster, p is C i Sample point of (1), m i Is C i Center of mass (C) i Mean of all samples), SSE is the sum of squared clustering errors for all samples;
with the increase of the clustering number k, the sample division is finer, the clustering degree of each cluster is gradually increased, and then the sum of squared errors SSE is gradually reduced; when k is smaller than the real clustering number, the descending amplitude of SSE is large because the aggregation degree of each cluster is greatly increased due to the increase of k, and when k reaches the real clustering number, the descending amplitude of SSE is suddenly reduced and then tends to be flat along with the continuous increase of the k value, the relation graph of the SSE and the k value is the shape of an elbow, and the k value corresponding to the elbow is the real clustering number of the data;
next, the formula of the contour coefficient method is as follows:
in the formula (3), S (i) represents the profile coefficient of the ith sample, a (i) is the intra-cluster dissimilarity representing the mean of the distances from the ith sample to other samples in the cluster to which the ith sample belongs, and b (i) is the inter-cluster dissimilarity representing the minimum value of the average distances from the ith sample to all samples in each cluster not in which the ith sample is located; the mean value of the contour coefficients S (i) of all samples is called the contour coefficient of the clustering result;
s (i) belongs to [ -1,1], the closer the outline coefficient is to 1, the more reasonable the sample i is clustered, so the value of k corresponding to the larger outline coefficient is selected;
in the embodiment, the relationship graphs of the SSE, the contour coefficient and the K value can be simultaneously calculated and drawn, and an optimal K value is determined by combining the elbow of the relationship graph of the SSE and the K value and the local optimal point of the relationship graph of the contour coefficient and the K value, so that the similarity among the classes in the clustering result is as high as possible, and the similarity among the classes is as low as possible;
step 3.2, based on the optimal clustering quantity K, using the water use characteristic vector y of the ith industrial and commercial company i The samples to be detected are input into a kmeans algorithm, so that the industrial companies in the water use characteristic vector Y of each industrial company are gathered into K clusters, and the coordinates of the centers of the K clusters are initialized randomly by using the formula (1):
in the formula (1), the reaction mixture is,the center of the k-th cluster is indicated,a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster centerEuropean distance ofThereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
step 3.4, according to the sample y to be measured i Euclidean distance from the center of each cluster to the sample y to be measured i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, dividing all samples to be tested into the clusters to which the samples belong to obtain K classes, and acquiring the set of industrial and commercial businesses in each class asWherein the content of the first and second substances,represents the feature vector of the jth industrial business in the kth class, andj=1,2,...,S k ,S k representing the number of industrial businesses in the kth class;representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)Thereby obtaining an updated cluster centerAnd assign a value tok=1,2,...,K;
In the formula (6), the reaction mixture is,a coordinate value representing the updated nth dimension of the kth class center;
step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
in specific implementation, step3 is based on the water use trend feature vector of the industrial and commercial customers learned and output by the long-short term memory model in step2, clustering is performed on the industrial and commercial customers by adopting a kmeans algorithm with Euclidean distance as a similarity measurement method, at the moment, the network of the LSTM model learns the content stored, discarded and read in the long-term state of the time sequence water sequence of the industrial and commercial customers, and the long-term water use trend in the time sequence water use data is detected, so that the clustering of the industrial and commercial customers based on the water use trend aspect can be realized by combining the kmeans algorithm based on the 64-dimensional feature vector output by the model;
step4, clustering industrial and commercial customers based on the range of water consumption;
step 4.1, acquiring the industrial and commercial businesses in each class based on the result of the water use trend clustering of the industrial and commercial businessesSet of (A) is B k ={b j |j=1,2,...,S′ k In which b is j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial businesses in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k class after normalization j The real daily water consumption value on the t day is recorded as Represents the j industrial business b in the k class after normalization j True daily water usage value on day t, j =1,2. k Repeating the process from the step 3.1 to the step 3.6, and re-clustering the real daily water consumption values of the industrial and commercial customers in each class, so as to cluster the industrial and commercial customers with similar water use trends and water consumption into a class;
in this embodiment, based on the clustering result in step3, many industrial and commercial customers with very similar water use trends and widely different water use ranges are clustered into one class, and we aim to cluster industrial and commercial customers with similar water use trends and slightly different water use ranges into one class, so that on the basis that clustering of the industrial and commercial customers with similar water use trends is completed in step3, we further adopt the kmeans algorithm again based on the original daily water use data of the industrial and commercial customers in the class, and at this time, do not need to capture the water use mode in the time sequence water use data, so that the feature that the kmeans algorithm is sensitive to numerical values so as to distinguish the water use sizes is utilized, and clustering is performed again according to the water use range on the basis of the clustering with similar water use trends, and in the clustering process, the optimal clustering number is determined by still combining the elbow method and the contour coefficient method, so that the similarity in the class is as high as possible, and the inter-class similarity is as low as possible. Finally, industrial and commercial enterprises with similar water use trends and water use ranges are gathered into one category;
step5, visualizing a clustering result;
step 5.1, taking the mean value vector of the daily water consumption vectors of all the industrial and commercial enterprises in each class in the clustering result of the step4 as a class center, calculating the daily water consumption mean value of the industrial and commercial enterprises in each class, and respectively carrying out classification visualization on the water consumption condition, the class center and the daily water consumption mean value of the industrial and commercial enterprises in each class by drawing a two-dimensional coordinate system;
in the specific embodiment, the date is taken as an x axis, the daily water consumption value is taken as a y axis, and the real daily water consumption information, the class center coordinates and the average daily water consumption value of all the industrial and commercial businesses in the class are visualized;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
in the embodiment, all class center coordinate vectors are visualized by taking the date as an x axis and the daily water consumption value as a y axis;
and 5.3, acquiring the industrial and commercial enterprises id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial enterprises so as to visualize the geographical position of the industrial and commercial enterprises in each class.
In specific implementation, the visualization of the geographical position of the class-I industrial and commercial businesses comprises the following steps:
step1: generating a longitude and latitude dictionary according to the id of the industrial and commercial customers in each class and longitude and latitude information corresponding to the id, wherein the key is the id of the industrial and commercial customers, and the value is the longitude and latitude
Step2: drawing a curtain, namely an area for displaying a map, by introducing a Geo packet in a pyecharts drawing tool, displaying the number of industrial and commercial businesses in the class by acquiring the number of pieces of data in each class right above the area, wherein each industrial and commercial business is represented as one piece of data, and setting formats such as the size, background color, font size and the like of the curtain;
step3: using a geo.add () function, setting a parameter maptype = 'joint fertilization', and loading a map resource package of a joint fertilization market in a curtain;
step4: on the fertilizer market map drawn in step3, the longitude and latitude dictionaries obtained in step1 are marked with scattered points one by one, and formats such as the size, shape, color and the like of the scattered points are set;
step5: acquiring name information of each industrial and commercial company according to the industrial and commercial company id, and displaying the name information and longitude and latitude information of the industrial and commercial company on a scatter point in a legend mode;
step6: and storing the maps drawn by step 1-step 5 and the visualization results as html files, so as to complete the visualization of the geographical positions of the industrial and commercial businesses in each class.
Claims (1)
1. A method for clustering industrial and commercial businesses based on time sequence water consumption data is characterized by comprising the following steps:
step1, building daily water consumption data of industrial and commercial businesses;
step 1.1, obtaining remote water meter data of industrial and commercial enterprises, and extracting industrial and commercial enterprise id, water meter updating time, accumulated water flow, industrial and commercial enterprise remote water meter address and industrial and commercial enterprise name in the remote water meter data;
step 1.2, carrying out longitude and latitude conversion on the industrial and commercial tenant remote water meter address to obtain longitude and latitude information of the industrial and commercial tenant;
step 1.3, dividing the remote water meter data according to the industrial and commercial customer id to obtain m parts of water meter data files named by the industrial and commercial customer id, and arranging all data in the water meter data files according to the sequence of water meter updating time; wherein m represents the total number of industrial businesses;
step 1.4, carrying out difference processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companiesWherein the content of the first and second substances,representing the daily water consumption value of the ith industrial and commercial company on the t day, t representing the water consumption days, and recording the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
Step 1.5, carrying out detection and processing on an abnormal value of the sample feature set X to obtain a sample feature set X' after abnormal processing;
step 1.6, processing the missing value of the processed sample feature set X 'to obtain a sample feature set X' subjected to missing processing;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded asWherein the content of the first and second substances,expressing the normalized daily water consumption value of the ith industrial company t day, and expressing the daily water consumption value of the ith industrial company on the tth day after normalization;
step 2.2, pre-training an LSTM model;
the normalized sample characteristic setDividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the verification set into the LSTM model to obtain a prediction sequence of the verification set, then calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root mean square error so as to complete one training of the LSTM model, and stopping training when the training times reach the epoch value so as to obtain the trained LSTM model and serve as a commercial tenant time sequence water use characteristic extraction model;
step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into the commercial business time sequence water consumption characteristic extraction model, thereby outputting the water consumption characteristic vector Y = { Y } of each industrial and commercial business i I =1,2,. ·, m }; wherein, y i Representing the water use characteristic vector of the ith industrial and commercial company, an Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector;
step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2, · m } performing industrial and commercial customer clustering based on water usage trends;
step 3.1, determining the optimal clustering number by combining an elbow method and a contour coefficient method, and recording the optimal clustering number as K;
step 3.2, based on the optimal clustering quantity K, using the water consumption characteristic vector y of the ith industrial and commercial company i The samples to be detected are input into a kmeans algorithm, so that the industrial companies in the water use characteristic vector Y of each industrial company are gathered into K clusters, and the coordinates of the centers of the K clusters are initialized randomly by using the formula (1):
in the formula (1), the reaction mixture is,the center of the k-th cluster is indicated,a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster centerEuropean distance ofThereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
step 3.4, according to the sample y to be measured i Euclidean distance from the center of each cluster to the sample y to be measured i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, dividing all samples to be tested into the clusters to which the samples belong to obtain K classes, and acquiring the set of industrial and commercial businesses in each class asWherein, the first and the second end of the pipe are connected with each other,represents the feature vector of the jth industrial business in the kth class, andj=1,2,...,S k ,S k representing the number of industrial businesses in the kth class;representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)Thereby obtaining an updated cluster center ofAnd assign a value tok=1,2,...,K;
In the formula (6), the reaction mixture is,a coordinate value representing the nth dimension of the updated kth class center;
step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
step4, clustering industrial and commercial customers based on the range of water consumption;
step 4.1, acquiring a set B of industrial and commercial businesses in each class based on the result of the water use trend clustering of the industrial and commercial businesses k ={b j |j=1,2,...,S′ k In which b j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial customers in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k category after normalization j The real daily water consumption value on the t day is recorded Represents the j industrial business b in the k category after normalization j True daily water usage value on day t, j =1,2. k Repeating the process from the step 3.1 to the step 3.6, and reuniting the real daily water consumption value of each class of industrial and commercial enterprisesThe like, so that industrial and commercial enterprises with similar water use trends and water use amounts are gathered into a category;
step5, visualizing a clustering result;
step 5.1, calculating the average value of the daily water consumption of the industrial and commercial customers in each class by taking the average value vector of the daily water consumption vectors of all the industrial and commercial customers in each class in the clustering result of the step4 as a class center, and respectively classifying and visualizing the water consumption condition, the class center and the daily water consumption average value of the industrial and commercial customers in each class by drawing a two-dimensional coordinate system;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
and 5.3, acquiring the industrial and commercial tenant id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial tenant so as to visualize the geographical position of the industrial and commercial tenant in each class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110569868.3A CN113205368B (en) | 2021-05-25 | 2021-05-25 | Industrial and commercial customer clustering method based on time sequence water consumption data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110569868.3A CN113205368B (en) | 2021-05-25 | 2021-05-25 | Industrial and commercial customer clustering method based on time sequence water consumption data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113205368A CN113205368A (en) | 2021-08-03 |
CN113205368B true CN113205368B (en) | 2022-11-29 |
Family
ID=77023043
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110569868.3A Active CN113205368B (en) | 2021-05-25 | 2021-05-25 | Industrial and commercial customer clustering method based on time sequence water consumption data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113205368B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117952281B (en) * | 2024-03-26 | 2024-05-28 | 广东先知大数据股份有限公司 | User water demand prediction method, device and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764517A (en) * | 2018-04-08 | 2018-11-06 | 中南大学 | A kind of blast furnace molten iron silicon content trend method, equipment and storage medium |
CN111353523A (en) * | 2019-12-24 | 2020-06-30 | 中国国家铁路集团有限公司 | Method for classifying railway customers |
CN111415192A (en) * | 2020-02-27 | 2020-07-14 | 重庆森鑫炬科技有限公司 | Water quality prediction method for user based on big data |
CN111722576A (en) * | 2020-06-24 | 2020-09-29 | 合肥供水集团有限公司 | Water supply industry computer lab 3D visual fortune dimension management system |
CN112149990A (en) * | 2020-09-18 | 2020-12-29 | 南京邮电大学 | Fuzzy supply and demand matching method based on prediction |
CN112433927A (en) * | 2020-11-30 | 2021-03-02 | 西安理工大学 | Cloud server aging prediction method based on time series clustering and LSTM |
CN112508275A (en) * | 2020-12-07 | 2021-03-16 | 国网湖南省电力有限公司 | Power distribution network line load prediction method and equipment based on clustering and trend indexes |
CN112700068A (en) * | 2021-01-15 | 2021-04-23 | 武汉大学 | Reservoir dispatching rule optimization method based on machine learning fusion of multi-source remote sensing data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550744A (en) * | 2015-12-06 | 2016-05-04 | 北京工业大学 | Nerve network clustering method based on iteration |
CN107967542B (en) * | 2017-12-21 | 2021-07-27 | 国网浙江省电力公司丽水供电公司 | Long-short term memory network-based electricity sales amount prediction method |
CN109902915A (en) * | 2019-01-11 | 2019-06-18 | 国网浙江省电力有限公司 | A kind of energy behavior analysis method of the electricity-water-gas based on fuzzy C-mean algorithm model |
CN110007652B (en) * | 2019-03-22 | 2020-12-29 | 华中科技大学 | Hydroelectric generating set degradation trend interval prediction method and system |
CN111260117B (en) * | 2020-01-10 | 2022-03-25 | 燕山大学 | CA-NARX water quality prediction method based on meteorological factors |
-
2021
- 2021-05-25 CN CN202110569868.3A patent/CN113205368B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764517A (en) * | 2018-04-08 | 2018-11-06 | 中南大学 | A kind of blast furnace molten iron silicon content trend method, equipment and storage medium |
CN111353523A (en) * | 2019-12-24 | 2020-06-30 | 中国国家铁路集团有限公司 | Method for classifying railway customers |
CN111415192A (en) * | 2020-02-27 | 2020-07-14 | 重庆森鑫炬科技有限公司 | Water quality prediction method for user based on big data |
CN111722576A (en) * | 2020-06-24 | 2020-09-29 | 合肥供水集团有限公司 | Water supply industry computer lab 3D visual fortune dimension management system |
CN112149990A (en) * | 2020-09-18 | 2020-12-29 | 南京邮电大学 | Fuzzy supply and demand matching method based on prediction |
CN112433927A (en) * | 2020-11-30 | 2021-03-02 | 西安理工大学 | Cloud server aging prediction method based on time series clustering and LSTM |
CN112508275A (en) * | 2020-12-07 | 2021-03-16 | 国网湖南省电力有限公司 | Power distribution network line load prediction method and equipment based on clustering and trend indexes |
CN112700068A (en) * | 2021-01-15 | 2021-04-23 | 武汉大学 | Reservoir dispatching rule optimization method based on machine learning fusion of multi-source remote sensing data |
Non-Patent Citations (7)
Title |
---|
Water Level Prediction of Community Secondary Water Supply Tank Based on Deep Learning;Han Wu;《IEEE Xplore》;20200213;全文 * |
基于层次聚类的LSTM神经网络模型在江苏省降水量预测中的应用;周晓旭;《中国优秀硕士学位论文全文数据库 (基础科学辑)》;20201015;全文 * |
基于用水量预测的智慧水务可视化预警***设计与实现;刘春柳;《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》;20200315;第1-63页 * |
基于粗糙集-模糊C均值聚类的Elman神经网络农村需水量预测;李伟等;《科学技术与工程》;20200108(第01期);全文 * |
基于聚类LSTM深度学习模型的主动配电网电能质量预测;翁国庆等;《高技术通讯》;20200715(第07期);全文 * |
长短期记忆神经网络在多时次土壤水分动态预测中的应用;范嘉智;《土壤》;20210228(第1期);全文 * |
集对分析聚类预测法在区域用水量中的应用;袁朝阳;《华北水利水电大学学报(自然科学版)》;20150831;第36卷(第4期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113205368A (en) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kanevski et al. | Analysis and modelling of spatial environmental data | |
Wilks | Statistical methods in the atmospheric sciences | |
CN109492099A (en) | It is a kind of based on field to the cross-domain texts sensibility classification method of anti-adaptive | |
CN112508105A (en) | Method for detecting and retrieving faults of oil extraction machine | |
Shen et al. | Visual interpretation of recurrent neural network on multi-dimensional time-series forecast | |
CN112784980B (en) | Intelligent logging horizon dividing method | |
CN111582387A (en) | Rock spectral feature fusion classification method and system | |
CN111949535A (en) | Software defect prediction device and method based on open source community knowledge | |
CN112181490B (en) | Method, device, equipment and medium for identifying function category in function point evaluation method | |
CN113205368B (en) | Industrial and commercial customer clustering method based on time sequence water consumption data | |
CN113240518A (en) | Bank-to-public customer loss prediction method based on machine learning | |
Tung et al. | Binary classification and data analysis for modeling calendar anomalies in financial markets | |
Zaidan et al. | Predicting atmospheric particle formation days by Bayesian classification of the time series features | |
CN115718746A (en) | Rice field methane emission prediction method based on machine learning | |
Bommer et al. | Finding the right XAI method--A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science | |
WO2023004632A1 (en) | Method and apparatus for updating knowledge graph, electronic device, storage medium, and program | |
CN111863135B (en) | False positive structure variation filtering method, storage medium and computing device | |
CN113177644A (en) | Automatic modeling system based on word embedding and depth time sequence model | |
CN117408167A (en) | Debris flow disaster vulnerability prediction method based on deep neural network | |
Adler et al. | Ranking methods within data envelopment analysis | |
CN115293641A (en) | Enterprise risk intelligent identification method based on financial big data | |
Liao et al. | A visual voting framework for weather forecast calibration | |
CN108960347A (en) | A kind of recruitment evaluation system and method for convolutional neural networks handwriting recongnition Ranking Stability | |
CN116910526A (en) | Model training method, device, communication equipment and readable storage medium | |
CN114818849A (en) | Convolution neural network based on big data information and anti-electricity-stealing method based on genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |