CN116629142A - Lightning positioning track prediction method, system and storage medium based on transformer mechanism - Google Patents

Lightning positioning track prediction method, system and storage medium based on transformer mechanism Download PDF

Info

Publication number
CN116629142A
CN116629142A CN202310904397.6A CN202310904397A CN116629142A CN 116629142 A CN116629142 A CN 116629142A CN 202310904397 A CN202310904397 A CN 202310904397A CN 116629142 A CN116629142 A CN 116629142A
Authority
CN
China
Prior art keywords
time domain
attention module
frequency domain
value
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310904397.6A
Other languages
Chinese (zh)
Other versions
CN116629142B (en
Inventor
魏振春
裴文浩
向念文
吕增威
李科杰
丁煦
石雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202310904397.6A priority Critical patent/CN116629142B/en
Publication of CN116629142A publication Critical patent/CN116629142A/en
Application granted granted Critical
Publication of CN116629142B publication Critical patent/CN116629142B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R29/00Arrangements for measuring or indicating electric quantities not covered by groups G01R19/00 - G01R27/00
    • G01R29/08Measuring electromagnetic field characteristics
    • G01R29/0807Measuring electromagnetic field characteristics characterised by the application
    • G01R29/0814Field measurements related to measuring influence on or from apparatus, components or humans, e.g. in ESD, EMI, EMC, EMP testing, measuring radiation leakage; detecting presence of micro- or radiowave emitters; dosimetry; testing shielding; measurements related to lightning
    • G01R29/0842Measurements related to lightning, e.g. measuring electric disturbances, warning systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Environmental & Geological Engineering (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electromagnetism (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Ecology (AREA)
  • Environmental Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, in particular to a lightning positioning track prediction method, a lightning positioning track prediction system and a storage medium based on a transformation mechanism. According to the lightning location track prediction method based on the transducer mechanism, the transducer is combined with a time sequence decomposition method, wherein the decomposition method captures the whole outline of the time sequence, and the transducer captures a more detailed structure, so that the model can capture the global view of the time sequence. In the invention, during time domain feature extraction, a small number of dot products contributing to main attention are selected through M (q (i) and K1 to carry out subsequent attention extraction, so that the complexity of an algorithm is reduced, the benefit of computing resources is improved, and the model prediction capability is improved.

Description

Lightning positioning track prediction method, system and storage medium based on transformer mechanism
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a lightning positioning track prediction method, a lightning positioning track prediction system and a storage medium based on a transformation mechanism.
Background
Lightning is a high-intensity electromagnetic pulse phenomenon which frequently occurs in nature, and is widely focused by various departments of weather, aerospace, aviation, electric power and petroleum because of the great influence. Among them, the grid is more vulnerable to lightning because it has wide area distribution characteristics, with geometrical dimensions up to thousands of km. Lightning strikes are important factors affecting the safe operation of the power grid. Accurate prediction of lightning track is required to ensure safe operation of the power grid so as to prepare for lightning protection in advance; the difficulty of lightning track prediction is that the influence of all external factors on the lightning cloud motion track cannot be described by a formula, and the lightning cloud track is difficult to predict accurately for a long time based on real-time data.
In the prior art, a lot of valuable research results are obtained on the prediction of lightning, but the methods cannot predict the short thundercloud motion trail in the future or only can predict the short thundercloud motion trail in the future based on the existing data. Before the arrival of lightning activity, the evacuation of constructors on the electric tower is required to be completed, the lightning risk level is required to be evaluated to complete the next decision on the task scheduling of the electric tower, the deployment of active lightning protection is required to be completed, a great deal of time is required to ensure the safety of the constructors and minimize the loss. The time before the lightning activity is very precious, so a method capable of completing long-term prediction of the lightning track is urgently needed.
Disclosure of Invention
In order to overcome the defect that long-time thundercloud prediction cannot be realized in the prior art, the invention provides a lightning positioning track prediction method based on a transformation mechanism, which can well solve the long-time prediction problem, can accurately judge the thundercloud motion track and enables accurate lightning protection to be possible.
According to the lightning positioning track prediction method based on the transformation mechanism, input data are defined to be historical data containing T continuous time points, and the historical data at each time point comprise lightning data and weather data; extracting frequency domain features and time domain features aiming at input data, splicing the frequency domain features and the time domain features in the time domain to obtain first spliced data, and decomposing the first spliced data; decomposing based on the input data, extracting frequency domain features and time domain features aiming at the decomposed data of the input data, splicing the frequency domain features and the time domain features in the time domain to obtain second spliced data, and decomposing the second spliced data; decoding by combining the decomposition data of the first splicing data and the decomposition data of the second splicing data to obtain thundercloud distribution at t time points after the input data as a prediction result Y;
the extraction of the time domain features is realized through a sparse time domain self-attention module, the sparse time domain self-attention module converts input data into word vectors of d dimension, and then different linear transformations are adopted to transform the word vectors so as to obtain a matrix Q1, a matrix K1 and a matrix V1; let the ith element in the matrix Q1 be Q (i), and the jth element in the matrix K1 be K (j); the sparse time domain self-attention module calculates an evaluation value M (Q (i), K1) of each element Q (i) in the matrix Q1 and the matrix K1 according to a preset evaluation formula; the sparse time domain self-attention module selects n elements Q (i) with larger corresponding evaluation values M (Q (i), K1) as target elements, and forms a matrix Q (1, 1) by the selected n target elements; the sparse time domain self-attention module takes Q (1, 1) as a Q value, takes K1 as a K value, takes V1 as a V value, and then executes a self-attention mechanism by combining the Q value, the K value and the V value so as to acquire the time domain characteristics of the data input by the sparse time domain self-attention module.
Preferably, the evaluation formula is:
a1 and A2 are transition items; k (q (i), k (j)) and k (q (i), k (r)) are set kernel functions; k (r) is the r-th element in the matrix K1; m (q (i), K1) is used to describe the distance between the probability distribution of attention and the uniform distribution; l (K) is the number of elements of the K value.
Preferably, k (q (i), k (j)) and k (q (i), k (r)) are asymmetric exponential kernel functions;
where d is the dimension of the word vector extracted by the sparse time domain attention module, and the superscript T represents the transpose of the vector.
Preferably, the sparse time domain self-attention module takes Q (1, 1) as a Q value, takes K1 as a K value, takes V1 as a V value, and then inputs the Q value, the K value and the V value into an activation function, wherein an activation result of the activation function is a time domain feature.
Preferably, the activation function selects a SoftMax activation function.
Preferably, a thundercloud prediction model is firstly constructed, and the thundercloud prediction model is used for predicting the thundercloud distribution at the next T time points by combining known input data x containing the T continuous time points; then, constructing input data x by combining known historical data, and inputting the input data x into a thundercloud prediction model to acquire the thundercloud distribution at a subsequent time point;
the thundercloud prediction model comprises an encoder and a decoder; the encoder is used for extracting time domain features and frequency domain features of input data x, splicing the time domain features and the frequency domain features on the time domain to obtain first spliced data, carrying out residual processing on the first spliced data by the encoder, and carrying out seasonal-trend item decomposition on the processed data to output seasonal items xs1 and trend items xt1;
the decoder includes: the system comprises a second time sequence decomposition unit, a second sparse time domain self-attention module, a second frequency domain self-attention module, a second residual error network, a third time sequence decomposition unit, a third frequency domain self-attention module, a third residual error network, a fourth time sequence decomposition unit, a second feedforward network and a fourth residual error network; the second sparse time domain self-attention module adopts the sparse time domain self-attention module;
the second time sequence decomposition unit performs season-trend item decomposition on the input data x and outputs a season item xs2 and a trend item xt2; the second sparse time domain self-attention module is used for extracting the time domain characteristics of xs2, and the second frequency domain self-attention module is used for extracting the frequency domain characteristics of xs2 and converting the frequency domain characteristics into time domain characteristics for output; the time domain features output by the second sparse time domain self-attention module and the time domain features output by the second frequency domain self-attention module form second spliced data, the second spliced data and the season term xs2 are subjected to residual processing through a second residual network, the residual features output by the second residual network are input into a third time sequence decomposition unit, the third time sequence decomposition unit carries out season-trend term decomposition, and the season term xs3 and the trend term xt3 are output;
the third frequency domain self-attention module takes a seasonal term xs3 as a Q value, takes values transformed by different linear transformation modes of the seasonal term xs1 as a K value and a V value, and extracts frequency domain features from the xs1 and converts the frequency domain features into time domain features to be output; carrying out residual processing on the time domain features and the season terms xs3 output by the third frequency domain self-attention module through a third residual network, carrying out season-trend term decomposition on the residual features output by the third residual network through a fourth time sequence decomposition unit, and outputting season terms xs4 and trend terms xt4 by the fourth time sequence decomposition unit; the values of the season term xs4 processed by the second feedforward network, the trend terms xt2, xt3 and xt4 and the season term xs4 are all input into a fourth residual network, the fourth residual network outputs a prediction result Y after residual processing, and the prediction result Y is the thundercloud distribution at t time points after the input data x.
Preferably, the encoder comprises a first sparse time domain self-attention module, a first frequency domain self-attention module, a first residual network, a first feed forward network and a first time sequence decomposition unit; the system comprises a first sparse time domain self-attention module, a first residual error network, a first feedforward network and a first time sequence decomposition unit, wherein the input of the first frequency domain self-attention module is connected with the input of the first sparse time domain self-attention module, and the output of the first frequency domain self-attention module is connected with the input of the first residual error network; the first sparse time domain self-attention module adopts the sparse time domain self-attention module;
the first sparse time domain self-attention module is used for extracting time domain features of input data x; the first frequency domain self-attention module is used for extracting frequency domain features of input data x, converting the frequency domain features into time domain features and outputting the time domain features; the time domain features output by the first sparse time domain self-attention module and the time domain features output by the first frequency domain self-attention module are spliced to form first spliced data, the first spliced data are processed through a first residual error network and a first feedforward network in sequence to obtain data z, the data z is subjected to season-trend item decomposition through a first time sequence decomposition unit, and the first time sequence decomposition unit outputs a season item xs1 and a trend item xt1.
Preferably, the first frequency-domain self-attention module, the second frequency-domain self-attention module and the third frequency-domain self-attention module have the same structure and are collectively called as frequency-domain self-attention modules;
the frequency domain attention module firstly acquires a Q value, a K value and a V value of an attention mechanism by combining input data, then respectively projects the Q value, the K value and the V value to a frequency domain, and then carries out random sampling on the frequency domain to make a random sampling small matrix corresponding to the Q value be SF (Q), the random sampling small matrix corresponding to the K value be SF (K) and the random sampling small matrix corresponding to the V value be SF (V); then the frequency domain attention module inputs the frequency domain features learned by SF (Q), SF (K) and SF (V) through dot product into an activation function for activation, the activated frequency domain features are filled to the original dimensions of Q value, K value and V value, and the filled frequency domain features are converted into time domain features through inverse Fourier transform and output.
The invention also provides a lightning positioning track prediction system and a storage medium based on the transformation mechanism, which are used for bearing the lightning positioning track prediction method based on the transformation mechanism, so that the method can be popularized and applied.
The invention provides a lightning positioning track prediction system based on a transformation mechanism, which comprises a processor and a memory, wherein a computer program and a thundercloud prediction model are stored in the memory, the processor is connected with the memory, and the processor is used for executing the computer program so as to execute the lightning positioning track prediction method based on the transformation mechanism.
The storage medium is used for storing a computer program and a thundercloud prediction model, and the computer program is used for realizing the thunder and lightning positioning track prediction method based on a transducer mechanism when being executed.
The invention has the advantages that:
(1) According to the lightning location track prediction method based on the transducer mechanism, a transducer model formed by an encoder and a decoder is combined with a time sequence decomposition method, wherein the decomposition method captures the whole outline of a time sequence, and the transducer captures a more detailed structure, so that the model can capture a global view of the time sequence. In the invention, during time domain feature extraction, a small number of dot products contributing to main attention are selected through M (q (i) and K1 to carry out subsequent attention extraction, so that the complexity of an algorithm is reduced, the benefit of computing resources is improved, and the model prediction capability is improved.
(2) The invention combines the transform, time sequence analysis and frequency domain analysis, so that the model can keep the global property and statistics of the whole time sequence, and is beneficial to better capturing the global property of the time sequence by the thundercloud prediction model. The invention can solve the problem of long time sequence, and forecast the thundercloud motion track for a long period of time in the future so as to achieve timely lightning protection preparation.
(3) According to the invention, the sparse time domain self-attention module is adopted to capture the dependent coupling between long time ranges through the ProbSparse self-attention mechanism, so that the time complexity and the space complexity of an algorithm are greatly reduced, and the accurate prediction of the long-time thundercloud track is possible.
(4) The frequency domain attention module in the invention projects the input sequence, namely the input data, on the original time domain to the frequency domain, and then randomly samples the frequency domain. This can greatly reduce the length of the input vector and thus the computational complexity.
Drawings
FIG. 1 is a block diagram of a thundercloud prediction model;
FIG. 2 is a training flow diagram of a thundercloud predictive model;
FIG. 3 is a graph of the trend of longitude of the centroid of a thundercloud in an embodiment;
fig. 4 is a graph of the dimension trend of the centroid of the thundercloud in an embodiment.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the training method of the lightning location track prediction model based on the transducer mechanism according to the present embodiment includes the following steps:
s1, constructing a basic model, wherein the basic model comprises an encoder and a decoder;
the encoder comprises a first sparse time domain self-attention module, a first frequency domain self-attention module, a first residual error network, a first feedforward network and a first time sequence decomposition unit; the system comprises a first sparse time domain self-attention module, a first residual error network, a first feedforward network and a first time sequence decomposition unit, wherein the input of the first frequency domain self-attention module is connected with the input of the first sparse time domain self-attention module, and the output of the first frequency domain self-attention module is connected with the input of the first residual error network;
the first sparse time domain self-attention module is used for extracting time domain features of input data x;
the first frequency domain self-attention module is used for extracting frequency domain features of input data x, converting the frequency domain features into time domain features and outputting the time domain features;
the method comprises the steps of splicing time domain features output by a first sparse time domain self-attention module and time domain features output by a first frequency domain self-attention module to form first spliced data, processing the first spliced data through a first residual error network and a first feedforward network in sequence to obtain data z, carrying out season-trend item decomposition on the data z through a first time sequence decomposition unit, and outputting a season item xs1 and a trend item xt1 through the first time sequence decomposition unit.
In this embodiment, the first sparse time domain self-attention module and the first frequency domain self-attention module extract the time domain feature and the frequency domain feature of the input data x respectively, so that feature extraction of the data on the frequency domain and the time domain is realized, the first frequency domain self-attention module converts the frequency domain feature into the time domain feature, feature stitching is facilitated, the stitched data is processed by the subsequent first residual network, and feature extraction and representation of the data are more sufficient.
In the embodiment, the distribution difference of network input and output is reduced by using the seasonal-trend term decomposition, so that the model is more suitable for the condition that the distribution of time series changes along with the time change, the extrapolation capability of the model is improved, and the prediction accuracy is improved.
The decoder includes: the system comprises a second time sequence decomposition unit, a second sparse time domain self-attention module, a second frequency domain self-attention module, a second residual error network, a third time sequence decomposition unit, a third frequency domain self-attention module, a third residual error network, a fourth time sequence decomposition unit, a second feedforward network and a fourth residual error network;
the second time sequence decomposition unit performs season-trend item decomposition on the input data x and outputs a season item xs2 and a trend item xt2; the output of the second time sequence decomposition unit is respectively connected with the input of the second sparse time domain self-attention module, the input of the second frequency domain self-attention module and the input of the fourth residual error network;
the output of the second time sequence decomposition unit, the output of the second sparse time domain self-attention module and the output of the second frequency domain self-attention module are all connected with the input of a second residual error network; the output of the second residual error network is connected with the input of the third time sequence decomposition unit;
the second sparse time domain self-attention module is used for extracting the time domain characteristics of xs2, and the second frequency domain self-attention module is used for extracting the frequency domain characteristics of xs2 and converting the frequency domain characteristics into time domain characteristics for output; carrying out residual processing on the time domain features output by the second sparse time domain self-attention module, the time domain features output by the second frequency domain self-attention module and the season term xs2 through a second residual network, inputting the residual features output by the second residual network into a third time sequence decomposition unit, carrying out season-trend term decomposition by the third time sequence decomposition unit, and outputting a season term xs3 and a trend term xt3;
the input of the third frequency domain self-attention module is respectively connected with the output of the third time sequence decomposition unit and the output of the first time sequence decomposition unit; the third frequency domain self-attention module takes a seasonal term xs3 as a Q value, takes values transformed by different linear transformation modes of the seasonal term xs1 as a K value and a V value, and extracts frequency domain features from the xs1 and converts the frequency domain features into time domain features to be output.
The output of the third time sequence decomposition unit and the output of the third frequency domain self-attention module are both connected with the input of a third residual error network, the output of the third residual error network is connected with the input of a fourth time sequence decomposition unit, and the output of the fourth time sequence decomposition unit is respectively connected with the input of a second feedforward network and the input of a fourth residual error network; the output of the second feed forward network is connected to the input of the fourth residual network.
Carrying out residual processing on the time domain features and the season terms xs3 output by the third frequency domain self-attention module through a third residual network, carrying out season-trend term decomposition on the residual features output by the third residual network through a fourth time sequence decomposition unit, and outputting season terms xs4 and trend terms xt4 by the fourth time sequence decomposition unit;
the values of the season term xs4 processed by the second feedforward network, the trend terms xt2, xt3 and xt4 and the season term xs4 are all input into a fourth residual network, and the fourth residual network outputs a prediction result Y after residual processing.
The input of the basic model is used for acquiring input data x, and the output of the basic model is used for outputting a prediction result Y; namely, the input of the first sparse time domain self-attention module, the input of the first frequency domain self-attention module and the input of the second time sequence decomposition unit are mutually connected to serve as the input of a basic model; the output of the basic model is the output of the fourth residual network.
The input data x is historical data comprising T continuous time points, the historical data at each time point comprises lightning data and weather data, the lightning data comprises thundercloud distribution and field intensity at the corresponding time point, and the weather data comprises wind speed;
Y={Y(1),Y(2),……,Y(t)}
Y(t)={y(1,t);y(2,t);……;y(L(Y(t)),t)}
y (1), Y (2), … …, Y (t) represent thundercloud data at t time points next to the input data x;
y (1, t), Y (2, t), … …, Y (L (Y (t)), t) representing a set of centers of mass of L (Y (t)) thundercloud clusters predicted at a t-th point in time after the input data x;
specifically, assuming that the last time point of the input data x is T, Y (1) represents the thundercloud data at the time point t+1, Y (2) represents the thundercloud data at the time point t+2, and Y (T) represents the thundercloud data at the time point t+t; y (1, T) represents the centroid of the 1 st thundercloud at time t+t, Y (2, T) represents the centroid of the 2 nd thundercloud at time t+t, Y (L (Y (T)) and T represents the centroid of the L (Y (T)) th thundercloud at time t+t.
S2, constructing an experience sample (x, Y) by combining historical data, enabling the base model to perform machine learning on the experience sample (x, Y), and obtaining a converged base model to serve as a thundercloud prediction model, wherein the thundercloud prediction model is used for predicting thundercloud distribution at the next T time points by combining known input data x containing T continuous time points.
The first sparse time domain self-attention module and the second sparse time domain self-attention module are collectively called as a sparse time domain self-attention module, and the two modules have the same structure.
The sparse time domain self-attention module is used for extracting time domain attention vectors of the input data; specifically, the sparse time domain self-attention module firstly converts input data into word vectors of d dimension, then converts the word vectors into a matrix Q1 by adopting first linear transformation, converts the word vectors into a matrix K1 by adopting second linear transformation, and converts the word vectors into a matrix V1 by adopting third linear transformation; let the ith element in the matrix Q1 be Q (i), the jth element in the matrix K1 be K (j), and the number of elements in the matrix K1 be L (K); the sparse time domain self-attention module calculates an evaluation value M (Q (i), K1) of each element Q (i) in the matrix Q1 and the matrix K1 according to a preset evaluation formula; the sparse time domain self-attention module selects larger n elements Q (i) from Q1 as target elements according to the order of evaluation values M (Q (i) and K1) from large to small, forms the selected n target elements into a matrix Q (1, 1), takes Q (1, 1) and K1 and V1 as Q value K value and V value of a self-attention mechanism respectively, inputs an activation function, and outputs time domain characteristics after the activation function activates the Q value K value and the V value; the activation function may specifically select a SoftMax activation function.
The evaluation formula is:
a1 and A2 are transition items; k (q (i), K (j)) and K (q (i), K (r)) are asymmetric exponential kernel functions, which are part of the probability distribution formula in conventional transformers, M (q (i), K1) is used to describe the distance between the probability distribution of attention and the uniform distribution, and the larger M (q (i), K1) indicates the more active the attention value. K (j) is the j-th element in the matrix K1, K (r) is the r-th element in the matrix K1, j is more than or equal to 1 and less than or equal to L (K), and r is more than or equal to 1 and less than or equal to L (K).
In this embodiment, during time domain feature extraction, a few dot products contributing to main attention are selected from the order of large to small through M (q (i), K1) to perform attention extraction, so that algorithm complexity is reduced, the benefit of computing resources is improved, and thus model prediction capability is improved.
The data input by the first sparse time domain self-attention module is input data x, and the data input by the second sparse time domain self-attention module is seasonal term xs2.
The first frequency-domain self-attention module, the second frequency-domain self-attention module and the third frequency-domain self-attention module have the same structure and are collectively called as the frequency-domain self-attention module.
In this embodiment, the frequency domain attention module firstly acquires a Q value, a K value and a V value in combination with input data, then projects the Q value, the K value and the V value to a frequency domain, then performs random sampling of a small matrix on the frequency domain, the frequency domain attention module updates the Q value to a sampling of a small matrix SF (Q) in the frequency domain projection of the Q value, the frequency domain attention module updates the K value to the sampling of a small matrix SF (K) in the frequency domain projection of the K value, the frequency domain attention module updates the V value to the sampling of a small matrix SF (V) in the frequency domain projection of the V value, then the frequency domain attention module inputs the frequency domain features learned by the SF (Q), SF (K) and SF (V) through dot products to activate functions, then fills the activated frequency domain features to the original dimensions of the Q value, the K value and the V value, and converts the filled frequency domain features into time domain features through inverse fourier transform and outputs the time domain features. The original Q value, K value and V value are the Q value, K value and V value obtained by combining the input data by the frequency domain attention module. It can be seen that the Q, K and V values are matrices of the same size.
The working process formula of the frequency domain self-attention module is expressed as follows:
f-Attention(QKV)=F -1 (Padding(Softmax((SF(Q)·SF(K) T )·SF(V))))
wherein SF (Q) represents a small matrix randomly selected from the projection of the Q value input from the frequency domain attention module on the frequency domain, SF (K) represents a small matrix randomly selected from the projection of the K value input from the frequency domain attention module on the frequency domain, SF (V) represents a small matrix randomly selected from the projection of the V value input from the frequency domain attention module on the frequency domain, and superscript T represents a transpose; softmax is the activation function for the application of (SF (Q). SF (K) T ) SF (V) maps between 0-1; padding represents a filling operation for filling the activated frequency domain feature Softmax ((SF (Q). SF (K)) T ) SF (V)) to the size of the Q value; the Q value, the K value and the V value are matrixes with the same size, F -1 Representing the inverse fourier transform.
The frequency domain attention module projects an input sequence, namely input data, on an original time domain to a frequency domain, and then randomly samples the frequency domain. Thus, the length of the input vector can be greatly reduced, and the calculation complexity is further reduced.
In this embodiment, the input of the first frequency domain attention module is input data x, and the first frequency domain attention module performs three different linear transformations on the input data x to obtain a Q value, a K value, and a V value, respectively.
The input of the second frequency domain attention module is a season term xs2, and the second frequency domain attention module performs three different linear transformations on xs2 to obtain a Q value, a K value and a V value respectively.
The third frequency domain self-attention module takes the seasonal term xs3 as a Q value, and takes values obtained by transforming the seasonal term xs1 through different linear transformation modes as a K value and a V value.
The data input by the first frequency domain self-attention module is input data x; the data input by the second frequency domain self-attention module is a season term xs2, and the data input by the third frequency domain self-attention module is season terms xs1 and xs3.
The frequency domain self-attention module extracts frequency domain features of input data through Fourier transformation, and then the frequency domain features are converted into time domain features through inverse Fourier transformation so as to be spliced with the time domain features output by the sparse time domain self-attention module or time sequence data so as to be convenient for subsequent calculation.
Thus, the feature extraction of the data in the frequency domain and the time domain is realized, so that the feature extraction and the representation of the data are more sufficient.
The thundercloud prediction model provided by the invention is verified by combining a specific embodiment.
In this embodiment, historical data of a region in southwest is selected to construct an empirical sample (x, Y), wherein Y in the empirical sample is a thundercloud distribution represented by radar basic reflectivity data.
The empirical samples (x, Y) are partitioned into a training set and a testing set.
In this embodiment, 2 comparison models were also constructed, respectively: LSTM model (time-cycled neural network) and transducer model.
In this embodiment, the thundercloud prediction model and the two comparison models provided by the invention are trained by adopting a training set until convergence, and then the accuracy of the converged model is tested on a testing set.
In this embodiment, when a model test is performed on the test set, the centroid of the thundercloud is calculated as a true value according to the true value Y of the thundercloud distribution, the centroids of the thundercloud distribution output by the 3 models are calculated, and then the longitude and latitude of the centroids obtained by the 3 models at each time point are compared with the longitude and latitude of the true value of the centroid.
In this embodiment, parameters of the thundercloud prediction model provided by the invention are set as follows: t=254, t=14.
In this embodiment, in order to synchronize lightning location information with the time of radar basic reflectivity, 6 minutes is selected uniformly as a unit time, that is, the time step between two adjacent time points is 6 minutes. It can be seen that the present invention predicts a thundercloud distribution over the future 14×6=84 minutes.
The test results in this example are shown in fig. 3 and 4. Fig. 3 is the predicted longitude of the centroid of the thundercloud at each point in time, and fig. 4 is the predicted latitude of the centroid of the thundercloud at each point in time. In fig. 3 and 4, groudTruth is the true value of the centroid.
As can be seen from fig. 3 and fig. 4, in this embodiment, a change curve of the distribution of the thundercloud obtained by the thundercloud prediction model with time is closer to a real track of the thundercloud; compared with an LSTM model and a traditional transducer model, the method has the advantages that the long-term prediction of the thundercloud is greatly improved, and the long-term prediction of the thundercloud is possible.
It will be understood by those skilled in the art that the present invention is not limited to the details of the foregoing exemplary embodiments, but includes other specific forms of the same or similar structures that may be embodied without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.
The technology, shape, and construction parts of the present invention, which are not described in detail, are known in the art.

Claims (10)

1. A lightning location track prediction method based on a transducer mechanism is characterized in that input data are defined to be historical data containing T continuous time points, and the historical data at each time point comprise lightning data and weather data; extracting frequency domain features and time domain features aiming at input data, splicing the frequency domain features and the time domain features in the time domain to obtain first spliced data, and decomposing the first spliced data; decomposing based on the input data, extracting frequency domain features and time domain features aiming at the decomposed data of the input data, splicing the frequency domain features and the time domain features in the time domain to obtain second spliced data, and decomposing the second spliced data; decoding by combining the decomposition data of the first splicing data and the decomposition data of the second splicing data to obtain thundercloud distribution at t time points after the input data as a prediction result Y;
the extraction of the time domain features is realized through a sparse time domain self-attention module, the sparse time domain self-attention module converts input data into word vectors of d dimension, and then different linear transformations are adopted to transform the word vectors so as to obtain a matrix Q1, a matrix K1 and a matrix V1; let the ith element in the matrix Q1 be Q (i), and the jth element in the matrix K1 be K (j); the sparse time domain self-attention module calculates an evaluation value M (Q (i), K1) of each element Q (i) in the matrix Q1 and the matrix K1 according to a preset evaluation formula; the sparse time domain self-attention module selects n elements Q (i) with larger corresponding evaluation values M (Q (i), K1) as target elements, and forms a matrix Q (1, 1) by the selected n target elements; the sparse time domain self-attention module takes Q (1, 1) as a Q value, takes K1 as a K value, takes V1 as a V value, and then executes a self-attention mechanism by combining the Q value, the K value and the V value so as to acquire the time domain characteristics of the data input by the sparse time domain self-attention module.
2. The method for predicting lightning location trajectories based on a transducer mechanism of claim 1, wherein the evaluation formula is:
a1 and A2 are transition items; k (q (i), k (j)) and k (q (i), k (r)) are set kernel functions; k (r) is the r-th element in the matrix K1; m (q (i), K1) is used to describe the distance between the probability distribution of attention and the uniform distribution; l (K) is the number of elements of the K value.
3. The lightning location trajectory prediction method based on a transducer mechanism according to claim 2, wherein k (q (i), k (j)) and k (q (i), k (r)) are asymmetric exponential kernel functions;
;/>the method comprises the steps of carrying out a first treatment on the surface of the Where d is the dimension of the word vector extracted by the sparse time domain attention module, and the superscript T represents the transpose of the vector.
4. The lightning location track prediction method based on a transformer mechanism according to claim 3, wherein the sparse time domain self-attention module takes Q (1, 1) as a Q value, takes K1 as a K value, takes V1 as a V value, and inputs the Q value, the K value and the V value into an activation function, and an activation result of the activation function is a time domain feature.
5. The lightning location trajectory prediction method based on a transducer mechanism of claim 4, wherein the activation function selects a SoftMax activation function.
6. The lightning location trajectory prediction method based on a transducer mechanism according to claim 3, wherein a thundercloud prediction model is first constructed for predicting the thundercloud distribution at the next T time points in combination with the known input data x containing T consecutive time points; then, constructing input data x by combining known historical data, and inputting the input data x into a thundercloud prediction model to acquire the thundercloud distribution at a subsequent time point;
the thundercloud prediction model comprises an encoder and a decoder; the encoder is used for extracting time domain features and frequency domain features of input data, splicing the time domain features and the frequency domain features on the time domain to obtain first spliced data, carrying out residual error processing on the first spliced data by the encoder, and carrying out seasonal-trend item decomposition on the processed data to output a seasonal item xs1 and a trend item xt1;
the decoder includes: the system comprises a second time sequence decomposition unit, a second sparse time domain self-attention module, a second frequency domain self-attention module, a second residual error network, a third time sequence decomposition unit, a third frequency domain self-attention module, a third residual error network, a fourth time sequence decomposition unit, a second feedforward network and a fourth residual error network; the second sparse time domain self-attention module adopts the sparse time domain self-attention module;
the second time sequence decomposition unit performs season-trend item decomposition on the input data x and outputs a season item xs2 and a trend item xt2; the second sparse time domain self-attention module is used for extracting the time domain characteristics of xs2, and the second frequency domain self-attention module is used for extracting the frequency domain characteristics of xs2 and converting the frequency domain characteristics into time domain characteristics for output; the time domain features output by the second sparse time domain self-attention module and the time domain features output by the second frequency domain self-attention module form second spliced data, the second spliced data and the season term xs2 are subjected to residual processing through a second residual network, the residual features output by the second residual network are input into a third time sequence decomposition unit, the third time sequence decomposition unit carries out season-trend term decomposition, and the season term xs3 and the trend term xt3 are output;
the third frequency domain self-attention module takes a seasonal term xs3 as a Q value, takes values transformed by different linear transformation modes of the seasonal term xs1 as a K value and a V value, and extracts frequency domain features from the xs1 and converts the frequency domain features into time domain features to be output; carrying out residual processing on the time domain features and the season terms xs3 output by the third frequency domain self-attention module through a third residual network, carrying out season-trend term decomposition on the residual features output by the third residual network through a fourth time sequence decomposition unit, and outputting season terms xs4 and trend terms xt4 by the fourth time sequence decomposition unit; the values of the season term xs4 processed by the second feedforward network, the trend terms xt2, xt3 and xt4 and the season term xs4 are all input into a fourth residual network, the fourth residual network outputs a prediction result Y after residual processing, and the prediction result Y is the thundercloud distribution at t time points after the input data x.
7. The method for predicting lightning location trajectories based on a transducer mechanism of claim 6, wherein the encoder comprises a first sparse time domain self-attention module, a first frequency domain self-attention module, a first residual network, a first feed forward network, and a first timing decomposition unit; the system comprises a first sparse time domain self-attention module, a first residual error network, a first feedforward network and a first time sequence decomposition unit, wherein the input of the first frequency domain self-attention module is connected with the input of the first sparse time domain self-attention module, and the output of the first frequency domain self-attention module is connected with the input of the first residual error network; the first sparse time domain self-attention module adopts the sparse time domain self-attention module;
the first sparse time domain self-attention module is used for extracting time domain features of input data x; the first frequency domain self-attention module is used for extracting frequency domain features of input data x, converting the frequency domain features into time domain features and outputting the time domain features; the time domain features output by the first sparse time domain self-attention module and the time domain features output by the first frequency domain self-attention module are spliced to form first spliced data, the first spliced data are processed through a first residual error network and a first feedforward network in sequence to obtain data z, the data z is subjected to season-trend item decomposition through a first time sequence decomposition unit, and the first time sequence decomposition unit outputs a season item xs1 and a trend item xt1.
8. The method for predicting lightning location trajectories based on a transducer mechanism of claim 7, wherein the first frequency domain self-attention module, the second frequency domain self-attention module and the third frequency domain self-attention module are identical in structure and collectively referred to as frequency domain self-attention modules;
the frequency domain attention module firstly acquires a Q value, a K value and a V value of an attention mechanism by combining input data, then respectively projects the Q value, the K value and the V value to a frequency domain, and randomly samples a small matrix on the frequency domain to enable the random sampling small matrix corresponding to the Q value to be SF (Q), the random sampling small matrix corresponding to the K value to be SF (K) and the random sampling small matrix corresponding to the V value to be SF (V); then the frequency domain attention module inputs the frequency domain features learned by SF (Q), SF (K) and SF (V) through dot product into an activation function for activation, the activated frequency domain features are filled to the original dimensions of Q value, K value and V value, and the filled frequency domain features are converted into time domain features through inverse Fourier transform and output.
9. A lightning location trajectory prediction system based on a transducer mechanism, comprising a processor and a memory, wherein the memory stores a computer program and a thundercloud prediction model, the processor is connected to the memory, and the processor is configured to execute the computer program to perform the lightning location trajectory prediction method based on the transducer mechanism according to any one of claims 1 to 8.
10. A storage medium, characterized in that a computer program and a thundercloud prediction model are stored, which computer program, when being executed, is adapted to implement a lightning location trajectory prediction method based on a transducer mechanism according to any one of claims 1-8.
CN202310904397.6A 2023-07-24 2023-07-24 Lightning positioning track prediction method and system based on transformer mechanism Active CN116629142B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310904397.6A CN116629142B (en) 2023-07-24 2023-07-24 Lightning positioning track prediction method and system based on transformer mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310904397.6A CN116629142B (en) 2023-07-24 2023-07-24 Lightning positioning track prediction method and system based on transformer mechanism

Publications (2)

Publication Number Publication Date
CN116629142A true CN116629142A (en) 2023-08-22
CN116629142B CN116629142B (en) 2023-09-29

Family

ID=87636903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310904397.6A Active CN116629142B (en) 2023-07-24 2023-07-24 Lightning positioning track prediction method and system based on transformer mechanism

Country Status (1)

Country Link
CN (1) CN116629142B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117875535A (en) * 2024-03-13 2024-04-12 中南大学 Method and system for planning picking and delivering paths based on historical information embedding

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200401888A1 (en) * 2019-06-24 2020-12-24 Tata Consultancy Services Limited Time series prediction with confidence estimates using sparse recurrent mixture density networks
CN113204903A (en) * 2021-04-29 2021-08-03 国网电力科学研究院武汉南瑞有限责任公司 Method for predicting thunder and lightning
US20210326723A1 (en) * 2020-04-21 2021-10-21 Microsoft Technology Licensing, Llc Predicted forecast offset from remote location sensor
CN114218870A (en) * 2021-12-22 2022-03-22 大连理工大学 Wind speed prediction method based on variational modal decomposition and attention mechanism
CN114548595A (en) * 2022-03-03 2022-05-27 成都信息工程大学 Strong convection weather physical characteristic quantity prediction method and system based on attention mechanism
CN115423080A (en) * 2022-09-16 2022-12-02 中国电信股份有限公司 Time series prediction method, time series prediction device, electronic device, and medium
CN115730716A (en) * 2022-11-16 2023-03-03 国网江苏省电力有限公司镇江供电分公司 Method for predicting medium-term and long-term power consumption of communication base station based on improved Transformer model
CN115761261A (en) * 2022-11-27 2023-03-07 东南大学 Short-term rainfall prediction method based on radar echo diagram extrapolation
CN116227560A (en) * 2023-02-06 2023-06-06 中国矿业大学 Time sequence prediction model and method based on DTW-former

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200401888A1 (en) * 2019-06-24 2020-12-24 Tata Consultancy Services Limited Time series prediction with confidence estimates using sparse recurrent mixture density networks
US20210326723A1 (en) * 2020-04-21 2021-10-21 Microsoft Technology Licensing, Llc Predicted forecast offset from remote location sensor
CN113204903A (en) * 2021-04-29 2021-08-03 国网电力科学研究院武汉南瑞有限责任公司 Method for predicting thunder and lightning
CN114218870A (en) * 2021-12-22 2022-03-22 大连理工大学 Wind speed prediction method based on variational modal decomposition and attention mechanism
CN114548595A (en) * 2022-03-03 2022-05-27 成都信息工程大学 Strong convection weather physical characteristic quantity prediction method and system based on attention mechanism
CN115423080A (en) * 2022-09-16 2022-12-02 中国电信股份有限公司 Time series prediction method, time series prediction device, electronic device, and medium
CN115730716A (en) * 2022-11-16 2023-03-03 国网江苏省电力有限公司镇江供电分公司 Method for predicting medium-term and long-term power consumption of communication base station based on improved Transformer model
CN115761261A (en) * 2022-11-27 2023-03-07 东南大学 Short-term rainfall prediction method based on radar echo diagram extrapolation
CN116227560A (en) * 2023-02-06 2023-06-06 中国矿业大学 Time sequence prediction model and method based on DTW-former

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SABRINA GUASTAVINO, ETC: "Prediction of severe thunderstorm events with ensemble deep learning and radar data", 《SCIENTIFIC REPORTS》 *
李晓鹏: "基于注意力机制的多源数据雷电时空预测研究", 《中国优秀硕士学位论文全文数据库 基础科学辑》, pages 009 - 149 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117875535A (en) * 2024-03-13 2024-04-12 中南大学 Method and system for planning picking and delivering paths based on historical information embedding
CN117875535B (en) * 2024-03-13 2024-06-04 中南大学 Method and system for planning picking and delivering paths based on historical information embedding

Also Published As

Publication number Publication date
CN116629142B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN110070226B (en) Photovoltaic power prediction method and system based on convolutional neural network and meta-learning
Shamshirband et al. A survey of deep learning techniques: application in wind and solar energy resources
CN116629142B (en) Lightning positioning track prediction method and system based on transformer mechanism
CN115293415A (en) Multi-wind-farm short-term power prediction method considering time evolution and space correlation
CN110309603A (en) A kind of short-term wind speed forecasting method and system based on wind speed characteristics
CN111199270A (en) Regional wave height forecasting method and terminal based on deep learning
CN113962433B (en) Wind power prediction method and system integrating causal convolution and separable time convolution
CN114462718A (en) CNN-GRU wind power prediction method based on time sliding window
CN112925824A (en) Photovoltaic power prediction method and system for extreme weather type
Poudel et al. Solar power prediction using deep learning
CN114283331A (en) Lightweight SAR image ship detection model and method based on strip pruning
Benamrou et al. A proposed model to forecast hourly global solar irradiation based on satellite derived data, deep learning and machine learning approaches
CN118013457A (en) Wind speed prediction method and system based on multi-mode data
Rahaman et al. Bayesian optimization based ANN model for short term wind speed forecasting in newfoundland, Canada
CN109993335A (en) A kind of probability forecast method and system for wind power
Xiaosheng et al. A deep learning approach for wind power prediction based on stacked denoising auto encoders optimized by bat algorithm
Grace et al. Wind speed prediction using extra tree classifier
Bendali et al. Deep learning for very short term solar irradiation forecasting
CN113537573A (en) Wind power operation trend prediction method based on dual space-time feature extraction
Alvanitopoulos et al. Solar radiation time-series prediction based on empirical mode decomposition and artificial neural networks
Ali et al. Machine learning based solar power forecasting techniques: Analysis and comparison
Manoj et al. Deep Feature Selection for Wind Forecasting‐II
Sun et al. ATL-Net: Ocean Current Prediction Based on Self-Attention Mechanism and Feature Enhancement
Wang et al. Regional Distributed Photovoltaic Ultra-Short-Term Prediction Method Based on Multi-task Learning
CN114971058B (en) Photovoltaic forecasting method based on depth attention network and clear sky radiation priori fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant