CN115423048B - Traffic flow anomaly detection method and system based on pattern similarity - Google Patents
Traffic flow anomaly detection method and system based on pattern similarity Download PDFInfo
- Publication number
- CN115423048B CN115423048B CN202211365058.7A CN202211365058A CN115423048B CN 115423048 B CN115423048 B CN 115423048B CN 202211365058 A CN202211365058 A CN 202211365058A CN 115423048 B CN115423048 B CN 115423048B
- Authority
- CN
- China
- Prior art keywords
- traffic flow
- similarity
- time sequence
- mode
- flow data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 28
- 230000015654 memory Effects 0.000 claims abstract description 35
- 238000010586 diagram Methods 0.000 claims abstract description 31
- 238000013528 artificial neural network Methods 0.000 claims abstract description 19
- 230000002159 abnormal effect Effects 0.000 claims abstract description 15
- 238000000034 method Methods 0.000 claims description 36
- 230000006870 function Effects 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims 2
- 238000004364 calculation method Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 239000010410 layer Substances 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a traffic flow anomaly detection method and a system based on pattern similarity, which relate to the technical field of traffic flow anomaly detection models and comprise the following steps: extracting time sequence characteristics from traffic flow data by adopting an improved long-short-term memory neural network; dividing and clustering traffic flow data by adopting a sliding window, and taking a short-term sequence corresponding to a clustering center as a mode characteristic; calculating time sequence similarity for time sequence features of different space positions; determining the mode characteristics closest to each mode characteristic, and weighting the nearest neighbor distances of the mode characteristic pairs to obtain the mode similarity of different spatial positions; determining sequence similarity according to the time sequence similarity and the mode similarity, and constructing traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity; and detecting abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity so as to improve the accuracy of detecting abnormal traffic flow.
Description
Technical Field
The invention relates to the technical field of traffic flow anomaly detection models, in particular to a traffic flow anomaly detection method and system based on pattern similarity.
Background
Along with the related development of big data technology, the artificial intelligence technology is widely applied to traffic flow anomaly detection and traffic flow prediction, accurately detects the anomaly condition of traffic flow, not only can provide favorable decision reference for traffic management departments, but also can provide more proper route selection for pedestrians, and is favorable for relieving traffic pressure.
The change of the traffic flow at the intersection is affected by various aspects such as time, weather, traffic policy and the like, has obvious periodicity, and the existing traffic flow anomaly detection algorithm using the machine learning method has at least the following three problems:
(1) A single recurrent neural network model cannot more effectively extract information of the traffic flow history sequence.
(2) The existing traffic flow anomaly detection only considers the traffic condition of a single intersection, and does not consider the associated influence factors of other intersections.
(3) Calculation lacks an effective measure when calculating the similarity of traffic flows between different roads.
Disclosure of Invention
In order to solve the problems, the invention provides a traffic flow anomaly detection method and a system based on pattern similarity, which respectively extract time sequence characteristics and pattern characteristics from traffic flow data and construct a traffic flow dynamic relationship diagram, so as to judge traffic flow anomalies and improve the accuracy of traffic flow anomaly detection.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a traffic flow anomaly detection method based on pattern similarity, including:
acquiring traffic flow data;
extracting time sequence characteristics from traffic flow data by adopting an improved long-short-term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
dividing traffic flow data by adopting a sliding window to obtain a short-term sequence set, clustering the short-term sequence set, and taking a short-term sequence corresponding to a clustering center of each category as a mode characteristic;
calculating time sequence similarity for time sequence features of different space positions;
determining the mode characteristics closest to each mode characteristic, and obtaining mode similarity of different spatial positions after weighting the nearest neighbor distances of the mode characteristic pairs by forming mode characteristic pairs;
determining sequence similarity according to the time sequence similarity and the mode similarity, and constructing traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
and detecting abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity.
In the process of weighting and summing the hidden states obtained at different moments to obtain the time sequence characteristics, the weight is determined according to the correlation between the hidden states at different moments and the traffic flow dataThe weight is as follows:
wherein ,is the firsttThe traffic flow data of the day is used,in order to be in a hidden state,as a function of the correlation,in order for the parameters to be learned,is the number of days of the traffic flow data entered,is a transpose operation.
Alternatively, the timing similarity is calculated for timing characteristics of different spatial locationsThe process of (1) is as follows:
wherein ,is the firsttSpace position of dayaIs used for the time sequence characteristics of the (a),is the firsttSpace position of daybIs used for the time sequence characteristics of the (a),is composed of weight matrix to be learnedAnd a network of an activation function tanh,finger willAndand (5) splicing.
In an alternative embodiment, in the process of weighting the nearest neighbor distance of the pattern feature pair, the weight is the number of elements included in the category of the pattern feature.
Alternatively, the sequence similarity is determined by summing weighted time-series similarity and pattern similarity.
As an alternative embodiment, the process of constructing the traffic flow dynamic relationship graph includes:
constructing a relationship diagram of different spatial positions at the same time according to the sequence similarity of traffic flow data of different spatial positions;
Introducing a communication relation matrix between traffic flow data of different spatial positions, and constructing a traffic flow dynamic relation graph according to the relation graph and the communication relation matrix;
wherein ,in order for the parameters to be learned,for a connected relation matrix, tanh is the activation function,andthe current time and the time indicated by the a priori data respectively,in order for the time difference to be a function of the time difference,is a decreasing function.
Alternatively, the connectivity matrix is:
wherein ,X a Is the space positionaTraffic flow data, X b Is the space positionbIs used for determining the traffic flow data of the vehicle,is X a and
X b A connected relation matrix between the two.
In a second aspect, the present invention provides a traffic flow anomaly detection system based on pattern similarity, including:
the data acquisition module is configured to acquire traffic flow data;
a timing feature extraction module configured to extract timing features for traffic flow data using the modified long-short term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
the mode feature extraction module is configured to segment traffic flow data by adopting a sliding window to obtain a short-term sequence set, and after clustering the short-term sequence set, taking a short-term sequence corresponding to a clustering center of each category as a mode feature;
the time sequence similarity determining module is configured to calculate time sequence similarity for time sequence characteristics of different space positions;
the mode similarity determining module is configured to determine the mode feature closest to each mode feature so as to form a mode feature pair, and the nearest neighbor distances of the mode feature pair are weighted to obtain the mode similarity of different spatial positions;
the dynamic relation diagram construction module is configured to determine sequence similarity according to the time sequence similarity and the mode similarity, and construct traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
the abnormal detection module is configured to detect abnormal states of the traffic flow by adopting the traffic flow dynamic relation diagram and the time sequence similarity.
In a third aspect, the invention provides an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method of the first aspect.
In a fourth aspect, the present invention provides a computer readable storage medium storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a traffic flow anomaly detection method and a system based on pattern similarity, which adopt an improved long-short-term memory neural network to extract time sequence characteristics, meanwhile, the pattern characteristics are extracted to comprehensively consider the periodic characteristics of traffic flow data, and after similarity calculation is carried out on the extracted two parts of characteristics, a traffic flow dynamic relation graph is constructed, the influence of association relations among different spatial positions is considered in the traffic flow dynamic relation graph, the influence of different time on the current association relation is also considered, finally, the traffic flow anomaly condition is judged by utilizing a graph attention network, and the accuracy of traffic flow anomaly detection is improved.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
Fig. 1 is a flow chart of a traffic flow anomaly detection method based on pattern similarity provided in embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of the dynamic relationship diagram provided in embodiment 1 of the present invention;
fig. 3 is a flowchart of anomaly determination provided in embodiment 1 of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular forms also are intended to include the plural forms, and furthermore, it is to be understood that the terms "comprises" and "comprising" and any variations thereof are intended to cover non-exclusive inclusions, such as, for example, processes, methods, systems, products or devices that comprise a series of steps or units, are not necessarily limited to those steps or units that are expressly listed, but may include other steps or units that are not expressly listed or inherent to such processes, methods, products or devices.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
The embodiment proposes a traffic flow anomaly detection method based on pattern similarity, as shown in fig. 1, including:
acquiring traffic flow data;
extracting time sequence characteristics from traffic flow data by adopting an improved long-short-term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
dividing traffic flow data by adopting a sliding window to obtain a short-term sequence set, clustering the short-term sequence set, and taking a short-term sequence corresponding to a clustering center of each category as a mode characteristic;
calculating time sequence similarity for time sequence features of different space positions;
determining the mode characteristics closest to each mode characteristic, and obtaining mode similarity of different spatial positions after weighting the nearest neighbor distances of the mode characteristic pairs by forming mode characteristic pairs;
determining sequence similarity according to the time sequence similarity and the mode similarity, and constructing traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
and detecting abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity.
In the present embodiment, traffic flow data within T days is defined asThe method comprises the steps of carrying out a first treatment on the surface of the Wherein, the firsttThe traffic flow data of the day is,Is the firsttDay 3nMinute data, N is the length of the traffic flow data in the day.
Since traffic flow data is affected by various complex factors, in order to vector the timing characteristics thereof, the embodiment adopts a modified long-short-term memory neural network (Long Short Term Memory, LSTM) to model the acquired traffic flow data so as to extract the timing characteristics.
LSTM is an improved algorithm for Recurrent Neural Networks (RNNs) and is widely used in time series modeling, and the gating unit adopted by LSTM can suppress the gradient disappearance problem of RNNs to some extent. For each traffic flow datax t For LSTM constructionThe mode formula is shown as formula (1) -formula (6):
an input door:
forgetting the door:
output door:
long memory:
short memory:
wherein ,W i 、W f 、W C andW o are all parameters to be learned, and are used for learning,C t in the state of a cell, the cell is in a state of being,is an intermediate quantity of the state of the cell,for the Hadamard product,h t is in a hidden state. For convenience in explaining the specific improved algorithm, the above formula ignores biasAnd (5) transferring items.
In most existing algorithms, the last hidden state is typicallyh t As a result of the LSTM output, this tends to ignore features contained in the previous hidden state.
Therefore, in this embodiment, the hidden states at different moments are fused in a weighted summation manner to obtain a time sequence feature; wherein, the hidden states and the hidden states at different moments are usedx t The correlation of the hidden state is defined with a weight corresponding to a hidden state with large correlation, thereby improving the output result pairx t The expression ability of (a) is represented by the following formula (7) -formula (9):
wherein ,is thatIs calculated from the correlation functionDetermining;is the parameter to be learned.
After the above processing is carried out on all traffic flow data, the traffic flow data is obtainedx t Vector representation, i.e. temporal featurev t The method comprises the steps of carrying out a first treatment on the surface of the The process is simplified to be represented by formula (10):
Pattern features refer to a series of approximately short-term data that recurs over historical data. Thus, the present embodiment proposes a pattern feature extraction method based on segmentation and clustering to capture periodic features.
Firstly, dividing traffic flow data into a plurality of short-term sequences by adopting a sliding window; specifically:
adopt sliding window to make the firsttTraffic flow data for daysx t Dividing into M windows to construct the thtShort-term set of sequences for days; wherein For short-term sequences, L is the window length, m=n-l+ 1.
Then, clustering the short-term sequence sets according to the distance between the short-term sequences to capture repeated short-term sequences, namely pattern features;
specifically: integrating short-term sequences into collectionsIn by the pair ofAll short-term sequences in (a)Clustering to capture pattern features;
belonging to the same categoryWith approximate short-term sequences, taking the cluster center of each categoryAs the firsttPattern features of the traffic flow data of the day, where each element represents a cluster center of each category, g is the number of categories.
In this embodiment, similarity calculation is performed on the time sequence feature and the mode feature, and then, sequence similarity is determined according to the time sequence similarity and the mode similarity, and balance of the two similarities is controlled.
In this embodiment, the time sequence similarity between the time sequence features of the traffic flow data of different spatial positions (such as different traffic intersections) at the same time is calculated:
wherein ,is the firsttSpace position of dayaIs used for the time sequence characteristics of the (a),is the firsttSpace position of daybIs used for the time sequence characteristics of the (a),is composed of weight matrix to be learnedAnd a network of an activation function tanh,finger willAndand (5) splicing.
In the present embodiment, the pattern features of traffic flow data of all spatial locations are obtainedAfter that, by calculating the space positionaTraffic flow data of (a)Pattern features of (2)And spatial positionbTraffic flow data of (a)Pattern features of (2)Distance between them to obtainAndto determine pattern similarity.
Due toDoes not have a sequential relationship per se, andand (3) withThe number of elements contained may vary, resulting in a computational processAnd (3) withThe correspondence of the elements is not easily determined. In order to ensure the simplicity and robustness of the algorithm, the embodiment adopts the calculation of the nearest neighbor distance of each mode feature to solve the problem that trend mode features of different traffic flow data sequences have no one-to-one correspondence.
Nearest neighbor distance refers to the distance D between each pattern feature and its nearest pattern feature 1NN Expressed as:
wherein ,is thatIs the first of (2)The characteristics of the individual modes,is thatIs the first of (2)A personal pattern feature;
will beWhen it is "1Andthe Euclidean distance between them isWill beAll elements of (3) are relative toIs represented as an array;
Notably, whenIs thatIs used to determine the nearest-neighbor of the cell,may not beIs the nearest neighbor of (2);
therefore, it is necessary to useAndrespectively representAll elements of (3) are relative toNearest neighbor sum of (2)All elements of (3) are relative toIs the nearest neighbor of (2);
in order to makeAndthe distance measurement between the two is symmetricalAndis combined into, wherein Andrespectively isAndpattern feature quantity of (2);Includedand (3) withThe question is how to choose the most reasonable value to represent the nearest neighbor of all pattern features in (a)Andthe distance between, if chosenMaximum value of (2), then at randomNoise peaks that occur in (a) will seriously affect the distance determination, whereas most if based on a minimum valueThere is little distinction between them.
In order to consider the influence of all modes as much as possible, the present embodiment selects pairsAll the values in the pattern are weighted to obtain the pattern similarity. Structure of the deviceAnd (3) withWhen each pattern feature is located in the category, the number of the elements is recordedAndsimilar handleAndis combined intoThe weighting function is as shown in equation (13):
In this embodiment, the sequence similarity is determined based on the time sequence similarity and the pattern similarityAs shown in formula (14):
In the embodiment, a relationship diagram of different spatial positions (traffic intersections) at the same time is constructed according to the sequence similarity, and then the relationship diagram at different times is processed by using LSTM (least squares) based on the dynamic diagram so as to construct a traffic flow dynamic relationship diagram containing time sequence characteristics;
specifically, according to the differencesSequence similarity of traffic flow data of spatial positions to construct a relationship diagram of different spatial positions at the same time (same day)As shown in formula (15):
wherein ,is the firsttA relationship diagram constructed by the days,is the space positionaAnd spatial positionbSequence similarity between traffic flow data.
Relationship diagramCan reflect the firsttThe association relation of the day, but the influence of other time on the current association relation is ignored. For this reason, by referring to the gating structure of LSTM, a Dynamic relationship graph construction method is designed, as shown in fig. 2, named as Dynamic-based LSTM (DGLSTM), where the part actually optimizes the input data, and the corresponding specific formula is shown in formula (16):
wherein ,in order for the parameters to be learned,for the connection relation matrix between different space positions, the method is used for guiding the construction of dynamic relation diagrams and definesFormula (17):
wherein ,in order for the time difference to be a function of the time difference,,andthe time indicated by the current time and the prior data is respectively;is a decreasing function for assigning a priori dataAnd meansThe data in (c) is gradually forgotten with increasing time interval.
DGLSTM can be represented by the formula (18) -formula (23):
an input door:
forgetting the door:
output door:
long memory:
short memory:
after DGLSTM, a dynamic relation graph is obtainedThe method comprises the steps of carrying out a first treatment on the surface of the Relative to,The method not only can reflect the relevance among the current time sequences, but also is influenced by other time relation diagrams in history. For convenience of description, the following will be madeThe construction process of (2) is expressed as shown in formula (24):
in this embodiment, the graph attention network (Graph attention networks, GAT) is adopted to perform traffic flow anomaly judgment, and by aggregating the effects between approximate sequences, so as to capture implicit information in the dynamic relationship graph, compared with the traditional graph roll-up neural network (Graph Convolutional Networks, GCN) model, the GAT can selectively aggregate the effects of the approximate sequences, and the expression is shown in the formula (25):
wherein ,is the firstkThe traffic flow data of each intersection is provided with a time sequence feature obtained through LSTM;is an intersectionkAnd crossingpSequence similarity of (2) of the order of magnitude of a dynamic relationship graphFirst, thekLine 1pThe value of the column;the weight matrix to be learned; crossingkThe output at GAT is implicit information;
Acquiring hidden information of all intersections through GATThe process of (2) may be represented by formula (26):
defining a label when abnormality determination is performedAdopts a single-layer full-connection layer (Fully connected layer, FC) as a prediction function pairThe prediction is performed as shown in fig. 3, and the prediction formula is shown in formula (27):
In this embodiment, the graph annotation force network is trained using cross entropy, as shown in equation (28):
wherein ,andrespectively, are intersectionskIn the first placetThe true category and predicted value of the moment in time,Lis a loss function for minimizing the gap between the predicted value and the true class.
Since anomaly detection is a typical classification task, the present embodiment uses the Accuracy (ACC) and Ma Xiusi correlation coefficient (Matthews correlation coefficient, MCC) widely accepted in classification tasks to evaluate the predictive effect of a graph attention network.
The ACC can intuitively express the prediction effect of the model, and the formula is shown as formula (29):
wherein, true Positive (TP) is the result that both the predicted and the true value are normal; true Negative (TN) is the result that both the predicted and the true values are abnormal; false Positives (FP) are predicted as normal, actually abnormal results; false Negatives (FN) are predicted to be abnormal and actually normal.
MCC is an index that evaluates the performance of a model classification, and is actually a correlation coefficient that describes the relationship between the actual classification and the predicted classification. Its value is between-1 and +1, the coefficient +1 representing perfect prediction, 0 representing no better than random prediction, -1 representing complete inconsistency between prediction and observation. The MCC calculation formula is shown in formula (30):
after comparing the method of this embodiment with 3 conventional methods, namely, the graph annotation force network (Graph attention networks, GAT), the time convolution neural network (Temporal Convolutional Neural Network, TCN), and the gated loop unit neural network (Gated Recurrent Unit, GRU), the two indexes of the method of this embodiment are the highest, and the method of this embodiment is ranked first in the comparison method, so as to confirm the effectiveness of the method of this embodiment.
Example 2
The embodiment provides a traffic flow anomaly detection system based on pattern similarity, which comprises:
the data acquisition module is configured to acquire traffic flow data;
a timing feature extraction module configured to extract timing features for traffic flow data using the modified long-short term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
the mode feature extraction module is configured to segment traffic flow data by adopting a sliding window to obtain a short-term sequence set, and after clustering the short-term sequence set, taking a short-term sequence corresponding to a clustering center of each category as a mode feature;
the time sequence similarity determining module is configured to calculate time sequence similarity for time sequence characteristics of different space positions;
the mode similarity determining module is configured to determine the mode feature closest to each mode feature so as to form a mode feature pair, and the nearest neighbor distances of the mode feature pair are weighted to obtain the mode similarity of different spatial positions;
the dynamic relation diagram construction module is configured to determine sequence similarity according to the time sequence similarity and the mode similarity, and construct traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
the abnormal detection module is configured to detect abnormal states of the traffic flow by adopting the traffic flow dynamic relation diagram and the time sequence similarity.
It should be noted that the above modules correspond to the steps described in embodiment 1, and the above modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1. It should be noted that the modules described above may be implemented as part of a system in a computer system, such as a set of computer-executable instructions.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method described in embodiment 1. For brevity, the description is omitted here.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include read only memory and random access memory and provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method described in embodiment 1.
The method in embodiment 1 may be directly embodied as a hardware processor executing or executed with a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method. To avoid repetition, a detailed description is not provided herein.
Those of ordinary skill in the art will appreciate that the elements of the various examples described in connection with the present embodiments, i.e., the algorithm steps, can be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.
Claims (8)
1. The traffic flow anomaly detection method based on the pattern similarity is characterized by comprising the following steps of:
acquiring traffic flow data;
extracting time sequence characteristics from traffic flow data by adopting an improved long-short-term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
dividing traffic flow data by adopting a sliding window to obtain a short-term sequence set, clustering the short-term sequence set, and taking a short-term sequence corresponding to a clustering center of each category as a mode characteristic;
calculating time sequence similarity for time sequence features of different space positions;
determining the mode characteristics closest to each mode characteristic, and obtaining mode similarity of different spatial positions after weighting the nearest neighbor distances of the mode characteristic pairs by forming mode characteristic pairs;
determining sequence similarity according to the time sequence similarity and the mode similarity, and constructing traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
detecting abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity;
in the process of weighting and summing the hidden states obtained at different moments to obtain the time sequence characteristics, the weight is determined according to the correlation between the hidden states at different moments and the traffic flow dataThe weight is as follows:
wherein ,x t is the firsttThe traffic flow data of the day is used,in order to be in a hidden state,as a function of the correlation,in order for the parameters to be learned,is the number of days of the traffic flow data entered,is a transposition operation;
calculating time sequence similarity for time sequence characteristics of different space positionsThe process of (1) is as follows:
wherein ,is the firsttSpace position of dayaIs used for the time sequence characteristics of the (a),is the firsttSpace position of daybIs used for the time sequence characteristics of the (a),is composed of weight matrix to be learnedAnd a network of an activation function tanh,finger willAndand (5) splicing.
2. The traffic flow anomaly detection method based on pattern similarity according to claim 1, wherein in the process of weighting nearest neighbor distances of pattern feature pairs, the weight is the number of elements contained in the category of the pattern feature.
3. The traffic flow anomaly detection method based on pattern similarity according to claim 1, wherein the sequence similarity is determined by summing up weighted time sequence similarity and pattern similarity.
4. The traffic flow anomaly detection method based on pattern similarity as claimed in claim 1, wherein the process of constructing the traffic flow dynamic relationship graph comprises:
constructing a relationship diagram of different spatial positions at the same time according to the sequence similarity of traffic flow data of different spatial positions;
Introducing a communication relation matrix between traffic flow data of different spatial positions, and constructing a traffic flow dynamic relation graph according to the relation graph and the communication relation matrix;
5. The traffic flow anomaly detection method based on pattern similarity as claimed in claim 4, wherein the connectivity matrix is:
6. A traffic flow anomaly detection system based on pattern similarity, comprising:
the data acquisition module is configured to acquire traffic flow data;
a timing feature extraction module configured to extract timing features for traffic flow data using the modified long-short term memory neural network; the improved long-short-term memory neural network obtains time sequence characteristics after weighting and summing hidden states obtained at different moments;
the mode feature extraction module is configured to segment traffic flow data by adopting a sliding window to obtain a short-term sequence set, and after clustering the short-term sequence set, taking a short-term sequence corresponding to a clustering center of each category as a mode feature;
the time sequence similarity determining module is configured to calculate time sequence similarity for time sequence characteristics of different space positions;
the mode similarity determining module is configured to determine the mode feature closest to each mode feature so as to form a mode feature pair, and the nearest neighbor distances of the mode feature pair are weighted to obtain the mode similarity of different spatial positions;
the dynamic relation diagram construction module is configured to determine sequence similarity according to the time sequence similarity and the mode similarity, and construct traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity;
the abnormal detection module is configured to detect abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity;
in the process of weighting and summing the hidden states obtained at different moments to obtain the time sequence characteristics, the weight is determined according to the correlation between the hidden states at different moments and the traffic flow dataThe weight is as follows:
wherein ,x t is the firsttThe traffic flow data of the day is used,in order to be in a hidden state,as a function of the correlation,in order for the parameters to be learned,is the number of days of the traffic flow data entered,is a transposition operation;
calculating time sequence similarity for time sequence characteristics of different space positionsThe process of (1) is as follows:
wherein ,is the firsttSpace position of dayaIs used for the time sequence characteristics of the (a),is the firsttSpace position of daybIs used for the time sequence characteristics of the (a),is composed of weight matrix to be learnedAnd a network of an activation function tanh,finger willAndand (5) splicing.
7. An electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method of any one of claims 1-5.
8. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211365058.7A CN115423048B (en) | 2022-11-03 | 2022-11-03 | Traffic flow anomaly detection method and system based on pattern similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211365058.7A CN115423048B (en) | 2022-11-03 | 2022-11-03 | Traffic flow anomaly detection method and system based on pattern similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115423048A CN115423048A (en) | 2022-12-02 |
CN115423048B true CN115423048B (en) | 2023-04-25 |
Family
ID=84207956
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211365058.7A Active CN115423048B (en) | 2022-11-03 | 2022-11-03 | Traffic flow anomaly detection method and system based on pattern similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115423048B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116361635B (en) * | 2023-06-02 | 2023-10-10 | 中国科学院成都文献情报中心 | Multidimensional time sequence data anomaly detection method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022047658A1 (en) * | 2020-09-02 | 2022-03-10 | 大连大学 | Log anomaly detection system |
WO2022160902A1 (en) * | 2021-01-28 | 2022-08-04 | 广西大学 | Anomaly detection method for large-scale multivariate time series data in cloud environment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2841422A1 (en) * | 2011-07-20 | 2013-01-24 | Elminda Ltd. | Method and system for estimating brain concussion |
US20200097808A1 (en) * | 2018-09-21 | 2020-03-26 | International Business Machines Corporation | Pattern Identification in Reinforcement Learning |
CN111145541B (en) * | 2019-12-18 | 2021-10-22 | 深圳先进技术研究院 | Traffic flow data prediction method, storage medium, and computer device |
CN112801404B (en) * | 2021-02-14 | 2024-03-22 | 北京工业大学 | Traffic prediction method based on self-adaptive space self-attention force diagram convolution |
-
2022
- 2022-11-03 CN CN202211365058.7A patent/CN115423048B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022047658A1 (en) * | 2020-09-02 | 2022-03-10 | 大连大学 | Log anomaly detection system |
WO2022160902A1 (en) * | 2021-01-28 | 2022-08-04 | 广西大学 | Anomaly detection method for large-scale multivariate time series data in cloud environment |
Also Published As
Publication number | Publication date |
---|---|
CN115423048A (en) | 2022-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110223517B (en) | Short-term traffic flow prediction method based on space-time correlation | |
Chen et al. | Learning graph structures with transformer for multivariate time-series anomaly detection in IoT | |
Hsieh et al. | Unsupervised online anomaly detection on multivariate sensing time series data for smart manufacturing | |
Zhao et al. | Maritime anomaly detection using density-based clustering and recurrent neural network | |
US9779361B2 (en) | Method for learning exemplars for anomaly detection | |
CN111797122B (en) | Method and device for predicting change trend of high-dimensional reappearance concept drift stream data | |
Guo et al. | Hidden Markov models based approaches to long-term prediction for granular time series | |
CN114220271A (en) | Traffic flow prediction method, equipment and storage medium based on dynamic space-time graph convolution cycle network | |
Xie et al. | Deep graph convolutional networks for incident-driven traffic speed prediction | |
CN115423048B (en) | Traffic flow anomaly detection method and system based on pattern similarity | |
CN113570859B (en) | Traffic flow prediction method based on asynchronous space-time expansion graph convolution network | |
CN106709588B (en) | Prediction model construction method and device and real-time prediction method and device | |
CN114565124A (en) | Ship traffic flow prediction method based on improved graph convolution neural network | |
CN113505536A (en) | Optimized traffic flow prediction model based on space-time diagram convolution network | |
Hosseini et al. | Short-term traffic flow forecasting by mutual information and artificial neural networks | |
CN115169430A (en) | Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding | |
Kovács et al. | Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries | |
CN114596726B (en) | Parking berth prediction method based on interpretable space-time attention mechanism | |
CN114861875A (en) | Internet of things intrusion detection method based on self-supervision learning and self-knowledge distillation | |
Tambuwal et al. | Deep quantile regression for unsupervised anomaly detection in time-series | |
Xie et al. | " how do urban incidents affect traffic speed?" A deep graph convolutional network for incident-driven traffic speed prediction | |
CN117150882A (en) | Engine oil consumption prediction method, system, electronic equipment and storage medium | |
CN117111464A (en) | Self-adaptive fault diagnosis method under multiple working conditions | |
CN116992224A (en) | Time sequence data reconstruction method based on multi-head attention mechanism | |
CN115953902A (en) | Traffic flow prediction method based on multi-view space-time diagram convolution network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |