CN113627383A - Pedestrian loitering re-identification method for panoramic intelligent security - Google Patents

Pedestrian loitering re-identification method for panoramic intelligent security Download PDF

Info

Publication number
CN113627383A
CN113627383A CN202110978611.3A CN202110978611A CN113627383A CN 113627383 A CN113627383 A CN 113627383A CN 202110978611 A CN202110978611 A CN 202110978611A CN 113627383 A CN113627383 A CN 113627383A
Authority
CN
China
Prior art keywords
pedestrian
loitering
panoramic
identification
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110978611.3A
Other languages
Chinese (zh)
Inventor
张楠
黄绩
程德强
寇旗旗
赵凯
吕晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202110978611.3A priority Critical patent/CN113627383A/en
Publication of CN113627383A publication Critical patent/CN113627383A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a pedestrian loitering re-identification method for panoramic intelligent security, which mainly comprises the following parts of video picture acquisition, picture quality evaluation, pedestrian detection and re-identification, loitering judgment: the first part is a picture acquisition part which is used for realizing real-time preview of a security video, intercepting pictures according to seconds and collecting the pictures in a memory; the second part is picture quality evaluation, and some pictures are screened out due to the fact that the captured pictures are possibly blurred, too many occlusion objects and the like; the third part is pedestrian detection and pedestrian re-identification, in the third part, detection and pedestrian feature identification are combined, and the intercepted panoramic picture is sent into a single neural network to jointly process two tasks of pedestrian detection and re-identification; and the fourth step is pedestrian loitering judgment, and whether the pedestrians loiter or not is judged by judging whether the camera id is the same or not and the picture interval duration.

Description

Pedestrian loitering re-identification method for panoramic intelligent security
Technical Field
The invention belongs to the technical field of pedestrian loitering re-identification, and particularly relates to a pedestrian loitering re-identification method for panoramic intelligent security.
Background
With the rapid development of modern information technology and the coming of new infrastructure policy, the concept of 'smart city' is gradually changed into reality. The smart city is the integration of the real world and the digital world established based on the digital city, the Internet of things and cloud computing, and intelligent management and operation of the city are achieved. The intelligent security is a main application scene of a smart city, and the video monitoring system plays a key role in constructing panoramic intelligent security. For videos in intelligent video monitoring, human-free intelligent analysis such as target detection, classification, identification, tracking, feature point extraction, motion estimation and the like is attracting more and more attention. The detection of wandering behavior of a pedestrian is to determine whether a person stays in a place for more than a certain period of time or the motion trajectory is abnormal (for example, repeatedly walking back and forth in a place), which is a common video analysis technique.
The pedestrian loitering re-identification is a focus hot spot of the intelligent security at present, and is mainly divided into four parts, namely video picture processing, pedestrian detection, pedestrian re-identification and loitering judgment. Pedestrian loitering detection is generally classified into pedestrian abnormal behavior recognition and has certain defects. Firstly, the characteristics of the abnormal behaviors of the pedestrians are manually extracted by the traditional pedestrian behavior recognition method and are sent to a simple classifier such as a support vector machine for classification, the method not only consumes a large amount of manpower and financial resources, but also has poor detection effect on the wandering behaviors of the pedestrians, and professional treatment is not carried out aiming at specificity; secondly, the traditional method for identifying abnormal behaviors of pedestrians mainly uses a front background modeling method to obtain pedestrians, and is very easy to be interfered by noisy background information, so that the detection error rate is high. For the aspect of pedestrian detection, the traditional mode adopts a front-back separation mode of background modeling, the effect is often poor, and improvement is needed. The existing pedestrian re-identification technology can make up the visual limitation of the existing fixed camera, is applied to the field of panoramic intelligent security by combining with the pedestrian detection and tracking technology, and effectively participates in detecting the behavior of wandering pedestrians. The pedestrian re-identification method mainly comprises unsupervised pedestrian re-identification and supervised pedestrian re-identification.
Because the identity of the pedestrian in the surveillance video is unknown, only unsupervised pedestrian re-identification can be used. At present, unsupervised pedestrian re-identification still has great progress space for similarity discrimination of samples and identification of difficultly-corrected samples (pedestrians with similar appearance and clothes but different identities). The uncertainty of the result caused by different viewpoints, low resolution of image change, illumination change, occlusion, background confusion, unreliable bounding box generation and other factors also needs to be improved by a proper method to summarize the existing defects: the traditional pedestrian loitering detection has poor detection effect on the situations of pedestrian tracking loss and pedestrian going out of a video picture and turning back again; the pedestrian panoramic picture obtained from the real-time monitoring video stream contains a large amount of background information, and the picture is sent to a pedestrian re-identification step, so that the identification efficiency is poor due to a noisy background, and therefore a pedestrian loitering re-identification method for panoramic intelligent security needs to be designed to solve the problems.
Disclosure of Invention
The invention aims to provide a pedestrian loitering re-identification method for panoramic intelligent security, which can solve the problems.
The technical scheme adopted by the invention is as follows:
a pedestrian loitering re-identification method for panoramic intelligent security comprises the following steps:
a. collecting panoramic pictures of the real-time security monitoring video;
b. carrying out quality evaluation on the acquired panoramic picture;
c. carrying out pedestrian detection and re-identification combined processing on the screened panoramic picture;
d. and (4) carrying out pedestrian loitering judgment on the pseudo label distribution result obtained in the step of pedestrian detection and re-identification.
The invention is further improved in that: in the step a, the security video panoramic picture is acquired based on the python version of OpenCV, a cv2 and a NumPy library are prepared, real-time preview is realized on the security camera in a stream pushing mode, the frame number of the camera is read to capture the panoramic picture, and the picture is saved.
The invention is further improved in that: and c, evaluating the quality of the picture in the step b based on the image evaluation model, judging whether the pedestrians in the panoramic picture are fuzzy and shielded too much, and screening out the video screenshots of which the evaluation results are lower than the grading threshold.
The invention is further improved in that: and c, adopting a new deep learning framework to carry out combined processing on pedestrian detection and pedestrian re-identification.
The invention is further improved in that: the new deep learning framework is mainly divided into 5 modules, a Convolutional Neural Network (CNN) module, a pedestrian detection module, a pooling layer module, a mutual neighbor-based pseudo tag distribution module and a loss function module, wherein the pedestrian loitering judgment is based on a pseudo tag distribution result, whether camera ids of pedestrian pictures are the same or not is judged in a group of similar pedestrian features with the same pseudo tag, and if the camera ids are different, the possibility of loitering of the pedestrian is eliminated; and if the camera id is the same, judging the frame number interval of the pedestrian picture.
The invention is further improved in that: the basic model of the convolutional neural network CNN module is ResNet50, a main stem consists of the first four layers of ResNet50, an attention mechanism is added to the first layer of the network and the last layer of the convolutional layer for extracting features, an instant Normalization is added before a residual block relu of each layer for eliminating the influence of an image background,
Figure 466289DEST_PATH_IMAGE001
Figure 402888DEST_PATH_IMAGE002
Figure 639965DEST_PATH_IMAGE003
Figure 882728DEST_PATH_IMAGE004
indicating the second in the activation map
Figure 200446DEST_PATH_IMAGE005
The elements of the group are selected from the group,
Figure 745828DEST_PATH_IMAGE006
and
Figure 594835DEST_PATH_IMAGE007
is a dimension that spans a space in which,
Figure 1808DEST_PATH_IMAGE008
is a characteristic channel, and the characteristic channel is,
Figure 518240DEST_PATH_IMAGE009
is an index of the images in the batch,
Figure 31261DEST_PATH_IMAGE010
is a small constant which is a constant number of times,
Figure 492198DEST_PATH_IMAGE011
and
Figure 965511DEST_PATH_IMAGE012
respectively, height and width.
The invention is further improved in that: the pedestrian detection module is based on the feature map, converts pedestrian features in the feature map by using a 512 x 3 convolution layer, predicts whether an anchor frame contains a pedestrian or not by using an anchor point and a SoftMax classifier at each position of the feature map, and further comprises a linear regression for adjusting the position of the anchor frame.
The invention is further improved in that: the pooling layer module includes: region of interest Pooling (Rol Pooling) and Global Average Pooling (Global Average Pooling).
The invention is further improved in that: the mutual neighbor-based pseudo label distribution module calculates the nearest neighbor relation of all the characteristic vectors, and then divides the whole characteristic vector space into a plurality of different clusters by utilizing the transitivity to obtain the pseudo labels.
The invention is further improved in that: the loss function module comprises a cross entropy loss function and a triple loss function.
Has the advantages that:
firstly, aiming at the problem of poor detection effect of the conventional pedestrian loitering, the deep learning is combined with the conventional method for the first time, and the deep learning is utilized to complete the detection and the re-identification of the pedestrian in a CNN convolutional neural network, so that the complexity of detecting the pedestrian loitering by utilizing the complete deep learning method is simplified, and the loitering detection accuracy of the conventional method is improved;
secondly, when the panoramic picture obtained from the real-time monitoring camera passes through the pedestrian detection module, because the panoramic picture contains a large amount of background information, certain interference is caused to the extraction of the characteristics of the pedestrians. The invention utilizes an attention mechanism and IN (attention Normalization), and realizes weakening the background influence of pictures and more paying attention to extracting the characteristics of pedestrians by adding the attention mechanism after the ResNet50 convolution layer and adding an IN module (attention Normalization) IN the residual block.
Drawings
FIG. 1 is a system block diagram of a pedestrian loitering re-identification method for panoramic intelligent security;
FIG. 2 is a security video picture capture flow diagram of the present invention;
FIG. 3 is a pedestrian detection and re-identification framework of the present invention;
FIG. 4 is a ResNet50 residual block with IN added (left) and a ResNet50 residual block without IN added (right) of the present invention;
FIG. 5 is a network abstraction backbone model of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments.
The invention provides a pedestrian loitering re-identification method applied to panoramic intelligent security. The method mainly comprises the following steps of video picture acquisition, picture quality evaluation, pedestrian detection and re-identification, loitering judgment, wherein the first part is a picture acquisition part which is used for realizing real-time preview of a security video, intercepting pictures in seconds and collecting the pictures in a memory; the second part is picture quality evaluation, and some pictures are screened out due to the fact that the captured pictures are possibly blurred, too many occlusion objects and the like; the third part is pedestrian detection and pedestrian re-identification, in the third part, detection and pedestrian feature identification are combined, and the intercepted panoramic picture is sent into a single neural network to jointly process two tasks of pedestrian detection and re-identification; and the fourth step is pedestrian loitering judgment, and whether the pedestrians loiter or not is judged by judging whether the camera id is the same or not and the picture interval duration. The overall process system is shown in figure 1.
A first part: video picture capture
The security video picture acquisition process is shown in fig. 2. We use the python version of OpenCV to implement a real-time video preview screenshot. Firstly, cv2 and a NumPy library are prepared, real-time preview is realized on a security camera in a flow pushing mode, and the mode of the camera is as follows:videoCapture=cv2.VideoCapture(1). Because the moving speed of the pedestrian does not reach a very high speed, the pedestrian can be extracted according to the second, the pedestrian can be intercepted once every two frames of the camera according to the frames per second of the camera, and the stored pictures are placed in an updatable storage.
A second part: picture quality assessment
In the pictures intercepted by the panoramic security video in seconds, the pedestrian characteristics and the effect possibly reflected by some pictures are not good. For example, the pedestrian cannot distinguish specific personal features in a fuzzy manner, and the pedestrian is blocked by buildings, obstacles, passing vehicles and the like to have most of main features, and the pedestrian cannot be selected in such cases because the pedestrian in the next step cannot be effectively detected and identified. An image evaluation model is adopted to evaluate pedestrian images in the captured images, judge the fuzzy degree and the integrity degree of the pedestrians, and screen out the video screenshots of which the evaluation results are lower than a grading threshold value.
And a third part: pedestrian detection and re-identification
The part is a core part of pedestrian loitering detection, the pedestrian detection and the pedestrian re-identification are regarded as two independent tasks in the past task, the pedestrian detection and the pedestrian re-identification are jointly processed, and a new deep learning framework is provided as shown in fig. 3.
The framework is mainly divided into 5 modules, the first Module is a convolutional neural network CNN, the basic model of the CNN is ResNet50, in (input normalization) is added into a residual block in order to eliminate the influence of an image background, and in order to extract pedestrian features more intensively, Attention mechanisms are added into the ResNet network, namely Channel Attention Module and Spatial Attention Module respectively. The second module is pedestrian detection, which uses convolution layer to convert the pedestrian character in the character graph, uses anchor point and SoftMax classifier to predict whether the anchor frame contains pedestrian at each position of the character graph, and it also includes a linear regression to adjust the position of the anchor frame. The third module is a Pooling layer, which is to send the region of 1024 × 14 obtained from the feature map into a region-of-interest Pooling layer (Rol Pooling), then pass through a Global Average Pooling layer (Global Average Pooling), and finally integrate to obtain 2048-dimensional feature vectors. The fourth module is based on Mutual neighbor Pseudo label assignment (Mutual neighbor Neighbors Pseudo label), which can better explore the similarity between samples. The last module is a penalty function that calculates the penalty based on the assigned pseudo label.
A first module: CNN convolutional neural network
The basic model of the CNN convolutional neural network is ResNet50, the backbone is actually composed of the first four layers of ResNet50, each layer has a set of residual blocks, and IN (IN is the instant Normalization, i.e., the case Normalization, BN is Batch Normalization, IN is a variation of BN) is added just before the residual block relu, as shown IN fig. 4.
The residual block is reconstructed by adding IN before relu. IN is a variant of Batch Normalization (BN). The difference IN the calculations between them is that IN is feature normalized using the statistics of a single sample rather than the statistics of a batch of samples. IN is mainly used for style transformation fields to filter out instance-specific contrasts from the content, the addition of which can significantly improve model performance, can be written as:
Figure 946237DEST_PATH_IMAGE013
Figure 958055DEST_PATH_IMAGE014
Figure 703026DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 966648DEST_PATH_IMAGE004
indicating the second in the activation map
Figure 926514DEST_PATH_IMAGE005
The elements of the group are selected from the group,
Figure 735332DEST_PATH_IMAGE006
and
Figure 983911DEST_PATH_IMAGE007
is a dimension that spans a space in which,
Figure 644699DEST_PATH_IMAGE008
is a characteristic channel, and the characteristic channel is,
Figure 849285DEST_PATH_IMAGE009
is an index of the images in the batch,
Figure 140589DEST_PATH_IMAGE010
is a small constant which is a constant number of times,
Figure 938780DEST_PATH_IMAGE011
and
Figure 32288DEST_PATH_IMAGE012
respectively, height and width.
The CNN trunk is preceded by a 7 × 7 convolutional layer (i.e., the first convolutional layer conv 1), followed by 4 blocks (conv 2_ x to conv5_ x), which respectively contain 3, 4, 6, and 3 residual units. We add the attention mechanism at the first layer of the network and the last layer of the convolutional layer for feature extraction, and the added renet 50 backbone network is as in fig. 5. A given input image stem CNN will produce feature maps of 1024 channels with a resolution of 1/16 of the original image.
A second module: pedestrian detection
The panoramic picture generates a feature map through a convolutional neural network CNN, and pedestrian detection is to predict a pedestrian boundary frame on the basis of the feature map. We add 512 x 3 convolution layer on the feature map, then anchor classification and anchor regression to predict the pedestrian interested region, and generate many interested region bounding boxes in the pedestrian region prediction stage, and the output of this stage is the bounding box list of the possible positions of the pedestrian. We first transformed the pedestrian features using 512 x 3 convolutional layers, using 9 anchor points and a SoftMax classifier at each position of the feature map to predict whether each bounding box contains a pedestrian. Then, to further determine the anchor frame, the linear regressor is used to adjust the anchor frame position, and we will leave the first 128 adjusted bounding boxes after non-maximum suppression as the final choice. Since pedestrian detection will inevitably contain some false alarms and misalignment cases, the SoftMax classifier and linear regression are again used to exclude non-people and refine the location.
A third module: pooling layer
The pooling layer includes two layers of region-of-interest pooling and global average pooling, and the pedestrian region prediction stage generates many regions of interest, which may slow down performance and processing speed, where region-of-interest pooling is needed. Region of interest pooling for each region of interest from the input list, a portion of the corresponding input feature map is taken and scaled to some predefined size, the scaling being done by: (1) dividing the prediction region into equal sized portions (the number of which is the same as the output dimension); (2) finding the maximum value of each part; (3) these maxima are copied to the output. Finally, from a list of bounding rectangle boxes with different sizes, a list of corresponding feature maps with a fixed size can be obtained quickly. The dimensionality of the region of interest pooling output does not actually depend on the size of the input element map, nor on the region of interest size, which is determined only by the number of portions into which we divide the predicted pedestrian.
The region of interest pooling layer is used for obtaining 1024 × 14 regions from the feature map, and then the regions are sent to ResNet50, and the rest conv4_4 to conv5_3 are followed by a global average pooling layer, and the regions are integrated to finally obtain 2048-dimensional feature vectors.
A fourth module: mutual nearest neighbor based pseudo label allocation
After 2048-dimensional feature vectors are obtained, we learn the similarity between pedestrian feature vectors based on a Mutual Nearest Neighbors Pseudo label assignment Method (MNNPL). The MNNPL method is based on a transitive k-nearest-neighbor relationship, where k-nearest-neighbor means that two samples are located in k-nearest-neighbor of each other, and is expressed as follows:
Figure 966746DEST_PATH_IMAGE016
wherein
Figure 428951DEST_PATH_IMAGE017
Representing the nearest neighbor of k,
Figure 839073DEST_PATH_IMAGE018
the subscript of (b) represents the image index.
The MNNPL method has excellent performanceIn clustering algorithms, in particular, mutual nearest neighbor means that two samples are k nearest neighbors to each other, and when k is small, it is considered as a strong constraint, which can be used to solve the problem of small inter-class distance caused by similar clothes. Meanwhile, there is a certain correlation between the viewing angles, for example, the front and back faces and the side faces are somewhat similar although the difference between the front and back faces of the pedestrian is large, so the side faces can be used to establish the connection between the front face and the back face. For example, defining the front, sides and back of a pedestrian, respectively, then
Figure 841664DEST_PATH_IMAGE019
Is the corresponding characteristic. Since the back of a pedestrian may be significantly different from the front,
Figure 771574DEST_PATH_IMAGE020
. However, both front and rear are similar to the sides, and therefore
Figure 155413DEST_PATH_IMAGE021
And
Figure 662618DEST_PATH_IMAGE022
i.e. labels
Figure 344266DEST_PATH_IMAGE023
= label
Figure 518895DEST_PATH_IMAGE024
And a label
Figure 306592DEST_PATH_IMAGE024
= label
Figure 442038DEST_PATH_IMAGE025
. Then, the label is obtained according to the transmission principle
Figure 786432DEST_PATH_IMAGE023
= label
Figure 438736DEST_PATH_IMAGE025
. Thus, we can use a transferable interconnectThe nearest neighbor relation solves the problem of large intra-class distance caused by the viewing angle. The overall process of MNNPL is roughly as follows: firstly, calculating the nearest neighbor relation of all feature vectors; secondly, the whole feature vector space is divided into a plurality of different clusters by utilizing transmissibility, and the pseudo labels are obtained.
A fifth module: loss function
The cross entropy loss and triplet loss functions are applied simultaneously in the training. After all 2048-dimensional feature vectors are clustered to generate pseudo labels, the pseudo labels and classification results generated by a classifier are used together to calculate cross entropy loss, and the cross entropy loss is used
Figure 148066DEST_PATH_IMAGE026
The sampling method forms small batches, i.e. each small batch is composed of k pictures of p pedestrians, and the cross entropy loss can be written as follows:
Figure 895443DEST_PATH_IMAGE027
Figure 168161DEST_PATH_IMAGE028
is an image
Figure 51803DEST_PATH_IMAGE029
Belong to the label
Figure 400876DEST_PATH_IMAGE030
The prediction probability of (2).
In this batch, in addition to calculating the classification penalty, the hard triplet penalty is also calculated as follows:
Figure 792805DEST_PATH_IMAGE031
wherein
Figure 744581DEST_PATH_IMAGE032
Figure 358096DEST_PATH_IMAGE033
Figure 737125DEST_PATH_IMAGE034
Respectively features extracted from the anchor image, the positive examples and the negative examples,
Figure 114885DEST_PATH_IMAGE035
is the super-edge parameter.
The total loss is the sum of the above two losses, which is defined as
Figure 745718DEST_PATH_IMAGE036
The fourth part: loitering judgment
In the step of pedestrian detection and re-identification, the obtained panoramic picture is input into a convolutional neural network to detect pedestrians and extract the characteristics of the pedestrians, k nearest neighbors of all characteristic vectors are calculated through a KMNN algorithm, and finally the k nearest neighbors are divided into a plurality of clusters by utilizing transmissibility and pseudo labels are distributed. In this process, we find a class of pictures with features similar to pedestrians. According to the final pseudo label distribution result, carrying out wandering judgment. Because each panoramic picture has some attribute information such as camera id, frame number and the like, in a group of similar pedestrian features with the same pseudo tag, whether the camera id of the pedestrian picture is the same or not is judged firstly, and if the camera id of the pedestrian picture is different, the possibility of wandering pedestrians is eliminated; if the camera id is the same, judging whether the frame number interval of the pedestrian picture is larger than M (M is a constant, the normal non-loitering stay time is the number of frames per second), and if the frame number interval of the pedestrian picture is larger than M, determining that the pedestrian has loitering behavior; if M is less than M, it is considered that the pedestrian is less likely to wander.
The invention has the beneficial effects that:
firstly, a traditional method is combined with deep learning, panoramic picture frames are continuously acquired from a real-time security video by the traditional method, the acquired panoramic pictures are sent to a deep learning model, and two tasks of pedestrian detection and pedestrian recognization are combined, so that not only is a complex network for detecting wandering of pedestrians simplified by the complete deep learning method, but also the accuracy of wandering detection of the traditional method is improved;
secondly, for the influence of the background information of the pedestrian picture, a method for simultaneously adding an IN module and an attention mechanism IN a ResNet50 network is firstly proposed, the IN (instant normalization) module inhibits the influence of the picture background, the attention mechanism focuses more on the extraction of the pedestrian features and reduces the attention to the background information, and the influence of the background information on the final pedestrian recognition result is effectively reduced by adding the IN (instant normalization) module and the attention mechanism.
The above are merely preferred embodiments of the present invention.

Claims (10)

1. A pedestrian loitering re-identification method for panoramic intelligent security is characterized by comprising the following steps: the method comprises the following steps:
a. collecting panoramic pictures of the real-time security monitoring video;
b. carrying out quality evaluation on the acquired panoramic picture;
c. carrying out pedestrian detection and re-identification combined processing on the screened panoramic picture;
d. and (4) carrying out pedestrian loitering judgment on the pseudo label distribution result obtained in the step of pedestrian detection and re-identification.
2. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 1, is characterized in that: in the step a, the security video panoramic picture is acquired based on the python version of OpenCV, a cv2 and a NumPy library are prepared, real-time preview is realized on the security camera in a stream pushing mode, the frame number of the camera is read to capture the panoramic picture, and the picture is saved.
3. The method for recognizing the loitering of the pedestrian in the panoramic intelligent security and protection as claimed in claim 1, wherein in the step b, the image quality evaluation is based on an image evaluation model, whether the pedestrian in the panoramic image is fuzzy or not and is shielded too much is judged, and a video screenshot with an evaluation result lower than a grading threshold value is screened out.
4. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 1, is characterized in that: and c, adopting a new deep learning framework to carry out combined processing on pedestrian detection and pedestrian re-identification.
5. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 4, is characterized in that: the new deep learning framework is mainly divided into 5 modules, a Convolutional Neural Network (CNN) module, a pedestrian detection module, a pooling layer module, a mutual neighbor-based pseudo tag distribution module and a loss function module, wherein the pedestrian loitering judgment is based on a pseudo tag distribution result, whether camera ids of pedestrian pictures are the same or not is judged in a group of similar pedestrian features with the same pseudo tag, and if the camera ids are different, the possibility of loitering of the pedestrian is eliminated; and if the camera id is the same, judging the frame number interval of the pedestrian picture.
6. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 5, is characterized in that: the basic model of the convolutional neural network CNN module is ResNet50, a main stem consists of the first four layers of ResNet50, an attention mechanism is added to the first layer of the network and the last layer of the convolutional layer for extracting features, an instant Normalization is added before a residual block relu of each layer for eliminating the influence of an image background,
Figure 605004DEST_PATH_IMAGE001
Figure 882402DEST_PATH_IMAGE002
Figure 706001DEST_PATH_IMAGE003
Figure 861301DEST_PATH_IMAGE004
indicating the second in the activation map
Figure 86746DEST_PATH_IMAGE005
The elements of the group are selected from the group,
Figure 269466DEST_PATH_IMAGE006
and
Figure 111520DEST_PATH_IMAGE007
is a dimension that spans a space in which,
Figure 37888DEST_PATH_IMAGE008
is a characteristic channel, and the characteristic channel is,
Figure 117839DEST_PATH_IMAGE009
is an index of the images in the batch,
Figure 2619DEST_PATH_IMAGE010
is a small constant which is a constant number of times,
Figure 302275DEST_PATH_IMAGE011
and
Figure 32334DEST_PATH_IMAGE012
respectively, height and width.
7. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 5, is characterized in that: the pedestrian detection module is based on the feature map, converts pedestrian features in the feature map by using a 512 x 3 convolution layer, predicts whether an anchor frame contains a pedestrian or not by using an anchor point and a SoftMax classifier at each position of the feature map, and further comprises a linear regression for adjusting the position of the anchor frame.
8. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 5, is characterized in that: the pooling layer module includes: region of interest Pooling (Rol Pooling) and Global Average Pooling (Global Average Pooling).
9. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 5, is characterized in that: the mutual neighbor-based pseudo label distribution module calculates the nearest neighbor relation of all the characteristic vectors, and then divides the whole characteristic vector space into a plurality of different clusters by utilizing the transitivity to obtain the pseudo labels.
10. The pedestrian loitering re-identification method for panoramic intelligent security, according to claim 5, is characterized in that: the loss function module comprises a cross entropy loss function and a triple loss function.
CN202110978611.3A 2021-08-25 2021-08-25 Pedestrian loitering re-identification method for panoramic intelligent security Pending CN113627383A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110978611.3A CN113627383A (en) 2021-08-25 2021-08-25 Pedestrian loitering re-identification method for panoramic intelligent security

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110978611.3A CN113627383A (en) 2021-08-25 2021-08-25 Pedestrian loitering re-identification method for panoramic intelligent security

Publications (1)

Publication Number Publication Date
CN113627383A true CN113627383A (en) 2021-11-09

Family

ID=78387606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110978611.3A Pending CN113627383A (en) 2021-08-25 2021-08-25 Pedestrian loitering re-identification method for panoramic intelligent security

Country Status (1)

Country Link
CN (1) CN113627383A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114662521A (en) * 2021-11-16 2022-06-24 成都考拉悠然科技有限公司 Method and system for detecting wandering behavior of pedestrian

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948425A (en) * 2019-01-22 2019-06-28 中国矿业大学 A kind of perception of structure is from paying attention to and online example polymerize matched pedestrian's searching method and device
CN112069940A (en) * 2020-08-24 2020-12-11 武汉大学 Cross-domain pedestrian re-identification method based on staged feature learning
CN112733814A (en) * 2021-03-30 2021-04-30 上海闪马智能科技有限公司 Deep learning-based pedestrian loitering retention detection method, system and medium
CN112785572A (en) * 2021-01-21 2021-05-11 上海云从汇临人工智能科技有限公司 Image quality evaluation method, device and computer readable storage medium
US20210232813A1 (en) * 2020-01-23 2021-07-29 Tongji University Person re-identification method combining reverse attention and multi-scale deep supervision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948425A (en) * 2019-01-22 2019-06-28 中国矿业大学 A kind of perception of structure is from paying attention to and online example polymerize matched pedestrian's searching method and device
US20210232813A1 (en) * 2020-01-23 2021-07-29 Tongji University Person re-identification method combining reverse attention and multi-scale deep supervision
CN112069940A (en) * 2020-08-24 2020-12-11 武汉大学 Cross-domain pedestrian re-identification method based on staged feature learning
CN112785572A (en) * 2021-01-21 2021-05-11 上海云从汇临人工智能科技有限公司 Image quality evaluation method, device and computer readable storage medium
CN112733814A (en) * 2021-03-30 2021-04-30 上海闪马智能科技有限公司 Deep learning-based pedestrian loitering retention detection method, system and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TONG XIAO 等: "Joint Detection and Identification Feature Learning for Person Search", 《ARXIV》, pages 1 - 10 *
YANWEN CHONG 等: "Learning domain invariant and specific representation for cross-domain person re-identification", 《APPLIED INTELLIGENCE》, pages 5219 - 5232 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114662521A (en) * 2021-11-16 2022-06-24 成都考拉悠然科技有限公司 Method and system for detecting wandering behavior of pedestrian

Similar Documents

Publication Publication Date Title
CN107967451B (en) Method for counting crowd of still image
Bertini et al. Multi-scale and real-time non-parametric approach for anomaly detection and localization
CN107622258B (en) Rapid pedestrian detection method combining static underlying characteristics and motion information
Paul et al. Human detection in surveillance videos and its applications-a review
AU2014240213B2 (en) System and Method for object re-identification
CN111914664A (en) Vehicle multi-target detection and track tracking method based on re-identification
US8855363B2 (en) Efficient method for tracking people
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
Denman et al. Multi-spectral fusion for surveillance systems
Hu et al. Parallel spatial-temporal convolutional neural networks for anomaly detection and location in crowded scenes
Teja Static object detection for video surveillance
CN113627383A (en) Pedestrian loitering re-identification method for panoramic intelligent security
Yang et al. Video anomaly detection for surveillance based on effective frame area
Hou et al. Human detection and tracking over camera networks: A review
Agrawal et al. An improved Gaussian Mixture Method based background subtraction model for moving object detection in outdoor scene
CN116824641A (en) Gesture classification method, device, equipment and computer storage medium
Skadins et al. Edge pre-processing of traffic surveillance video for bandwidth and privacy optimization in smart cities
WO2022228325A1 (en) Behavior detection method, electronic device, and computer readable storage medium
Mantini et al. Camera Tampering Detection using Generative Reference Model and Deep Learned Features.
Park et al. Intensity classification background model based on the tracing scheme for deep learning based CCTV pedestrian detection
Al Najjar et al. A hybrid adaptive scheme based on selective Gaussian modeling for real-time object detection
Chen et al. Spatiotemporal motion analysis for the detection and classification of moving targets
CN113255549A (en) Intelligent recognition method and system for pennisseum hunting behavior state
Savaliya et al. Abandoned object detection system–a review
Dammalapati et al. An efficient criminal segregation technique using computer vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination