CN109583315B - Multichannel rapid human body posture recognition method for intelligent video monitoring - Google Patents

Multichannel rapid human body posture recognition method for intelligent video monitoring Download PDF

Info

Publication number
CN109583315B
CN109583315B CN201811299870.8A CN201811299870A CN109583315B CN 109583315 B CN109583315 B CN 109583315B CN 201811299870 A CN201811299870 A CN 201811299870A CN 109583315 B CN109583315 B CN 109583315B
Authority
CN
China
Prior art keywords
video
network
forwarding
thread
specific steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811299870.8A
Other languages
Chinese (zh)
Other versions
CN109583315A (en
Inventor
赵霞
李磊
于重重
管文化
赵松
冯泽骁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN201811299870.8A priority Critical patent/CN109583315B/en
Publication of CN109583315A publication Critical patent/CN109583315A/en
Application granted granted Critical
Publication of CN109583315B publication Critical patent/CN109583315B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The invention realizes a multichannel rapid human body posture recognition method oriented to intelligent video monitoring, and builds a multichannel rapid human body posture recognition system architecture; the forwarding server receives a client request, acquires a video stream from the network video recorder, and selects a key frame for format conversion; the method comprises the steps of performing moving target rapid detection and human body detection on key frames, sending video frames with human bodies to an intelligent analysis server, performing gesture recognition by the intelligent analysis server, and returning recognition results to a forwarding server; the forwarding server forwards the identification result and the video stream to the client, and the client displays the result on an interface; under the condition of supporting multiple paths of video channels, human body detection and gesture recognition are realized faster, the influence of complex backgrounds and clothes is shielded, the recognition speed and accuracy are obviously improved, and the method has good application prospect and market value.

Description

Multichannel rapid human body posture recognition method for intelligent video monitoring
Technical Field
The invention relates to human body gesture recognition in a monitoring video, in particular to a multichannel rapid human body gesture recognition method for intelligent video monitoring, and belongs to the fields of real-time streaming media and computer vision.
Background
The main function of most video monitoring systems at present is to record and display scenes of a monitoring site, so that real-time intelligent analysis can not be performed on the monitoring site, security personnel are required to view video pictures of the monitoring site in real time, the behaviors and scene properties of the personnel are judged, and the problems of fatigue, omission, misjudgment and the like which are different from person to person can occur. Therefore, the existing video monitoring system plays a great role in post evidence collection, but is difficult to meet the requirements of timeliness, accuracy, intellectualization and high efficiency required by the video monitoring system for current security early warning and security work in the aspects of automatically identifying the site case and finding the site case.
The intelligent video monitoring system is a new technology in the current video monitoring field, adopts technologies such as computer vision, image processing, pattern recognition and the like, automatically analyzes video images shot by a camera, detects targets in a scene, recognizes and tracks the targets, analyzes and understands the behaviors of the targets on the basis, and provides information useful for monitoring and early warning. The intelligent video monitoring system is not only applied to the field of safety precaution, but also has wide application prospect and great economic value in the fields of traffic, military, finance, industry and the like.
In many systems requiring intelligent video analysis and processing, such as action classification, abnormal behavior detection, automatic driving, etc., it is important to describe human body gestures and predict human body behaviors. The human body posture recognition technology can be applied to household monitoring and teaching applications, such as recognizing the fall of the elderly in the solitary, and knowing the teaching state of students. By analyzing the video stream of the video monitoring system in real time, the special human body gesture is identified, and the response is timely made, such as rescue of the old, teaching adjustment in a classroom and the like.
The human body gesture recognition is a process of giving an image or a video, detecting the positions of human skeleton joints in the image or the video, and classifying and labeling the human body gesture according to the structural characteristics of the joints. Wherein human skeletal joint detection is a key step in human gesture recognition. With the development of deep learning technology, the detection effect of human skeletal joints is continuously improved, and the method has been applied to the related fields of computer vision and has been focused by researchers.
The pedestrian detection method based on video monitoring (CN 201010227766.5) adopts the expanded gradient histogram feature and the Adaboost algorithm to rapidly detect pedestrians, and then uses the gradient histogram feature and a support vector machine to further identify and verify the pedestrians detected in the front. The patent CN20110566809.1 discloses a pedestrian detection method in intelligent video monitoring, which adopts a support vector machine to train a pedestrian detector model, classifies each pedestrian detection window in a picture by using the support vector machine, and fuses the detection windows to obtain a final pedestrian detection result. The intelligent detection and early warning method of the salient events of the monitoring video of the active visual attention (CN 201710181799.2) establishes a rapid extraction method of primary information of the visual attention from bottom to top, establishes an active detection model of a dynamic target, then tracks the salient targets by applying a particle swarm algorithm, and establishes an active early warning model of salient time in the monitoring video at the same time, thereby realizing the intelligent detection and early warning system of the salient events of the monitoring video based on the visual attention model. The patent 'a human body gesture recognition method based on OptiTrack' (CN 201711120678.3) adopts a local linear embedding algorithm to extract gesture features of a training sample, and uses a dimension reduction thought to bring key semantic frames into the gesture features of the training sample, so as to classify the features of the key semantic frames, thereby realizing classification and recognition of gestures; the patent "learner gesture recognition method" (CN201710457825. X) proposes a learner gesture recognition method, which comprises separating a portrait from a background, extracting a contour image of a learner by using mathematical morphological operation on a binarized image, extracting features by using a Zernike matrix, training feature vectors by using a support vector machine, and recognizing the gesture of the learner.
The research results mainly focus on algorithms of pedestrian detection and event detection, and adopt a traditional machine learning method to extract features and detect and identify, and do not relate to a flow and a method for fusing the algorithms with a video forwarding processing flow of a video monitoring system. The method directly acquires real-time image frames from video monitoring equipment, detects the positions of human body joint points based on a human body skeleton joint point detection algorithm of a deep neural network, extracts human body structural skeleton information and performs gesture recognition; the interference of the complex background of the picture and the clothing of the person on the identification effect is avoided. The invention designs a flow mechanism for forwarding, detecting and identifying multi-path video streams, so that the server can efficiently forward each path of video frame and can rapidly detect human targets and identify human body gestures.
Disclosure of Invention
The invention aims to realize a multichannel rapid human body posture recognition method oriented to intelligent video monitoring, and builds a multichannel rapid human body posture recognition system architecture; the forwarding server receives a client request, acquires a video stream from the network video recorder, and selects a key frame for format conversion; the method comprises the steps of performing moving target rapid detection and human body detection on key frames, sending video frames with human bodies to an intelligent analysis server, performing gesture recognition by the intelligent analysis server, and returning recognition results to a forwarding server; and the forwarding server forwards the identification result and the video stream to the client, and the client displays the result on the interface. Specifically, the method of the present invention comprises the steps of:
A. building a multichannel rapid human body posture recognition system architecture oriented to intelligent video monitoring;
A1. the system comprises a client, a video forwarding server (forwarding server for short), an intelligent analysis server and a network video recorder;
A2. the client sends a request for acquiring the video stream to the forwarding server, and displays the video image and the identification result to the user;
A3. the forwarding server receives a client request, acquires a requested video stream from the network video recorder and forwards the requested video stream to the client;
A4. the forwarding server performs rapid detection and rapid human body detection on the moving target, sends a gesture recognition request to the intelligent analysis server, receives a recognition result and sends the recognition result to the client;
A5. the intelligent analysis server receives the identification request, identifies the gesture and returns an identification result to the forwarding server;
A6. the intelligent analysis servers communicate video data and identification information through the network control port and the data port;
B. the forwarding server receives a client request, acquires a video stream, forwards the video stream to the client, and creates a detection sub-thread for rapid detection, and the specific steps are as follows:
B1. the main thread receives a client request and acquires a video stream of a requested channel from the network video recorder;
B2. the main thread creates a ring buffer queue and a forwarding sub-thread for each path of video channel;
B2.1. creating an annular buffer queue to store forwarding data packets corresponding to all channels;
B2.2. creating a forwarding data packet for the acquired video frame, and mounting the forwarding data packet on an annular buffer queue of each channel;
the forwarding data packet comprises a data head and a video frame buffer area; the data head comprises, but is not limited to, video frame size, format, time t, gesture recognition result information and a key frame mark nIDR, wherein a value of 1 represents a key frame, and a value of 0 represents a non-key frame;
B2.3. creating a forwarding sub-thread for taking video frames from the annular buffer queue, and forwarding the video frames to a client for real-time display;
B3. the method comprises the steps of creating a detection sub-thread for acquiring video frames from a ring buffer queue, rapidly detecting and sending gesture recognition requests;
C. the detection sub-thread corresponding to the channel acquires the key frame and performs format conversion, and the specific steps are as follows:
C1. the detection sub-thread acquires a forwarding data packet from a ring buffer queue of the channel to which the detection sub-thread belongs;
C2. selecting a video frame with ndidr=1 as a key frame for subsequent processing;
C3. decoding the H.264 format video frame into YUV format, and converting into JPG format;
D. the detection sub-thread carries out the rapid detection of the moving target on the key frame, and the specific steps are as follows:
D1. the gray level difference value of 3 continuous video frames is calculated, and the specific steps are as follows:
D1.1. let t n-1 、t n 、t n+1 The gray value of the video frame at the moment is marked as F n-1 ,F n ,F n+1
D1.2. Separately computing video frames F n And F is equal to n-1 ,F n+1 And F is equal to n Respectively as t n-1 Time sum t n Time foreground image D n-1 And D n
D2. The method comprises the following specific steps of performing rapid moving target detection on a foreground image, wherein the image containing a moving target is recorded as Rn':
D2.1. pair D n-1 And D n Intersection calculation is carried out to obtain D n ’;
D2.2. According to the preset threshold T1, pair D n Each pixel point in' is subjected to binarization processing to obtain a binarized image Rn, wherein the value of T1 comprises but is not limited to 10; the method comprises the following specific steps:
D2.2.1. let D be D n Gray values of the pixels in';
D2.2.2. if d > T1, R is recorded n =255, i.e. the motion target point;
D2.2.3. if d < = T1, R is recorded n =0, i.e. background point;
D2.3. counting the number of pixel points with the pixel value of 255 in Rn, if the number of the pixel points is larger than a preset threshold T2, considering that a moving target exists in the video, and marking the image as Rn'; when the corresponding image resolution is 464 x 464, the value of T2 is 30000;
E. the detection sub-thread detects human body of Rn' and sends a gesture recognition request to the intelligent analysis server, and the specific steps are as follows:
E1. loading parameters of a deep neural network model (network for short) including anchor frame information;
the anchor frame information refers to the width and height information of the anchor frame obtained by clustering during training of the network; the anchor frame refers to N bounding boxes with highest occurrence probability and used for predicting targets, wherein N comprises but is not limited to 5;
E2. processing the resolution of an input image to 464 x 464, and processing the input image through a network convolution layer and a pooling layer to obtain a feature map with the resolution of 13 x 13;
E3. the target frame is predicted for the feature map through a network prediction layer, and the specific steps are as follows:
e3.1, predicting information of M target frames, target frame types and corresponding confidence and type probabilities of each pixel on the feature map by using anchor frames;
the information of the target frame comprises: the offset of the center of the target frame relative to the pixel, and the width and height of the target frame, are denoted (x, y, w, h);
the confidence level represents the accuracy of the predicted target frame position information;
the class probability represents the probability of predicting the class of the target frame as a human body;
E3.2. filtering out target frames with confidence below a preset threshold T, wherein the threshold T comprises but is not limited to 0.7;
E3.3. removing the repeated target frames by using the maximum inhibition to the remained target frames;
E3.4. selecting a target frame with highest category probability, and outputting coordinates of a lower left corner and an upper right corner;
E4. if the image frames detected in the step E contain human bodies, packaging original image frames with human bodies and target frame information into gesture recognition requests every K frames, and sending the gesture recognition requests to an intelligent analysis server, wherein the K value is more than or equal to 1;
E5. if the image frame detected in the step E does not have a human body, the subsequent processing is not carried out;
F. the intelligent analysis server receives the gesture recognition request, performs gesture recognition processing, and returns a result to the forwarding server, and the specific steps are as follows:
F1. the main thread creates an identification sub-thread and an identification buffer queue for each video channel, and the specific steps are as follows:
F1.1. the main thread receives a gesture recognition request sent from a forwarding server, and mounts the received image frames to a recognition buffer queue corresponding to the channel;
F1.2. the identification sub-thread corresponding to the channel acquires the image frames in the identification buffer queue and marks the image frames as an original picture S;
F2. the identification sub-thread loads a deep neural network model formed by four-level networks, and extracts characteristic information of human joints in an original picture S, and the specific steps are as follows:
F2.1. the first level network comprises two paths of network N 11 And N 12 Generating a feature map F 1 1 ~F 1 14 The method comprises the following specific steps:
F2.1.1.N 11 the network extracts the characteristics on the original picture S by utilizing a plurality of residual modules, and outputs 14 paths of characteristic diagrams F 11-1 ~F 11-14 ;N 12 The network firstly downsamples the original picture S, then passes through a plurality of residual modules, and then upsamples to output 14 paths of characteristic diagrams F 12-1 ~F 12-14 The method comprises the steps of carrying out a first treatment on the surface of the Each path of characteristic diagram corresponds to a gateway node with highest Gaussian response;
the 14 joint points corresponding to the 14 paths of feature maps comprise: head, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle;
F2.1.2. based on F2.1.1 characteristic diagram, introducing encoder and decoder to generate characteristic diagram F 'with weight W' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 The method comprises the following specific steps:
F2.1.2.1. the encoder uniformly divides each feature map of F2.1.1 into a plurality of areas, which are called feature area a, and the code corresponding to each area is y;
the set of characteristic regions a is denoted as R D ={a 1 ,…,a i ,…,a L (wherein a) i An i-th region, 1 < =i < =l, and L represents the number of divided regions; a, a i ∈R D ,R D A complete feature map with D as a cutting unit, where D represents the number of pixels in the region, and if the region size is 14×14, d=196;
the code y is a 14-dimensional vector, wherein 14 is the number of joints; the coding of the i-th region is denoted y i ,y i The j-th element in the vector is denoted as y ij ,y ij =1 indicates that the i-th region contains the j-th node; y is ij =0 indicates that the ith region does not contain the jth node; the set of Y is denoted as Y D ={y 1 ,…,y i ,…,y L },1<=i<=L;
F2.1.2.2. The decoder calculates a weight W for each feature map of F2.1.1 11-1 ~W 11-14 And W is 12-1 ~W 12-14 : each weight W is the weight of all feature regions in the feature mapThe set of heavy W, denoted w= { W 1 ,…,w i ,…,w L -a }; weights w of each feature region i Representing the proportionality coefficient occupied by the characteristic region when the characteristic region is input to the next stage of processing, and calculating by utilizing the characteristic region a and the code y;
F2.1.3. fusion profile F' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 Obtaining the output characteristic diagram F of the network of the stage 1 1 ~F 1 14
F2.1.4. Original picture S and feature map F 1 1 ~F 1 14 As input to the next level of network;
F2.2. the second level sub-network comprises two paths of network N 21 And N 22 Feature map F output by first level subnetwork 1 1 ~F 1 14 And the original picture S is taken as input, the specific steps of the step F2.1 are repeated to obtain output F 2 1 ~F 2 14
F2.3. The third level sub-network comprises two paths of network N 31 And N 32 Feature map F output by second level subnetwork 2 1 ~F 2 14 And the original picture S is taken as input, and the specific steps under the step F2.1 are repeated to obtain output F 3 1 ~F 3 14
F2.4. The fourth level sub-network comprises two paths of network N 41 And N 42 Feature map F output by third-order subnetwork 3 1 ~F 3 14 And the original picture S is taken as input, and the specific steps under the step F2.1 are repeated to obtain output F 4 1 ~F 4 14
F2.5. Using feature maps F 4 1 ~F 4 14 Extracting the names and coordinates of all joints, connecting adjacent joints according to a physiological common sense, calculating the distance between joint points, and constructing a skeleton diagram; the skeleton diagram comprises names and coordinates of all the joint points and the distance between the joint points;
F3. the recognition sub-thread classifies the gesture of the skeleton map by using an SVM classifier, and the specific steps are as follows:
F3.1. loading trained SVM classifiers, identifying poses including, but not limited to kneeling, lying, sitting, standing;
F3.2. classifying the skeleton diagram, and returning the identification result to the forwarding server; the recognition result comprises a skeleton diagram and the gesture category and accuracy of the skeleton diagram;
G. the main thread of the forwarding server receives the identification result and forwards the identification result to the client, and the client displays the result on an interface; the method comprises the following specific steps:
G1. the main thread receives the identification result, and writes the identification result into the latest forwarding data packet in the annular buffer queue of the corresponding channel;
G2. the forwarding sub-thread of the corresponding channel acquires a forwarding data packet from the forwarding buffer queue and sends the forwarding data packet to the client;
G3. the client receives the forwarding data packet containing the identification result, extracts the video frame and the identification information, and displays the video frame and the identification information on the client interface.
The invention provides a human body detection and gesture recognition method based on a depth neural network in rapid real time, which comprises three stages of moving target rapid detection, human body rapid detection and human body gesture recognition. The three tasks are completed in parallel by adopting a multi-channel concurrency mechanism and a distributed processing mechanism, so that the parallel computing capacity of a multiprocessor can be better utilized in a multi-core processor system, and the human body detection and gesture recognition can be realized more quickly under the condition of supporting multiple paths of video channels simultaneously. The deep neural network is adopted to carry out gesture recognition, the human body joint point information is extracted, and the skeleton information is constructed, so that the influence of noise such as complex background, clothes and the like can be shielded, the recognition speed and accuracy are obviously improved, the hardware cost is reduced, and the method has good application prospect and market value.
Drawings
Fig. 1: a flow chart of a multichannel rapid human body gesture recognition method for intelligent video monitoring;
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings and examples.
The invention is further described below by way of example of yoga teaching video gesture recognition according to implementation steps. The experimental environment is as follows:
Figure GSB0000178620060000061
Figure GSB0000178620060000071
/>
the method flow is shown in figure 1, and the method comprises the following steps: 1) Building a multichannel rapid human body gesture recognition system oriented to intelligent video monitoring; 2) The detection sub-thread carries out rapid detection on the moving target; 3) The detection sub-thread carries out human body rapid detection; 4) Quickly recognizing human body posture; the intelligent analysis server carries out rapid human body gesture recognition on the human body video frame by adopting a neural network, and sends a recognition result to the forwarding server; 5) The client presents a gesture recognition result; and the forwarding server forwards the identification result and the corresponding video stream to the client, the client receives the forwarding data packet containing the identification result, and the client displays the gesture identification result on the interface. The invention is further described in terms of steps in connection with system examples as follows:
1. building a multichannel rapid human body posture recognition system architecture oriented to intelligent video monitoring;
1.1. the system comprises a client, a video forwarding server (forwarding server for short), an intelligent analysis server and a network video recorder;
1.2. the client sends a request for acquiring yoga teaching video to the forwarding server, and video images and recognition results are displayed to a user;
1.3. the forwarding server receives a client request, acquires yoga teaching video stream from the network video recorder and forwards the yoga teaching video stream to the client;
1.4. the forwarding server performs rapid detection and rapid human body detection on the moving target, sends a gesture recognition request to the intelligent analysis server, receives a recognition result and sends the recognition result to the client;
1.5. the intelligent analysis server receives the identification request, identifies the gesture and returns an identification result to the forwarding server;
1.6. the intelligent analysis servers communicate video data and identification information through the network control port and the data port;
2. the detection sub-thread carries out the rapid detection of the moving target on the key frame, and the specific steps are as follows:
2.1. the main thread receives a client request and acquires a video stream of a requested channel from the network video recorder;
2.2. the main thread creates a ring buffer queue and a forwarding sub-thread for each path of video channel;
2.3. the detection sub-thread corresponding to the channel acquires the video frame and performs format conversion, and the specific steps are as follows:
2.3.1. the detection sub-thread acquires a forwarding data packet from the channel ring buffer queue to which the detection sub-thread belongs;
2.3.2. selecting a video frame with ndidr=1 as a key frame for subsequent processing;
2.3.3. decoding the H.264 format video frame into YUV format, and converting into JPG format video frame;
2.4. the detection sub-thread carries out the rapid detection of the moving target on the key frame, and the specific steps are as follows:
2.4.1. calculating the gray level difference value of 3 continuous video frames to obtain t n-1 Time sum t n Time foreground image D n-1 And D n
2.4.2. Intersection and binarization calculation are carried out on the foreground image to obtain R n An image;
2.4.3. counting the number of points with the Rn pixel value of 255, and if the number of points is larger than a set threshold value 30000, considering that a moving target exists in the video, and marking the moving target as Rn';
3. the detection sub-thread detects human body of Rn' and sends a gesture recognition request to the intelligent analysis server, and the specific steps are as follows:
3.1. loading parameters of a deep neural network model (network for short) including anchor frame information;
the anchor frame information refers to the width and height information of anchor frames obtained by clustering during training of a network, and the width and height information is (10, 13), (16, 30), (33, 23), (30, 61), (62, 45), (59, 119), (116, 90), (156, 198), (373, 326) respectively;
3.2. processing the resolution of an input image to 464 x 464, and processing the input image through a network convolution layer and a pooling layer to obtain a feature map with the resolution of 13 x 13;
3.3. the target frame is predicted for the feature map through a network prediction layer, and the specific steps are as follows:
3.3.1. for each pixel on the feature map, predicting information of 5 target frames, target frame categories and corresponding confidence and category probabilities by using anchor frames;
finally, the target frame information (x, y, w, h) of the category probability front 5 is respectively (249.5, 346, 16, 449) (249.5, 462, 15, 449) (249.5, 461.5, 15, 449) (249.5, 232, 82, 449) (249.5, 404.5, 23, 449);
3.3.2. filtering a target frame with the confidence coefficient lower than a preset threshold value T, wherein the threshold value T is 0.7;
3.3.3. removing the repeated target frames by using the maximum inhibition to the remained target frames;
3.3.4. selecting a target frame with highest category probability, and outputting coordinates of a lower left corner and an upper right corner;
finally, the target frame coordinate with the highest probability is obtained, the probability is 0.97645, and the coordinates of the lower left corner and the upper right corner are (0, 338) respectively (499, 354);
3.4. the detected image frames in the step 3.3 contain human bodies, and the original image frames with human bodies and target frame information are packaged into gesture recognition requests every 5 frames and sent to an intelligent analysis server;
4. the intelligent analysis server receives the gesture recognition request, performs gesture recognition processing, and returns a result to the forwarding server, and the specific steps are as follows:
4.1. the main thread creates an identification sub-thread and an identification buffer queue for each video channel, and the specific steps are as follows:
4.1.1. the main thread receives a gesture recognition request sent from a forwarding server, and mounts the received image frames to a recognition buffer queue corresponding to the channel;
4.1.2. the identification sub-thread corresponding to the channel acquires the image frames in the identification buffer queue and marks the image frames as an original picture S;
4.2. the identification sub-thread loads a deep neural network model formed by four-level networks, and extracts characteristic information of human joints in an original picture S, and the specific steps are as follows:
4.2.1. the first level network comprises two paths of network N 11 And N 12 Generating a feature map F 1 1 ~F 1 14 The method comprises the following specific steps:
4.2.1.1.N 11 the network extracts the characteristics on the original picture S by utilizing a plurality of residual modules, and outputs 14 paths of characteristic diagrams F 11-1 ~F 11-14 ;N 12 The network firstly downsamples the original picture S, then passes through a plurality of residual modules, and then upsamples to output 14 paths of characteristic diagrams F 12-1 ~F 12-14 The method comprises the steps of carrying out a first treatment on the surface of the Each path of characteristic diagram corresponds to a gateway node with highest Gaussian response;
the 14 joint points corresponding to the 14 paths of feature maps comprise: head, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle;
4.2.2. based on the feature map of 4.2.1.1, an encoder and a decoder are introduced to generate a feature map F 'with weight W' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 The method comprises the steps of carrying out a first treatment on the surface of the The method comprises the following specific steps:
4.2.2.1. the encoder uniformly divides each feature map of F2.1.1 into a plurality of areas, which are called feature area a, and the code corresponding to each area is y;
the set of characteristic regions a is denoted as R D ={a 1 ,…,a i ,…,a L And }, wherein a i An i-th region, 1 < =i < =l, and L represents the number of divided regions; a, a i ∈R D ,R D A complete feature map with D as a cutting unit, where D represents the number of pixels in a region, and if the region size is 14×14, d=196;
The code y is a 14-dimensional vector, wherein 14 is the number of joints; the coding of the i-th region is denoted y i ,y i The j-th element in the vector is denoted as y ij ,y ij =1 indicates that the i-th region contains the j-th node; y is ij =0 indicates that the ith region does not contain the jth node; the set of Y is denoted as Y D ={y 1 ,…,y i ,…,y L },1<=i<=L;
4.2.2.2. The decoder calculates a weight W for each feature map of F2.1.1 11-1 ~W 11-14 And W is 12-1 ~W 12-14 : each weight W is a set of weights W for all feature regions in the feature map, denoted w= { W 1 ,…,w i ,…,w L -a }; weights w of each feature region i Representing the proportionality coefficient occupied by the characteristic region when the characteristic region is input to the next stage of processing, and calculating by utilizing the characteristic region a and the code y;
4.2.3. fusion profile F' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 Obtaining the output characteristic diagram F of the network of the stage 1 1 ~F 1 14
4.2.4. Original picture S and feature map F 1 1 ~F 1 14 As input to the next level of network;
4.2.5. the second level sub-network comprises two paths of network N 21 And N 22 Feature map F output by first level subnetwork 1 1 ~F 1 14 And the original picture S is taken as input, the specific steps of the step 4.2.1 are repeated to obtain output F 2 1 ~F 2 14
4.2.6. The third level sub-network comprises two paths of network N 31 And N 32 Feature map F output by second level subnetwork 2 1 ~F 2 14 And the original picture S is taken as input, and the specific steps in the step 4.2.1 are repeated to obtain output F 3 1 ~F 3 14
4.2.7. The fourth level sub-network comprises two paths of network N 41 And N 42 Feature map F output by third-order subnetwork 3 1 ~F 3 14 And the original picture S is taken as input, and the specific steps in the step 4.2.1 are repeated to obtain output F 4 1 ~F 4 14
4.2.8. Using feature maps F 4 1 ~F 4 14 Extracting names and coordinates of all joints, connecting adjacent joints according to a physiological common sense, calculating distances between joint points, and constructing a skeleton diagram, wherein the skeleton diagram comprises the following steps:
the upper left corner of the picture is taken as an origin of coordinates, and the coordinate values of the head, the neck, the right shoulder, the right elbow, the right wrist, the left shoulder, the left elbow, the left wrist, the right hip, the right knee, the right ankle, the left hip, the left knee and the left ankle are respectively (62, 123) (98, 120) (108, 95) (138, 67) (162, 85) (107, 144) (115, 169) (82, 161) (166, 103) (144, 105) (299, 113) (166, 131) (248, 127) (300, 131); the calculated (head, neck), (right shoulder, right elbow), (right elbow, right wrist), (left shoulder, left elbow), (left elbow, left wrist), (right hip, right knee), (right knee, right ankle), (left hip, left knee), (left knee, left ankle) line lengths are 36.12, 41.04, 30.00, 26.25, 33.96, 78.03, 55.58, 82.10, 52.15 respectively;
4.3. the recognition sub-thread classifies the gesture of the skeleton map by using an SVM classifier, and the specific steps are as follows:
4.3.1. loading a trained SVM classifier, classifying the skeleton diagram, and respectively obtaining recognition results: sitting: 92%, lie: 85%, station: 95%, kneel: 80.3%; selecting the gesture category with highest accuracy as a station;
4.3.2. the skeleton information, the gesture category station and the accuracy of 95 percent are returned to the forwarding server as the identification result;
5. the main thread of the forwarding server receives the identification result and forwards the identification result to the client, and the client displays the result on an interface; the method comprises the following specific steps:
5.1. the main thread receives the identification result, and writes the identification result into the latest forwarding data packet in the annular buffer queue of the corresponding channel;
5.2. the forwarding sub-thread of the corresponding channel acquires a forwarding data packet from the forwarding buffer queue and sends the forwarding data packet to the client;
5.3. the client receives the forwarding data packet containing the identification result, extracts the video frame and the identification information, and displays the video frame and the identification information on the client interface.
Finally, it should be noted that the examples are disclosed for the purpose of aiding in the further understanding of the present invention, but those skilled in the art will appreciate that: various alternatives and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the invention should not be limited to the disclosed embodiments, but rather the scope of the invention is defined by the appended claims.

Claims (1)

1. A multichannel rapid human body posture identification method for intelligent video monitoring comprises the following steps:
A. building a multichannel rapid human body posture recognition system architecture oriented to intelligent video monitoring;
A1. the system comprises a client, a video forwarding server (forwarding server for short), an intelligent analysis server and a network video recorder;
A2. the client sends a request for acquiring the video stream to the forwarding server, and displays the video image and the identification result to the user;
A3. the forwarding server receives a client request, acquires a requested video stream from the network video recorder and forwards the requested video stream to the client;
A4. the forwarding server performs rapid detection and rapid human body detection on the moving target, sends a gesture recognition request to the intelligent analysis server, receives a recognition result and sends the recognition result to the client;
A5. the intelligent analysis server receives the identification request, identifies the gesture and returns an identification result to the forwarding server;
A6. the intelligent analysis servers communicate video data and identification information through the network control port and the data port;
B. the forwarding server receives a client request, acquires a video stream, forwards the video stream to the client, and creates a detection sub-thread for rapid detection, and the specific steps are as follows:
B1. the main thread receives a client request and acquires a video stream of a requested channel from the network video recorder;
B2. the main thread creates a ring buffer queue and a forwarding sub-thread for each path of video channel;
B2.1. creating an annular buffer queue to store forwarding data packets corresponding to all channels;
B2.2. creating a forwarding data packet for the acquired video frame, and mounting the forwarding data packet on an annular buffer queue of each channel;
the forwarding data packet comprises a data head and a video frame buffer area; the data head comprises, but is not limited to, video frame size, format, time t, gesture recognition result information and a key frame mark nIDR, wherein a value of 1 represents a key frame, and a value of 0 represents a non-key frame;
B2.3. creating a forwarding sub-thread for taking video frames from the annular buffer queue, and forwarding the video frames to a client for real-time display;
B3. the method comprises the steps of creating a detection sub-thread for acquiring video frames from a ring buffer queue, rapidly detecting and sending gesture recognition requests;
C. the detection sub-thread corresponding to the channel acquires the key frame and performs format conversion, and the specific steps are as follows:
C1. the detection sub-thread acquires a forwarding data packet from a ring buffer queue of the channel to which the detection sub-thread belongs;
C2. selecting a video frame with ndidr=1 as a key frame for subsequent processing;
C3. decoding the H.264 format video frame into YUV format, and converting into JPG format;
D. the detection sub-thread carries out the rapid detection of the moving target on the key frame, and the specific steps are as follows:
D1. the gray level difference value of 3 continuous video frames is calculated, and the specific steps are as follows:
D1.1. let t n-1 、t n 、t n+1 The gray value of the video frame at the moment is marked as F n-1 ,F n ,F n+1
D1.2. Separately computing videoFrame F n And F is equal to n-1 ,F n+1 And F is equal to n Respectively as t n-1 Time sum t n Time foreground image D n-1 And D n
D2. The method comprises the following specific steps of performing rapid moving target detection on a foreground image, wherein the image containing a moving target is recorded as Rn':
D2.1. pair D n-1 And D n Intersection calculation is carried out to obtain D n ’;
D2.2. According to the preset threshold T1, pair D n Each pixel point in' is subjected to binarization processing to obtain a binarized image Rn, wherein the value of T1 comprises but is not limited to 10; the method comprises the following specific steps:
D2.2.1. let D be D n Gray values of the pixels in';
D2.2.2. if d > T1, R is recorded n =255, i.e. the motion target point;
D2.2.3. if d < = T1, R is recorded n =0, i.e. background point;
D2.3. counting the number of pixel points with the pixel value of 255 in Rn, if the number of the pixel points is larger than a preset threshold T2, considering that a moving target exists in the video, and marking the image as Rn'; when the corresponding image resolution is 464 x 464, the value of T2 is 30000;
E. the detection sub-thread detects human body of Rn' and sends a gesture recognition request to the intelligent analysis server, and the specific steps are as follows:
E1. loading parameters of a deep neural network model (network for short) including anchor frame information;
the anchor frame information refers to the width and height information of the anchor frame obtained by clustering during training of the network; the anchor frame refers to N bounding boxes with highest occurrence probability and used for predicting targets, wherein N comprises but is not limited to 5;
E2. processing the resolution of an input image to 464 x 464, and processing the input image through a network convolution layer and a pooling layer to obtain a feature map with the resolution of 13 x 13;
E3. the target frame is predicted for the feature map through a network prediction layer, and the specific steps are as follows:
e3.1, predicting information of M target frames, target frame types and corresponding confidence and type probabilities of each pixel on the feature map by using anchor frames;
the information of the target frame comprises: the offset of the center of the target frame relative to the pixel, and the width and height of the target frame, are denoted (x, y, w, h);
the confidence level represents the accuracy of the predicted target frame position information;
the class probability represents the probability of predicting the class of the target frame as a human body;
E3.2. filtering out target frames with confidence below a preset threshold T, wherein the threshold T comprises but is not limited to 0.7;
E3.3. removing the repeated target frames by using the maximum inhibition to the remained target frames;
E3.4. selecting a target frame with highest category probability, and outputting coordinates of a lower left corner and an upper right corner;
E4. if the image frames detected in the step E contain human bodies, packaging original image frames with human bodies and target frame information into gesture recognition requests every K frames, and sending the gesture recognition requests to an intelligent analysis server, wherein the K value is more than or equal to 1;
E5. if the image frame detected in the step E does not have a human body, the subsequent processing is not carried out;
F. the intelligent analysis server receives the gesture recognition request, performs gesture recognition processing, and returns a result to the forwarding server, and the specific steps are as follows:
F1. the main thread creates an identification sub-thread and an identification buffer queue for each video channel, and the specific steps are as follows:
F1.1. the main thread receives a gesture recognition request sent from a forwarding server, and mounts the received image frames to a recognition buffer queue corresponding to the channel;
F1.2. the identification sub-thread corresponding to the channel acquires the image frames in the identification buffer queue and marks the image frames as an original picture S;
F2. the identification sub-thread loads a deep neural network model formed by four-level networks, and extracts characteristic information of human joints in an original picture S, and the specific steps are as follows:
F2.1. first-stage netThe network comprises two paths of networks N 11 And N 12 Generating a feature map F 1 1 ~F 1 14 The method comprises the following specific steps:
F2.1.1.N 11 the network extracts the characteristics on the original picture S by utilizing a plurality of residual modules, and outputs 14 paths of characteristic diagrams F 11-1 ~F 11-14 ;N 12 The network firstly downsamples the original picture S, then passes through a plurality of residual modules, and then upsamples to output 14 paths of characteristic diagrams F 12-1 ~F 12-14 The method comprises the steps of carrying out a first treatment on the surface of the Each path of characteristic diagram corresponds to a gateway node with highest Gaussian response;
the 14 joint points corresponding to the 14 paths of feature maps comprise: head, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle;
F2.1.2. based on F2.1.1 characteristic diagram, introducing encoder and decoder to generate characteristic diagram F 'with weight W' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 The method comprises the following specific steps:
F2.1.2.1. the encoder uniformly divides each feature map of F2.1.1 into a plurality of areas, which are called feature area a, and the code corresponding to each area is y;
the set of characteristic regions a is denoted as R D ={a 1 ,…,a i ,…,a L (wherein a) i An i-th region, 1 < =i < =l, and L represents the number of divided regions; a, a i ∈R D ,R D A complete feature map with D as a cutting unit, where D represents the number of pixels in the region, and if the region size is 14×14, d=196;
the code y is a 14-dimensional vector, wherein 14 is the number of joints; the coding of the i-th region is denoted y i ,y i The j-th element in the vector is denoted as y ij ,y ij =1 indicates that the i-th region contains the j-th node; y is ij =0 indicates that the ith region does not contain the jth node; the set of Y is denoted as Y D ={y 1 ,…,y i ,…,y L },1<=i<=L;
F2.1.2.2. The decoder calculates a weight W for each feature map of F2.1.1 11-1 ~W 11-14 And W is 12-1 ~W 12-14 : each weight W is a set of weights W for all feature regions in the feature map, denoted w= { W 1 ,…,w i ,…,w L -a }; weights w of each feature region i Representing the proportionality coefficient occupied by the characteristic region when the characteristic region is input to the next stage of processing, and calculating by utilizing the characteristic region a and the code y;
F2.1.3. fusion profile F' 11-1 ~F‘ 11-14 And F' 12-1 ~F‘ 12-14 Obtaining the output characteristic diagram F of the network of the stage 1 1 ~F 1 14
F2.1.4. Original picture S and feature map F 1 1 ~F 1 14 As input to the next level of network;
F2.2. the second level sub-network comprises two paths of network N 21 And N 22 Feature map F output by first level subnetwork 1 1 ~F 1 14 And the original picture S is taken as input, the specific steps of the step F2.1 are repeated to obtain output F 2 1 ~F 2 14
F2.3. The third level sub-network comprises two paths of network N 31 And N 32 Feature map F output by second level subnetwork 2 1 ~F 2 14 And the original picture S is taken as input, and the specific steps under the step F2.1 are repeated to obtain output F 3 1 ~F 3 14
F2.4. The fourth level sub-network comprises two paths of network N 41 And N 42 Feature map F output by third-order subnetwork 3 1 ~F 3 14 And the original picture S is taken as input, and the specific steps under the step F2.1 are repeated to obtain output F 4 1 ~F 4 14
F2.5. Using feature maps F 4 1 ~F 4 14 Extracting the names and coordinates of all joints, connecting adjacent joints according to the common physiological knowledge, and calculating the jointThe distance between the nodes, constructing a skeleton diagram; the skeleton diagram comprises names and coordinates of all the joint points and the distance between the joint points;
F3. the recognition sub-thread classifies the gesture of the skeleton map by using an SVM classifier, and the specific steps are as follows:
F3.1. loading trained SVM classifiers, identifying poses including, but not limited to kneeling, lying, sitting, standing;
F3.2. classifying the skeleton diagram, and returning the identification result to the forwarding server; the recognition result comprises a skeleton diagram and the gesture category and accuracy of the skeleton diagram;
G. the main thread of the forwarding server receives the identification result and forwards the identification result to the client, and the client displays the result on an interface; the method comprises the following specific steps:
G1. the main thread receives the identification result, and writes the identification result into the latest forwarding data packet in the annular buffer queue of the corresponding channel;
G2. the forwarding sub-thread of the corresponding channel acquires a forwarding data packet from the forwarding buffer queue and sends the forwarding data packet to the client;
G3. the client receives the forwarding data packet containing the identification result, extracts the video frame and the identification information, and displays the video frame and the identification information on the client interface.
CN201811299870.8A 2018-11-02 2018-11-02 Multichannel rapid human body posture recognition method for intelligent video monitoring Active CN109583315B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811299870.8A CN109583315B (en) 2018-11-02 2018-11-02 Multichannel rapid human body posture recognition method for intelligent video monitoring

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811299870.8A CN109583315B (en) 2018-11-02 2018-11-02 Multichannel rapid human body posture recognition method for intelligent video monitoring

Publications (2)

Publication Number Publication Date
CN109583315A CN109583315A (en) 2019-04-05
CN109583315B true CN109583315B (en) 2023-05-12

Family

ID=65921471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811299870.8A Active CN109583315B (en) 2018-11-02 2018-11-02 Multichannel rapid human body posture recognition method for intelligent video monitoring

Country Status (1)

Country Link
CN (1) CN109583315B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860392B (en) * 2020-07-28 2021-04-20 珠海安联锐视科技股份有限公司 Thermodynamic diagram statistical method based on target detection and foreground detection
CN112017384A (en) * 2020-08-05 2020-12-01 山东大学 Automatic alarm method and system for real-time area monitoring
CN112287840B (en) * 2020-10-30 2022-07-22 焦点科技股份有限公司 Method and system for intelligently acquiring exercise capacity analysis data
CN112954449B (en) * 2021-01-29 2023-03-24 浙江大华技术股份有限公司 Video stream processing method, system, electronic device and storage medium
US11645874B2 (en) 2021-06-23 2023-05-09 International Business Machines Corporation Video action recognition and modification
CN113611387B (en) * 2021-07-30 2023-07-14 清华大学深圳国际研究生院 Motion quality assessment method based on human body pose estimation and terminal equipment
CN115187918B (en) * 2022-09-14 2022-12-13 中广核贝谷科技有限公司 Method and system for identifying moving object in monitoring video stream
CN115665359B (en) * 2022-10-09 2023-04-25 西华县环境监察大队 Intelligent compression method for environment monitoring data
CN116304179B (en) * 2023-05-19 2023-08-11 北京大学 Data processing system for acquiring target video
CN117115718B (en) * 2023-10-20 2024-01-09 思创数码科技股份有限公司 Government affair video data processing method, system and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017000115A1 (en) * 2015-06-29 2017-01-05 北京旷视科技有限公司 Person re-identification method and device
CN107527045A (en) * 2017-09-19 2017-12-29 桂林安维科技有限公司 A kind of human body behavior event real-time analysis method towards multi-channel video
CN107832672A (en) * 2017-10-12 2018-03-23 北京航空航天大学 A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017000115A1 (en) * 2015-06-29 2017-01-05 北京旷视科技有限公司 Person re-identification method and device
CN107527045A (en) * 2017-09-19 2017-12-29 桂林安维科技有限公司 A kind of human body behavior event real-time analysis method towards multi-channel video
CN107832672A (en) * 2017-10-12 2018-03-23 北京航空航天大学 A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于GMM的人体运动姿态的追踪与识别;魏燕欣等;《北京服装学院学报(自然科学版)》;20180630(第02期);全文 *

Also Published As

Publication number Publication date
CN109583315A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN109583315B (en) Multichannel rapid human body posture recognition method for intelligent video monitoring
Kim et al. Vision-based human activity recognition system using depth silhouettes: A smart home system for monitoring the residents
CN109543695B (en) Population-density population counting method based on multi-scale deep learning
Wang et al. Detection of abnormal visual events via global optical flow orientation histogram
US10140508B2 (en) Method and apparatus for annotating a video stream comprising a sequence of frames
CN103824070B (en) A kind of rapid pedestrian detection method based on computer vision
US20180114071A1 (en) Method for analysing media content
Wu et al. A detection system for human abnormal behavior
WO2011080900A1 (en) Moving object detection device and moving object detection method
Naik et al. Deep-violence: individual person violent activity detection in video
CN110852179B (en) Suspicious personnel invasion detection method based on video monitoring platform
Janku et al. Fire detection in video stream by using simple artificial neural network
Jemilda et al. Moving object detection and tracking using genetic algorithm enabled extreme learning machine
Aldahoul et al. A comparison between various human detectors and CNN-based feature extractors for human activity recognition via aerial captured video sequences
CN110188718B (en) Unconstrained face recognition method based on key frame and joint sparse representation
Kale et al. Suspicious activity detection using transfer learning based resnet tracking from surveillance videos
CN114870384A (en) Taijiquan training method and system based on dynamic recognition
Dahirou et al. Motion Detection and Object Detection: Yolo (You Only Look Once)
Nosheen et al. Efficient Vehicle Detection and Tracking using Blob Detection and Kernelized Filter
Ponika et al. Developing a YOLO based Object Detection Application using OpenCV
IL260438A (en) System and method for use in object detection from video stream
CN117475353A (en) Video-based abnormal smoke identification method and system
Alzahrani et al. Anomaly detection in crowds by fusion of novel feature descriptors
Bhardwaj et al. Modified Neural Network-based Object Classification in Video Surveillance System.
Srinivasa et al. Fuzzy edge-symmetry features for improved intruder detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant