CN113887384B - Pedestrian track analysis method, device, equipment and medium based on multi-track fusion - Google Patents

Pedestrian track analysis method, device, equipment and medium based on multi-track fusion Download PDF

Info

Publication number
CN113887384B
CN113887384B CN202111150133.3A CN202111150133A CN113887384B CN 113887384 B CN113887384 B CN 113887384B CN 202111150133 A CN202111150133 A CN 202111150133A CN 113887384 B CN113887384 B CN 113887384B
Authority
CN
China
Prior art keywords
target
pixel
image
human body
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111150133.3A
Other languages
Chinese (zh)
Other versions
CN113887384A (en
Inventor
李会璟
赖众程
王晟宇
谢鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111150133.3A priority Critical patent/CN113887384B/en
Publication of CN113887384A publication Critical patent/CN113887384A/en
Application granted granted Critical
Publication of CN113887384B publication Critical patent/CN113887384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to an artificial intelligence technology, and discloses a pedestrian track analysis method based on multi-track fusion, which comprises the following steps: a three-dimensional model of a preset space is built, monitoring videos captured by cameras in different directions in the preset space are analyzed, so that position information, face characteristics and human body characteristics of targets in the monitoring videos of different cameras are obtained, and whether movement tracks in different cameras belong to the same target is judged according to the position information, the face characteristics and the human body characteristics. In addition, the invention also relates to a blockchain technology, and the monitoring picture can be stored in a node of the blockchain. The invention further provides a pedestrian track analysis device based on multi-track fusion, electronic equipment and a storage medium. The invention can improve the accuracy of pedestrian track recognition.

Description

Pedestrian track analysis method, device, equipment and medium based on multi-track fusion
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a pedestrian track analysis method and device based on multi-track fusion, electronic equipment and a computer readable storage medium.
Background
With increasing importance of people on life safety, people increasingly utilize devices such as cameras and video recorders to monitor and analyze life environments, for example, video of the cameras is analyzed, and then tracks of target pedestrians are distinguished from tracks of many lines of people, but monitoring images of the cameras are two-dimensional planes, so that the monitoring range of a single camera is limited, even tracks of the same person in the images monitored by the cameras cannot be overlapped, and therefore how to analyze the monitoring images to determine whether the tracks monitored by different cameras are tracks of the same pedestrian becomes a problem to be solved urgently.
The existing method for tracking and distinguishing the target pedestrian track is mostly the face of the pedestrian captured by the cameras, whether the pedestrian is the same person is judged through the face recognition technology, and then monitoring pictures of different cameras are analyzed according to the judging result, so that the track of the pedestrian is tracked. However, in this method, due to the influence of factors such as technology or cost, most cameras have insufficient resolution, clear faces of pedestrians are difficult to capture under the condition of insufficient resolution, so that the face recognition effect is not ideal, and if the two tracks of different cameras are tracks of the same person or not can not be accurately recognized due to the fact that the two tracks are judged from the tracks determined from the tracks only according to the face recognition.
Disclosure of Invention
The invention provides a pedestrian track analysis method and device based on multi-track fusion and a computer readable storage medium, and mainly aims to solve the problem of low accuracy in pedestrian track identification.
In order to achieve the above object, the present invention provides a pedestrian track analysis method based on multi-track fusion, including:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trail of the first target;
Extracting a first human body characteristic and a first human face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion trail, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
And calculating the contact ratio of the first target and the second target according to the first motion trail, the second motion trail, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion trail and the second motion trail belong to the motion trail of the same person according to the contact ratio.
Optionally, the establishing a three-dimensional model of the preset space according to the monitoring picture includes:
Acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting a shooting picture corresponding to one angle one by one as a target picture, and randomly selecting any pixel point of the target from the target picture as a target pixel point;
generating a vector in the direction of the target pixel point by taking a camera shooting the shooting picture as an original point;
Measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
Calculating the space coordinates of a camera shooting the shooting picture according to the module length of the vector, the horizontal included angle and the vertical included angle;
And constructing a three-dimensional coordinate system by taking the target pixel point as an origin, taking the space coordinates of cameras at different positions as known coordinates, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
Optionally, the identifying the location information of the first target in each frame of image in the first surveillance video includes:
One frame of image is selected from the first monitoring video one by one to serve as a target image;
Performing convolution and pooling operation on the target image to obtain the image characteristics of the target image;
and determining the position of the image feature in the target image as the position information of the first target.
Optionally, the mapping the position information of the target in each frame of image to the three-dimensional model to obtain a first motion trail of the first target includes:
Constructing a plane coordinate system in the target image by taking the central pixel of the target image as an origin;
counting position coordinates corresponding to position information contained in the target image from the plane coordinate system;
Mapping the position coordinates into the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
And connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion trail of the first target in the three-dimensional model.
Optionally, the extracting the first human feature and the first human face feature of the first target from each frame of image includes:
Cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image;
Selecting a human body image area corresponding to one frame of image in the first monitoring video one by one as a target area, generating global features of the target area according to pixel gradients in the target area, and taking the global features as the first human body features;
calculating the probability value of each pixel point in the human body image area as a human face pixel by using a preset activation function, and determining the area where the pixel point with the probability value larger than a preset threshold value is located as a human face area;
the face areas are selected one by utilizing a preset sliding window, and a pixel window is obtained;
and generating local features of the target area according to pixel values in each pixel window, and taking the local features as the first face features.
Optionally, the generating the global feature of the target region according to the pixel gradient in the target region includes:
Counting the pixel value of each pixel point in the target area;
Taking the maximum pixel value and the minimum pixel value in the pixel values as input of a preset mapping function, and mapping the pixel value of each pixel point in the target area into a preset range by utilizing the mapping function;
and calculating the pixel gradient of each row of pixels in the target area after mapping, converting the pixel gradient of each row of pixels into row vectors, and splicing the row vectors into global features of the target area.
Optionally, the generating the local feature of the target area according to the pixel value in each pixel window includes:
selecting one pixel point from the pixel window one by one as a target pixel point;
judging whether the pixel value of the target pixel point is an extremum in the pixel window;
When the pixel value of the target pixel point is not an extremum in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point;
When the pixel value of the target pixel point is an extremum in the pixel window, determining the target pixel point as a key point;
And vectorizing pixel values of all key points in all pixel windows, and collecting the obtained vectors as local features of the target area.
In order to solve the above problems, the present invention further provides a pedestrian trajectory analysis device based on multi-trajectory fusion, the device comprising:
the three-dimensional model construction module is used for acquiring monitoring pictures of cameras at different positions in a preset space and constructing a three-dimensional model of the preset space according to the monitoring pictures;
The position identification module is used for acquiring a first monitoring video of a camera at a first position in the preset space and identifying the position information of a first target in each frame of image in the first monitoring video;
the first track analysis module is used for mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion track of the first target, and extracting the first human body characteristic and the first human face characteristic of the first target from each frame of image;
The second track analysis module is used for acquiring a second monitoring video of the camera at a second position in the preset space and generating a second motion track, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
The track judging module is used for calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body feature, the second human body feature, the first human face feature and the second human face feature, and confirming whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the multi-track fusion based pedestrian track analysis method described above.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned pedestrian trajectory analysis method based on multi-trajectory fusion.
According to the embodiment of the invention, the three-dimensional model of the space can be constructed through monitoring pictures at different positions of the preset space, the first target and the second target are analyzed from different shooting angles, so that the motion trail of the first target and the second target is mapped into the three-dimensional model, the human body characteristics and the human face characteristics of the first target and the second target are combined to carry out comprehensive judgment, and further whether the motion trail of the first target and the motion trail of the second target belong to the same person or not is determined, and the accurate analysis of the motion trail is realized. Therefore, the pedestrian track analysis method, the device, the electronic equipment and the computer readable storage medium based on multi-track fusion can solve the problem of low accuracy of pedestrian track identification.
Drawings
FIG. 1 is a flow chart of a pedestrian track analysis method based on multi-track fusion according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for analyzing position information of a first object according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of analyzing a first motion trajectory according to an embodiment of the present invention;
FIG. 4 is a functional block diagram of a pedestrian trajectory analysis device based on multi-trajectory fusion according to an embodiment of the present invention;
Fig. 5 is a schematic structural diagram of an electronic device for implementing the pedestrian track analysis method based on multi-track fusion according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a pedestrian track analysis method based on multi-track fusion. The execution subject of the pedestrian track analysis method based on multi-track fusion comprises at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the application. In other words, the pedestrian trajectory analysis method based on multi-trajectory fusion may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flow chart of a pedestrian track analysis method based on multi-track fusion according to an embodiment of the invention is shown. In this embodiment, the pedestrian track analysis method based on multi-track fusion includes:
S1, acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures.
In the embodiment of the present invention, the preset space may be any space that can be monitored by a camera, for example, a bedroom, a hospital monitoring area, a working room, or an outdoor park.
In detail, the monitoring screen refers to a screen obtained by monitoring the preset space from a plurality of directions by cameras in different directions in the preset space, for example, a screen obtained by monitoring the preset space from at least two directions of east, south, west, north, etc. of the preset space by using the cameras.
Specifically, the monitoring picture can be captured from the data storage areas corresponding to the cameras at different positions in the preset space by using computer sentences (such as java sentences, python sentences and the like) with a data capturing function, wherein the data storage areas comprise, but are not limited to, a database, a blockchain node and a network cache.
In the embodiment of the invention, the monitoring frames comprise frames monitored by cameras at different positions in the preset space, and the spatial positions among different frames are not uniform and cannot be used for uniformly analyzing the pedestrian track subsequently, so that a three-dimensional model of the preset space can be established through the monitoring frames, and each camera at different positions can realize unification of object coordinate dimensions in the frames when monitoring the frames, thereby improving the accuracy of analyzing the pedestrian track.
In the embodiment of the present invention, the establishing a three-dimensional model of the preset space according to the monitoring picture includes:
Acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting a shooting picture corresponding to one angle one by one as a target picture, and randomly selecting any pixel point of the target from the target picture as a target pixel point;
generating a vector in the direction of the target pixel point by taking a camera shooting the shooting picture as an original point;
Measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
Calculating the space coordinates of a camera shooting the shooting picture according to the module length of the vector, the horizontal included angle and the vertical included angle;
And constructing a three-dimensional coordinate system by taking the target pixel point as an origin, taking the space coordinates of cameras at different positions as known coordinates, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
In detail, any pixel point on the same target can be selected from pictures captured by cameras at different positions as an origin, unification of coordinates of each camera is realized, a camera shooting the shot picture is taken as the origin to generate a vector towards the direction of the target pixel point, and a cosine theorem is utilized to measure a horizontal included angle between the vector and the horizontal direction and a vertical included angle between the vector and the vertical direction.
Specifically, the target pixel point may be used as an origin of the preset space, coordinate information of the camera in the preset space is determined according to the horizontal included angle, the vertical included angle and the modular length of the vector, and a three-dimensional coordinate system including the origin and each camera is further constructed and used as a three-dimensional model of the preset space.
S2, acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video.
In the embodiment of the present invention, the first surveillance video is a picture of the preset space captured by a camera at a first position in the preset space, and the step of obtaining the first surveillance video of the camera at the first position in the preset space is consistent with the step of obtaining the surveillance pictures of the cameras at different positions in the preset space in S1, which is not described herein.
In the embodiment of the invention, each frame of image in the first monitoring video can be respectively analyzed to obtain the position information of the first target in each frame of image, wherein the first target can be an object, a pedestrian and the like which are monitored by the camera at the first position and move in the preset space.
In the embodiment of the present invention, referring to fig. 2, the identifying the location information of the first target in each frame of image in the first surveillance video includes:
s21, selecting one frame of images from the first monitoring video one by one as a target image;
s22, carrying out convolution and pooling operation on the target image to obtain the image characteristics of the target image;
s23, determining the position of the image feature in the target image as the position information of the first target.
In detail, the target image may be convolved, pooled, etc. using an artificial intelligence model with feature extraction function, including but not limited to Vgg-net network model, rcnn-net network model, etc., to obtain a plurality of image features in the target image.
Specifically, after the image feature of the target image is obtained, the position information of the image feature in the target image may be used as the position information of the first target, so as to perform the above operation on each frame of image in the first surveillance video, and obtain the position information of the first target in each frame of image in the first surveillance video.
And S3, mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trail of the first target.
In the embodiment of the present invention, because the position information of the first target in each frame in the first surveillance video extracted in step S2 is the position information in the plane picture acquired by the camera corresponding to the first surveillance video, the effect of spatial track analysis is poor, so that the position information of the target in each frame of image can be mapped into the three-dimensional model to obtain the first motion track of the first target in the three-dimensional model.
In the embodiment of the invention, the position information of the target in each frame of image can be analyzed by utilizing yoloV network, and the position information is tracked by utilizing deepsort (tracking) technology, so that the change track of the position information in the three-dimensional model is obtained, and the first motion track is obtained.
In another embodiment of the present invention, referring to fig. 3, the mapping the position information of the object in each frame of image to the three-dimensional model to obtain a first motion track of the first object includes:
S31, constructing a plane coordinate system in the target image by taking a central pixel of the target image as an origin;
s32, counting position coordinates corresponding to position information contained in the target image from the plane coordinate system;
s33, mapping the position coordinates into the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
and S34, connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion track of the first target in the three-dimensional model.
In detail, a plane coordinate system may be constructed in the target image with the center pixel point as an origin, and then a position coordinate corresponding to the position information is calculated in the plane coordinate system, and the position coordinate is mapped into the three-dimensional model by using a preset mapping function, so as to obtain a three-dimensional coordinate of the position information in the three-dimensional model, where the mapping function includes, but is not limited to, a gaussian function and a map function.
Specifically, after the position information of the object in each frame in the first surveillance video is mapped into the three-dimensional model, all the mapped three-dimensional coordinates in the three-dimensional model can be connected by using a smooth curve, and the curve obtained by connection is used as the first motion trail of the first object.
S4, extracting the first human body characteristics and the first human face characteristics of the first target from each frame of image.
In one practical application scene, the analysis condition is too single to analyze the motion trail of the pedestrian by only utilizing the image features, so that the accuracy of the analysis result is lower, and therefore, in order to improve the accuracy of the final analysis of the pedestrian trail, the first human body features and the first human face features of the first target can be extracted from each frame of image, and further, the subsequent analysis of the pedestrian trail by combining the motion trail, the human body features and the human face features is facilitated.
In detail, the human body characteristics refer to morphological characteristics of the first object, such as fat, thin, tall, short, etc.; the first face feature refers to a local feature of a face portion of the first target, such as a face texture, a face key point, and the like.
In the embodiment of the present invention, the extracting the first human body feature and the first human face feature of the first target from each frame of image includes:
Cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image;
Selecting a human body image area corresponding to one frame of image in the first monitoring video one by one as a target area, generating global features of the target area according to pixel gradients in the target area, and taking the global features as the first human body features;
calculating the probability value of each pixel point in the human body image area as a human face pixel by using a preset activation function, and determining the area where the pixel point with the probability value larger than a preset threshold value is located as a human face area;
the face areas are selected one by utilizing a preset sliding window, and a pixel window is obtained;
and generating local features of the target area according to pixel values in each pixel window, and taking the local features as the first face features.
In one embodiment of the present invention, the global features of the target image may be extracted by HOG (Histogram of Oriented Gradient, direction gradient histogram), DPM (Deformable Part Model, variability component model), LBP (Local Binary Patterns, local binary pattern), or the like, or may be extracted by an artificial intelligence model of a specific image feature extraction function trained in advance, including but not limited to VGG-net model, U-net model.
In another embodiment of the present invention, the generating the global feature of the target area according to the pixel gradient in the target area includes:
Counting the pixel value of each pixel point in the target area;
Taking the maximum pixel value and the minimum pixel value in the pixel values as input of a preset mapping function, and mapping the pixel value of each pixel point in the target area into a preset range by utilizing the mapping function;
and calculating the pixel gradient of each row of pixels in the target area after mapping, converting the pixel gradient of each row of pixels into row vectors, and splicing the row vectors into global features of the target area.
Illustratively, the preset mapping function may be:
Wherein Y i is a pixel value of the i-th pixel in the target area after the i-th pixel is mapped to a preset range, X i is a pixel value of the i-th pixel in the target area, max (X) is a maximum pixel value in the target area, and min (X) is a minimum pixel value in the target area.
Further, a probability value of each pixel point in the human body image area as a human face pixel can be calculated by using a preset activation function, and then an area where the pixel point with the probability value larger than a preset threshold value is located is selected from the human body image area as a human face area, wherein the preset activation function comprises, but is not limited to, softma activation functions, sigmoid activation functions and relu activation functions.
In the embodiment of the present invention, the pixel gradient of each row of pixels in the target area after mapping may be calculated by using a preset gradient algorithm, where the gradient algorithm includes, but is not limited to, a two-dimensional discrete derivative algorithm, soble operators, and the like.
In the embodiment of the application, the pixel gradient of each row of pixels can be converted into the row vector and spliced into the global feature of the target area.
For example, the selected target area includes three rows of pixels, the pixel gradients of the first row of pixels are a, b, c, the pixel gradients of the second row of pixels are d, e, f, the pixel gradients of the third row of pixels are g, h, i, and then the pixel gradients of each row of pixels can be respectively used as row vectors to be spliced into the following global features:
Further, the generating the local feature of the target area according to the pixel value in each pixel window includes:
selecting one pixel point from the pixel window one by one as a target pixel point;
judging whether the pixel value of the target pixel point is an extremum in the pixel window;
When the pixel value of the target pixel point is not an extremum in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point;
When the pixel value of the target pixel point is an extremum in the pixel window, determining the target pixel point as a key point;
And vectorizing pixel values of all key points in all pixel windows, and collecting the obtained vectors as local features of the target area.
In the embodiment of the present application, the sliding window may be a pre-constructed selection frame with a certain area, which may be used to frame the pixels in the target area, for example, a square selection frame constructed with 10 pixels as a height and 10 pixels as a width.
In detail, the extremum includes a maximum value and a minimum value, and when the pixel value of the target pixel point is the maximum value or the minimum value in the pixel window, the target pixel point is determined to be the key point of the pixel window.
Specifically, the step of vectorizing the pixel values of all the key points in the pixel window is consistent with the step of calculating the pixel gradient of each row of pixels in the mapped target area and converting the pixel gradient of each row of pixels into a row vector, which is not described again.
S5, acquiring a second monitoring video of the camera at a second position in the preset space, and generating a second motion trail, second human body characteristics and second human face characteristics of a second target according to the second monitoring video.
In the embodiment of the invention, the camera at the second position is any camera which is positioned at a different position from the camera at the first position in the preset space.
In detail, the step of obtaining the second surveillance video of the camera at the second position in the preset space and generating the second motion track, the second human body feature and the second human face feature of the second target according to the second surveillance video is consistent with the step of extracting the first motion track, the first human body feature and the first human face feature of the first target from the first surveillance video in S2 to S4, and is not repeated again.
S6, calculating the coincidence ratio of the first target and the second target according to the first motion trail, the second motion trail, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and confirming whether the first motion trail and the second motion trail belong to the motion trail of the same person according to the coincidence ratio.
In the embodiment of the invention, the first motion track, the second motion track, the first human body feature, the second human body feature, the first human face feature and the second human face feature can be calculated by using a preset distance function, the distance value between the first motion track and the second motion track, the distance value between the first human body feature and the second human body feature, and the distance value between the first human face feature and the second human face feature are calculated respectively, and further, the coincidence degree of the first target and the second target is calculated according to the calculated three distance values, so that whether the first target and the second target are the same target or not is judged according to the coincidence degree.
For example, the distance value between the first motion trajectory and the second motion trajectory may be calculated using the following distance algorithm:
wherein D is a distance value between the first motion track and the second motion track, n is the first motion track, and m is the second motion track.
In other embodiments of the present invention, the distance value between the first motion track and the second motion track may be calculated by an algorithm having a distance value calculation function, such as a euclidean distance algorithm and a cosine distance algorithm.
In detail, the step of calculating the distance value between the first human feature and the second human feature, and the step of calculating the distance value between the first human face feature and the second human face feature are consistent with the step of calculating the distance value between the first motion trail and the second motion trail, which are not described herein in detail.
In the embodiment of the present invention, the calculating the overlap ratio of the first target and the second target according to the first motion track, the second motion track, the first human feature, the second human feature, the first face feature and the second face feature includes:
Calculating the coincidence degree of the first target and the second target by using the following weight algorithm:
C=α*A+β*B+γ*C
wherein C is the coincidence ratio of the first target and the second target, a is the distance value between the first motion track and the second motion track, B is the distance value between the first human body feature and the second human body feature, C is the distance value between the first human face feature and the second human face feature, and α, β, and γ are preset weight coefficients.
In the embodiment of the invention, the contact ratio can be compared with the preset contact threshold, and when the contact ratio is larger than the preset contact threshold, the track of the first target is determined to be consistent with the track of the second target, so that the first target is determined to be identical with the second target; when the overlap ratio is smaller than or equal to the preset overlap threshold, determining that the track of the first target is inconsistent with the track of the second target, and further determining that the first target is not identical with the second target.
According to the embodiment of the invention, the three-dimensional model of the space can be constructed through monitoring pictures at different positions of the preset space, the first target and the second target are analyzed from different shooting angles, so that the motion trail of the first target and the second target is mapped into the three-dimensional model, the human body characteristics and the human face characteristics of the first target and the second target are combined to carry out comprehensive judgment, and further whether the motion trail of the first target and the motion trail of the second target belong to the same person or not is determined, and the accurate analysis of the motion trail is realized. Therefore, the pedestrian track analysis method based on multi-track fusion can solve the problem of low accuracy of pedestrian track identification.
Fig. 4 is a functional block diagram of a pedestrian track analysis device based on multi-track fusion according to an embodiment of the present invention.
The pedestrian trajectory analysis device 100 based on multi-trajectory fusion can be installed in an electronic device. The pedestrian trajectory analysis device 100 based on multi-trajectory fusion may include a three-dimensional model construction module 101, a position recognition module 102, a first trajectory analysis module 103, a second trajectory analysis module 104, and a trajectory judgment module 105 according to the functions implemented. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
The three-dimensional model construction module 101 is configured to obtain monitoring pictures of cameras at different positions in a preset space, and establish a three-dimensional model of the preset space according to the monitoring pictures;
The position identifying module 102 is configured to obtain a first surveillance video of a camera at a first position in the preset space, and identify position information of a first target in each frame of image in the first surveillance video;
The first track analysis module 103 is configured to map the position information of the first target in each frame of image to the three-dimensional model, so as to obtain a first motion track of the first target, and extract a first human feature and a first human face feature of the first target from each frame of image;
the second track analysis module 104 is configured to obtain a second surveillance video of the camera at a second position in the preset space, and generate a second motion track, a second human body feature and a second face feature of a second target according to the second surveillance video;
The track judging module 105 is configured to calculate a contact ratio between the first target and the second target according to the first motion track, the second motion track, the first human feature, the second human feature, the first human face feature, and the second human face feature, and confirm whether the first motion track and the second motion track belong to a motion track of the same person according to the contact ratio.
In detail, each module in the pedestrian track analysis device 100 based on multi-track fusion in the embodiment of the present invention adopts the same technical means as the pedestrian track analysis method based on multi-track fusion described in fig. 1 to 3, and can generate the same technical effects, which is not described herein.
Fig. 5 is a schematic structural diagram of an electronic device for implementing a pedestrian track analysis method based on multi-track fusion according to an embodiment of the present invention.
The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a pedestrian trajectory analysis program based on multi-trajectory fusion.
The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, executes various functions of the electronic device and processes data by running or executing programs or modules stored in the memory 11 (for example, executing a pedestrian trajectory analysis program based on multi-trajectory fusion, etc.), and calling data stored in the memory 11.
The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of pedestrian trajectory analysis programs based on multi-trajectory fusion, but also for temporarily storing data that has been output or is to be output.
The communication bus 12 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
The communication interface 13 is used for communication between the electronic device and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
Fig. 5 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The pedestrian trajectory analysis program based on multi-trajectory fusion stored in the memory 11 in the electronic device 1 is a combination of a plurality of instructions, which when executed in the processor 10, can implement:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trail of the first target;
Extracting a first human body characteristic and a first human face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion trail, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
And calculating the contact ratio of the first target and the second target according to the first motion trail, the second motion trail, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion trail and the second motion trail belong to the motion trail of the same person according to the contact ratio.
In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trail of the first target;
Extracting a first human body characteristic and a first human face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion trail, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
And calculating the contact ratio of the first target and the second target according to the first motion trail, the second motion trail, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion trail and the second motion trail belong to the motion trail of the same person according to the contact ratio.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (7)

1. A pedestrian trajectory analysis method based on multi-trajectory fusion, the method comprising:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trail of the first target;
Extracting a first human body characteristic and a first human face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion trail, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
Calculating the contact ratio of the first target and the second target according to the first motion trail, the second motion trail, the first human body feature, the second human body feature, the first human face feature and the second human face feature, and determining whether the first motion trail and the second motion trail belong to the motion trail of the same person according to the contact ratio;
The extracting the first human body feature and the first human face feature of the first target from each frame of image includes: cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image; selecting a human body image area corresponding to one frame of image in the first monitoring video one by one as a target area, generating global features of the target area according to pixel gradients in the target area, and taking the global features as the first human body features; calculating the probability value of each pixel point in the human body image area as a human face pixel by using a preset activation function, and determining the area where the pixel point with the probability value larger than a preset threshold value is located as a human face area; the face areas are selected one by utilizing a preset sliding window, and a pixel window is obtained; generating local features of the target area according to pixel values in each pixel window, and taking the local features as the first face features;
The generating the global feature of the target region according to the pixel gradient in the target region includes: counting the pixel value of each pixel point in the target area; taking the maximum pixel value and the minimum pixel value in the pixel values as input of a preset mapping function, and mapping the pixel value of each pixel point in the target area into a preset range by utilizing the mapping function; calculating the pixel gradient of each row of pixels in the target area after mapping, converting the pixel gradient of each row of pixels into row vectors, and splicing the row vectors into global features of the target area;
The generating the local feature of the target area according to the pixel value in each pixel window comprises the following steps: selecting one pixel point from the pixel window one by one as a target pixel point; judging whether the pixel value of the target pixel point is an extremum in the pixel window; when the pixel value of the target pixel point is not an extremum in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point; when the pixel value of the target pixel point is an extremum in the pixel window, determining the target pixel point as a key point; and vectorizing pixel values of all key points in all pixel windows, and collecting the obtained vectors as local features of the target area.
2. The pedestrian trajectory analysis method based on multi-trajectory fusion as claimed in claim 1, wherein the creating a three-dimensional model of the preset space from the monitoring screen includes:
Acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting a shooting picture corresponding to one angle one by one as a target picture, and randomly selecting any pixel point of the target from the target picture as a target pixel point;
generating a vector in the direction of the target pixel point by taking a camera shooting the shooting picture as an original point;
Measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
Calculating the space coordinates of a camera shooting the shooting picture according to the module length of the vector, the horizontal included angle and the vertical included angle;
And constructing a three-dimensional coordinate system by taking the target pixel point as an origin, taking the space coordinates of cameras at different positions as known coordinates, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
3. The pedestrian trajectory analysis method based on multi-trajectory fusion as claimed in claim 1, wherein said identifying the position information of the first target in each frame of image in the first surveillance video includes:
One frame of image is selected from the first monitoring video one by one to serve as a target image;
Performing convolution and pooling operation on the target image to obtain the image characteristics of the target image;
and determining the position of the image feature in the target image as the position information of the first target.
4. The method for analyzing the trajectory of the pedestrian based on the multi-trajectory fusion as claimed in claim 1, wherein the mapping the position information of the first object in each frame of image to the three-dimensional model to obtain the first motion trajectory of the first object includes:
Constructing a plane coordinate system in the target image by taking the central pixel of the target image as an origin;
counting position coordinates corresponding to position information contained in the target image from the plane coordinate system;
Mapping the position coordinates into the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
And connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion trail of the first target in the three-dimensional model.
5. A pedestrian trajectory analysis device based on multi-trajectory fusion for implementing the pedestrian trajectory analysis method based on multi-trajectory fusion as claimed in any one of claims 1 to 4, characterized in that the device comprises:
the three-dimensional model construction module is used for acquiring monitoring pictures of cameras at different positions in a preset space and constructing a three-dimensional model of the preset space according to the monitoring pictures;
The position identification module is used for acquiring a first monitoring video of a camera at a first position in the preset space and identifying the position information of a first target in each frame of image in the first monitoring video;
the first track analysis module is used for mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion track of the first target, and extracting the first human body characteristic and the first human face characteristic of the first target from each frame of image;
The second track analysis module is used for acquiring a second monitoring video of the camera at a second position in the preset space and generating a second motion track, a second human body characteristic and a second human face characteristic of a second target according to the second monitoring video;
The track judging module is used for calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body feature, the second human body feature, the first human face feature and the second human face feature, and confirming whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
6. An electronic device, the electronic device comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the multi-trajectory fusion-based pedestrian trajectory analysis method of any one of claims 1 to 4.
7. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the pedestrian trajectory analysis method based on multi-trajectory fusion as claimed in any one of claims 1 to 4.
CN202111150133.3A 2021-09-29 Pedestrian track analysis method, device, equipment and medium based on multi-track fusion Active CN113887384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111150133.3A CN113887384B (en) 2021-09-29 Pedestrian track analysis method, device, equipment and medium based on multi-track fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111150133.3A CN113887384B (en) 2021-09-29 Pedestrian track analysis method, device, equipment and medium based on multi-track fusion

Publications (2)

Publication Number Publication Date
CN113887384A CN113887384A (en) 2022-01-04
CN113887384B true CN113887384B (en) 2024-06-28

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330371A (en) * 2017-06-02 2017-11-07 深圳奥比中光科技有限公司 Acquisition methods, device and the storage device of the countenance of 3D facial models
CN110210276A (en) * 2018-05-15 2019-09-06 腾讯科技(深圳)有限公司 A kind of motion track acquisition methods and its equipment, storage medium, terminal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330371A (en) * 2017-06-02 2017-11-07 深圳奥比中光科技有限公司 Acquisition methods, device and the storage device of the countenance of 3D facial models
CN110210276A (en) * 2018-05-15 2019-09-06 腾讯科技(深圳)有限公司 A kind of motion track acquisition methods and its equipment, storage medium, terminal

Similar Documents

Publication Publication Date Title
CN108764048B (en) Face key point detection method and device
CN110458895B (en) Image coordinate system conversion method, device, equipment and storage medium
CN110428448B (en) Target detection tracking method, device, equipment and storage medium
US11238272B2 (en) Method and apparatus for detecting face image
CN112446919B (en) Object pose estimation method and device, electronic equipment and computer storage medium
CN114758249B (en) Target object monitoring method, device, equipment and medium based on field night environment
CN111160307A (en) Face recognition method and face recognition card punching system
WO2021169642A1 (en) Video-based eyeball turning determination method and system
CN114049568A (en) Object shape change detection method, device, equipment and medium based on image comparison
CN113011280A (en) Method and device for detecting person contact distance, computer equipment and storage medium
WO2023284358A1 (en) Camera calibration method and apparatus, electronic device, and storage medium
CN114241338A (en) Building measuring method, device, equipment and storage medium based on image recognition
CN112528903B (en) Face image acquisition method and device, electronic equipment and medium
CN113689475A (en) Cross-border head trajectory tracking method, equipment and storage medium
CN113887384B (en) Pedestrian track analysis method, device, equipment and medium based on multi-track fusion
CN113255456B (en) Inactive living body detection method, inactive living body detection device, electronic equipment and storage medium
KR102540290B1 (en) Apparatus and Method for Person Re-Identification based on Heterogeneous Sensor Camera
CN113887384A (en) Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion
CN113705304A (en) Image processing method and device, storage medium and computer equipment
CN111753766A (en) Image processing method, device, equipment and medium
CN117333929B (en) Method and system for identifying abnormal personnel under road construction based on deep learning
CN114359645B (en) Image expansion method, device, equipment and storage medium based on characteristic area
CN113850750A (en) Target track checking method, device, equipment and storage medium
CN117218162B (en) Panoramic tracking vision control system based on ai
CN115862089B (en) Security monitoring method, device, equipment and medium based on face recognition

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant