CN111709382A

CN111709382A - Human body trajectory processing method and device, computer storage medium and electronic equipment

Info

Publication number: CN111709382A
Application number: CN202010567771.4A
Authority: CN
Inventors: 严石伟; 丁凯; 蒋楠
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-06-19
Filing date: 2020-06-19
Publication date: 2020-09-25

Abstract

The disclosure relates to the field of artificial intelligence, and provides a human body trajectory processing method and device. The method comprises the following steps: acquiring a face track, a first human body track, a second human body track and a third human body track; binding the face track and the first human body track to obtain a first track mapping relation, and retrieving in a face database according to the face track to obtain a target permanent identity corresponding to the face track; clustering the first human body track and the second human body track, and determining a temporary identity corresponding to a clustering result to obtain a second track mapping relation; retrieving in the first human body track and the second human body track according to the third human body track to obtain a third track mapping relation; and binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity. According to the scheme, the cost can be reduced, and the efficiency and the accuracy of identity recall of human body tracks are improved.

Description

Human body trajectory processing method and device, computer storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a human body trajectory processing method, a human body trajectory processing apparatus, a computer storage medium, and an electronic device.

Background

The intelligent retail adopts the internet and the internet of things technology, senses consumption habits, predicts consumption trends, guides production and manufacture, and provides diversified and personalized products and services for consumers. In a smart retail store scenario, the main concern of the store or shop is the overall travel track of the customer and the activities of entering and exiting the store or entering and exiting the store.

At present, in a scheme of a smart retail store, the overall strolling track and behavior analysis of a customer is mainly based on identity profiling of a human face track shot by a camera, and meanwhile, the behavior analysis is carried out based on the shot human body track, so that the association between the human body track and a permanent identity is established, and the strolling track and behavior of a full mall and a shop corresponding to the permanent identity are obtained through statistics. However, in the method, alignment processing needs to be performed on videos shot by different types of cameras, and the alignment processing needs to be performed only after all trajectory data are acquired when human body trajectory analysis is performed, so that the scheme has the problems of poor identity recall effect, low system throughput, low data processing efficiency and the like.

It is to be noted that the information disclosed in the background section above is only used to enhance understanding of the background of the present disclosure.

Disclosure of Invention

The present disclosure aims to provide a human body trajectory processing method, a human body trajectory processing apparatus, a computer storage medium, and an electronic device, so as to improve data processing efficiency and accuracy at least to a certain extent, and further improve an identity recall effect and system throughput.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the present disclosure, there is provided a human body trajectory processing method, including: acquiring a face track and a first human body track obtained according to a video shot by a first shooting device at a field gate, a second human body track obtained according to a video shot by a second shooting device at the field gate, and a third human body track obtained according to a video shot by a third shooting device at a field gate; binding the face track and a first human body track matched with the face track to obtain a first track mapping relation, and retrieving in a face database according to the face track to obtain a target permanent identity corresponding to the face track; clustering the first human body track and the second human body track, and determining a temporary identity corresponding to a clustering result to obtain a second track mapping relation between the human body track and the temporary identity; retrieving in the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory; and binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity.

According to an aspect of the present disclosure, there is provided a human body trajectory processing apparatus, the apparatus including: the track acquisition module is used for acquiring a face track and a first human body track which are obtained according to a video shot by a first shooting device at a field gate, a second human body track which is obtained according to a video shot by a second shooting device at the field gate and a third human body track which is obtained according to a video shot by a third shooting device at an intra-field gate; the face retrieval module is used for binding the face track and a first human body track matched with the face track to obtain a first track mapping relation, and retrieving in a face database according to the face track to obtain a target permanent identity corresponding to the face track; the track clustering module is used for clustering the first human body track and the second human body track and determining a temporary identity corresponding to a clustering result so as to obtain a second track mapping relation between the human body track and the temporary identity; a trajectory retrieval module, configured to retrieve from the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory; and the identity binding module is used for binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity.

According to an aspect of the present disclosure, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the human body trajectory processing method of the first aspect described above.

According to an aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the human body trajectory processing method of the first aspect via execution of the executable instructions.

As can be seen from the foregoing technical solutions, the human body trajectory processing method, the human body trajectory processing apparatus, the computer storage medium and the electronic device in the exemplary embodiments of the present disclosure have at least the following advantages and positive effects:

in the technical scheme provided by some embodiments of the present disclosure, a first trajectory mapping relationship is obtained by binding a face trajectory with a first human body trajectory matched therewith, and a target permanent identity is obtained by retrieving in a face database according to the face trajectory; clustering the first human body track and the second human body track, determining a temporary identity corresponding to a clustering result to obtain a second track mapping relation between the human body tracks and the temporary identity, and retrieving in the first human body track and the second human body track according to a third human body track to obtain a third track mapping relation between the third human body track and the first human body track or the second human body track; and finally, binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity. On one hand, the human body track processing method in the embodiment of the disclosure can realize accurate identity recall through the track mapping relation among the human face track, the first human body track, the second human body track and the third human body track; on the other hand, before the first human body track and the second human body track are obtained, the video streams shot by different types of cameras do not need to be aligned, so that the problem of low identity recall accuracy caused by the fact that the video streams are not aligned is solved, and labor cost is greatly reduced; on the other hand, when the second track mapping relation is obtained, the first human body track and the second human body track are clustered, and all tracks do not need to be obtained, so that the processing efficiency and the system throughput are improved; finally, the method and the device can avoid deploying a large number of shooting terminals to obtain a large number of first human body tracks to bind human faces, so that the cost is further reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

In the drawings:

fig. 1 shows a schematic diagram of an exemplary system architecture to which technical aspects of embodiments of the present disclosure may be applied;

FIG. 2 is a diagram showing an architecture of identity recall in a related art intelligent retail store;

FIG. 3 shows a flow diagram of a human body trajectory processing method according to an embodiment of the present disclosure;

FIG. 4 shows a flow diagram of human body trajectory processing according to an embodiment of the present disclosure;

FIG. 5 illustrates a schematic flow chart of clustering a first human body trajectory and a second human body trajectory to form a trajectory cluster according to an embodiment of the present disclosure;

fig. 6 is a schematic flowchart illustrating a process of obtaining a temporary identity corresponding to a track cluster according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram illustrating a flow of data update when the highest retrieval score is greater than a third preset threshold according to an embodiment of the present disclosure;

FIG. 8 is a flow chart illustrating data updating when the highest score is less than or equal to a third predetermined threshold according to an embodiment of the present disclosure;

FIG. 9 illustrates a structural diagram of identity binding according to an embodiment of the present disclosure;

FIG. 10 illustrates a flow diagram for determining a final target permanent identity according to an embodiment of the present disclosure;

fig. 11 is a schematic structural diagram of a human body trajectory processing device according to an embodiment of the present disclosure;

fig. 12 shows a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

The described features, structures, or characteristics of the example embodiments may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

Furthermore, the block diagrams shown in the figures are only functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include a first photographing terminal 101, a second photographing terminal 102, a third photographing terminal 103, a terminal device 104, a controller 105, and a server 106. The first shooting terminal 101, the second shooting terminal 102, and the third shooting terminal 103 may be terminal devices having a camera unit, such as a gun camera and a dome camera, specifically, the first shooting terminal 101 may be a gun camera, the second shooting terminal 102 and the third shooting terminal 103 may be dome cameras, the first shooting terminal 101 and the second shooting terminal 102 may be disposed above a same door, and the third shooting terminal 103 may be disposed above a door in a venue different from the door, for example, above a door in a mall; the terminal device 104 may be an electronic device with a display screen, such as a desktop computer, a notebook computer, a tablet computer, a smart phone, etc.; the controller 105 is configured to transmit video streams captured by the first capture terminal 101, the second capture terminal 102, and the third capture terminal 103 to the server 106.

It should be understood that the numbers of the first photographing terminal 101, the second photographing terminal 102, the third photographing terminal 103, the terminal device 104, the controller 105, and the server 106 in fig. 1 are merely illustrative. There may be any number of the first photographing terminal 101, the second photographing terminal 102, the third photographing terminal 103, the terminal device 104, the controller 105, and the server 106 according to implementation needs. For example, the server 106 may be an independent server, or a server cluster composed of a plurality of servers, and the like, and may be used to store information related to human body trajectory processing.

In one embodiment of the present disclosure, in a scene of a smart retail store, the first camera terminal 101 and the second camera terminal 102 may be disposed above a field door for capturing a track image that a customer visits in a public area; the third photographing terminal 103 may be disposed above a store door of each store for photographing a track image that a customer stroked in a private area (store). First shooting terminal 101 can be the rifle bolt camera, because the quality of taking a candid photograph is fine, so can be used for taking a candid photograph customer's people face and human body, second shooting terminal 102 and third shooting terminal 103 can be the ball machine camera, because the quality of taking a candid photograph is relatively poor, so can be used for taking a candid photograph the human body. The different shooting terminals shoot fragmented videos, and in order to obtain a strolling track of a customer in a shopping mall and recall an identity of the human body track of the customer, video streams shot by the first shooting terminal 101, the second shooting terminal 102 and the third shooting terminal 103 need to be processed to obtain the human body track of the customer, and the human body track is bound with a permanent identity of the customer. Specifically, video streams captured by the first photographing terminal 101, the second photographing terminal 102, and the third photographing terminal 103 may be transmitted to the server 106 through the controller 105. After receiving the video streams sent by the shooting terminals, the server 106 firstly performs face detection tracking, registration and filtering on each image in the video streams sent by the first shooting terminal 101 to obtain a face track, and performs body detection tracking and filtering to obtain a first body track, and obtains a first track mapping relationship by binding the face track and the first body track of the same customer. Then, the server 106 may perform human detection tracking and filtering on the images in the video streams sent by the second shooting terminal 102 and the third shooting terminal 103 to obtain a second human body trajectory and a third human body trajectory, then perform clustering on the first human body trajectory and the second human body trajectory to divide the first human body trajectory and the second human body trajectory with similar average quality scores into a same trajectory cluster and obtain a temporary identity corresponding thereto, that is, a second trajectory mapping relationship, and further may perform retrieval on the first human body trajectory and the second human body trajectory according to the third human body trajectory to map the third human body trajectory to the corresponding first human body trajectory or second human body trajectory to obtain a third trajectory mapping relationship. And finally, performing identity binding according to the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity, namely obtaining the human body track corresponding to the target permanent identity, and binding the target permanent identity and the human body track to realize identity recall of the human body track. Further, the server 106 may send the mapping relationship between the target permanent identity and the human body trajectory to the terminal device 104, so that the mall manager or the store owner may formulate a corresponding marketing strategy according to the strolling trajectory of the customer.

It should be noted that the human body trajectory processing method provided by the embodiment of the present disclosure is generally executed by the server 106, and accordingly, the human body trajectory processing device is generally disposed in the server 106. However, in other embodiments of the present disclosure, the terminal device may also have a similar function as the server, so as to execute the human body trajectory processing scheme provided by the embodiments of the present disclosure.

In the related art in the field, fig. 2 shows an architecture diagram of identity recall in an intelligent retail store, and as shown in fig. 2, a gun camera and a ball camera are arranged on a backdrop door, and a ball camera, which are respectively referred to as a backdrop gun, a backdrop ball machine and a ball store machine, is arranged on a store door. In step S201, a stream taking service is called to take streams from a gun camera, a field ball machine and a shop ball machine, and the real-time streams of videos shot by the gun camera and the field ball machine in the same gate are aligned, that is, the gun ball videos are aligned, so that the synchronization of frames in the videos of the cameras in the pair of gates is ensured, and the matching of subsequent gun balls is facilitated; in step S202, visual processing micro-services such as face feature, face retrieval, body feature, body retrieval and the like are called to respectively detect, track and filter faces and bodies of each image in the field gun camera video to obtain face tracks and field gun camera body tracks, and the face tracks and the field gun camera body tracks corresponding to the same customer in the same field door are bound, if the binding is successful, the numbers of the two tracks are set to be the same, that is, Trace ID face is Trace ID gun, and if the binding is failed, the track ID face is discarded and not reported; in step S203, performing human body detection tracking and filtering on images in videos captured by the dome camera and the shop camera, and judging the behaviors of a customer such as entering and exiting the house, entering and exiting the shop, so as to determine the human body trajectory of the dome camera and the human body trajectory of the shop camera; in step S204, the face background is queried according to the face features in the successfully bound face track to obtain a permanent identity corresponding to the face track, which is specifically represented as: trace ID face → FaceID; in step S205, reporting the successfully bound gun camera human body trajectory, the field camera human body trajectory and the shop camera human body trajectory to a human body background, so as to process the field human body trajectory and the shop human body trajectory, so as to obtain a relationship between the shop human body trajectory and the field human body trajectory; in step S206, identity mapping of the human body trajectory is performed according to the relationship between the human face trajectory and the permanent identity and the relationship between the store human body trajectory and the field human body trajectory.

In step S205, the human body background is used to perform pedestrian Re-identification (Person Re-identification), which is a technology for determining whether a specific pedestrian exists in an image or a video sequence by using a computer vision technology, and a monitored pedestrian image is given and retrieved across devices. When analyzing the human body track according to the gun machine human body track, the field dome camera human body track and the shop dome camera human body track, firstly mapping the gun machine human body track and the field dome camera human body track to obtain the field human body track, and then analyzing according to the shop dome camera human body track and the field human body track to obtain the relationship between the shop human body track and the field human body track. When acquiring a field human body track, because the field ball machine has a wide snapshot field of vision and many snapshots, the processing speed is slow, a human background needs to wait for gun ball matching after receiving the field rifle bolt human body track until the field ball machine human body track arrives, then the same pair of field rifle bolt human body track and field ball machine human body track are matched, and the association between the field rifle bolt human body track and the field ball machine human body track is established, which can be specifically expressed as: trace ID ball function (Trace ID gun), where function refers to a mapping relationship. When the relationship between the store human body track and the field human body track is obtained, the store human body track and the field human body track can be retrieved according to the store human body track, so that the association between the store human body track and the field human body track is completed, and the relationship can be specifically expressed as follows: the Trace ID hop is function (Trace ID gun, Trace ID ball).

In step S206, identity binding is performed according to the association between the store human body trajectory and the court human body trajectory, the association between the rifle bolt human body trajectory and the court ball machine human body trajectory, the association between the face trajectory and the court human body trajectory, and the association between the face trajectory and the permanent identity, which may specifically be represented as:

Trace ID ball＝function(Trace ID gun)＝function(Trace ID face)→FaceID

Trace ID shop＝function(Trace ID gun,Trace ID ball)＝function(TraceID gun)＝function(Trace ID face)→FaceID

the above scheme has the corresponding defects that (1) algorithm performance: according to the scheme, the gun and ball camera videos are required to be strictly aligned, a large number of binding succeeds when the videos are bound, the human body track of the gun camera needs to wait for the human body track of the dome camera, and other strict conditions, so once the videos are not aligned or bound, a large number of human body tracks of the dome camera are lost due to a large number of failures, or the human body track of the gun camera is not within a specified time and the human body track of the dome camera is lost, a large number of human body tracks of the dome camera are reduced, and the identity recall effect of the human body tracks of the dome camera or. (2) The cost is high: the scheme requires that the gun and ball camera videos are strictly aligned, so that professional alignment personnel are required to align the videos of a plurality of field guns and field ball machines, time and labor are consumed, and the labor cost is high; meanwhile, the algorithm performance is poor due to the fact that the videos are not aligned or a large number of human body tracks are lost due to a large number of failures in binding, and in order to make up for the loss, more gunlock cameras have to be deployed in a market to fully obtain human face and human body tracks of customers so as to improve the identity binding effect of the human body tracks. The price of a gunlock camera in the current market is nearly 4 times of that of a dome camera, and the camera cost is inevitably overhigh due to the arrangement. (3) The system throughput is low: the scheme requires that the human body track of the field gun camera at the human body background needs to wait for a period of time for conveniently matching the gun ball, so that the system throughput is greatly reduced.

In the technical scheme provided by the disclosure, by introducing a human body track clustering and retrieval technology, the association of the human body track of the gun trigger and the human body track of the dome camera can be completed by carrying out human body clustering on the human body track of the gun trigger including a binding failure track and the human body track of the dome camera, video alignment and waiting logic are not relied on, the loss of the unnecessary human body track of the gun trigger and/or the human body track of the dome camera is avoided during track matching, and then the recall performance of the human body track binding permanent identity is ensured, and experiments prove that the identity recall of the human body track can be improved to 85% from 80% under the condition of fewer gun trigger cameras, the throughput of the system is improved, and meanwhile, the labor cost and the purchase cost of the camera are greatly reduced.

The human body track processing method provided by the embodiment of the disclosure is realized based on image processing of a video shot by a camera, and relates to the technical field of artificial intelligence. Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV), which is a science for researching how to make a machine "see", and further refers to using a camera and a Computer to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further performing image processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the disclosure relates to an artificial intelligence image processing technology and an image recognition technology, and is specifically explained by the following embodiments:

in an embodiment of the present disclosure, a human body trajectory processing method is provided, which overcomes, at least to some extent, the drawbacks of the related art or the above-mentioned attempts of the inventors. The executing main body of the human body trajectory processing method provided in this embodiment may be a device having a calculation processing function, for example, may be a server, and the server may be the server 106 shown in fig. 1, and of course, may also be executed by a terminal device having a calculation processing function. The human body track processing method can be used for processing the human body track in any scene, for example, the human body track in scenes such as intelligent retail stores, intelligent security, intelligent communities, intelligent catering and the like can be processed, so that the identity recall efficiency and accuracy of the human body track are improved, and the cost is reduced. The following describes the technical solution of the present disclosure in detail by taking the server as an execution subject to process the human body trajectory in the intelligent retail store.

Fig. 3 is a flowchart illustrating a human body trajectory processing method according to an exemplary embodiment of the disclosure. Referring to fig. 3, the method for processing a human body trajectory provided in this embodiment specifically includes the following steps:

in step S310, a face trajectory and a first human body trajectory obtained from a video captured by a first capturing device at a gate of a field, a second human body trajectory obtained from a video captured by a second capturing device at the gate of the field, and a third human body trajectory obtained from a video captured by a third capturing device at a gate of the field are obtained.

In one embodiment of the present disclosure, the intelligent retail store can be divided into two categories, namely, a public area and a private area, wherein the public area includes an entrance and an exit of the store, an elevator entrance, an escalator, an atrium and other field areas, the position in the field area available for setting the camera is called a door, the private area includes a store entrance, an in-store and other store areas, and the position in the store area available for setting the camera is called an in-store door, such as a store door. In order to record the strolling track of a customer, a plurality of shooting terminals are usually arranged in a public area and a private area in a shopping mall to record the track of the customer, and the shooting terminals can be a gun camera and a ball machine camera, wherein the monitoring position of the gun camera is fixed, the monitoring direction is limited, but the shooting quality is good, and the shooting terminals can be used for shooting a human face and a human body; the monitoring range of the camera of the ball machine is large, 360-degree rotation can be generally achieved, the snapping quality is poor, and the camera can be used for snapping a human body. For example, a gunlock camera and/or a dome camera may be arranged at a gate of a field area to photograph a track of a customer appearing within a photographing range; for a store area, a ball machine camera can be arranged on a store door in order to shoot a large range of people flow situations.

In one embodiment of the present disclosure, a first shooting terminal and a second shooting terminal may be arranged on a gate of a storefront, the first shooting terminal may be a camera of a gun camera or a camera of a dome camera, the second shooting terminal may be a camera of a dome camera or a camera of a gun camera, and a third shooting terminal is arranged on a gate of a storefront, the third shooting terminal being a camera of a dome camera. Through arranging rifle bolt camera and ball machine camera simultaneously on same door, can acquire customer's face orbit and human body orbit simultaneously to bind the face orbit and the human body orbit of same customer that the rifle bolt camera was shot, confirm customer's permanent identity according to the face orbit simultaneously, according to binding of face human body orbit and the mapping relation of face orbit and permanent identity, can map human body orbit and permanent identity, guarantee to carry out accurate identity recall to customer's human body orbit.

In one embodiment of the present disclosure, in order to obtain a human face track and a human body track, videos shot by a first shooting terminal, a second shooting terminal and a third shooting terminal may be processed, specifically, a first video sent by the first shooting terminal, a second video sent by the second shooting terminal and a third video sent by the third shooting terminal are obtained first; then, respectively detecting and tracking the face and the human body contained in each image in the first video to obtain a face track and a first human body track; detecting and tracking the human body contained in each image in the second video to obtain a second human body track; and detecting and tracking the human body contained in each image in the third video to obtain a third human body track. Furthermore, the entrance and exit behaviors of the customer can be judged according to the second human body track, the entrance and exit behaviors of the customer can be judged according to the third human body track, and according to different analysis requirements, the entrance and exit behaviors can be processed according to the human body tracks with different behaviors to obtain a required analysis result.

Next, how to detect and track a human face and a human body will be described by taking a gun camera as a first shooting terminal, a dome camera as a second shooting terminal, and a dome camera as a third shooting terminal as examples. For convenience of description, the gun camera and the ball camera on the field door are respectively referred to as field gun and field ball machine, and the ball camera on the store door is referred to as store ball machine.

The method comprises the steps that a scene gun machine is shot, a scene gun machine shot video is shot, the scene gun machine shot video comprises a human face and a human body, the human face is shot, the scene gun machine shot video comprises a video frame, a shot image and a human face track, the shot image and the human face track are obtained, the scene gun machine shot video comprises a scene gun machine shot video, the scene gun machine shot video and a scene gun machine shot video, the scene gun machine shot video comprises a scene gun machine shot video, a scene gun machine shot video and a scene gun machine shot video, the scene gun machine shot video and the scene gun machine shot video are shot, the scene gun. In addition, the images captured by the dome camera and the shop camera comprise human bodies, so that human body feature extraction service and human body retrieval service in computer vision processing microserver can be called to detect and track the human bodies in the videos captured by the dome camera and the shop camera so as to obtain a second human body track and a third human body track.

In an embodiment of the disclosure, a camera may cause a shadow on a captured face or a captured human body due to factors such as light, shielding, and angle during capturing, and the capturing of a face track and a human body track in subsequent matching may be affected, so that after the face or the human body in a video is detected and tracked, the face may be registered and a face quality calculation algorithm may be invoked to calculate a face quality score, an image forming the face track may be filtered according to a preset face quality score threshold, a human body quality calculation algorithm may be invoked to calculate a human body quality score, and images in a first human body track, a second human body track, and a third human body track may be filtered according to a preset human body quality score threshold, so as to obtain a face track and a human body track that can be used in subsequent processes.

In step S320, the face trajectory and the first human body trajectory matched with the face trajectory are bound to obtain a first trajectory mapping relationship, and the first trajectory mapping relationship is retrieved from a face database according to the face trajectory to obtain a target permanent identity corresponding to the face trajectory.

In an embodiment of the present disclosure, after obtaining the face trajectory and the first person trajectory, the face trajectory and the first person trajectory corresponding to the same customer may be subjected to SDK binding, where the SDK is a software development kit, and is generally a collection of development tools used by a software engineer to establish application software for a specific software package, a software framework, a hardware platform, a service system, and the like. The bound face track and the first human body track can be mapped with each other to obtain a first track mapping relation, the first track mapping relation is that the number of the face track is the same as the number of the first human body track, and the first track mapping relation is specifically represented as formula (1):

Trace ID face＝Trace ID gun (1)

wherein, the Trace ID face is a face track, and the Trace ID gun is a first human body track.

In one embodiment of the present disclosure, since the face features are uniquely corresponding to the permanent identities, the retrieval may be performed in the face database according to the face trajectory to obtain the target permanent identity corresponding to the face trajectory. In the embodiment of the present disclosure, a face background and a body background may be respectively set, where the face background is configured to process a face trajectory to obtain a permanent identity corresponding to the face trajectory, and the body background is configured to process a body trajectory to cluster a first body trajectory and a second body trajectory, and obtain a mapping relationship between a third body trajectory and the first body trajectory or the second body trajectory.

In an embodiment of the present disclosure, in order to obtain a target permanent identity corresponding to a face trajectory, the face trajectory may be sent to a face background, after the face background receives the face trajectory, first obtaining a target face feature corresponding to the face trajectory, and then retrieving in a face database according to the target face feature to obtain a target permanent identity corresponding to the target face feature, where the face database includes a plurality of face features and permanent identities corresponding to the face features, and during the retrieving, a similarity between the target face feature and each face feature in the face database may be calculated, where the permanent identity corresponding to the face feature with the highest similarity is the target permanent identity corresponding to the target face trajectory. The specific representation of the relationship between the face track and the target permanent identity is shown as formula (2):

Trace ID face→Face ID (2)

the Face ID Face is a Face track, the Face ID is a target permanent identity, and the Face ID is stored in an identity number mode.

Fig. 4 is a schematic flow chart of human body trajectory processing, and as shown in fig. 4, in step S401, a human face trajectory numbered 1 and first human body trajectories numbered 1 and 2 can be obtained by processing a video shot by a gun camera, second human body trajectories numbered 1 to 4 can be obtained by processing a video shot by a dome camera, and third human body trajectories numbered 1, 2 and 3 can be obtained by processing a video shot by a dome camera; in step S402, performing face-body binding on the face trajectory and the first body trajectory; because the face track numbered 1 and the first human body track numbered 1 correspond to the same customer, the face track numbered 1 and the first human body track numbered 1 can be bound; in step S403, according to the bound face track with number 1, 1: and N, retrieving to obtain a target permanent identity corresponding to the face track, wherein the face track with the number of 1 corresponds to the target permanent identity a as shown in the figure.

In step S330, the first human body trajectory and the second human body trajectory are clustered, and a temporary identity corresponding to a clustering result is determined, so as to obtain a second trajectory mapping relationship between the human body trajectory and the temporary identity.

In one embodiment of the present disclosure, since the gun camera and the dome camera are located on the same door, the gun camera and the dome camera can capture images of the same customer, that is, there may be human body tracks corresponding to the same customer in the first human body track and the second human body track, and therefore it is necessary to cluster the first human body track and the second human body track to cluster the human body tracks corresponding to the same customer into one cluster. Further, quality screening can be performed on the clustered track clusters to determine target track clusters, and identity retrieval can be performed on the target track clusters to obtain temporary identities corresponding to the target track clusters. In the embodiment of the disclosure, as the video of the gun camera and the video of the dome camera do not need to be aligned, and when the human body track is reported, all the first human body track and the second human body track are reported to the human body background for clustering, instead of only reporting the first human body track and the second human body track which are successfully bound, the condition that more gun cameras are arranged to fully obtain the human face track of a customer to improve the identity binding effect of the human body track can be avoided, the problem that the algorithm performance is poor due to the fact that the videos are not aligned or the human face binding fails to a large amount of human body tracks and the like is solved, and the labor cost and the shooting terminal cost are further reduced.

Returning to fig. 4, in step S404, the entrance and exit behaviors of the customer are determined according to the second human body trajectory numbered 1-4, and the second human body trajectory numbered 1-4 and the first human body trajectories numbered 1 and 2 after the behavior determination are sent to the human body background; in step S405, the human body background clusters the first human body trajectory and the second human body trajectory by calling a clustering microservice to obtain a trajectory cluster; as shown in the figure, a first human body trajectory numbered 1 and second human body trajectories numbered 1, 2, and 3 belong to the same trajectory cluster, and a first human body trajectory numbered 2 and a second human body trajectory numbered 4 belong to the same trajectory cluster; in step S406, performing identity retrieval on a target track cluster in the track clusters to obtain a temporary identity corresponding to the target track cluster; as shown in the figure, a trajectory cluster composed of a first human body trajectory numbered 1 and second human body trajectories numbered 1, 2, and 3 in the two trajectory clusters is a target trajectory cluster, and a temporary identity a' corresponding to the target trajectory cluster can be obtained through identity retrieval.

Next, how to cluster the first and second human body trajectories and how to determine the temporary identity will be described in detail.

During clustering, a clustering microservice in the computer vision processing microservice can be called to process a first human body track and a second human body track, specifically, the first human body track and the second human body track in a preset time period can be pulled from a database, then the pulled first human body track and the pulled second human body track are registered, and the clustering microservice is called to cluster the registered first human body track and the registered second human body track so as to obtain a track cluster. The preset time period may be a time period set according to actual needs, and may be, for example, 20min, 30min, and the like. Fig. 5 is a schematic diagram illustrating a process of clustering a first human body trajectory and a second human body trajectory to form a trajectory cluster, where as shown in fig. 5, the process at least includes steps S501-S502, specifically:

in step S501, feature extraction is performed on the first human body trajectory and the second human body trajectory, respectively, to obtain a first trajectory feature corresponding to the first human body trajectory and a second trajectory feature corresponding to the second human body trajectory.

In an embodiment of the present disclosure, the first human body trajectory includes a plurality of captured images corresponding to the same customer, the second human body trajectory also includes a plurality of captured images corresponding to the same customer, the first trajectory feature corresponding to the first human body trajectory can be obtained by respectively performing feature extraction on the plurality of captured images in the first human body trajectory through the machine learning model, and the second trajectory feature corresponding to the second human body trajectory can be obtained by respectively performing feature extraction on the plurality of captured images in the second human body trajectory through the machine learning model. The machine learning model for performing the feature extraction may be any model that is trained and can be used for human feature extraction, for example, a convolutional neural network model, a residual error network model, and the like, which is not specifically limited in this embodiment of the disclosure.

In step S502, the similarity between the first trajectory features, the similarity between the second trajectory features, and the similarity between the first trajectory features and the second trajectory features are calculated, and the first human body trajectory and the second human body trajectory are clustered according to the similarity to obtain a trajectory cluster.

In an embodiment of the present disclosure, since a strolling route of a customer has no regularity, even if a human body is detected and tracked by a video shot by the same shooting terminal, a human body track corresponding to the same customer may not be uniquely determined, and there may be a plurality of human body tracks corresponding to the same customer, so that it is necessary to cluster human body tracks obtained from the same shooting terminal, and also cluster human body tracks obtained from different shooting terminals. During clustering, in order to judge which first human body track and second human body track correspond to the same customer, whether different track features correspond to the same customer can be determined by calculating the similarity between different track features, wherein the similarity between different track features comprises the similarity between first track features, the similarity between second track features and the similarity between the first track features and the second track features, when the calculated similarity is greater than or equal to a preset similarity threshold, the corresponding track features correspond to the same customer, and when the calculated similarity is less than the preset similarity threshold, the corresponding track features correspond to different customers. Based on the above description, the clustering of the human body trajectories can be realized by performing the above processing on all the registered first human body trajectories and second human body trajectories, and trajectory clusters are obtained. The similarity may be determined by calculating a cosine distance, an euclidean distance, and the like, and the preset similarity threshold may be determined according to actual needs, for example, may be set to 0.9 or 0.95, and the like, which is not specifically limited in this embodiment of the disclosure.

In an embodiment of the present disclosure, after clustering the first human body trajectory and the second human body trajectory registered in the clustering logic to form a trajectory cluster, human body identity retrieval may be performed according to the trajectory cluster to obtain a temporary identity corresponding to the trajectory cluster, where a corresponding relationship between the trajectory cluster and the temporary identity is a second trajectory mapping relationship. It is noted that the first human body trajectory and the second human body trajectory are usually pulled from the database at intervals of a preset time period, which may be 20min, 30min, etc., for example, for a smart retail store, a plurality of customers may enter and exit the store and be photographed within the preset time period, that is, the pulled first human body trajectory and second human body trajectory usually form a plurality of trajectory clusters through clustering, and each trajectory cluster may include one or more human body trajectories. Next, how to obtain temporary identities of a plurality of trajectory clusters will be described. Fig. 6 is a schematic diagram illustrating a process of acquiring a temporary identity corresponding to a track cluster, where as shown in fig. 6, the process at least includes steps S601-S603, specifically:

in step S601, an average mass score of each human body trajectory in each trajectory cluster is obtained, and a target trajectory cluster is determined according to the average mass score.

In an embodiment of the present disclosure, after obtaining a plurality of trajectory clusters, the plurality of trajectory clusters may be filtered according to the snapshot quality of the human body trajectory in the trajectory cluster to obtain a target trajectory cluster with higher snapshot quality, and a human body trajectory identity retrieval is performed on the target trajectory cluster to obtain a temporary identity corresponding to the target trajectory cluster.

The process of acquiring the target track cluster is specifically as follows: firstly, acquiring a human body mass score corresponding to each snap-shot image in each human body track; then determining the average mass score of each human body track according to the mass scores of the human bodies and the number of the snap-shot images; finally, comparing the average mass of each human body track in each track cluster with a preset mass score threshold value, and when the average mass scores of all human body tracks in the track cluster are greater than or equal to the preset mass score threshold value, taking the track cluster as a target track cluster; and when the average quality component of the human body track smaller than the preset quality component threshold exists in the track cluster, filtering the track cluster.

Because a plurality of snap-shot images exist in one human body track, after the mass scores corresponding to all the snap-shot images in one human body track are obtained, the average mass score of the human body track can be obtained by adding all the mass scores and dividing the sum by the total number of the snap-shot images. Further, the preset quality score threshold may be set according to actual needs, and in order to improve the accuracy of the identity retrieval result, the preset quality score threshold may be set to a slightly higher point, for example, may be set to 0.9, 0.93, and so on. For example, a preset quality score threshold may be set to 0.9, and 5 trajectory clusters A, B, C, D, E are filtered according to the preset quality score threshold, and after comparison, the average quality scores of the human body trajectories in the trajectory cluster A, C are all greater than 0.9, so that the trajectory cluster may be used as a target trajectory cluster, and the trajectory cluster B, D, E may be filtered out.

In step S602, a cluster average feature corresponding to the target track cluster is obtained, and a search is performed in the human body identity library according to the cluster average feature, so as to obtain a search score corresponding to the target track cluster.

In an embodiment of the present disclosure, after the target trajectory cluster is obtained, the human body trajectories in the target trajectory cluster may be sorted from large to small according to the average quality, then a first preset number of human body trajectories are sequentially obtained, then a first average trajectory feature is determined according to trajectory features of the first preset number of human body trajectories, and the first average trajectory feature is used as a cluster average feature. The first preset number may be set according to actual needs, for example, 1, 2, and so on, taking 2 as an example, if the target trajectory cluster a includes 4 human body trajectories a, b, c, and d, and the average masses corresponding to the human body trajectories a, b, c, and d are respectively divided into 0.9, 0.92, 0.91, and 0.95, then a sequence formed after sorting according to the average masses is d-b-c-a, two human body trajectories are sequentially obtained therefrom, i.e., human body trajectories d and b, features of respective snap-shot images in the human body trajectories d and b are respectively extracted to obtain trajectory features of the human body trajectories d and b, and then the trajectory features of the human body trajectories d and b are added and divided by 2, i.e., a first average trajectory feature corresponding to the target trajectory cluster a, i.e., a cluster average feature can be obtained. If the number of the target track clusters is multiple, calculating according to the method to obtain the cluster average characteristics corresponding to each target track cluster.

In an embodiment of the present disclosure, the cluster average feature corresponding to the target track cluster may represent the human body track feature of the customer corresponding to the target track cluster, so that the search may be performed in the human body identity library according to the cluster average feature, and whether the human body identity number corresponding to the target track cluster exists in the human body identity library is determined by calculating the search score corresponding to the target track cluster. The human body identity library comprises a plurality of human body identity numbers and a plurality of registered human body tracks corresponding to the human body identity numbers, and the process of obtaining the retrieval scores corresponding to the target track cluster is as follows: firstly, traversing each human body identity number, and calculating the average mass fraction of a plurality of registered human body tracks corresponding to the human body identity numbers; then, sequencing a plurality of registered human body tracks corresponding to the human body identity numbers from large to small according to the average quality score, and sequentially obtaining a second preset number of target registered human body tracks; then, calculating a second average track characteristic corresponding to the human identity number according to the track characteristic of the target registered human body track corresponding to the human identity number; and finally, calculating the similarity between the cluster average characteristic and the second average track characteristic to obtain a retrieval score corresponding to the target track cluster.

The method for calculating the average mass score of the registered human body trajectory is the same as the method for calculating the average mass score of the human body trajectory in the trajectory cluster in the above embodiments, and is not described herein again. In addition, the second preset number may be a number set according to actual needs, for example, the number may be 1, 2, and the like, the second preset number may be the same as or different from the first preset number, which is not specifically limited in this disclosure, and the method for calculating the second average trajectory feature is the same as the method for calculating the first average trajectory feature, which is not described herein again. Meanwhile, the similarity between the cluster average characteristic and the second average track characteristic can be determined by calculating the cosine distance, the Euclidean distance and the like between the cluster average characteristic and the second average track characteristic, meanwhile, the similarity can be used as a retrieval score corresponding to the target track cluster, and whether the human identity code corresponding to the target track cluster exists in the human identity library or not can be determined according to the retrieval score.

In an embodiment of the present disclosure, since a plurality of human identity codes exist in the human identity library, a plurality of search scores corresponding to the target track cluster may be obtained according to the above search score obtaining method, where the plurality of search scores respectively represent similarities between the target track cluster and second average track features corresponding to each human identity number in the human identity library. In addition, when a plurality of target track clusters exist, each target track cluster can be traversed, and human identity retrieval is performed on each target track cluster in the human identity library according to the flow schematic diagram so as to obtain retrieval scores corresponding to each target track cluster.

In step S603, a temporary identity is determined from the search score.

In an embodiment of the present disclosure, after obtaining a plurality of retrieval scores corresponding to a target trajectory cluster, a highest retrieval score corresponding to the target trajectory cluster may be obtained, and then the highest retrieval score may be compared with a third preset threshold, and when the highest retrieval score is greater than the third preset threshold, it is indicated that a human identity number of a customer corresponding to the target trajectory cluster exists in a human identity library, so that the human identity number corresponding to the highest retrieval score may be used as a temporary identity; when the highest retrieval score is smaller than or equal to a third preset threshold, the human body identity number of the customer corresponding to the target track cluster does not exist in the human body identity library, and therefore the temporary identity can be generated according to the human body track with the highest average quality score in the target track cluster. When the temporary identity is generated according to the human body track with the highest average mass score in the target track cluster, an ID generation algorithm such as a unique ID generation algorithm and a snowflake algorithm may be used for generation, which is not specifically limited in the embodiment of the present disclosure.

After the first human body trajectory and the second human body trajectory are clustered, the first human body trajectory and the second human body trajectory corresponding to the same customer can be mapped to corresponding temporary identities, and the mapping relationship between the human body trajectories and the temporary identities is the second trajectory mapping relationship. The second trajectory mapping relationship is specifically shown as formula (3):

(Trace ID gun，Trace ID ball)→REID (3)

wherein, the Trace ID gun is a first human body track, the Trace ID ball is a second human body track, and the REID is a temporary identity.

In an embodiment of the present disclosure, after the temporary identity corresponding to the target track cluster is determined, the database and the human identity library may be updated according to the target track cluster and the corresponding temporary identity, the updated human identity library and database may be used for subsequent human identity retrieval and human track clustering, and the accuracy and efficiency of human identity retrieval may be improved by gradually perfecting the human identity library.

Fig. 7 is a schematic diagram illustrating a flow of updating data when the highest retrieval score is greater than a third preset threshold, and as shown in fig. 7, in step S701, the human identity number corresponding to the highest retrieval score in the human identity library is taken as an anchor point, and similarities between all the snap-shot images in the target trajectory cluster and the snap-shot images corresponding to the anchor point are calculated; in step S702, the similarity degrees are sorted from large to small, and the similarity degrees are filtered according to a first preset filtering threshold; in step S703, when the similarity is smaller than the first preset filtering threshold, discarding the target track cluster; in step S704, when the similarity is greater than or equal to the first preset filtering threshold, sorting the snap shots corresponding to the anchor points and all snap shots in the target trajectory cluster from large to small according to the quality scores of the snap shots to form a first image sequence; in step S705, a first position of a last image with a mass score greater than or equal to a preset mass score in the first image sequence is obtained; in step S706, determining whether the first position is greater than or equal to the total number of the snap-shot images corresponding to the anchor point; in step S707, when the first position is greater than or equal to the total number of the captured images corresponding to the anchor point, replacing the captured images in the registered human body trajectory corresponding to the anchor point with all images in the first image sequence located before the first position (including the first position), and forming a new human body trajectory corresponding to the anchor point; in step S708, when the first position is less than the total number of the snap shots corresponding to the anchor point, determining a target image in the first image sequence according to the total number of the snap shots corresponding to the anchor point, and replacing the snap shots in the registered human body trajectory corresponding to the anchor point with the target image to form a new human body trajectory corresponding to the anchor point; in step S709, adding a new human body trajectory corresponding to the anchor point to the update library; in step S710, the human body identity library and the database are updated according to the update data in the update library.

Fig. 8 is a schematic flow chart illustrating data updating performed when the highest retrieval score is less than or equal to a third preset threshold, and as shown in fig. 8, in step S801, the snapshot image in the human body trajectory having the highest average quality score in the target trajectory cluster is taken as an anchor point, and the similarity between all the snapshot images in the target trajectory cluster and the snapshot image corresponding to the anchor point is calculated; in step S802, the similarity degrees are sorted from large to small, and the similarity degrees are filtered according to a second preset filtering threshold; in step S803, when the similarity is smaller than the second preset filtering threshold, discarding the target track cluster; in step S804, when the similarity is greater than or equal to a second preset filtering threshold, sorting all the snap images in the target track cluster from large to small according to the quality scores of the snap images to form a second image sequence; in step S805, a second position of a last image with a mass score greater than or equal to a preset mass score in the second image sequence is obtained; in step S806, all images in the second image sequence located before the second position (including the second position) are taken as the human body trajectory corresponding to the new temporary identity; in step S807, a new human body trajectory corresponding to the new temporary identity is added to the update repository; in step S808, the human body identity library and the database are updated according to the update data in the update library.

In step S340, a search is performed in the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory.

In one embodiment of the present disclosure, a customer not only shops in a public area but also goes into shop shopping, purchasing, and the like, and thus, in order to obtain a shopping track corresponding to the same customer, it is necessary to associate a third human body track obtained by shooting through a shop camera and a first human body track obtained by shooting through a gun camera and a second human body track obtained by shooting through a field camera corresponding to the same customer.

Returning to fig. 4, as shown in fig. 4, in step S407, the store-in and store-out behavior of the customer is determined according to the third human body trajectories numbered 1, 2, and 3, and the third human body trajectory after the behavior determination is sent to the human body background; in step S408, the human body background searches the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship. Since the third human body trajectory represents a customer's shop trajectory and the first human body trajectory and the second human body trajectory represent customer's field trajectory, the field trajectory and the shop trajectory of the same customer can be associated by searching the store trajectory field trajectory, and as shown in the figure, the third human body trajectory numbered 1 corresponds to the first human body trajectory numbered 1, and the third human body trajectories numbered 2 and 3 correspond to the second human body trajectories numbered 1 and 4, respectively.

The field track and the store track are associated and mapped to obtain a third track mapping relationship, which can be specifically realized by the following method flow.

Firstly, extracting the characteristics of a third human body track to obtain third track characteristics, and respectively extracting the characteristics of a first human body track and a second human body track to obtain fourth track characteristics and fifth track characteristics; then, calculating the similarity between the third track characteristic and the fourth track characteristic and the similarity between the third track characteristic and the fifth track characteristic; and finally, mapping the third human body track corresponding to the highest similarity with the first human body track or mapping the third human body track corresponding to the highest similarity with the second human body track to obtain a third track mapping relation.

The specific representation of the third trajectory mapping relationship may be represented by equation (4):

Trace ID shop＝function (Trace ID gun, Trace ID ball) (4)

the Trace ID shop is a third human body track, the Trace ID gun is a first human body track, the Trace IDball is a second human body track, and the function is a mapping relation.

In step S350, the target permanent identity is bound with the corresponding human body trajectory based on the first trajectory mapping relationship, the second trajectory mapping relationship, the third trajectory mapping relationship, and the target permanent identity.

Returning to fig. 4, as shown in fig. 4, in step S409, the human body trajectory is bound with the target permanent identity according to the relationship between the human face trajectory and the target permanent identity, the first trajectory mapping relationship, the second trajectory mapping relationship, and the third trajectory mapping relationship. As shown in the figure, the third human body trajectories numbered 1, 2, and 3 correspond to the first human body trajectory numbered 1 and the second human body trajectories numbered 1 and 4, respectively, while the first human body trajectory numbered 1 and the second human body trajectories numbered 1 and 4 are the same trajectory cluster and correspond to the temporary identity a', and the first human body trajectory numbered 1 corresponds to the face trajectory numbered 1 and the face trajectory numbered 1 corresponds to the target permanent identity a, so that the first human body trajectory numbered 1, the second human body trajectories numbered 1 and 4, and the third human body trajectories numbered 1, 2, and 3 can be determined to correspond to the target permanent identity a.

In one embodiment of the present disclosure, identity binding may be performed based on relational expressions (1) - (4) to bind a target permanent identity with a corresponding human trajectory. The identity binding is specifically represented as shown in formula (5) and formula (6):

((Trace ID gun,Trace ID ball)→REID)&(Trace ID face＝Trace ID gun)&(Trace ID face→FACE ID)→(Trace ID gun,Trace ID ball)→FACE ID (5)

Trace ID shop＝function(Trace ID gun, Trace ID ball)→FACE ID(6)

fig. 9 shows a structural diagram of identity binding, and as shown in fig. 9, a relationship between a face track and a target permanent identity may be obtained based on face retrieval, for example, in the diagram, the target permanent identity corresponding to the face tracks numbered 1, 2, and 3 is a, and the target permanent identity corresponding to the face tracks numbered 4, 5, and 6 is b; a first track mapping relation can be obtained based on human face and human body binding, as shown in the figure, a first human body track with the number of 1 corresponds to a human face track with the number of 1, and a first human body track with the number of 6 corresponds to a human face track with the number of 6; a second track mapping relation can be obtained based on human body clustering, as shown in the figure, after clustering, second human body tracks numbered 1, 2 and 3 and a first human body track numbered 1 form a target track cluster, the temporary identity corresponding to the target track cluster is a ', meanwhile, the second human body tracks numbered 3, 4 and 5 and the first human body track numbered 6 form a target track cluster, and the temporary identity corresponding to the target track cluster is b'; based on the human body trajectory retrieval, a third trajectory mapping relationship may be obtained, as shown in the figure, a third human body trajectory numbered 1 corresponds to a first human body trajectory numbered 1, and third human body trajectories numbered 2, 3, and 4 correspond to second human body trajectories numbered 1, 2, and 3, respectively. When the identity binding is carried out according to the corresponding relation, firstly, the first human body track and the target permanent identity are bound, namely the first human body track with the number of 1 corresponds to the target permanent identity a, and the first human body track with the number of 6 corresponds to the target permanent identity b; then, according to the mapping relation between the first human body track and the target permanent identity, binding the first human body track and the second human body track with the target permanent identity, namely, the second human body tracks numbered 1, 2 and 3 and the first human body track numbered 1 correspond to a target permanent identity a, and the second human body tracks numbered 3, 4 and 5 and the first human body track numbered 6 correspond to a target permanent identity b; and finally, performing identity binding according to the mapping relation of the third track and the mapping relation of the first human body track, the second human body track and the target permanent identity, namely the third human body tracks with the numbers of 1, 2 and 3 correspond to the target permanent identity a, and the third human body tracks with the numbers of 4 correspond to the target permanent identities a and b.

In an embodiment of the present disclosure, the identity recall of the human body trajectory may be implemented through the above-mentioned process, but as can be seen from fig. 9, a plurality of human body trajectories may be clustered into different temporary identities during clustering, so that a plurality of target permanent identities may exist corresponding to the same human body trajectory, and therefore, the plurality of target permanent identities need to be screened to determine the unique target permanent identity corresponding to the same human body trajectory. In the embodiment of the disclosure, the target permanent identities corresponding to the same human body trajectory may be determined through a multi-identity voting algorithm, and specifically, the plurality of target permanent identities may be filtered according to the number of each target permanent identity, the retrieval score corresponding to each target permanent identity, the average quality score corresponding to each target permanent identity, and the shooting time of the snapshot image corresponding to each target permanent identity, so as to determine the final target permanent identity.

Fig. 10 is a schematic flowchart illustrating a process of determining a final target permanent identity, and as shown in fig. 10, in step S1001, the number of target permanent identities corresponding to a human body trajectory is counted; in step S1002, it is determined whether the number of target permanent identities is greater than 1; in step S1003, when the number of target permanent identities is equal to 1, taking the target permanent identity as a final target permanent identity; in step S1004, when the number of target permanent identities is greater than 1, counting the number of times that different target permanent identities occur; in step S1005, it is determined whether the number of target permanent identities that occur most frequently is greater than 1; in step S1006, when the number of target permanent identities that appear most frequently is equal to 1, the target permanent identity is taken as a final target permanent identity; in step S1007, when the number of the target permanent identities that occur most frequently is greater than 1, obtaining search scores between the target trajectory cluster and each target permanent identity that occur most frequently, and sorting the search scores from large to small; in step S1008, it is determined whether the number of the highest retrieval scores is greater than 1; in step S1009, when the number of the highest retrieval scores is equal to 1, taking the target permanent identity corresponding to the highest retrieval scores as a final target permanent identity; in step S1010, when the number of the highest retrieval scores is greater than 1, obtaining average quality scores corresponding to the registered human body trajectories corresponding to the plurality of highest retrieval scores, and sorting the average quality scores from large to small; in step S1011, it is determined whether the number of highest average mass points is greater than 1; in step S1012, when the number of the highest average quality scores is equal to 1, taking the target permanent identity corresponding to the highest average quality scores as a final target permanent identity; in step S1013, when the number of the highest average quality scores is greater than 1, acquiring earliest capturing time of captured images in human body trajectories corresponding to the plurality of highest average quality scores, and sorting the earliest capturing time from far to near; in step S1014, the target permanent identity corresponding to the human body trajectory with the farthest earliest snapshot time is used as the final target permanent identity.

The multi-dimensional filtering is carried out on the target permanent identities according to the multi-identity voting algorithm, so that one target permanent identity corresponding to the same human body track can be accurately determined, and the identity recall accuracy of the human body track is improved.

The human body track processing method provided by the disclosure can be used in a plurality of scenes in which the identity of the human body track needs to be recalled, such as a smart retail scene in the embodiment, can be applied to smart security, a smart community, smart catering and other scenes, for example, in the smart catering scene, a plurality of shelves exist in a food city, so that a gunlock camera and a dome camera can be arranged on a gate of the food city to shoot customers passing through the gate, track images of the customers in a public area are recorded, and meanwhile, a dome camera is arranged at each shelf to shoot track images of the customers in the shelves. Further, a plurality of face tracks, a first human body track, a second human body track and a third human body track can be obtained according to videos shot by a gun camera and a dome camera at a field gate and videos shot by a dome camera at a notch, and then the human body tracks and the target permanent identities can be bound according to the human body track processing method in the embodiment, so that identity recall of the human body tracks is achieved.

According to the human body track processing method provided by the disclosure, on one hand, the association of the human body field tracks of the gun camera and the dome camera can be completed through human body track clustering and human body track retrieval, so that a series of strict conditions that the video alignment of the gun camera and the dome camera, a large number of binding succeeds when human faces and human bodies are bound, a first human body track has to wait for a second human body track and the like are avoided, the efficiency and the accuracy of the identity recall of the human body track and the throughput of a system are improved, and the labor cost is reduced; on the other hand, the situation that a large number of gunlock cameras are arranged to fully acquire human face and human body tracks and the identity binding effect of the human body tracks is improved is avoided, so that the cost is further reduced; on the other hand, the technical scheme of the present disclosure is applicable to current mainstream hardware platforms including PCs, servers, and the like, and thus has wide applicability.

Embodiments of the human body trajectory processing device of the present disclosure are described below, which can be used to execute the human body trajectory processing method of the present disclosure.

Fig. 11 is a schematic structural diagram of a human body trajectory processing device in an exemplary embodiment of the present disclosure. As shown in fig. 11, the human body trajectory processing device 1100 includes: a track acquisition module 1101, a face retrieval module 1102, a track clustering module 1103, a track retrieval module 1104 and an identity binding module 1105.

Wherein: a trajectory acquisition module 1101, configured to acquire a face trajectory and a first human body trajectory obtained from a video captured by a first capturing device at a gate, a second human body trajectory obtained from a video captured by a second capturing device at the gate, and a third human body trajectory obtained from a video captured by a third capturing device at the gate; a face retrieval module 1102, configured to bind the face trajectory and a first human body trajectory matched with the face trajectory to obtain a first trajectory mapping relationship, and retrieve in a face database according to the face trajectory to obtain a target permanent identity corresponding to the face trajectory; a track clustering module 1103, configured to cluster the first human body track and the second human body track, and determine a temporary identity corresponding to a clustering result, so as to obtain a second track mapping relationship between the human body track and the temporary identity; a trajectory retrieval module 1104, configured to retrieve from the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory; an identity binding module 1105, configured to bind the target permanent identity with the corresponding human body trajectory based on the first trajectory mapping relationship, the second trajectory mapping relationship, the third trajectory mapping relationship, and the target permanent identity.

In an embodiment of the present disclosure, the trajectory acquisition module 1101 is configured to: acquiring a first video sent by a first shooting terminal, a second video sent by a second shooting terminal and a third video sent by a third shooting terminal; detecting and tracking the face and the human body contained in each image in the first video respectively to obtain the face track and the first human body track; detecting and tracking the human body contained in each image in the second video to obtain a second human body track; and detecting and tracking the human body contained in each image in the third video to obtain the third human body track.

In one embodiment of the present disclosure, the face database includes a plurality of face features and a permanent identity corresponding to each of the face features; the face retrieval module 1102 is configured to: acquiring target face features corresponding to the face track; and calculating the similarity between the target face features and each face feature, and taking the permanent identity corresponding to the face feature with the highest similarity as the target permanent identity.

In an embodiment of the present disclosure, the trajectory clustering module 1103 includes: a logic registration unit, configured to register the first human body trajectory and the second human body trajectory; a clustering unit, configured to invoke a clustering microservice to cluster the registered first human body trajectory and the registered second human body trajectory to obtain a trajectory cluster; and the identity retrieval unit is used for retrieving the human identity according to the track cluster to acquire a temporary identity corresponding to the track cluster, and taking the corresponding relation between the track cluster and the temporary identity as the second track mapping relation.

In an embodiment of the disclosure, the clustering unit is configured to: respectively extracting features of the first human body track and the second human body track to obtain a first track feature corresponding to the first human body track and a second track feature corresponding to the second human body track; calculating the similarity between the first track features, the similarity between the second track features and the similarity between the first track features and the second track features, and clustering the first human body track and the second human body track according to the similarity to obtain the track cluster.

In one embodiment of the present disclosure, the number of the trajectory clusters is multiple, and each trajectory cluster includes one or more human body trajectories; the identity retrieval unit comprises: the target track cluster determining unit is used for acquiring the average mass fraction of each human body track in each track cluster and determining the target track cluster according to the average mass fraction; the retrieval score determining unit is used for acquiring cluster average characteristics corresponding to the target track cluster, and retrieving in a human body identity library according to the cluster average characteristics so as to acquire a retrieval score corresponding to the target track cluster; and the temporary identity determining unit is used for determining the temporary identity according to the retrieval score.

In an embodiment of the present disclosure, the target trajectory cluster determining unit is configured to: acquiring a human body mass score corresponding to the snapshot image contained in each human body track; determining the average mass score of each human body track according to the human body mass scores and the number of the snapshot images; comparing the average mass of each human body track in each track cluster with a preset mass threshold value; and when the average mass scores of all human body tracks in the track cluster are greater than or equal to the preset mass score threshold value, taking the track cluster as the target track cluster.

In one embodiment of the present disclosure, the retrieval score determining unit is configured to: sequencing the human body tracks in the target track cluster from large to small according to the average mass; the method comprises the steps of sequentially obtaining a first preset number of human body tracks, determining a first average track characteristic according to the track characteristics of the first preset number of human body tracks, and taking the first average track characteristic as the cluster average characteristic.

In one embodiment of the present disclosure, the human body identity library includes a plurality of human body identity numbers and a plurality of registered human body trajectories corresponding to the human body identity numbers; the retrieval score determining unit is configured to: traversing each human body identity number, and calculating the average mass fraction of a plurality of registered human body tracks corresponding to the human body identity numbers; sequencing a plurality of registered human body tracks corresponding to the human body identity numbers from large to small according to average quality scores, and sequentially acquiring a second preset number of target registered human body tracks; calculating a second average track characteristic corresponding to the human body identity number according to the track characteristic of the target registered human body track; and calculating the similarity between the cluster average characteristic and the second average track characteristic to obtain a retrieval score corresponding to the target track cluster.

In one embodiment of the present disclosure, the temporary identity determination unit is configured to: acquiring a highest retrieval score corresponding to the target track cluster, and comparing the highest retrieval score with a third preset threshold; when the highest retrieval score is larger than the third preset threshold, taking the identity number corresponding to the highest retrieval score as the temporary identity; and when the highest retrieval score is smaller than or equal to the third preset threshold, generating the temporary identity according to the human body track with the highest average mass score in the target track cluster.

In one embodiment of the present disclosure, the trajectory retrieval module 1104 is configured to: performing feature extraction on the third human body track to obtain a third track feature, and performing feature extraction on the first human body track and the second human body track respectively to obtain a fourth track feature and a fifth track feature; calculating the similarity between the third and fourth trajectory features and the similarity between the third and fifth trajectory features; and mapping the third human body track corresponding to the highest similarity with the first human body track or mapping the third human body track corresponding to the highest similarity with the second human body track to obtain the third track mapping relation.

In an embodiment of the present disclosure, the human body trajectory processing device 1100 further includes: and the voting module is used for determining the final target permanent identity corresponding to the human body track according to a multi-identity voting algorithm when the same human body track corresponds to a plurality of target permanent identities.

In one embodiment of the disclosure, the voting module is configured to: and filtering the plurality of target permanent identities according to the number of each target permanent identity, the retrieval score corresponding to each target permanent identity, the average quality score corresponding to each target permanent identity and the shooting time of the snapshot image corresponding to each target permanent identity to determine the final target permanent identity.

FIG. 12 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.

It should be noted that the computer system 1200 of the electronic device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 12, the computer system 1200 includes a processor 1201, wherein the processor 1201 may include: a Graphics Processing Unit (GPU), a Central Processing Unit (CPU), which can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. In the RAM 1203, various programs and data necessary for system operation are also stored. A processor (GPU/CPU)1201, ROM 1202, and RAM 1203 are connected to each other by a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.

The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output section 1207 including a Display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN (Local area network) card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. A driver 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1210 as necessary, so that a computer program read out therefrom is mounted into the storage section 1208 as necessary.

In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1209, and/or installed from the removable medium 1211. The computer program, when executed by the processor (GPU/CPU)1201, performs various functions defined in the system of the present application. In some embodiments, computer system 1200 may also include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

It should be noted that the computer readable medium shown in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A human body trajectory processing method is characterized by comprising the following steps:

acquiring a face track and a first human body track obtained according to a video shot by a first shooting device at a field gate, a second human body track obtained according to a video shot by a second shooting device at the field gate, and a third human body track obtained according to a video shot by a third shooting device at a field gate;

binding the face track and a first human body track matched with the face track to obtain a first track mapping relation, and retrieving in a face database according to the face track to obtain a target permanent identity corresponding to the face track;

clustering the first human body track and the second human body track, and determining a temporary identity corresponding to a clustering result to obtain a second track mapping relation between the human body track and the temporary identity;

retrieving in the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory;

and binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity.

2. The human body trajectory processing method according to claim 1, wherein the acquiring a face trajectory and a first human body trajectory obtained from a video captured by a first capturing device at a gate, a second human body trajectory obtained from a video captured by a second capturing device at the gate, and a third human body trajectory obtained from a video captured by a third capturing device at a gate comprises:

acquiring a first video sent by the first shooting terminal, a second video sent by the second shooting terminal and a third video sent by the third shooting terminal;

detecting and tracking the face and the human body contained in each image in the first video respectively to obtain the face track and the first human body track;

detecting and tracking the human body contained in each image in the second video to obtain a second human body track;

and detecting and tracking the human body contained in each image in the third video to obtain the third human body track.

3. The human body trajectory processing method of claim 1, wherein the face database includes a plurality of face features and a permanent identity corresponding to each of the face features;

retrieving in a face database according to the face trajectory to obtain a target permanent identity corresponding to the face trajectory, including:

acquiring target face features corresponding to the face track;

and calculating the similarity between the target face features and each face feature, and taking the permanent identity corresponding to the face feature with the highest similarity as the target permanent identity.

4. The human body trajectory processing method according to claim 1, wherein the clustering the first human body trajectory and the second human body trajectory and determining a temporary identity corresponding to a clustering result to obtain a second trajectory mapping relationship between human body trajectories and the temporary identity includes:

registering the first human body trajectory and the second human body trajectory;

calling clustering microservice to cluster the registered first human body track and the registered second human body track so as to obtain a track cluster;

and searching the human body identity according to the track cluster to obtain a temporary identity corresponding to the track cluster, and taking the corresponding relation between the track cluster and the temporary identity as the second track mapping relation.

5. The human body trajectory processing method according to claim 4, wherein the invoking a clustering microservice clusters the registered first human body trajectory and the registered second human body trajectory to obtain a trajectory cluster, comprising:

respectively extracting features of the first human body track and the second human body track to obtain a first track feature corresponding to the first human body track and a second track feature corresponding to the second human body track;

calculating the similarity between the first track features, the similarity between the second track features and the similarity between the first track features and the second track features, and clustering the first human body track and the second human body track according to the similarity to obtain the track cluster.

6. The human body trajectory processing method according to claim 4, wherein the number of the trajectory clusters is plural, and each trajectory cluster includes one or more human body trajectories;

the human body identity retrieval is carried out according to the track cluster so as to obtain the temporary identity corresponding to the track cluster, and the method comprises the following steps:

obtaining the average mass fraction of each human body track in each track cluster, and determining a target track cluster according to the average mass fraction;

acquiring cluster average characteristics corresponding to the target track cluster, and searching in a human body identity library according to the cluster average characteristics to acquire a search score corresponding to the target track cluster;

and determining the temporary identity according to the retrieval score.

7. The human body trajectory processing method according to claim 6, wherein the obtaining an average mass score of each human body trajectory in each trajectory cluster, and determining a target trajectory cluster according to the average mass score includes:

acquiring a human body mass score corresponding to the snapshot image contained in each human body track;

determining the average mass score of each human body track according to the human body mass scores and the number of the snapshot images;

comparing the average mass of each human body track in each track cluster with a preset mass threshold value;

and when the average mass scores of all human body tracks in the track cluster are greater than or equal to the preset mass score threshold value, taking the track cluster as the target track cluster.

8. The human body trajectory processing method according to claim 6, wherein the obtaining of the cluster average feature corresponding to the target trajectory cluster includes:

sequencing the human body tracks in the target track cluster from large to small according to the average mass;

the method comprises the steps of sequentially obtaining a first preset number of human body tracks, determining a first average track characteristic according to the track characteristics of the first preset number of human body tracks, and taking the first average track characteristic as the cluster average characteristic.

9. The human body trajectory processing method according to claim 6 or 8, wherein the human body identity library includes a plurality of human body identity numbers and a plurality of registered human body trajectories corresponding to the human body identity numbers;

the retrieving in the human body identity library according to the cluster average characteristics to obtain a retrieval score corresponding to the target track cluster comprises the following steps:

traversing each human body identity number, and calculating the average mass fraction of a plurality of registered human body tracks corresponding to the human body identity numbers;

sequencing a plurality of registered human body tracks corresponding to the human body identity numbers from large to small according to average quality scores, and sequentially acquiring a second preset number of target registered human body tracks;

calculating a second average track characteristic corresponding to the human body identity number according to the track characteristic of the target registered human body track;

and calculating the similarity between the cluster average characteristic and the second average track characteristic to obtain a retrieval score corresponding to the target track cluster.

10. The human body trajectory processing method of claim 9, wherein the determining the temporary identity according to the search score comprises:

acquiring a highest retrieval score corresponding to the target track cluster, and comparing the highest retrieval score with a third preset threshold;

when the highest retrieval score is larger than the third preset threshold, taking the identity number corresponding to the highest retrieval score as the temporary identity;

and when the highest retrieval score is smaller than or equal to the third preset threshold, generating the temporary identity according to the human body track with the highest average mass score in the target track cluster.

11. The human body trajectory processing method according to claim 1, wherein the retrieving in the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory includes:

performing feature extraction on the third human body track to obtain a third track feature, and performing feature extraction on the first human body track and the second human body track respectively to obtain a fourth track feature and a fifth track feature;

calculating the similarity between the third and fourth trajectory features and the similarity between the third and fifth trajectory features;

and mapping the third human body track corresponding to the highest similarity with the first human body track or mapping the third human body track corresponding to the highest similarity with the second human body track to obtain the third track mapping relation.

12. The human body trajectory processing method according to claim 1, further comprising:

and when the same human body track corresponds to a plurality of target permanent identities, determining the final target permanent identity corresponding to the human body track according to a multi-identity voting algorithm.

13. The human body trajectory processing method of claim 12, wherein the determining a final target permanent identity corresponding to the human body trajectory according to a multi-identity voting algorithm comprises:

and filtering the plurality of target permanent identities according to the number of each target permanent identity, the retrieval score corresponding to each target permanent identity, the average quality score corresponding to each target permanent identity and the shooting time of the snapshot image corresponding to each target permanent identity to determine the final target permanent identity.

14. A human body trajectory processing device, comprising:

the track acquisition module is used for acquiring a face track and a first human body track which are obtained according to a video shot by a first shooting device at a field gate, a second human body track which is obtained according to a video shot by a second shooting device at the field gate and a third human body track which is obtained according to a video shot by a third shooting device at an intra-field gate;

the face retrieval module is used for binding the face track and a first human body track matched with the face track to obtain a first track mapping relation, and retrieving in a face database according to the face track to obtain a target permanent identity corresponding to the face track;

the track clustering module is used for clustering the first human body track and the second human body track and determining a temporary identity corresponding to a clustering result so as to obtain a second track mapping relation between the human body track and the temporary identity;

a trajectory retrieval module, configured to retrieve from the first human body trajectory and the second human body trajectory according to the third human body trajectory to obtain a third trajectory mapping relationship between the third human body trajectory and the first human body trajectory or the second human body trajectory;

and the identity binding module is used for binding the target permanent identity with the corresponding human body track based on the first track mapping relation, the second track mapping relation, the third track mapping relation and the target permanent identity.

15. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the human body trajectory processing method as claimed in any one of claims 1 to 12.