CN112926356B - Target tracking method and device - Google Patents

Target tracking method and device Download PDF

Info

Publication number
CN112926356B
CN112926356B CN201911236052.8A CN201911236052A CN112926356B CN 112926356 B CN112926356 B CN 112926356B CN 201911236052 A CN201911236052 A CN 201911236052A CN 112926356 B CN112926356 B CN 112926356B
Authority
CN
China
Prior art keywords
frame image
target
frame
kth
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911236052.8A
Other languages
Chinese (zh)
Other versions
CN112926356A (en
Inventor
朱兆琪
董玉新
安山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201911236052.8A priority Critical patent/CN112926356B/en
Publication of CN112926356A publication Critical patent/CN112926356A/en
Application granted granted Critical
Publication of CN112926356B publication Critical patent/CN112926356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target tracking method and device, and relates to the technical field of computers. One embodiment of the method comprises the following steps: performing target detection on the kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1; based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the (k+1) th frame image; determining average displacement of a target key point in a kth frame image and a target key point in a (k+1) th frame image; when the average displacement is smaller than or equal to the threshold value, the target detection frame in the k frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the k+2 frame image, so that target tracking is realized. According to the method and the device for detecting the target in the image, the problems that each frame of image needs target detection, is long in time consumption and cannot meet real-time requirements can be solved, the detection efficiency is improved, and the method and the device are suitable for application scenes with high real-time requirements.

Description

Target tracking method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a target tracking method and apparatus.
Background
Object tracking is an important element of automatic identification systems, and this technology has been increasingly used. It generally refers to searching through any given image using a certain strategy to determine whether it contains a target (e.g., a human face), and if so, returning the position and size of the target.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art: the existing target tracking algorithm is mainly divided into a traditional algorithm and a depth algorithm, wherein the traditional algorithm is such as kcf (Kernel Correlation Filter, kernel related filtering algorithm) and other related filtering algorithms, and target tracking is realized by giving a target to be tracked and then obtaining the maximum response position in an image through a filter. The depth algorithm regresses the position of the target in the image by extracting the features of the target. However, these two methods have large calculation amount and high performance requirements. And because the performance of the mobile terminal is limited, the mobile terminal is difficult to deploy and operate in real time.
Disclosure of Invention
In view of the above, the embodiment of the invention provides a target tracking method and device, which can solve the problems that each frame of image needs target detection, consumes long time and cannot meet real-time requirements, further improves detection efficiency, and is suitable for application scenes with high real-time requirements.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a target tracking method including:
Performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image;
determining the average displacement of a target key point in the kth frame image and a target key point in the k+1th frame image;
When the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.
Optionally, the method further comprises: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.
Optionally, determining the average displacement of the target key point in the kth frame image and the target key point in the k+1th frame image includes:
respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;
And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.
Optionally, correcting the target detection frame in the kth frame image by the average displacement includes:
and translating the target detection frame in the kth frame of image according to the average displacement.
To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided an object tracking apparatus including:
the detection frame determining module is used for carrying out target detection on the kth frame image and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
the key point determining module is used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image based on the target detection frame in the kth frame image;
the displacement determining module is used for determining average displacement between the target key point in the kth frame image and the target key point in the (k+1) th frame image;
And the tracking module is used for correcting the target detection frame in the kth frame image through the average displacement when the average displacement is smaller than or equal to a threshold value, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image so as to realize target tracking.
Optionally, the tracking module is further configured to: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.
Optionally, the displacement determining module is further configured to:
respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;
And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.
Optionally, the tracking module is further configured to: and translating the target detection frame in the kth frame of image according to the average displacement.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic device including: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the target tracking method of the embodiment of the invention.
To achieve the above object, according to still another aspect of the embodiments of the present invention, there is provided a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements an object tracking method of the embodiments of the present invention.
One embodiment of the above invention has the following advantages or benefits: because a plurality of target key points in the kth frame image and a plurality of target key points in the kth+1 frame image are respectively determined through the target detection frame in the kth frame image, namely the target detection frame in the kth frame image is used as the target detection frame of the kth+1 frame image, the kth+1 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main flow of a target tracking method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the main modules of an object tracking device according to an embodiment of the present invention;
FIG. 3 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
Fig. 4 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
FIG. 1 is a schematic diagram of the main flow of a target tracking method according to an embodiment of the invention, as shown in FIG. 1, the method includes:
step S101: performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1.
In this embodiment, the target may be a face in the image, or may be a vehicle or other object in the image, which is not limited herein.
In this step, the object detection is aimed at obtaining the object in the image
And (5) obtaining the target detection frame at the appearance position. As an example, the position of the target detection frame may be obtained by using an SSD detection algorithm, and let the position of the target detection frame be (x, y, w, h) box,(x,y)box represent the position coordinates of the upper left corner of the target detection frame, and w box,hbox represent the width and the height of the target detection frame, respectively. The SSD detection algorithm (Single Shot MultiBox Detector, target detection algorithm) is a regression thought-based deep convolutional neural network object detection algorithm.
Step S102: and respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image based on the target detection frame in the kth frame image.
The purpose of this step is to obtain the point coordinates of a specific location of the object in the image, for example, the point coordinates of a specific location of the face in the image. In the present embodiment, a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image correspond to each other.
Specifically, the target key point may be obtained by the following procedure:
And obtaining a target position in the image through a target detection frame obtained by a target detection algorithm, extracting an image of a target part in the image by utilizing the target detection frame to obtain the image of the target part, and finally inputting the image of the target part into a target key point detection model to obtain a target key point. The target key point detection model is a deep learning model and is obtained through training data, namely, the mapping relation from the image to the point is obtained through training by using the target image and the corresponding target key point, and the mapping relation is set as f. When the target key point detection model is used, the position of the target key point can be obtained only by inputting the image data into the model. In a specific embodiment, the number of target keypoints may be set to 106.
If the image input at the kth frame is I k and the mapping model from the image to the keypoint is f, the target keypoint detection can be expressed by formula (1):
f(Ik)={(x1,y1),(x2,y2),…(xn,yn),}k (1)
Wherein (x n,yn) is a plurality of keypoints output by the target keypoint detection model.
In the step, the target detection frame in the kth frame image is used as the target detection frame of the kth+1 frame image, namely the kth+1 frame image is not subjected to target detection, so that the process of target detection is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved. In this step, although the position of the target detection frame of the kth frame image may deviate from the actual position of the target detection frame in the k+1th frame image by a certain degree, and a certain degree of deviation may be caused when the target in the k+1th frame image is captured by the target detection frame in the kth frame image, since the target key point model has a certain degree of generalization, the target key point model can correctly output the position of the target key point even if the target image in the input target key point model has a certain degree of deviation in pixel level.
Step S103: and determining the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image.
Specifically, the method comprises the following steps:
respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;
And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.
Wherein, calculate the average position of a plurality of goal key points in the k frame picture according to the following formula (2):
Representing the average position of a plurality of target key points in a kth frame image,/> And the position of the ith target key point in the kth frame image is represented.
Calculating the average positions of a plurality of target key points in the k+1st frame image according to the following formula (3):
representing the average position of a plurality of target key points in the k+1st frame image,/> And the position of the ith target key point in the (k+1) th frame image is represented.
Calculating a displacement difference between an average position of the plurality of target key points in the k+1 frame image and an average position of the plurality of target key points in the k frame image according to the following formula (4):
Representing a displacement difference between an average position of a plurality of target key points in a k+1th frame image and an average position of a plurality of target key points in the k frame image.
Step S104: and judging the magnitude of the average displacement and the threshold value. In this step, the threshold may be flexibly set according to the application scenario, which is not limited in this disclosure.
Step S105: when the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.
In this embodiment, if the average displacement is smaller than the threshold value, it is indicated that the movement of the target in the k+1 frame image is smaller than that in the k frame image, and the target detection frame in the k frame image is translated, as shown in the following formula (5):
representing the position of the target detection frame in the kth frame image,/> Indicating the position of the corrected target detection frame.
Step S106: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.
In this embodiment, if the average displacement is greater than the threshold value, it is explained that the target in the k+1th frame image is more moving than the k frame image, and the target detection needs to be performed again for the k+2th frame image to acquire the target detection frame in the k+2th frame image.
According to the target tracking method, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.
In order to make the target tracking method of the embodiment of the invention clearer, the processing procedure of the method is described again by taking the face in the tracking image as an example:
(1) The processing procedure for the kth frame image is: obtaining a face frame through an SSD detection algorithm, obtaining a face image through the face frame, inputting the face image into a face key point detection model to obtain a plurality of face key points, and calculating the average positions of the face key points;
(2) The processing procedure for the k+1th frame image is: and acquiring a face image in the k+1 frame image by using a face frame in the k frame image, inputting the face image into a face key point detection model to obtain a face key point, and calculating the average position of the face key point. It is noted that the face frame in the k+1th frame image is the face frame of the k+1th frame image, but not the face frame of the k+1th frame image, and the k+1th frame image is not detected by the SSD algorithm, so that the SSD detection process is saved in the frame, although the position of the face frame of the k+1th frame image may deviate from the actual position of the face in the k+1th frame image to some extent, the face buckled in the k+1th frame image may deviate to some extent, but because the face key point model has a certain generalization, the face key point model can correctly output the position of the key point even if the face position in the face image input in the face key point model deviates by some pixel level.
(3) Calculating average displacement of face key points in the k frame and the k+1 frame image, if the average displacement is smaller than or equal to a threshold value, indicating that the face positions of the k frame and the k+1 frame image are not much moved, and correcting the position of the face frame of the k frame by using the average displacement of the k frame and the k+1 frame image to serve as the face frame of the k+2 frame image. If the average displacement is larger than the threshold value, the k+1 frame image is indicated to move too much face compared with the k frame image, then the face frame of the k frame image is directly translated through the average displacement to obtain the face frame of the k+2 frame image, and then the face frame possibly has problems when the k+2 frame image is scratched, so that when the average displacement is larger than the threshold value, the k+2 frame image is subjected to face detection again, and when the k+2 frame image is scratched and key point detection is performed, the position of the face frame can be completely corresponding to the image even if the face motion is relatively fast, and the position of the face image buckled on the k+2 frame image is correct, thereby ensuring that the key point model of the face is correctly output.
It is noted that the position of the corrected face frame is the position of the face frame of the k+1st frame image, and compared with the k+2nd frame image, the position of the face frame may not be in the middle of the face position of the k+2nd frame image, but due to the generalization capability of the face key point model, even if the face position in the face image input in the face key point model has some pixel level deviations, the face key point model can correctly output the position of the key point, so that the face tracking method of the embodiment of the invention can reduce the face detection frequency of some frames by the means.
(4) The processing flow for the k+2th frame image is:
a. If the average displacement of the face key points of the k+1th frame and the k frame image is smaller than or equal to a threshold value, the k+1th frame image is less in movement compared with the k frame image, the face frame of the k frame image is directly translated according to the average displacement to obtain the face frame of the k+2th frame image, then the translated face frame is used for matting the k+2th frame image, and the buckled face image is input into a face key point model;
b. If the average displacement of the key points of the k+1st frame and the k frame of the image is larger than a threshold value, the method indicates that the k+1st frame of the image can be inaccurate if the k+1st frame of the image is directly translated, and if the k+2nd frame of the image is scratched by the frame, the deviation can be larger, and the correct result can not be obtained by the key point model of the face. Therefore, for the case that the average displacement is greater than the threshold value, SSD detection is directly carried out on the k+2 frame image, so that the face frame of the k+2 frame image is obtained and is definitely the position of the face in the k+2 frame image.
According to the target tracking method, through the face frames in the kth frame image, a plurality of face key points in the kth frame image and a plurality of face key points in the kth+1 frame image are respectively determined, namely, the face frames in the kth frame image are used as the face frames of the kth+1 frame image, so that the kth+1 frame image is not subjected to face detection, the face detection process is saved, the speed of the whole face tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the face key points in the kth frame image and the face key points in the k+1 frame image is smaller than or equal to a threshold value, correcting the face frame in the kth frame image through the average displacement, and taking the corrected face frame as the face frame of the k+2 frame image to realize face tracking, namely, when the average displacement of the face key points in the kth frame image and the face key points in the k+1 frame image is smaller than or equal to the threshold value, the k+2 frame image is not subjected to face detection, so that the face detection process is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the face tracking method provided by the embodiment of the invention avoids the face detection of each frame of image, so that the technical problems that the face detection of each frame of image is required, the time consumption is long and the real-time requirement cannot be met in the prior art are solved, the detection efficiency is improved, and the face tracking method is suitable for application scenes with high real-time requirements.
Fig. 2 is a schematic diagram of main modules of an object tracking device 200 according to an embodiment of the present invention, and as shown in fig. 2, the device 200 includes:
a detection frame determining module 201, configured to perform target detection on a kth frame image, and determine a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
A key point determining module 202, configured to determine a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image respectively based on the target detection frame in the kth frame image;
a displacement determining module 203, configured to determine an average displacement between the target key point in the kth frame image and the target key point in the k+1th frame image;
And the tracking module 204 is configured to correct the target detection frame in the kth frame image according to the average displacement when the average displacement is less than or equal to the threshold value, and take the corrected target detection frame as the target detection frame of the kth+2th frame image to realize target tracking.
In an alternative embodiment, the tracking module 204 is further configured to: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.
In an alternative embodiment, the displacement determination module 203 is further configured to:
respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;
And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.
In an alternative embodiment, the tracking module 204 is further configured to: and translating the target detection frame in the kth frame of image according to the average displacement.
According to the target tracking device, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.
The device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.
Fig. 3 illustrates an exemplary system architecture 300 to which the target tracking method or target tracking apparatus of embodiments of the invention may be applied.
As shown in fig. 3, the system architecture 300 may include terminal devices 301, 302, 303, a network 304, and a server 305. The network 304 is used as a medium to provide communication links between the terminal devices 301, 302, 303 and the server 305. The network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 305 via the network 304 using the terminal devices 301, 302, 303 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 301, 302, 303.
The terminal devices 301, 302, 303 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 305 may be a server providing various services, such as a background management server providing support for shopping-type websites browsed by the user using the terminal devices 301, 302, 303. The background management server can analyze and other processing on the received data such as the product information inquiry request and the like, and feed back processing results (such as target push information and product information) to the terminal equipment.
It should be noted that, the object tracking method provided in the embodiment of the present invention is generally executed by the server 305, and accordingly, the object tracking device is generally disposed in the server 305.
It should be understood that the number of terminal devices, networks and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 4, there is illustrated a schematic diagram of a computer system 400 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 4 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU) 401, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In RAM 403, various programs and data required for the operation of system 400 are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output portion 407 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage section 408 including a hard disk or the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. The drive 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 410 as needed, so that a computer program read therefrom is installed into the storage section 408 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 409 and/or installed from the removable medium 411. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 401.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not constitute a limitation on the unit itself in some cases, and for example, the transmitting module may also be described as "a module that transmits a picture acquisition request to a connected server".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include:
Performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image;
determining the average displacement of a target key point in the kth frame image and a target key point in the k+1th frame image;
When the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.
According to the technical scheme, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of a target tracking overall process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (7)

1. A target tracking method, comprising:
Performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image;
determining the average displacement of a target key point in the kth frame image and a target key point in the k+1th frame image;
When the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.
2. The method according to claim 1, wherein the method further comprises:
and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.
3. The method of claim 1, wherein determining an average displacement of a target keypoint in the kth frame image from a target keypoint in the k+1 frame image comprises:
respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;
And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.
4. The method of claim 1, wherein correcting the object detection box in the kth frame image by the average displacement comprises:
and translating the target detection frame in the kth frame of image according to the average displacement.
5. An object tracking device, comprising:
the detection frame determining module is used for carrying out target detection on the kth frame image and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;
the key point determining module is used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image based on the target detection frame in the kth frame image;
the displacement determining module is used for determining average displacement between the target key point in the kth frame image and the target key point in the (k+1) th frame image;
And the tracking module is used for correcting the target detection frame in the kth frame image through the average displacement when the average displacement is smaller than or equal to a threshold value, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image so as to realize target tracking.
6. An electronic device, comprising:
One or more processors;
Storage means for storing one or more programs,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
7. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-4.
CN201911236052.8A 2019-12-05 2019-12-05 Target tracking method and device Active CN112926356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911236052.8A CN112926356B (en) 2019-12-05 2019-12-05 Target tracking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911236052.8A CN112926356B (en) 2019-12-05 2019-12-05 Target tracking method and device

Publications (2)

Publication Number Publication Date
CN112926356A CN112926356A (en) 2021-06-08
CN112926356B true CN112926356B (en) 2024-06-18

Family

ID=76161900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911236052.8A Active CN112926356B (en) 2019-12-05 2019-12-05 Target tracking method and device

Country Status (1)

Country Link
CN (1) CN112926356B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455797A (en) * 2013-09-07 2013-12-18 西安电子科技大学 Detection and tracking method of moving small target in aerial shot video
CN109214245A (en) * 2017-07-03 2019-01-15 株式会社理光 A kind of method for tracking target, device, equipment and computer readable storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007088759A1 (en) * 2006-02-01 2007-08-09 National University Corporation The University Of Electro-Communications Displacement detection method, displacement detection device, displacement detection program, characteristic point matching method, and characteristic point matching program
CN103077532A (en) * 2012-12-24 2013-05-01 天津市亚安科技股份有限公司 Real-time video object quick tracking method
CN106846362B (en) * 2016-12-26 2020-07-24 歌尔科技有限公司 Target detection tracking method and device
KR101837407B1 (en) * 2017-11-03 2018-03-12 국방과학연구소 Apparatus and method for image-based target tracking
CN110400332B (en) * 2018-04-25 2021-11-05 杭州海康威视数字技术股份有限公司 Target detection tracking method and device and computer equipment
CN109003245B (en) * 2018-08-21 2021-06-04 厦门美图之家科技有限公司 Coordinate processing method and device and electronic equipment
CN110349190B (en) * 2019-06-10 2023-06-06 广州视源电子科技股份有限公司 Adaptive learning target tracking method, device, equipment and readable storage medium
CN110378264B (en) * 2019-07-08 2023-04-18 Oppo广东移动通信有限公司 Target tracking method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455797A (en) * 2013-09-07 2013-12-18 西安电子科技大学 Detection and tracking method of moving small target in aerial shot video
CN109214245A (en) * 2017-07-03 2019-01-15 株式会社理光 A kind of method for tracking target, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN112926356A (en) 2021-06-08

Similar Documents

Publication Publication Date Title
US10762387B2 (en) Method and apparatus for processing image
CN109308469B (en) Method and apparatus for generating information
US20190197703A1 (en) Method and apparatus for tracking target profile in video
CN109255337B (en) Face key point detection method and device
CN110188719B (en) Target tracking method and device
US11915447B2 (en) Audio acquisition device positioning method and apparatus, and speaker recognition method and system
US20210200971A1 (en) Image processing method and apparatus
CN110619807B (en) Method and device for generating global thermodynamic diagram
CN110059623B (en) Method and apparatus for generating information
CN111815738B (en) Method and device for constructing map
CN110288625B (en) Method and apparatus for processing image
CN110956131B (en) Single-target tracking method, device and system
CN111192312B (en) Depth image acquisition method, device, equipment and medium based on deep learning
CN111160410B (en) Object detection method and device
CN110110666A (en) Object detection method and device
CN110717405B (en) Face feature point positioning method, device, medium and electronic equipment
CN109919220B (en) Method and apparatus for generating feature vectors of video
CN110321454B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN112926356B (en) Target tracking method and device
CN109034085B (en) Method and apparatus for generating information
CN108446737B (en) Method and device for identifying objects
CN113362090A (en) User behavior data processing method and device
CN113642493B (en) Gesture recognition method, device, equipment and medium
CN115393423A (en) Target detection method and device
CN113033377A (en) Character position correction method, character position correction device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant