CN112926356B

CN112926356B - Target tracking method and device

Info

Publication number: CN112926356B
Application number: CN201911236052.8A
Authority: CN
Inventors: 朱兆琪; 董玉新; 安山
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2024-06-18
Anticipated expiration: 2039-12-05
Also published as: CN112926356A

Abstract

The invention discloses a target tracking method and device, and relates to the technical field of computers. One embodiment of the method comprises the following steps: performing target detection on the kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1; based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the (k+1) th frame image; determining average displacement of a target key point in a kth frame image and a target key point in a (k+1) th frame image; when the average displacement is smaller than or equal to the threshold value, the target detection frame in the k frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the k+2 frame image, so that target tracking is realized. According to the method and the device for detecting the target in the image, the problems that each frame of image needs target detection, is long in time consumption and cannot meet real-time requirements can be solved, the detection efficiency is improved, and the method and the device are suitable for application scenes with high real-time requirements.

Description

Target tracking method and device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a target tracking method and apparatus.

Background

Object tracking is an important element of automatic identification systems, and this technology has been increasingly used. It generally refers to searching through any given image using a certain strategy to determine whether it contains a target (e.g., a human face), and if so, returning the position and size of the target.

In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art: the existing target tracking algorithm is mainly divided into a traditional algorithm and a depth algorithm, wherein the traditional algorithm is such as kcf (Kernel Correlation Filter, kernel related filtering algorithm) and other related filtering algorithms, and target tracking is realized by giving a target to be tracked and then obtaining the maximum response position in an image through a filter. The depth algorithm regresses the position of the target in the image by extracting the features of the target. However, these two methods have large calculation amount and high performance requirements. And because the performance of the mobile terminal is limited, the mobile terminal is difficult to deploy and operate in real time.

Disclosure of Invention

In view of the above, the embodiment of the invention provides a target tracking method and device, which can solve the problems that each frame of image needs target detection, consumes long time and cannot meet real-time requirements, further improves detection efficiency, and is suitable for application scenes with high real-time requirements.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a target tracking method including:

Performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;

based on the target detection frame in the kth frame image, respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image;

determining the average displacement of a target key point in the kth frame image and a target key point in the k+1th frame image;

When the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.

Optionally, the method further comprises: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.

Optionally, determining the average displacement of the target key point in the kth frame image and the target key point in the k+1th frame image includes:

respectively determining the average positions of a plurality of target key points in the kth frame image and the average positions of a plurality of target key points in the k+1th frame image;

And calculating displacement differences between the average positions of the target key points in the k+1th frame image and the average positions of the target key points in the k frame image, and taking the displacement differences as the average displacement of the target key points in the k frame image and the target key points in the k+1th frame image.

Optionally, correcting the target detection frame in the kth frame image by the average displacement includes:

and translating the target detection frame in the kth frame of image according to the average displacement.

To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided an object tracking apparatus including:

the detection frame determining module is used for carrying out target detection on the kth frame image and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;

the key point determining module is used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image based on the target detection frame in the kth frame image;

the displacement determining module is used for determining average displacement between the target key point in the kth frame image and the target key point in the (k+1) th frame image;

And the tracking module is used for correcting the target detection frame in the kth frame image through the average displacement when the average displacement is smaller than or equal to a threshold value, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image so as to realize target tracking.

Optionally, the tracking module is further configured to: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.

Optionally, the displacement determining module is further configured to:

Optionally, the tracking module is further configured to: and translating the target detection frame in the kth frame of image according to the average displacement.

To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic device including: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the target tracking method of the embodiment of the invention.

To achieve the above object, according to still another aspect of the embodiments of the present invention, there is provided a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements an object tracking method of the embodiments of the present invention.

One embodiment of the above invention has the following advantages or benefits: because a plurality of target key points in the kth frame image and a plurality of target key points in the kth+1 frame image are respectively determined through the target detection frame in the kth frame image, namely the target detection frame in the kth frame image is used as the target detection frame of the kth+1 frame image, the kth+1 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.

Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main flow of a target tracking method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the main modules of an object tracking device according to an embodiment of the present invention;

FIG. 3 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;

Fig. 4 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

FIG. 1 is a schematic diagram of the main flow of a target tracking method according to an embodiment of the invention, as shown in FIG. 1, the method includes:

step S101: performing target detection on a kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1.

In this embodiment, the target may be a face in the image, or may be a vehicle or other object in the image, which is not limited herein.

In this step, the object detection is aimed at obtaining the object in the image

And (5) obtaining the target detection frame at the appearance position. As an example, the position of the target detection frame may be obtained by using an SSD detection algorithm, and let the position of the target detection frame be (x, y, w, h) _box,(x,y)_box represent the position coordinates of the upper left corner of the target detection frame, and w _box,h_box represent the width and the height of the target detection frame, respectively. The SSD detection algorithm (Single Shot MultiBox Detector, target detection algorithm) is a regression thought-based deep convolutional neural network object detection algorithm.

Step S102: and respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image based on the target detection frame in the kth frame image.

The purpose of this step is to obtain the point coordinates of a specific location of the object in the image, for example, the point coordinates of a specific location of the face in the image. In the present embodiment, a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image correspond to each other.

Specifically, the target key point may be obtained by the following procedure:

And obtaining a target position in the image through a target detection frame obtained by a target detection algorithm, extracting an image of a target part in the image by utilizing the target detection frame to obtain the image of the target part, and finally inputting the image of the target part into a target key point detection model to obtain a target key point. The target key point detection model is a deep learning model and is obtained through training data, namely, the mapping relation from the image to the point is obtained through training by using the target image and the corresponding target key point, and the mapping relation is set as f. When the target key point detection model is used, the position of the target key point can be obtained only by inputting the image data into the model. In a specific embodiment, the number of target keypoints may be set to 106.

If the image input at the kth frame is I ^k and the mapping model from the image to the keypoint is f, the target keypoint detection can be expressed by formula (1):

f(I^k)＝{(x₁,y₁),(x₂,y₂),…(x_n,y_n),}^k (1)

Wherein (x _n,y_n) is a plurality of keypoints output by the target keypoint detection model.

In the step, the target detection frame in the kth frame image is used as the target detection frame of the kth+1 frame image, namely the kth+1 frame image is not subjected to target detection, so that the process of target detection is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved. In this step, although the position of the target detection frame of the kth frame image may deviate from the actual position of the target detection frame in the k+1th frame image by a certain degree, and a certain degree of deviation may be caused when the target in the k+1th frame image is captured by the target detection frame in the kth frame image, since the target key point model has a certain degree of generalization, the target key point model can correctly output the position of the target key point even if the target image in the input target key point model has a certain degree of deviation in pixel level.

Step S103: and determining the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image.

Specifically, the method comprises the following steps:

Wherein, calculate the average position of a plurality of goal key points in the k frame picture according to the following formula (2):

Representing the average position of a plurality of target key points in a kth frame image,/> And the position of the ith target key point in the kth frame image is represented.

Calculating the average positions of a plurality of target key points in the k+1st frame image according to the following formula (3):

representing the average position of a plurality of target key points in the k+1st frame image,/> And the position of the ith target key point in the (k+1) th frame image is represented.

Calculating a displacement difference between an average position of the plurality of target key points in the k+1 frame image and an average position of the plurality of target key points in the k frame image according to the following formula (4):

Representing a displacement difference between an average position of a plurality of target key points in a k+1th frame image and an average position of a plurality of target key points in the k frame image.

Step S104: and judging the magnitude of the average displacement and the threshold value. In this step, the threshold may be flexibly set according to the application scenario, which is not limited in this disclosure.

Step S105: when the average displacement is smaller than or equal to a threshold value, correcting the target detection frame in the kth frame image through the average displacement, and taking the corrected target detection frame as the target detection frame of the (k+2) th frame image to realize target tracking.

In this embodiment, if the average displacement is smaller than the threshold value, it is indicated that the movement of the target in the k+1 frame image is smaller than that in the k frame image, and the target detection frame in the k frame image is translated, as shown in the following formula (5):

representing the position of the target detection frame in the kth frame image,/> Indicating the position of the corrected target detection frame.

Step S106: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.

In this embodiment, if the average displacement is greater than the threshold value, it is explained that the target in the k+1th frame image is more moving than the k frame image, and the target detection needs to be performed again for the k+2th frame image to acquire the target detection frame in the k+2th frame image.

According to the target tracking method, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.

In order to make the target tracking method of the embodiment of the invention clearer, the processing procedure of the method is described again by taking the face in the tracking image as an example:

(1) The processing procedure for the kth frame image is: obtaining a face frame through an SSD detection algorithm, obtaining a face image through the face frame, inputting the face image into a face key point detection model to obtain a plurality of face key points, and calculating the average positions of the face key points;

(2) The processing procedure for the k+1th frame image is: and acquiring a face image in the k+1 frame image by using a face frame in the k frame image, inputting the face image into a face key point detection model to obtain a face key point, and calculating the average position of the face key point. It is noted that the face frame in the k+1th frame image is the face frame of the k+1th frame image, but not the face frame of the k+1th frame image, and the k+1th frame image is not detected by the SSD algorithm, so that the SSD detection process is saved in the frame, although the position of the face frame of the k+1th frame image may deviate from the actual position of the face in the k+1th frame image to some extent, the face buckled in the k+1th frame image may deviate to some extent, but because the face key point model has a certain generalization, the face key point model can correctly output the position of the key point even if the face position in the face image input in the face key point model deviates by some pixel level.

(3) Calculating average displacement of face key points in the k frame and the k+1 frame image, if the average displacement is smaller than or equal to a threshold value, indicating that the face positions of the k frame and the k+1 frame image are not much moved, and correcting the position of the face frame of the k frame by using the average displacement of the k frame and the k+1 frame image to serve as the face frame of the k+2 frame image. If the average displacement is larger than the threshold value, the k+1 frame image is indicated to move too much face compared with the k frame image, then the face frame of the k frame image is directly translated through the average displacement to obtain the face frame of the k+2 frame image, and then the face frame possibly has problems when the k+2 frame image is scratched, so that when the average displacement is larger than the threshold value, the k+2 frame image is subjected to face detection again, and when the k+2 frame image is scratched and key point detection is performed, the position of the face frame can be completely corresponding to the image even if the face motion is relatively fast, and the position of the face image buckled on the k+2 frame image is correct, thereby ensuring that the key point model of the face is correctly output.

It is noted that the position of the corrected face frame is the position of the face frame of the k+1st frame image, and compared with the k+2nd frame image, the position of the face frame may not be in the middle of the face position of the k+2nd frame image, but due to the generalization capability of the face key point model, even if the face position in the face image input in the face key point model has some pixel level deviations, the face key point model can correctly output the position of the key point, so that the face tracking method of the embodiment of the invention can reduce the face detection frequency of some frames by the means.

(4) The processing flow for the k+2th frame image is:

a. If the average displacement of the face key points of the k+1th frame and the k frame image is smaller than or equal to a threshold value, the k+1th frame image is less in movement compared with the k frame image, the face frame of the k frame image is directly translated according to the average displacement to obtain the face frame of the k+2th frame image, then the translated face frame is used for matting the k+2th frame image, and the buckled face image is input into a face key point model;

b. If the average displacement of the key points of the k+1st frame and the k frame of the image is larger than a threshold value, the method indicates that the k+1st frame of the image can be inaccurate if the k+1st frame of the image is directly translated, and if the k+2nd frame of the image is scratched by the frame, the deviation can be larger, and the correct result can not be obtained by the key point model of the face. Therefore, for the case that the average displacement is greater than the threshold value, SSD detection is directly carried out on the k+2 frame image, so that the face frame of the k+2 frame image is obtained and is definitely the position of the face in the k+2 frame image.

According to the target tracking method, through the face frames in the kth frame image, a plurality of face key points in the kth frame image and a plurality of face key points in the kth+1 frame image are respectively determined, namely, the face frames in the kth frame image are used as the face frames of the kth+1 frame image, so that the kth+1 frame image is not subjected to face detection, the face detection process is saved, the speed of the whole face tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the face key points in the kth frame image and the face key points in the k+1 frame image is smaller than or equal to a threshold value, correcting the face frame in the kth frame image through the average displacement, and taking the corrected face frame as the face frame of the k+2 frame image to realize face tracking, namely, when the average displacement of the face key points in the kth frame image and the face key points in the k+1 frame image is smaller than or equal to the threshold value, the k+2 frame image is not subjected to face detection, so that the face detection process is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the face tracking method provided by the embodiment of the invention avoids the face detection of each frame of image, so that the technical problems that the face detection of each frame of image is required, the time consumption is long and the real-time requirement cannot be met in the prior art are solved, the detection efficiency is improved, and the face tracking method is suitable for application scenes with high real-time requirements.

Fig. 2 is a schematic diagram of main modules of an object tracking device 200 according to an embodiment of the present invention, and as shown in fig. 2, the device 200 includes:

a detection frame determining module 201, configured to perform target detection on a kth frame image, and determine a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;

A key point determining module 202, configured to determine a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image respectively based on the target detection frame in the kth frame image;

a displacement determining module 203, configured to determine an average displacement between the target key point in the kth frame image and the target key point in the k+1th frame image;

And the tracking module 204 is configured to correct the target detection frame in the kth frame image according to the average displacement when the average displacement is less than or equal to the threshold value, and take the corrected target detection frame as the target detection frame of the kth+2th frame image to realize target tracking.

In an alternative embodiment, the tracking module 204 is further configured to: and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.

In an alternative embodiment, the displacement determination module 203 is further configured to:

In an alternative embodiment, the tracking module 204 is further configured to: and translating the target detection frame in the kth frame of image according to the average displacement.

According to the target tracking device, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole target tracking process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.

The device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.

Fig. 3 illustrates an exemplary system architecture 300 to which the target tracking method or target tracking apparatus of embodiments of the invention may be applied.

As shown in fig. 3, the system architecture 300 may include terminal devices 301, 302, 303, a network 304, and a server 305. The network 304 is used as a medium to provide communication links between the terminal devices 301, 302, 303 and the server 305. The network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 305 via the network 304 using the terminal devices 301, 302, 303 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 301, 302, 303.

The terminal devices 301, 302, 303 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 305 may be a server providing various services, such as a background management server providing support for shopping-type websites browsed by the user using the terminal devices 301, 302, 303. The background management server can analyze and other processing on the received data such as the product information inquiry request and the like, and feed back processing results (such as target push information and product information) to the terminal equipment.

It should be noted that, the object tracking method provided in the embodiment of the present invention is generally executed by the server 305, and accordingly, the object tracking device is generally disposed in the server 305.

It should be understood that the number of terminal devices, networks and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 4, there is illustrated a schematic diagram of a computer system 400 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 4 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU) 401, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In RAM 403, various programs and data required for the operation of system 400 are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output portion 407 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage section 408 including a hard disk or the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. The drive 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 410 as needed, so that a computer program read therefrom is installed into the storage section 408 as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 409 and/or installed from the removable medium 411. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 401.

The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not constitute a limitation on the unit itself in some cases, and for example, the transmitting module may also be described as "a module that transmits a picture acquisition request to a connected server".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include:

According to the technical scheme, the target detection frames in the kth frame image are used for respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the k+1th frame image, namely, the target detection frames in the kth frame image are used as the target detection frames of the k+1th frame image, so that the k+1th frame image is not subjected to target detection, the process of target detection is saved, the speed of a target tracking overall process is accelerated, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the k+1 frame image is smaller than or equal to a threshold value, the target detection frame in the kth frame image is corrected through the average displacement, and the corrected target detection frame is used as the target detection frame of the kth+2 frame image, so that target tracking is realized, namely, when the average displacement of the target key point in the kth frame image and the target key point in the kth+1 frame image is smaller than or equal to the threshold value, the kth+2 frame image is not subjected to target detection, the process of target detection is saved, the speed of the whole flow is accelerated, and the efficiency is improved. Therefore, the target tracking method provided by the embodiment of the invention avoids target detection of each frame of image, so that the technical problems that each frame of image in the prior art needs target detection, is long in time consumption and cannot meet real-time requirements are solved, the detection efficiency is improved, and the target tracking method is suitable for application scenes with high real-time requirements.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A target tracking method, comprising:

2. The method according to claim 1, wherein the method further comprises:

and when the average displacement is larger than a threshold value, performing target detection on the k+2 frame image, and determining a target detection frame in the k+2 frame image so as to realize target tracking.

3. The method of claim 1, wherein determining an average displacement of a target keypoint in the kth frame image from a target keypoint in the k+1 frame image comprises:

4. The method of claim 1, wherein correcting the object detection box in the kth frame image by the average displacement comprises:

5. An object tracking device, comprising:

6. An electronic device, comprising:

One or more processors;

Storage means for storing one or more programs,

When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.

7. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-4.