CN110675428A

CN110675428A - Target tracking method and device for human-computer interaction and computer equipment

Info

Publication number: CN110675428A
Application number: CN201910842488.5A
Authority: CN
Inventors: 陈晓春; 林博溢; 王海林; 张坤华
Original assignee: Shenzhen Research Institute Tsinghua University; Peng Cheng Laboratory
Current assignee: Shenzhen Research Institute Tsinghua University; Peng Cheng Laboratory
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2020-01-10
Anticipated expiration: 2039-09-06
Also published as: CN110675428B

Abstract

The application relates to a target tracking method, a target tracking system, computer equipment and a storage medium for human-computer interaction. The method comprises the following steps: reading a video frame image, and detecting a corresponding target in the video frame image; extracting color features of the target, and establishing a color space model corresponding to the target by using the color features; partitioning the video frame image to obtain a motion area corresponding to the target; and tracking the motion area corresponding to the target by using the color space model. When the method is adopted for target tracking of human-computer interaction, the target can be accurately tracked.

Description

Target tracking method and device for human-computer interaction and computer equipment

Technical Field

The present application relates to the field of human-computer interaction technologies, and in particular, to a human-computer interaction-oriented target tracking method, apparatus, computer device, and storage medium.

Background

With the rapid development of the computer vision field, the target tracking is increasingly regarded as an important part in the vision field. At present, the life style of people is gradually changed particularly in the aspect of human-computer interaction based on a target tracking technology of computer vision, and the target tracking technology is widely applied to the fields of monitoring, medical imaging, unmanned driving, remote control, interactive games and the like. The traditional target tracking algorithm comprises a compressed sensing algorithm, a correlation filtering algorithm, a space-time context tracking algorithm and the like, and the traditional tracking algorithm is simple in calculation and easy to understand, so that the traditional tracking algorithm is popular.

However, in the actual tracking process, it is difficult to perform accurate target tracking by using the conventional tracking algorithm due to the fact that the visual tracking of the computer device is difficult due to the external conditions such as the complex background and the illumination, and the factors such as the diversity of the human body and the fact that the motion of the human body belongs to the nonlinear irregular motion.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device and a storage medium for accurately tracking a target oriented to human-computer interaction.

A target tracking method facing human-computer interaction comprises the following steps:

reading a video frame image, and detecting a corresponding target in the video frame image;

extracting color features of the target, and establishing a color space model corresponding to the target by using the color features;

partitioning the video frame image to obtain a motion area corresponding to the target;

and tracking the motion area corresponding to the target by using the color space model.

In one embodiment, the partitioning the video frame image to obtain a motion region corresponding to the target includes:

performing color space conversion on the video frame image to obtain pixel probabilities corresponding to various colors;

generating a color probability distribution map by using pixel probabilities corresponding to a plurality of colors;

and partitioning the color probability distribution map to obtain an interference region and a motion region corresponding to the target.

In one embodiment, the interference region comprises a homochromatic interference region; the partitioning the color probability distribution map includes:

marking the area similar to the target color in the color probability distribution map as a homochromatic interference area;

establishing a mixed model corresponding to a plurality of homochromy interference areas, and generating a probability density map containing the plurality of homochromy interference areas;

and partitioning the probability density map to obtain a plurality of homochromatic interference areas and motion areas corresponding to the targets.

In one embodiment, the tracking a motion region corresponding to the target by using the color space model includes:

positioning the target through the color probability distribution map;

marking the motion area in the color probability distribution map;

the target is tracked in successive video frame images.

In one embodiment, the method further comprises:

when a target is lost in the tracking process, marking a video frame image of the lost target as an abnormal image;

acquiring a motion track corresponding to a target;

determining the lost position of the target by utilizing the motion track;

and starting from the lost position, continuing to track the target.

In one embodiment, the determining the missing position of the target by using the motion trajectory includes:

searching areas similar to the color characteristics corresponding to the target in the multiple abnormal images;

comparing the area with similar color features in each abnormal image with the motion trail to obtain the probability of the lost position of the target in each abnormal image;

and determining the lost position of the target according to the probability.

An apparatus for human-computer interaction-oriented target tracking, the apparatus comprising:

the acquisition module is used for reading the video frame image and detecting a corresponding target in the video frame image;

the model establishing module is used for establishing a color space model corresponding to the target by utilizing the color characteristics;

and the partitioning module is used for partitioning the video frame image to obtain a motion area corresponding to the target.

And the tracking module is used for tracking the motion area corresponding to the target.

In one embodiment, the apparatus further comprises:

the loss re-detection module is used for acquiring a motion track corresponding to the target when the target is lost in the tracking process and determining the lost position of the target by utilizing the motion track; and starting from the lost position, the tracking module is used for continuously tracking the target.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the target tracking method and device, the computer equipment and the storage medium for human-computer interaction, the corresponding target can be detected by reading the video frame image. A color space model corresponding to the target can be established by utilizing the color characteristics of the target. By partitioning the video frame images and positioning the motion area of the target in the video frame images, the target can be tracked in the multi-frame video frame images by utilizing the color space model, so that the accuracy of target tracking is improved.

Drawings

FIG. 1 is a diagram of an application scenario of a human-computer interaction-oriented target tracking method in an embodiment;

FIG. 2 is a schematic flowchart of a human-computer interaction-oriented target tracking method in one embodiment;

3-a through 3-e are schematic diagrams illustrating segmentation of gesture objects for human-computer interaction in one embodiment;

FIG. 4 is a flowchart illustrating the step of partitioning a video frame image to obtain a motion region corresponding to a target according to an embodiment;

FIG. 5 is a schematic flow chart illustrating the partitioning step for the color probability distribution map in one embodiment;

6-a, 6-b are diagrams illustrating the same color interference area division for gesture targets in one embodiment;

FIG. 7 is a flowchart of the step of re-detection of lost object tracking in one embodiment;

FIG. 8 is a block diagram of an embodiment of a human-computer interaction-oriented target tracking device;

FIG. 9 is a block diagram of another embodiment of a target tracking apparatus for human-computer interaction;

FIG. 10 is a diagram showing an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The target tracking method facing human-computer interaction can be applied to the application environment shown in fig. 1. Wherein the terminal 102 and the server 104 communicate via a network. The terminal 102 may obtain the video from the server 104 by accessing the server 104. A camera and a screen are installed in the terminal 102. The terminal 102 may also obtain a real-time video collected by the camera by starting the camera. The terminal 102 displays the video frame image through the screen, and detects a corresponding object in the video frame image. The terminal 102 extracts color features of the target, establishes a color space model corresponding to the target by using the color features, partitions the video frame image to obtain a motion area corresponding to the target, and tracks the motion area corresponding to the target by using the color space model. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers. Only one type of the above applications is listed in this application, but it is understood that the type of application is not limited thereto, and may be a game application, a social networking application, or the like in other embodiments.

In an embodiment, as shown in fig. 2, a target tracking method facing human-computer interaction is provided, which is described by taking the example that the method is applied to the terminal in fig. 1, and includes the following steps:

step 202, reading the video frame image, and detecting a corresponding target in the video frame image.

The terminal acquires a video. The video may be a real-time video or a non-real-time video corresponding to a preset time period. The real-time video is acquired in real time in the man-machine interaction process through a camera installed in the terminal. The non-real-time video is obtained by the terminal from the server through accessing the server. For example, the terminal can acquire the monitoring video acquired by other terminals through the camera to perform target tracking by accessing the server. Both real-time video and non-real-time video comprise frames of video frame images. Each video frame image is a static image, and a plurality of static images form a video.

The terminal is provided with a camera. The camera can shoot a target in motion in real time to generate a real-time video. The terminal reads the real-time video collected by the camera and displays a plurality of frames of video frame images on the screen. And the terminal detects whether a corresponding target exists in the multi-frame video frame image by using the trained classifier. When the terminal detects that the corresponding target exists in the multi-frame video frame image, the terminal detects the spatial coincidence degree of the multi-frame video frame image. When the spatial coincidence degree reaches a threshold value, the corresponding target (namely the target to be tracked) is determined to be detected. The threshold may be preset in a configuration file, for example, the threshold is 85%. The target to be tracked includes, but is not limited to, a human gesture, a face, a foot, a leg and other limbs.

And 204, extracting the color characteristics of the target, and establishing a color space model corresponding to the target by using the color characteristics.

The terminal identifies an initial position corresponding to the target to be tracked in a first frame video frame image of the target to be tracked, the terminal positions the target to be tracked by using the initial position of the target to be tracked, and the target to be tracked is segmented by adopting a mean shift method to obtain a central target area corresponding to the target to be tracked. The terminal acquires the color of the center target region, and sets the peripheral portion of the center target region to a color different from the color of the center target region.

Taking the target to be tracked as an example, the specific method steps for segmenting the palm as shown in fig. 3-a to 3-e are described. The terminal detects whether a palm exists in the multi-frame video frame images by using the trained classifier as shown in fig. 3-a, when the terminal detects that the palm exists in the multi-frame video frame images, a central target area of the palm of the target to be tracked is extracted by using a mean shift method, as shown in fig. 3-b to 3-e, the terminal sets the color of the central target area of the palm to be white, and the peripheral part except the central target area of the palm is set to be black. And the terminal takes the target palm to be tracked as a color characteristic value source, and generates a color histogram corresponding to the target palm to be tracked by using the acquired central target area of the target palm to be tracked. And the terminal performs color space conversion on the video frame image by using the color histogram corresponding to the target palm to be tracked to obtain a converted color space, and performs quantitative conversion on the converted color space to obtain a color space model corresponding to the target to be tracked. By establishing the color space model corresponding to the target, the nonlinear motion of the target to be tracked can be tracked more accurately in the next step.

And step 206, partitioning the video frame image to obtain a motion area corresponding to the target.

And step 208, tracking the motion area corresponding to the target by using the color space model.

The terminal partitions the video frame image by using the established color space model, and divides the area in the video frame image into a motion area and an interference area corresponding to the target. The interference area may be one or a combination of a plurality of interference areas.

In a continuous multi-frame video frame image, a terminal acquires a preset initial tracking window, and the terminal solves the position of the color space model centroid corresponding to a target to be tracked by using a mean shift method and an iteration method to obtain the position of the color space model centroid corresponding to the initial tracking window in the next frame video frame image, wherein the position of the color space model centroid is the central position of the target to be tracked in the next frame video frame image. And the terminal records the central position of the target to be tracked in the continuous multi-frame video frame images to obtain the motion track of the target to be tracked. The terminal can also use the motion trail of the target to be tracked to detect the loss of the target to be tracked.

In the continuous multi-frame video frame images, the terminal acquires the initial position of the target to be tracked, and from the initial tracking window, the terminal tracks the central position of the target to be tracked in the next frame video frame image by using a mean shift method and an iteration method.

In one embodiment, tracking a motion region corresponding to a target by using a color space model includes: positioning the target through the color probability distribution map; marking the motion area in the color probability distribution map; the target is tracked in successive video frame images.

The terminal identifies an initial position corresponding to the target to be tracked in a first frame video frame image of the target to be tracked, and sets the initial position as an initial tracking window. In order to accurately track the target to be tracked, the terminal marks the motion area of the target to be tracked in the color probability distribution map. The method comprises the steps that a terminal obtains an initial tracking window, and in continuous multi-frame video frame images, the terminal solves the position of a color space model centroid corresponding to a target to be tracked by using a mean shift method and an iteration method to obtain the position of the color space model centroid corresponding to the initial tracking window in a next frame video frame image, wherein the position of the color space model centroid is the central position of the target to be tracked in the next frame video frame image. And the terminal records the central position of the target to be tracked in the continuous multi-frame video frame images to obtain the motion track of the target to be tracked. In the continuous multi-frame video frame images, the terminal acquires the initial position of the target to be tracked, and from the initial tracking window, the terminal tracks the central position of the target to be tracked in the next frame video frame image by using a mean shift method and an iteration method. Therefore, the target is positioned in the motion area, and the target area is accurately tracked in the multi-frame video frame images.

In this embodiment, by reading the video frame image, a corresponding target can be detected. A color space model corresponding to the target can be established by utilizing the color characteristics of the target. The video frame images are partitioned, and the motion area of the target is positioned in the video frame images, so that the target can be accurately tracked in the multi-frame video frame images by using the color space model, and the target can be accurately tracked in the process of facing human-computer interaction.

In one embodiment, the step of partitioning the video frame image to obtain the motion region corresponding to the target, as shown in fig. 4, includes:

step 402, performing color space conversion on the video frame image to obtain pixel probabilities corresponding to multiple colors.

At step 404, a color probability distribution map is generated using the pixel probabilities corresponding to the plurality of colors.

And 406, partitioning the color probability distribution map to obtain an interference region and a motion region corresponding to the target.

In the present embodiment, after the video frame image is color-space-converted using the color features, the terminal performs quantization conversion on the converted color space. Wherein the terminal converts the video frame image from the RGB color space to the HSV color space. The HSV color space comprises an H component histogram, an S component histogram and a V component histogram. The H component histogram includes color features of H component multi-level colors, for example, color features of n level colors that may be H components. And the terminal carries out quantization conversion on the color space of the H component histogram to obtain the probability of the n-level colors appearing in the H component histogram. Because different level colors may contain different numbers of pixels, the H-component histogram of a video frame image may be represented as:

q＝{q(u)},u＝1,2…n (1)

wherein q (u) represents the probability of the occurrence of the u-th order color in the H component map.

The terminal acquires the probability value of the n-level color appearing in the H component image (also called as the pixel probability of the n-level color), searches the value of each pixel of the n-level color in the corresponding video frame image, and the terminal replaces the value of each pixel of the n-level color in the video frame image with the pixel probability of the corresponding n-level color one by one to obtain the color probability distribution map of the video frame image.

And the terminal partitions the color probability distribution map of each frame of video frame image, and generates a corresponding interference area and a motion area corresponding to the target by using the color probability distribution map corresponding to the video frame image. The terminal marks all the areas outside the motion area corresponding to the target as interference areas and sets the interference areas to different colors, such as black. Therefore, when the target is tracked, the target can be accurately tracked in the motion area corresponding to the divided target, and the influence of the interference area on the target tracking is effectively eliminated.

In one embodiment, the interference region includes a homochromatic interference region, and the step of partitioning the color probability distribution map specifically includes, as shown in fig. 5:

step 502, marking the region similar to the target color in the color probability distribution map as a homochromatic interference region.

Step 504, a mixed model corresponding to the plurality of homochromatic interference regions is established, and a probability density map containing the plurality of homochromatic interference regions is generated.

Step 506, the probability density map is partitioned to obtain a plurality of homochromatic interference areas and a motion area corresponding to the target.

The terminal partitions the video frame image by using the established color space model, and divides the area in the video frame image into a motion area and an interference area corresponding to the target. The interference area may be one or a combination of a plurality of interference areas. Specifically, the terminal compares the color probability distribution map with a region with similar color of the target to be tracked in the video frame image, and marks the region with similar color as a homochromatic interference region. Wherein, the same color interference area can be a plurality of areas which are not adjacent to each other. The probability distribution map corresponding to the video frame image can be represented by the following method: with F (x)_i) The probability distribution graph representing a frame of video frame image can be represented as:

F(x_i)＝F_M(x_i)+F_I(x_i) (2)

wherein, F_M(x_i) A color probability distribution map of a motion area corresponding to the target; f_I(x_i) The color probability distribution diagram of the same color interference area comprises a plurality of same color interference areas.

And the terminal performs segmentation Gaussian mixture modeling on the color probability distribution maps of the plurality of homochromatic interference regions to obtain a probability density map containing the plurality of homochromatic interference regions. X in video frame image_iThe pixels of a location can be represented by a mixture model of K gaussian distributions:

in the formula eta_k(x_i,t,μ_i,t,k,Σ_i,t,k) Is a Gaussian distribution function, mu_i,t,kIs mean value, Σ_i,t,kIs a covariance matrix, ω_i,t,kIs a weight of class k and satisfiesWhere K takes the value 5. The probability density map thus obtained, which contains a plurality of areas of interference of the same color, can be used as F_NewI(x_i) Represents:

F_NewI(x_i)＝F_I(x_i)-f(F_i,t) (4)

wherein, F_I(x_i) A color probability distribution map of the same-color interference region; f (F)_i,t) Is a mixed model formed by K Gaussian distributions.

The terminal partitions a probability density map which corresponds to the video frame image and comprises a plurality of same-color interference areas, generates a plurality of white connected area images of the corresponding same-color interference areas by using the probability density map which corresponds to the video frame image and comprises the plurality of same-color interference areas, marks all areas except the same-color interference areas as motion areas corresponding to targets, and sets different colors to be displayed on a screen, such as black. And the terminal partitions the probability density map to obtain a motion area M corresponding to the target and a plurality of homochromatic interference areas I. If the right hand lifted by the person in fig. 6-a is taken as the target to be tracked, as shown in fig. 6-b, the two white connected regions are the same-color interference region I of the gesture target to be tracked, and all black regions outside the same-color interference region are the motion regions M of the gesture target to be tracked. Therefore, when the target is tracked, interference of areas with similar colors in the tracking process to the gesture target to be tracked can be eliminated.

In one embodiment, the method further comprises; as shown in fig. 7, the step of detecting the target tracking loss includes:

step 702, when the target is lost in the tracking process, marking the video frame image of the lost target as an abnormal image.

Step 704, obtaining a motion trajectory corresponding to the target.

And step 706, determining the lost position of the target by using the motion track.

And step 708, starting from the lost position, continuing to track the target.

When the target is lost in the tracking process, the terminal marks the video frame image of the lost target as an abnormal image. In a continuous multi-frame video frame image, a terminal acquires a preset initial tracking window, and the terminal solves the position of the color space model centroid corresponding to a target to be tracked by using a mean shift method and an iteration method to obtain the position of the color space model centroid corresponding to the initial tracking window in the next frame video frame image, wherein the position of the color space model centroid is the central position of the target to be tracked in the next frame video frame image. And the terminal records the central position of the target to be tracked in the continuous multi-frame video frame images to obtain the motion track of the target to be tracked.

The terminal searches areas with similar color characteristics corresponding to the target in the abnormal images, marks the searched similar areas, and judges the areas with similar color characteristics in each marked abnormal image one by one. When the marked similar color area is in the same color interference area, the terminal will set a smaller lost position probability; when the marked similar color area is in the target motion area, the terminal will be set with a larger probability of losing the position; when the marked similar color area is in the target motion track area, the terminal will be set with a larger probability of losing the position. And the terminal sorts the values of the probability values of the lost positions in each abnormal image, sets the position with the maximum probability value of the lost position as the lost position of the target, and continuously carries out iterative tracking on the motion area corresponding to the target from the lost position. And if the target is not found in the current abnormal image, continuously inputting the next frame of video image, and repeating the steps of tracking, losing and detecting the target. And starting from the lost position, the terminal continuously carries out iterative tracking on the motion area corresponding to the target. Therefore, the target can be accurately tracked at any time in the tracking process.

In this embodiment, the pixels in the probability density map may be regarded as points with mass, and the mass of the pixels is proportional to the brightness, so that the terminal may use a mean shift method and an iterative method to solve the position of the centroid of the initial tracking window in the probability distribution map corresponding to the next frame image, so as to obtain the center position of the target to be tracked in the corresponding motion region. In addition, the tracking window in the present embodiment may be set to different shapes, sizes, and thresholds according to the target to be tracked. For example, the terminal detects that the target to be tracked is a palm, the terminal sets the tracking window to be a rectangular frame with the side length of d, and the terminal records data of two central positions according to a preset threshold range, namely when the central distance between two adjacent rectangular frames is greater than d/4 and smaller than d. The terminal records the center position of the moving area corresponding to the continuous multi-frame video frame image target, and then the positions of a series of center points of the target in the moving process can be obtained. Wherein, the tracking window can be arranged in a circle, a square and the like in other embodiments.

In one embodiment, as shown in fig. 8, there is provided an apparatus for target tracking oriented to human-computer interaction, including: an obtaining module 802, a model building module 804, a partitioning module 806, and a tracking module 808, wherein:

an obtaining module 802, configured to read a video frame image and detect a corresponding target in the video frame image;

a model building module 804, configured to build a color space model corresponding to the target by using the color features;

a partitioning module 806, configured to partition the video frame image to obtain a motion region corresponding to the target;

and a tracking module 808, configured to track a motion region corresponding to the target.

In one embodiment, the partitioning module 806 is further configured to perform color space conversion on the video frame image to obtain pixel probabilities corresponding to multiple colors, then generate a color probability distribution map by using the pixel probabilities corresponding to the multiple colors, and finally partition the color probability distribution map to obtain an interference region and a motion region corresponding to the target.

In one embodiment, the partitioning module 806 is further configured to mark a region of the color probability distribution map similar to the target color as a homochromatic interference region, and the model establishing module 804 is configured to establish a mixture model corresponding to a plurality of homochromatic interference regions, and generate a probability density map including the plurality of homochromatic interference regions; the partitioning module 806 is configured to partition the probability density map to obtain a plurality of homochromatic interference regions and a motion region corresponding to the target.

In one embodiment, the tracking module 808 is further configured to locate the target through a color probability distribution map, mark a motion region in the color probability distribution map, and finally track the target in the continuous video frame images.

In one embodiment, as shown in fig. 9, the apparatus further comprises: loss re-detection module 910.

When the target is lost in the tracking process, the loss re-detection module 910 is configured to mark the video frame image of the lost target as an abnormal image, the obtaining module 902 is further configured to obtain a motion trajectory corresponding to the target, determine a lost position of the target by using the motion trajectory, and from the lost position, the tracking module 908 is further configured to continue tracking the target.

In one embodiment, the apparatus further comprises: loss re-detection module 910.

The loss re-detection module 910 is configured to search a region with similar color characteristics corresponding to the target in the multiple abnormal images, compare the region with similar color characteristics in each abnormal image with the motion trajectory to obtain a probability of a lost position of the target in each abnormal image, and determine the lost position of the target according to the probability of the lost position of the motion region in each abnormal image.

For specific limitations of the device for tracking a target oriented to human-computer interaction, reference may be made to the above limitations of the target tracking method oriented to human-computer interaction, and details thereof are not repeated herein. All or part of the modules in the human-computer interaction-oriented target tracking device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a human-computer interaction oriented object tracking method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the steps of the above-described method embodiments being implemented when the computer program is executed by the processor.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A target tracking method facing human-computer interaction comprises the following steps:

2. The method of claim 1, wherein said partitioning the video frame image to obtain the motion region corresponding to the object comprises:

3. The method of claim 2, wherein the interference region comprises a co-chromatic interference region; the partitioning the color probability distribution map includes:

4. The method of claim 2, wherein tracking the motion region corresponding to the target using the color space model comprises:

positioning the target through the color probability distribution map;

marking the motion area in the color probability distribution map;

the target is tracked in successive video frame images.

5. The method according to any one of claims 1-4, characterized in that the method further comprises:

acquiring a motion track corresponding to a target;

determining the lost position of the target by utilizing the motion track;

and starting from the lost position, continuing to track the target.

6. The method of claim 5, wherein determining the missing position of the object using the motion trajectory comprises:

and determining the lost position of the target according to the probability.

7. An apparatus for tracking a target oriented to human-computer interaction, the apparatus comprising:

the partitioning module is used for partitioning the video frame image to obtain a motion area corresponding to the target;

8. The human-computer interaction-oriented object tracking device of claim 7, further comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented by the processor when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.