CN111291598A

CN111291598A - Multi-target tracking method, device, mobile terminal and computer storage medium

Info

Publication number: CN111291598A
Application number: CN201811494503.3A
Authority: CN
Inventors: 胡荣东; 罗水强; 李智勇; 肖德贵
Original assignee: Changsha Intelligent Driving Research Institute Co Ltd
Current assignee: Changsha Intelligent Driving Research Institute Co Ltd
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2020-06-16
Anticipated expiration: 2038-12-07
Also published as: CN111291598B

Abstract

The invention discloses a multi-target tracking method, a multi-target tracking device, a mobile terminal and a computer storage medium, wherein the multi-target tracking method comprises the following steps: acquiring an original image corresponding to a current road according to a set sampling frequency, and performing region division on the original image according to a set image division rule to obtain at least two image regions; determining access frequencies respectively corresponding to the at least two image areas; performing image processing on different image areas of the original image according to the corresponding access frequencies respectively to obtain a target object corresponding to each image area; and carrying out target tracking on the target object.

Description

Multi-target tracking method, device, mobile terminal and computer storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a multi-target tracking method, a multi-target tracking device, a mobile terminal and a computer storage medium.

Background

The video target tracking technology relates to the fields of computer vision processing, image sequence processing, mode recognition, artificial intelligence and the like, and has extremely wide application, such as monitoring in commercial malls, hotels and residential areas; monitoring public places such as schools, hospitals, airports, stations and the like on public utilities; some military guidance systems, aiming systems, etc. based on machine vision.

At present, the main task of target tracking is to determine the position relationship of a tracked object in continuous video frames to obtain a complete motion track of the object. The pedestrian tracking in the complex traffic scene is one of key representative problems in the tracking field due to a special variable structure, the traditional pedestrian detection method is not ideal in detection effect in a complex motion environment and cannot guarantee real-time performance, and the deep learning method has high requirements on hardware and high cost.

Disclosure of Invention

In view of the above, the present invention provides a multi-target tracking method, apparatus, mobile terminal and computer storage medium, which achieve fast, efficient and low-cost target tracking.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

the embodiment of the invention provides a multi-target tracking method, which comprises the following steps:

acquiring an original image corresponding to a current road according to a set sampling frequency, and performing region division on the original image according to a set image division rule to obtain at least two image regions;

determining access frequencies respectively corresponding to the at least two image areas;

performing image processing on different image areas of the original image according to the corresponding access frequencies respectively to obtain a target object corresponding to each image area;

and carrying out target tracking on the target object.

The performing region division on the original image according to the set image division rule to obtain at least two image regions includes:

when the current lane is determined to be a lane with two sides respectively as an opposite lane and a curb, dividing the original image into a first image area containing the current lane and the opposite lane and a second image area containing the curb according to a first image division rule; or the like, or, alternatively,

and when the two sides of the current lane are determined to be lanes of the same-direction lane and/or the curb respectively, dividing the original image into a first image area containing the current lane, and a second image area and a third image area which are positioned on the two sides of the first image area respectively according to a second image division rule.

Wherein, the acquiring the target object corresponding to each image area includes:

carrying out image scanning on the image area to obtain a plurality of candidate windows and window features corresponding to the candidate windows;

determining a target window from the candidate windows according to the window characteristics, and determining the corresponding position of the target window in the original image;

and acquiring a target object corresponding to each image area according to the corresponding position of the target window in the original image.

Wherein the determining a target window from the candidate windows according to the window features comprises:

determining a background filter response value, a root filter response value and a component filter response value respectively corresponding to the candidate windows based on the window characteristics;

determining a cascade order;

and according to the cascade sequence, sequentially accumulating the background filter response value, the root filter response value and the component filter response value for the candidate window to obtain a matching value corresponding to the candidate window, and determining the candidate window as a target window when the matching value is not less than a preset matching degree threshold value.

Wherein the target tracking for the target object comprises:

adding the target object into a tracking queue, and extracting the characteristics of the target object;

performing online model training based on the target object characteristics to generate an online model;

and scanning the target object based on the online model, and correspondingly adjusting the size of a target window corresponding to the target object according to the position coordinate change of the target object.

Wherein the target tracking for the target object comprises: and determining that the position coordinates of the target object reach the set boundary position, and rejecting the target object.

Wherein the target tracking for the target object further comprises:

judging whether the target object meets a set value or not based on a pre-trained offline classifier, and adding 1 to a counter corresponding to the target object when the target object meets the set value;

and when the value of the counter exceeds a set first threshold value, rejecting the target object corresponding to the counter.

Wherein the target tracking for the target object comprises:

and masking the corresponding position of the target object in the original image, and carrying out target tracking on the target object.

The embodiment of the invention provides a multi-target tracking device, which comprises:

the dividing module is used for acquiring an original image corresponding to the current road according to the set sampling frequency and carrying out region division on the original image according to the set image dividing rule to obtain at least two image regions;

the determining module is used for determining the access frequencies respectively corresponding to the at least two image areas;

the acquisition module is used for respectively carrying out image processing on different image areas of the original image according to the corresponding access frequencies to acquire a target object corresponding to each image area;

and the tracking module is used for tracking the target of the target object.

The embodiment of the invention provides a multi-target tracking device, which comprises: a processor and a memory for storing a computer program capable of running on the processor;

when the processor is used for running the computer program, the multi-target tracking method according to any embodiment of the invention is realized.

The embodiment of the invention provides a mobile terminal which comprises a multi-target tracking device in any embodiment of the invention.

The embodiment of the invention provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and when being executed by a processor, the computer program realizes the multi-target tracking method provided by any embodiment of the invention.

According to the multi-target tracking method, the multi-target tracking device, the multi-target tracking mobile terminal and the computer storage medium, the original image corresponding to the current road is obtained according to the set sampling frequency, the area division is carried out on the original image according to the set image division rule to obtain at least two image areas, and the access frequencies corresponding to the at least two image areas are determined respectively, so that the cost on a single frame is reduced when the original image is processed, and meanwhile, the higher attention degree on the important image areas can be kept by setting different access frequencies; performing image processing on different image areas of the original image according to the corresponding access frequencies respectively to obtain a target object corresponding to each image area; and target tracking is carried out on the target object, so that the target object is obtained through image processing, and then the target image is tracked, so that the overhead between the image processing and the image tracking is more balanced, the continuity of the target tracking is ensured, and the target tracking with high speed, high efficiency and low cost is realized.

Drawings

Fig. 1 is a schematic flow chart of a multi-target tracking method according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating image region division according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating image region division according to another embodiment of the present invention;

fig. 4 is a schematic structural diagram of a multi-target tracking apparatus according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a multi-target tracking apparatus according to another embodiment of the present invention;

fig. 6 is a schematic flow chart of a multi-target tracking method according to another embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further elaborated by combining the drawings and the specific embodiments in the specification. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

As shown in fig. 1, an embodiment of the present invention provides a multi-target tracking method, which is applied to a mobile terminal, where the mobile terminal may be a car recorder, a mobile phone, a tablet, or the like.

The method comprises the following steps:

step 101: acquiring an original image corresponding to a current road according to a set sampling frequency, and performing region division on the original image according to a set image division rule to obtain at least two image regions;

here, a camera is generally employed to acquire an original image. The sampling frequency, which defines the number of samples extracted from a continuous signal per second and forming a discrete signal, is expressed in hertz (Hz), and the magnitude of the sampling frequency is generally related to the height of pixels of the camera, for example, the higher the pixel, the higher the sampling frequency of the camera.

Here, the image division rule refers to a rule for dividing an image into regions according to the position of the image in two-dimensional coordinates. The step of dividing the original image into at least two image areas according to the set image division rule means that the original image is divided into at least two image areas according to the set division areas and the corresponding coordinate areas, for example, the original image is divided into three equal parts, namely an image area a, an image area b and an image area c according to the coordinate area division, so that the original frame of image now becomes 3 image areas of one third.

Step 102: determining access frequencies respectively corresponding to the at least two image areas;

here, the access frequency refers to a frequency of scanning the corresponding image area. For example, the original image is divided into three equal parts, i.e., an image area a, an image area b, and an image area c, and the access frequencies f1, f2, and f3 of a, b, and c are set according to scene requirements, where f1+ f2+ f3 is 1, and it is assumed that the access frequency of the image area b can be increased if the image area b is more important among the image area a, the image area b, and the image area c, for example, f1 is set to 0.25, f2 is set to 0.5, and f3 is set to 0.25. Therefore, three image areas of three different pictures can be scanned within the time of scanning one frame of image of the original image.

Step 103: performing image processing on different image areas of the original image according to the corresponding access frequencies respectively to obtain a target object corresponding to each image area;

here, the image processing refers to an image target recognition technology, and may be a scanning image area, and when extracting individual features in an image, a template matching model may be adopted, and the features may be classified into different categories according to different classification methods.

Here, acquiring the target object corresponding to each image area means that the target object corresponding to each image area is obtained by performing image processing on each image area. Generally referred to herein as a pedestrian. For example, by extracting image features, a template matching model is adopted for pedestrian features, non-pedestrian features are filtered one by one through various types of classifiers after training, and finally a target object corresponding to the pedestrian features and a corresponding coordinate area of the target object in an original image are obtained. The classifier here may be an Adaboost classifier, a root filter, a component filter, etc. And determining the target object by adopting different classifiers and different collocation modes.

Step 104: and carrying out target tracking on the target object.

Here, performing target tracking on the target object refers to performing tracking according to a coordinate region corresponding to the determined target object in the original image. The target tracking means that a coordinate area corresponding to a determined target object in an original image is adjusted in real time, and the adjusted coordinate area is ensured to correspond to the target object.

In the above embodiment of the present application, an original image corresponding to a current road is obtained according to a set sampling frequency, at least two image regions are obtained by performing region division on the original image according to a set image division rule, and access frequencies respectively corresponding to the at least two image regions are determined, so that overhead on a single frame is reduced when the original image is processed, and meanwhile, a higher attention degree can be kept on a critical image region by setting different access frequencies; performing image processing on different image areas of the original image according to the corresponding access frequencies respectively to obtain a target object corresponding to each image area; and target tracking is carried out on the target object, so that the target object is obtained through image processing, and then the target image is tracked, so that the overhead between the image processing and the image tracking is more balanced, the continuity of the target tracking is ensured, and the target tracking with high speed, high efficiency and low cost is realized.

In an embodiment, the performing region division on the original image according to the set image division rule to obtain at least two image regions includes:

when the two sides of the current lane are determined to be lanes of an opposite lane and a curb respectively, dividing the original image into a first image area containing the current lane and the opposite lane and a second image area containing the curb according to a first image division rule; or the like, or, alternatively,

and when the current lane is determined to be a lane with two sides respectively being the same-direction lane and/or the curb, dividing the original image into a first image area containing the current lane, and a second image area and a third image area respectively positioned at two sides of the first image area according to a second image division rule.

Here, referring to fig. 2, when lanes on both sides of the current lane are determined as an opposite lane and a curb, respectively, the original image is divided into a first image region (a) including the current lane and the opposite lane opposite to the vehicle direction and a second image region (b) including the curb according to coordinates corresponding to the original image, so that, due to detection of pedestrian safety, a higher access frequency can be set for the first image region including the current lane and the opposite lane opposite to the vehicle direction according to the important region division, thereby ensuring that both the vehicle driving direction and the opposite lane obtain a higher access frequency.

Here, referring to fig. 3, when determining lanes on both sides of the current lane, which are respectively a lane in the same direction and/or a curb, the original image is divided into a first image region (a) including the current lane facing the vehicle direction, a second image region (b) including the lane in the same direction and a third image region (c) including the curb, which are respectively located on both sides of the first image region, according to coordinates corresponding to the original image, so that, due to detection of pedestrian safety, a higher access frequency may be set for the first image region facing the current lane in the vehicle direction, and a reduced access frequency may be set for the second image region and the third image region located on both sides of the first image region, which ensures a higher access frequency for the current lane in the vehicle traveling direction.

According to the embodiment of the application, the original image is divided into two image areas or three image areas according to different image division rules, so that the target tracking is fast, efficient and low-cost while the continuity of the target tracking is ensured.

In an embodiment, the acquiring the target object corresponding to each image region includes:

Here, the candidate window refers to a size of a scanning pixel corresponding to the image scanning, for example, a scanning size of 32 × 64 is determined by the classifier, so that the corresponding image area is divided into a plurality of candidate windows having pixels of 32 × 64, and a feature corresponding to each candidate window is obtained by scanning each candidate window, where the feature may be a color feature, a hog feature, an fhog feature, or the like.

Here, determining the target window from the candidate windows according to the window features means determining a candidate window satisfying a condition as the target window according to the features corresponding to the candidate window, where the condition may be that the features of the candidate window are matched with a template obtained after pre-training, and when the features meet a set matching value, determining the candidate window as the target window. Here, determining the corresponding position of the target window in the original image means mapping the target window back to the corresponding image area after determining the target window, and acquiring the corresponding position in the original image corresponding to the image area, where the position is a coordinate position area in two-dimensional coordinates.

Here, acquiring the target object corresponding to each image region according to the corresponding position of the target window in the original image means acquiring the position coordinates corresponding to the target window, and determining the target object corresponding to each target window, where the target object includes the target window and the corresponding coordinate position of the target window in the original image.

According to the embodiment of the application, each image area is divided into a plurality of candidate windows, and the target window is determined through the obtained candidate windows and the corresponding features of the candidate windows, so that the corresponding target object in the original image is determined, the target object in the image area is detected, and the target object is rapidly detected.

In one embodiment, the determining a target window from the candidate windows according to the window features includes:

determining a cascade order;

Determining the background filter response value, the root filter response value and the component filter response value respectively corresponding to the candidate window based on the window feature means training the background filter, the root filter and the component filter through the window feature, and calculating to obtain the background filter response value, the root filter response value and the component filter response value; here, training of these filters can be used to derive samples for the target object. For example, the training the root filter/component filter may specifically include: and performing heavy edge positive sample, data mining, random gradient descent method optimization and the like on the root filter/component filter by using window characteristics.

Here, the cascade order refers to an order in which the response value of the background filter, the response value of the root filter, and the response value of the component filter are sequentially accumulated in the cascade detection process, so that the calculation efficiency for obtaining the target object can be improved by a reasonable cascade order.

Here, according to the cascade order, sequentially accumulating the background filter response value, the root filter response value, and the component filter response value for the candidate window to obtain a matching value corresponding to the candidate window means that a background target is quickly filtered by a background filter, then the root filter is used for filtering, and finally the component filter is used for obtaining the accurate position of the pedestrian. For example, when a background filter is used, a hog feature and an Adaboost classifier are used, the minimum scanning scale is 32x64, the background filter is used for quickly filtering a background object, a candidate window is reset to be 128x256 to a root filter, a 31-dimensional fhog feature is used for screening, the fhog feature is a compression feature of the hog feature and can further accelerate the scanning speed, and a component filter is used for determining that the candidate window meets the condition, namely the candidate window is the target window. Here, the three filters sequentially confirm that the obtained matching value of the target window satisfies a preset matching degree threshold, and then the target window is determined.

In the above embodiment of the application, the corresponding cascade order is set, and the background filter, the root filter and the component filter are adopted to screen the candidate window in the image region, so as to finally obtain the target window.

In one embodiment, the target tracking for the target object includes:

Here, the online model training based on the target object features may be the same as the training based on the features of the candidate window in the above embodiments of the present application. Further, the fhog feature and the color feature may be used, for example, the color feature of the target user is extracted, and the online model is trained by combining the fhog feature. Here, the adjusting of the size of the target window corresponding to the target object according to the change of the position coordinate of the target object means that the size of the target window is adjusted according to the change of the position coordinate determined in real time based on the tracking of the target object after the target moves in the original image to generate displacement. For example, the position coordinates of the target window become larger, and it appears that the target object becomes larger in the real scene, and the closer to the camera, the larger the size of the corresponding adjustment target window becomes, so that the size of the target window corresponds to the target object.

Further, coarse to fine unequal interval scales conforming to a gaussian weight distribution may be set in the sizing, for example, a primary scale, a secondary scale, and the corresponding weighted confidence is adjusted to conform to the gaussian distribution. For example. When the target object is far away from the camera, the primary scale is adopted, the adjustment change is large, and when the target object is close to the target position, the more refined secondary scale is adopted, so that the adjustment precision of the adjustment scale of the target window is ensured to be finer each time, and the position of the target object can be accurately positioned through the target window when the size of the target window corresponding to the target object is adjusted.

In one embodiment, the target tracking for the target object includes: and determining that the position coordinates of the target object reach the set boundary position, and rejecting the target object.

The boundary position refers to an edge position in the original image, and since the target object has a motion trend, after the target object is tracked, when the position coordinate of the target object reaches the set boundary position, the target object is removed, that is, the target object is removed from the target tracking queue, the original image is scanned again, and tracking is not performed, so that unnecessary calculation is reduced, and tracking efficiency is improved.

In one embodiment, the performing target tracking on the target object further includes:

Here, the offline trainer refers to a trainer which is trained in advance and screens a target object, and here, a hog feature and an SVM classifier can be adopted to confirm and judge each tracked target in the tracking process, so as to prevent target drift and loss.

Judging whether the target object meets a set value or not based on a pre-trained offline classifier, when the target object meets the set value, adding 1 to a counter corresponding to the target object refers to judging the target object again, if the offline classifier judges that the target object is the target object, setting the counter corresponding to the target object to be 0, otherwise, adding 1, for example, the set value can be 'not a pedestrian', and if the offline classifier judges that the target object is not a pedestrian, adding 1 to the counter corresponding to the target object.

Here, the first threshold may be set by itself, and generally, since the previous filter is unavoidable in error occurrence when determining the target object, but the probability is small, the first threshold is set to be greater than 3, for example, the first threshold is 5, when the offline classifier determines that a target object is not a pedestrian for 5 times, the target object is removed from the tracking target and is not tracked any more. Therefore, each tracking target is judged again in the tracking process, and target drifting and target loss are prevented.

In one embodiment, the target tracking for the target object includes:

Here, the masking refers to that the target object is made not to appear in the original image corresponding to the current road in the next frame by an image processing method, that is, after the target object is determined to be the target object, when the next frame of camera acquires an image, the target object is not subjected to secondary image processing again, and is used as a tracking object for real-time tracking. Therefore, the time of image processing can be greatly shortened, the problem of discontinuous image processing can be solved, and pedestrian detection in a congested environment is relieved to a certain extent.

In another embodiment, as shown in fig. 4, there is also provided a multi-target tracking apparatus, which includes a dividing module 21, a determining module 22, an obtaining module 23, and a tracking module 24; wherein the content of the first and second substances,

the dividing module 21 is configured to obtain an original image corresponding to a current road according to a set sampling frequency, and perform area division on the original image according to a set image dividing rule to obtain at least two image areas;

the determining module 22 is configured to determine access frequencies respectively corresponding to the at least two image areas;

the obtaining module 23 is configured to perform image processing on different image areas of the original image according to the corresponding access frequencies, and obtain a target object corresponding to each image area;

the tracking module 24 is configured to perform target tracking on the target object.

Optionally, the dividing module 21 is further configured to, when it is determined that two sides of the current lane are lanes of an opposite lane and a curb respectively, divide the original image into a first image region including the current lane and the opposite lane and a second image region including the curb according to a first image dividing rule; or the like, or, alternatively,

Optionally, the obtaining module 23 is further configured to perform image scanning on the image region, and obtain a plurality of candidate windows and window features corresponding to the candidate windows;

Optionally, the obtaining module 23 is further configured to determine, based on the window features, a background filter response value, a root filter response value, and a component filter response value respectively corresponding to the candidate windows;

determining a cascade order;

Optionally, the tracking module 24 is further configured to add the target object into a tracking queue, and extract features of the target object;

Optionally, the tracking module 24 is further configured to determine that the position coordinate of the target object reaches a set boundary position, and reject the target object.

Optionally, the tracking module 24 is further configured to determine whether the target object meets a set value based on a pre-trained offline classifier, and add 1 to a counter corresponding to the target object when the target object meets the set value;

Optionally, the tracking module 24 is further configured to perform target tracking on the target object by masking a corresponding position of the target object in the original image.

In another embodiment, as shown in fig. 5, there is also provided a multi-target tracking apparatus including: at least one processor 210 and a memory 211 for storing computer programs capable of running on the processor 210; the processor 210 illustrated in fig. 5 is not used to refer to the number of processors as one, but is only used to refer to the position relationship of the processor with respect to other devices, and in practical applications, the number of processors may be one or more; similarly, the memory 211 illustrated in fig. 5 is also used in the same sense, i.e., it is only used to refer to the position relationship of the memory with respect to other devices, and in practical applications, the number of the memory may be one or more.

Wherein, when the processor 210 is used for running the computer program, the following steps are executed:

and carrying out target tracking on the target object.

In an alternative embodiment, the processor 210 is further configured to execute the following steps when the computer program runs:

when the two sides of the current lane are determined to be lanes of an opposite lane and a curb respectively, dividing the original image into a first image area containing the current lane and a second image area containing the curb according to a first image division rule; or the like, or, alternatively,

determining a cascade order;

and determining that the position coordinates of the target object reach the set boundary position, and rejecting the target object.

The multi-target tracking apparatus further includes: at least one network interface 212. The various components on the transmit side are coupled together by a bus system 213. It will be appreciated that the bus system 213 is used to enable communications among the connections of these components. The bus system 213 includes a power bus, a control bus, and a status signal bus in addition to the data bus. For clarity of illustration, however, the various buses are labeled as bus system 213 in fig. 5.

The memory 211 may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 211 described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.

The memory 211 in the embodiment of the present invention is used to store various types of data to support the operation of the transmitting end. Examples of such data include: any computer program for operating on the sender side, such as an operating system and application programs. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs may include various application programs for implementing various application services. Here, the program that implements the method of the embodiment of the present invention may be included in an application program.

The embodiment further provides a computer storage medium, for example, including a memory 211 storing a computer program, which can be executed by a processor 210 in the transmitting end to perform the steps of the foregoing method. The computer storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM; or various devices including one or any combination of the above memories, such as a smart phone, a tablet computer, a notebook computer, and the like. A computer storage medium having a computer program stored therein, the computer program, when executed by a processor, performing the steps of:

and carrying out target tracking on the target object.

In an alternative embodiment, the computer program, when executed by the processor, further performs the steps of:

determining a cascade order;

The following describes the operation of the multi-target tracking method in further detail by using an alternative embodiment, with reference to fig. 6, taking the target object as a pedestrian. The multi-target tracking method comprises the following steps:

s1: the camera collects an original image in real time; the camera acquires an original image corresponding to the camera according to the set acquisition frequency;

s2: selecting a region where the target may appear or a region of no interest (if not, no operation); the sky part can be removed directly by manually adjusting the shooting angle of the camera, so that unnecessary image processing is reduced;

s3: dividing an original image into three blocks, and setting the polling frequency of each block; here, the polling frequency is an access frequency to each image area, and the original image is subjected to area division to obtain three image areas;

s4: selecting an image area to be processed currently;

s5: masking the target object already in the tracking queue; here, this step does not work at the time of the first image processing, and is executed when it is determined that there is a target object;

specifically, the method comprises the following steps:

s61: judging by a background filter; adopting a background filter to scan the image, determining that the scanning size is 32 multiplied by 64, dividing the corresponding image area into a plurality of candidate windows with the pixels of 32 multiplied by 64, judging whether the image area is a pedestrian or not through the background filter, and eliminating the candidate windows if the image area is not the pedestrian; if yes, go to step S62;

step S62: restoring the candidate window to 128X256 size;

step S63: judging by a root filter;

step S64: when the threshold value of the root filter does not meet the set threshold value, eliminating the candidate window, and executing the step S65 if the condition is met;

step S65: performing component filter detection within a candidate window of a root filter;

step S66: combining the weighted values of the root filter and the component filter, if the weighted values do not meet the set threshold value, eliminating the candidate window, if the weighted values meet the condition, determining the candidate window as the target window, and executing the step S67;

step S67: mapping back to the position of the original image and adding the position to a tracking queue; here, the position mapped back to the original image is to determine the corresponding position of the target window in the original image, and add the target object to the tracking queue.

Target tracking is carried out on the target object;

specifically, the method comprises the following steps:

step S71: training an online model; performing online model training on target objects added into a tracking queue, and setting a corresponding counter for each target object;

step S72: searching in the next frame, and simultaneously setting the target object counter newly added into the tracking queue to be 0;

step S73: setting four scales and adjusting the size of a target object; here, the target object is scanned based on the online model, the size of a target window corresponding to the target object is adjusted according to the position coordinate change of the target object, here, four dimensions are set, the size of the target object is adjusted,

step S74: reset the picture size to 64 x 128;

step S75: judging pedestrians by an offline classifier trained offline; if yes, go to step S77; otherwise, executing step S76;

step S76: the counter is counted to be + 1;

step S77: setting the counter to be 0;

step S78: when the counter is more than or equal to 5, removing the object; here, when the counter is greater than or equal to 5, the target object is removed from the tracking queue;

step S79: judging the position of the target object reaching the boundary; here, when the target object reaches the boundary position of the original image, the target object is removed from the tracking queue; otherwise, step S5 is executed at the next frame and step S71 is re-executed.

S8: and outputting position information of all targets (target objects in the tracking queue), wherein the position information comprises a target window of the target objects, the target objects corresponding to the target window and the corresponding position coordinate information of the target window in the original image.

According to the embodiment of the application, the original image is divided into the image blocks and different access frequencies are set, after the image processing is determined to be the target object, the tracking processing of the target object is kept through the mode of region polling, cascade classification image processing and tracking fusion, meanwhile, the target object is not detected any more, meanwhile, the original image is divided into three blocks, so that the time overhead of the image processing on a single frame is reduced, the problem of a large number of frame loss is solved, the position change of pedestrians is continuous and moderate, the subsequent tracking can work stably, the stability problem of target detection is solved, the difficulty of pedestrian detection in a congested environment is relieved, different access frequencies are set for different regions, and the attention degree to important regions is guaranteed.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A multi-target tracking method is characterized by comprising the following steps:

and carrying out target tracking on the target object.

2. The multi-target tracking method according to claim 1, wherein the area-dividing the original image according to the set image-dividing rule to obtain at least two image areas comprises:

3. The multi-target tracking method according to claim 1, wherein the acquiring the target object corresponding to each image area comprises:

4. The multi-target tracking method of claim 3, wherein the determining a target window from the candidate windows based on the window features comprises:

determining a cascade order;

5. The multi-target tracking method of claim 1, wherein the target tracking for the target object comprises:

6. The multi-target tracking method of claim 1, wherein the target tracking for the target object comprises: and determining that the position coordinates of the target object reach the set boundary position, and rejecting the target object.

7. The multi-target tracking method of claim 1, wherein the target tracking for the target object further comprises:

8. The multi-target tracking method of claim 1, wherein the target tracking for the target object comprises:

9. A multi-target tracking apparatus, the apparatus comprising:

and the tracking module is used for tracking the target of the target object.

10. A multi-target tracking apparatus, comprising: a processor and a memory for storing a computer program capable of running on the processor;

wherein the processor is configured to implement the multi-target tracking method of any one of claims 1 to 8 when running the computer program.

11. A mobile terminal characterized by comprising the multi-target tracking apparatus according to any one of claims 9 to 10.

12. A computer storage medium, in which a computer program is stored, wherein the computer program, when executed by a processor, implements the multi-target tracking method according to any one of claims 1 to 8.