CN114937231B - Target identification tracking method - Google Patents

Target identification tracking method Download PDF

Info

Publication number
CN114937231B
CN114937231B CN202210858864.1A CN202210858864A CN114937231B CN 114937231 B CN114937231 B CN 114937231B CN 202210858864 A CN202210858864 A CN 202210858864A CN 114937231 B CN114937231 B CN 114937231B
Authority
CN
China
Prior art keywords
target
target object
characteristic
layer
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210858864.1A
Other languages
Chinese (zh)
Other versions
CN114937231A (en
Inventor
寇映
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHENGDU XIWU SECURITY SYSTEM ALLIANCE CO LTD
Original Assignee
CHENGDU XIWU SECURITY SYSTEM ALLIANCE CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHENGDU XIWU SECURITY SYSTEM ALLIANCE CO LTD filed Critical CHENGDU XIWU SECURITY SYSTEM ALLIANCE CO LTD
Priority to CN202210858864.1A priority Critical patent/CN114937231B/en
Publication of CN114937231A publication Critical patent/CN114937231A/en
Application granted granted Critical
Publication of CN114937231B publication Critical patent/CN114937231B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The invention relates to a target identification tracking method, which comprises the following steps: inputting a video image into a target identification module, extracting pixel characteristics of all target detection frames in the video image by the target identification module, and determining a target object through the pixel characteristics; the target tracking module locates a target object, and determines the facing direction of the target object in a two-dimensional image frame, wherein the facing direction is represented by an angle range of 360 degrees; and the target tracking module predicts the moving track of the target object in the next frame of image according to the angle faced by the target object. The invention aims at the pixel characteristics of a plurality of target objects in a video image to determine the detection frame of the target object so as to quickly track and position the target object.

Description

Target identification tracking method
Technical Field
The invention relates to the technical field of automatic target identification and tracking, in particular to a target identification and tracking method.
Background
The collected video images are used for identifying and tracking people, and the items are very favorable for development in the modern tracking technology, people can be well identified through a deep neural network, but when a plurality of people targets exist in the video images and the targets are easy to generate large movement deviation, shield each other and the like, for example, in scenes with large people flow such as markets, squares, stations and the like, the target tracking performance is limited, the problem that a target detection frame jumps occurs, and the target tracking precision is reduced. Therefore, the technology for identifying and tracking the target in the video image is further improved.
Disclosure of Invention
The invention aims to determine a detection frame of a plurality of target objects in a video image according to pixel characteristics of the target objects so as to quickly track and position the target objects, and provides a target identification and tracking method.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
a method of target recognition tracking, comprising the steps of:
step S1, inputting the video image into a target identification module, extracting the pixel characteristics of all target detection frames in the video image by the target identification module, and determining a target object through the pixel characteristics;
step S2, the target tracking module locates the target object, determines the facing direction of the target object in the two-dimensional image frame, and the facing direction is represented by an angle range of 360 degrees;
in step S3, the target tracking module predicts the movement trajectory of the target object in the next frame of image according to the angle that the target object faces.
In the scheme, the pixel characteristics of the target detection frame in the image are obtained, the trend of the next frame of the target detection frame is determined through the pixel characteristics, the problem that the target detection frame frequently jumps when the video image is shielded by more people is solved, and the target object can be quickly tracked and positioned as long as the detection frame of the target object is determined even if shielding or large movement deviation exists.
The step of extracting the pixel characteristics of all target detection frames in the video image by the target identification module comprises the following steps:
the target identification module comprises a chromaticity extraction unit, a deep convolutional neural network and a feature fusion layer;
the chroma extraction unit extracts four chroma values of pixel points aiming at the pixel points of the target detection frame in the current frame image
Figure DEST_PATH_IMAGE002A
Respectively representing red chromatic values, green chromatic values, blue chromatic values and transparent chromatic values of jth pixel points in the ith target detection frame;
the deep convolutional neural network is used for respectively extracting the characteristics of the four chromatic values to obtain the chromatic characteristics corresponding to the four chromatic values;
and the feature fusion layer fuses the four chromaticity features through softmax to obtain the pixel features of the pixel points.
The number of the deep convolutional neural networks is four, and each deep convolutional neural network is used for extracting characteristics of a chromatic value; each deep convolutional neural network comprises an average pooling layer, a two-dimensional deformable convolutional layer, a first linear layer, a second linear layer, a first batch of normalization layers, a second batch of normalization layers, a first nonlinear activation layer, a second nonlinear activation layer and a full connection layer;
inputting any colorimetric value into an average pooling layer coding input characteristic, and obtaining a position characteristic and a texture characteristic after the input characteristic passes through a two-dimensional deformable convolution layer; inputting the position features into a first linear layer, a first batch of normalization layers and a first nonlinear activation layer in sequence to obtain weight vectors of the position features; inputting the texture features into a second linear layer, a second batch of normalization layers and a second nonlinear activation layer in sequence to obtain weight vectors of the texture features; and finally, carrying out weighting processing on the weight vector of the position characteristic and the weight vector of the texture characteristic through a full connection layer to obtain the chroma characteristic corresponding to the chroma value.
In the scheme, the structure of the deep convolutional neural network is improved and is divided into two branches of position feature extraction and texture feature extraction. The position characteristics refer to the positions of the pixel points in the target detection frame, for example, the position of the jth pixel point is (x, y), when the target object moves, the jth pixel point can be ensured to be always in the position of (x, y) in the target detection frame, and thus the target detection frame can be prevented from jumping as much as possible. The texture features refer to external features such as human body wearing, for example, when clothes are wrinkled, clothes of the same color under the same light can also have different gray level representations, so that the texture features can make up for the determination of the chromaticity features.
The feature fusion layer fuses the four chromaticity features through softmax to obtain pixel features of the pixel points, and the method comprises the following steps:
Figure DEST_PATH_IMAGE004A
wherein, O i,j Expressing the pixel characteristics of a jth pixel point in an ith target detection frame;
Figure DEST_PATH_IMAGE006A
Figure DEST_PATH_IMAGE008A
Figure DEST_PATH_IMAGE010A
Figure DEST_PATH_IMAGE012A
respectively representing the red chromaticity characteristic, the green chromaticity characteristic, the blue chromaticity characteristic and the transparent chromaticity characteristic of a jth pixel point in an ith target detection frame;
Figure DEST_PATH_IMAGE014A
Figure DEST_PATH_IMAGE016A
Figure DEST_PATH_IMAGE018A
Figure DEST_PATH_IMAGE020A
the weights of the red chrominance feature, the green chrominance feature, the blue chrominance feature, and the transparent chrominance feature are expressed respectively.
The target tracking module predicts the moving track of the target object in the next frame of image according to the facing angle of the target object, and the method comprises the following steps:
by a loss function L center And regressing the position of the central point of the target detection frame to restrict the distance between the predicted target detection frame and the real target detection frame in the next frame image:
Figure DEST_PATH_IMAGE022A
wherein x is i Indicating the pixels belonging to the target object in the input i-th target detection box, y i Pixels representing that the input ith target detection frame belongs to the background;
Figure DEST_PATH_IMAGE024A
a set of pixels representing the target object is shown,
Figure DEST_PATH_IMAGE026A
representing a set of background pixels;
Figure DEST_PATH_IMAGE028A
which is indicative of a parameter of the balance,
Figure DEST_PATH_IMAGE030A
Figure DEST_PATH_IMAGE032
Figure DEST_PATH_IMAGE034
representing the center offset of the target detection frame;
Figure DEST_PATH_IMAGE036
indicating the angle of direction the predicted target object is facing,
Figure DEST_PATH_IMAGE038
Figure DEST_PATH_IMAGE040
representing the angle of direction that the real target object is facing,
Figure DEST_PATH_IMAGE042
) (ii) a r represents a cosine boundary;
Figure DEST_PATH_IMAGE044
a cosine function representing a direction angle in which the prediction target object faces,
Figure DEST_PATH_IMAGE046
a cosine function representing a direction angle that a real target object faces;
Figure DEST_PATH_IMAGE048
represents a center weight;
Figure DEST_PATH_IMAGE050
the scale parameter is indicated.
Compared with the prior art, the invention has the beneficial effects that:
the chromaticity characteristics of each pixel point in the target detection frame are determined through the deep convolution neural network, so that the target detection frame is determined according to the pixel characteristics of the pixel points, even if large motion deviation and mutual shielding conditions occur when the flow of people is large on site, the target object can be quickly tracked and positioned according to the pixel characteristics of the target object, and the accuracy of target identification and tracking is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic structural diagram of a target recognition module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a deep convolutional neural network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of determining a facing direction of a target object in a two-dimensional image frame according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Also, in the description of the present invention, the terms "first", "second", and the like are used for distinguishing between descriptions and not necessarily for describing a relative importance or implying any actual relationship or order between such entities or operations.
The embodiment is as follows:
the invention is realized by the following technical scheme, as shown in fig. 1, a target identification tracking method comprises the following steps:
and step S1, inputting the video image into the target recognition module, extracting the pixel characteristics of all target detection frames in the video image by the target recognition module, and determining the target object according to the pixel characteristics.
The video image is shown in the form of video, but is finally the video formed by multi-frame images, so the scheme is explained for a single-frame image. Referring to fig. 2, the target identification module includes a chroma extraction unit, a deep convolutional neural network, and a feature fusion layer. Since the identification of the person in the image is a mature prior art, the details of how to obtain the target detection frame of the person are not repeated, and the reference to the prior art is only needed.
Because the image is composed of one pixel point, and each pixel point is composed of four bytes, the four bytes represent the meaning: the first byte represents a red chrominance value, the second byte represents a green chrominance value, the third byte represents a blue chrominance value, and the fourth byte represents a transparent chrominance value. The red, green and blue are three primary colors, and other colors in the nature are mixed by different proportions of the three primary colors.
The chroma extraction unit extracts four chroma values of pixel points aiming at the pixel points of the target detection frame in the current frame image
Figure DEST_PATH_IMAGE002AA
Figure DEST_PATH_IMAGE052
The red chrominance value of the jth pixel point in the ith target detection frame is represented,
Figure DEST_PATH_IMAGE054
representing the green chrominance value of the jth pixel point in the ith target detection frame,
Figure DEST_PATH_IMAGE056
representing the blue chrominance value of the jth pixel point in the ith target detection frame,
Figure DEST_PATH_IMAGE058
and expressing the transparent chromatic value of the jth pixel point in the ith target detection frame.
And respectively extracting the characteristics of the four colorimetric values by the deep convolutional neural network to obtain the colorimetric characteristics corresponding to the four colorimetric values. Referring to fig. 2, the number of the deep convolutional neural networks is four, and the structures of the deep convolutional neural networks are the same, and each deep convolutional neural network performs feature extraction on one colorimetric value respectively.
Referring to fig. 3, each of the deep convolutional neural networks includes an averaging pooling layer, a two-dimensional deformable convolutional layer, a first linear layer, a second linear layer, a first batch of normalization layers, a second batch of normalization layers, a first nonlinear active layer, a second nonlinear active layer, and a full connection layer.
Taking a red colorimetric value as an example for explanation, inputting the red colorimetric value into an average pooling layer coding input characteristic, and obtaining a position characteristic and a texture characteristic after the input characteristic passes through a two-dimensional deformable convolution layer; inputting the position features into a first linear layer, a first batch of normalization layers and a first nonlinear activation layer in sequence to obtain weight vectors of the position features; inputting the texture features into a second linear layer, a second batch of normalization layers and a second nonlinear activation layer in sequence to obtain weight vectors of the texture features; and finally, carrying out weighting processing on the weight vector of the position characteristic and the weight vector of the texture characteristic through a full connection layer to obtain a red chrominance characteristic. And inputting other colorimetric values into the deep convolution neural network, and obtaining corresponding colorimetric characteristics through the same processing.
And finally, fusing the four chromaticity characteristics through softmax by the characteristic fusion layer to obtain the pixel characteristics of the jth pixel point:
Figure DEST_PATH_IMAGE004AA
wherein, O i,j Expressing the pixel characteristics of a jth pixel point in an ith target detection frame;
Figure DEST_PATH_IMAGE006AA
Figure DEST_PATH_IMAGE008AA
Figure DEST_PATH_IMAGE010AA
Figure DEST_PATH_IMAGE012AA
respectively representing the red chromaticity characteristic, the green chromaticity characteristic, the blue chromaticity characteristic and the transparent chromaticity characteristic of a jth pixel point in an ith target detection frame;
Figure DEST_PATH_IMAGE014AA
Figure DEST_PATH_IMAGE016AA
Figure DEST_PATH_IMAGE018AA
Figure DEST_PATH_IMAGE020AA
the weights of the red chrominance feature, the green chrominance feature, the blue chrominance feature, and the transparent chrominance feature are expressed respectively.
In step S2, the target tracking module locates the target object, determines a direction in which the target object faces in the two-dimensional image frame, the facing direction being represented by an angular range of 360 °.
Since the captured video image is actually a three-dimensional space but can only be displayed as a two-dimensional image, if the direction in which the human body faces is proportional when viewed from directly above or directly below the human body, for example, if the human body rotates by β degrees, then β degrees rotation is displayed in the two-dimensional image. However, since the captured video image is not directly above or below the human body, the two-dimensional image does not show β degrees when the human body is actually rotated by β degrees. Therefore, fitting is required, a frame of image is fixed by taking the head or some other part of the human body as a circle center, please refer to fig. 4, a coordinate system of the frame of image is designed, and distances between the human body and an origin of the coordinate system are different at different positions, so as to fit a linear relation between an actual rotation angle of the human body and a rotation angle in the two-dimensional image. For example, when a human body rotates from point b to point b ', the actual rotation angle is β ', but β and β ' have a linear relationship because β is linearly fit to β in the coordinate system shown in fig. 4.
In step S3, the target tracking module predicts a moving track of the target object in the next frame of image according to the angle that the target object faces.
By a loss function L center And regressing the position of the central point of the target detection frame to constrain the distance between the predicted target detection frame and the real target detection frame in the next frame of image, so that the prediction precision is continuously improved:
Figure DEST_PATH_IMAGE060
wherein x is i Indicating the pixels belonging to the target object in the input i-th target detection box, y i Pixels indicating that the input ith target detection frame belongs to the background;
Figure DEST_PATH_IMAGE024AA
a set of pixels representing the target object is shown,
Figure DEST_PATH_IMAGE026AA
representing a set of background pixels;
Figure DEST_PATH_IMAGE028AA
the balance parameters are represented by a number of parameters,
Figure DEST_PATH_IMAGE030AA
Figure DEST_PATH_IMAGE032A
Figure DEST_PATH_IMAGE034A
representing the center offset of the target detection frame;
Figure DEST_PATH_IMAGE036A
indicating the angle of direction the predicted target object is facing,
Figure DEST_PATH_IMAGE038A
Figure DEST_PATH_IMAGE040A
representing the angle of the direction that the real target object is facing,
Figure DEST_PATH_IMAGE042A
) (ii) a r represents a cosine boundary;
Figure DEST_PATH_IMAGE044A
a cosine function representing a direction angle in which the prediction target object faces,
Figure DEST_PATH_IMAGE046A
a cosine function representing a direction angle in which a real target object faces;
Figure DEST_PATH_IMAGE048A
represents a center weight;
Figure DEST_PATH_IMAGE050A
the scale parameter is indicated.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (1)

1. A method for identifying and tracking a target is characterized in that: the method comprises the following steps:
step S1, inputting the video image into a target identification module, extracting the pixel characteristics of all target detection frames in the video image by the target identification module, and determining a target object through the pixel characteristics;
the step of extracting the pixel characteristics of all target detection frames in the video image by the target identification module comprises the following steps:
the target identification module comprises a chromaticity extraction unit, a deep convolution neural network and a feature fusion layer;
the chroma extraction unit extracts four chroma values of pixel points aiming at the pixel points of the target detection frame in the current frame image
Figure 645378DEST_PATH_IMAGE002
Respectively representing red chromatic values, green chromatic values, blue chromatic values and transparent chromatic values of jth pixel points in the ith target detection frame;
the deep convolution neural network is used for respectively extracting the characteristics of the four colorimetric values to obtain colorimetric characteristics corresponding to the four colorimetric values;
the number of the deep convolutional neural networks is four, and each deep convolutional neural network is used for extracting the characteristics of a chromatic value; each deep convolutional neural network comprises an average pooling layer, a two-dimensional deformable convolutional layer, a first linear layer, a second linear layer, a first batch of normalization layers, a second batch of normalization layers, a first nonlinear activation layer, a second nonlinear activation layer and a full connection layer;
inputting any colorimetric value into an average pooling layer code input characteristic, and obtaining a position characteristic and a texture characteristic after the input characteristic passes through a two-dimensional deformable convolution layer; inputting the position features into a first linear layer, a first batch of normalization layers and a first nonlinear activation layer in sequence to obtain weight vectors of the position features; inputting the texture features into a second linear layer, a second batch of normalization layers and a second nonlinear activation layer in sequence to obtain weight vectors of the texture features; finally, the weight vector of the position characteristic and the weight vector of the texture characteristic are subjected to weighting processing through a full connection layer to obtain a chrominance characteristic corresponding to the chrominance value;
the feature fusion layer fuses the four chrominance features through softmax to obtain pixel features of the pixel point;
the feature fusion layer fuses the four chromaticity features through softmax to obtain the pixel features of the pixel point, and the method comprises the following steps:
Figure 116811DEST_PATH_IMAGE004
wherein, O i,j Expressing the pixel characteristics of a jth pixel point in an ith target detection frame;
Figure 374617DEST_PATH_IMAGE005
Figure 538882DEST_PATH_IMAGE006
Figure 249349DEST_PATH_IMAGE007
Figure 524472DEST_PATH_IMAGE008
respectively representing the red chromaticity characteristic, the green chromaticity characteristic, the blue chromaticity characteristic and the transparent chromaticity characteristic of a jth pixel point in an ith target detection frame;
Figure 902364DEST_PATH_IMAGE009
Figure 971951DEST_PATH_IMAGE010
Figure 684561DEST_PATH_IMAGE011
Figure 497796DEST_PATH_IMAGE012
respectively representing the weight of the red chrominance characteristic, the weight of the green chrominance characteristic, the weight of the blue chrominance characteristic and the weight of the transparent chrominance characteristic;
step S2, the target tracking module locates the target object, determines the facing direction of the target object in the two-dimensional image frame, and the facing direction is represented by an angle range of 360 degrees;
step S3, the target tracking module predicts the moving track of the target object in the next frame image according to the angle faced by the target object;
the target tracking module predicts the moving track of the target object in the next frame of image according to the facing angle of the target object, and the method comprises the following steps:
by a loss function L center And regressing the position of the central point of the target detection frame to constrain the distance between the predicted target detection frame and the real target detection frame in the next frame image:
Figure 730195DEST_PATH_IMAGE013
wherein x is i Indicating the pixels belonging to the target object in the input i-th target detection box, y i Pixels representing that the input ith target detection frame belongs to the background;
Figure 970683DEST_PATH_IMAGE014
a set of pixels representing the target object is shown,
Figure 921321DEST_PATH_IMAGE015
representing a set of background pixels;
Figure 803827DEST_PATH_IMAGE016
which is indicative of a parameter of the balance,
Figure 625152DEST_PATH_IMAGE018
Figure 567701DEST_PATH_IMAGE020
Figure 740056DEST_PATH_IMAGE021
representing the center offset of the target detection frame;
Figure 144361DEST_PATH_IMAGE022
indicating the angle of direction the predicted target object is facing,
Figure 351352DEST_PATH_IMAGE023
Figure 933643DEST_PATH_IMAGE024
representing the angle of the direction that the real target object is facing,
Figure 593294DEST_PATH_IMAGE025
) (ii) a r represents a cosine boundary;
Figure 552023DEST_PATH_IMAGE026
a cosine function representing a direction angle in which the prediction target object faces,
Figure DEST_PATH_IMAGE027
a cosine function representing a direction angle that a real target object faces;
Figure DEST_PATH_IMAGE028
represents a center weight;
Figure DEST_PATH_IMAGE029
the scale parameter is indicated.
CN202210858864.1A 2022-07-21 2022-07-21 Target identification tracking method Active CN114937231B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210858864.1A CN114937231B (en) 2022-07-21 2022-07-21 Target identification tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210858864.1A CN114937231B (en) 2022-07-21 2022-07-21 Target identification tracking method

Publications (2)

Publication Number Publication Date
CN114937231A CN114937231A (en) 2022-08-23
CN114937231B true CN114937231B (en) 2022-09-30

Family

ID=82868489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210858864.1A Active CN114937231B (en) 2022-07-21 2022-07-21 Target identification tracking method

Country Status (1)

Country Link
CN (1) CN114937231B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414432A (en) * 2019-07-29 2019-11-05 腾讯科技(深圳)有限公司 Training method, object identifying method and the corresponding device of Object identifying model
CN113538585A (en) * 2021-09-17 2021-10-22 深圳火眼智能有限公司 High-precision multi-target intelligent identification, positioning and tracking method and system based on unmanned aerial vehicle

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7747040B2 (en) * 2005-04-16 2010-06-29 Microsoft Corporation Machine vision system and method for estimating and tracking facial pose
WO2019140609A1 (en) * 2018-01-18 2019-07-25 深圳市道通智能航空技术有限公司 Target detection method and unmanned aerial vehicle
CN110634155A (en) * 2018-06-21 2019-12-31 北京京东尚科信息技术有限公司 Target detection method and device based on deep learning
CN109978045A (en) * 2019-03-20 2019-07-05 深圳市道通智能航空技术有限公司 A kind of method for tracking target, device and unmanned plane
CN110675428B (en) * 2019-09-06 2023-02-28 鹏城实验室 Target tracking method and device for human-computer interaction and computer equipment
CN110991397B (en) * 2019-12-17 2023-08-04 深圳市捷顺科技实业股份有限公司 Travel direction determining method and related equipment
CN111915649A (en) * 2020-07-27 2020-11-10 北京科技大学 Strip steel moving target tracking method under shielding condition
CN112101150B (en) * 2020-09-01 2022-08-12 北京航空航天大学 Multi-feature fusion pedestrian re-identification method based on orientation constraint
CN112989962B (en) * 2021-02-24 2024-01-05 上海商汤智能科技有限公司 Track generation method, track generation device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414432A (en) * 2019-07-29 2019-11-05 腾讯科技(深圳)有限公司 Training method, object identifying method and the corresponding device of Object identifying model
CN113538585A (en) * 2021-09-17 2021-10-22 深圳火眼智能有限公司 High-precision multi-target intelligent identification, positioning and tracking method and system based on unmanned aerial vehicle

Also Published As

Publication number Publication date
CN114937231A (en) 2022-08-23

Similar Documents

Publication Publication Date Title
CN109559310B (en) Power transmission and transformation inspection image quality evaluation method and system based on significance detection
CN109934177A (en) Pedestrian recognition methods, system and computer readable storage medium again
CN110650427B (en) Indoor positioning method and system based on fusion of camera image and UWB
CN108694709B (en) Image fusion method and device
KR20160143494A (en) Saliency information acquisition apparatus and saliency information acquisition method
CN108694741A (en) A kind of three-dimensional rebuilding method and device
CN112102409A (en) Target detection method, device, equipment and storage medium
CN113222973B (en) Image processing method and device, processor, electronic equipment and storage medium
CN111491149B (en) Real-time image matting method, device, equipment and storage medium based on high-definition video
CN113052876A (en) Video relay tracking method and system based on deep learning
Bi et al. Haze removal for a single remote sensing image using low-rank and sparse prior
CN111932601A (en) Dense depth reconstruction method based on YCbCr color space light field data
CN113486697A (en) Forest smoke and fire monitoring method based on space-based multi-modal image fusion
CN109903265A (en) A kind of image change area detecting threshold value setting method, system and its electronic device
CN114937231B (en) Target identification tracking method
CN113506275B (en) Urban image processing method based on panorama
CN113298177B (en) Night image coloring method, device, medium and equipment
Du et al. Recognition of mobile robot navigation path based on K-means algorithm
CN111860378A (en) Market fire-fighting equipment inspection method based on gun-ball linkage and video event perception
CN116614715A (en) Near infrared image colorization processing method for scene monitoring
CN115100240A (en) Method and device for tracking object in video, electronic equipment and storage medium
CN112561001A (en) Video target detection method based on space-time feature deformable convolution fusion
CN116711295A (en) Image processing method and apparatus
CN113902733A (en) Spacer defect detection method based on key point detection
CN111325209B (en) License plate recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Method for Target Recognition and Tracking

Granted publication date: 20220930

Pledgee: Bank of China Limited Chengdu pilot Free Trade Zone Branch

Pledgor: CHENGDU XIWU SECURITY SYSTEM ALLIANCE CO.,LTD.

Registration number: Y2024980020664