CN114332801A - Target detection active sampling method based on time sequence variance threshold - Google Patents

Target detection active sampling method based on time sequence variance threshold Download PDF

Info

Publication number
CN114332801A
CN114332801A CN202210244128.7A CN202210244128A CN114332801A CN 114332801 A CN114332801 A CN 114332801A CN 202210244128 A CN202210244128 A CN 202210244128A CN 114332801 A CN114332801 A CN 114332801A
Authority
CN
China
Prior art keywords
sample
model
variance
target detection
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210244128.7A
Other languages
Chinese (zh)
Other versions
CN114332801B (en
Inventor
黄圣君
罗世发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210244128.7A priority Critical patent/CN114332801B/en
Publication of CN114332801A publication Critical patent/CN114332801A/en
Application granted granted Critical
Publication of CN114332801B publication Critical patent/CN114332801B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a target detection active sampling method based on a time sequence variance threshold. The method comprises the following steps: firstly, collecting a large amount of non-labeled time sequence data and a small amount of labeled data; secondly, setting the number n of query samples and a variance threshold value delta; thirdly, initializing the model; fourthly, the target detection model outputs a prediction result of the frame without the label; fifthly, calculating the model uncertainty of each iteration for the unmarked frame according to the prediction result; taking a sample with the maximum model uncertainty, and if the time sequence variance is greater than a threshold value and an adjacent frame is not selected, marking the sample to the query; seventhly, updating the labeled image set, the unlabeled image set and the prediction model; and eighthly, returning to the step four or inquiring enough samples and outputting the target detection model f. According to the invention, a special active learning index is set for a target detection task of time sequence data in an automatic driving scene to reduce the labeling cost.

Description

Target detection active sampling method based on time sequence variance threshold
Technical Field
The invention belongs to the technical field of automatic digital image labeling, and particularly relates to a target detection active sampling method based on a time sequence variance threshold.
Background
In the actual industrial application process, data has been regarded as a core resource of the industrial internet. Mass data are generated after people, machines and objects are interconnected, the forward development of the industry is promoted, however, the data also contain a large amount of redundant data, and data extraction and cleaning become urgent. One of the highlighted fields is the time series data of automatic driving, and the automatic driving data is usually captured from a plurality of driving segments recorded by a camera, and each driving segment contains continuous pictures, i.e. time series frames, in a driving scene. For the data, selective sampling is carried out by using an active learning algorithm, so that redundant information can be reduced, and the most effective part can be extracted. If frames are selected only through uncertainty, although all frames with uncertain models are selected, adjacent frames are likely to be too similar, and all labels are not necessarily selected, otherwise data redundancy is easily caused. Therefore, by combining the time sequence variance calculation method of a single sample frame, certain time sequence dissimilarity is also met while a frame with uncertain model is selected. The industry currently makes full annotations to time series data or samples at regular intervals. The labeling cost of the former is too high, and the sampling of the latter is random, so that key information can be lost.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a target detection active sampling method based on a time sequence variance threshold, which fully uses the information of a sample and the output result of a model generated in the training process stage, selects a data frame with the most query value to label, and reduces the labeling cost required by model training.
The technical scheme is as follows: the invention discloses a target detection active sampling method based on a time sequence variance threshold, which comprises the following steps:
step 1, collecting an unmarked time sequence data image set L and an annotated time sequence image data set U;
step 2, setting a planned query sample number n and a variance threshold value delta, and initializing the selected sample number q to 0;
step 3, initializing a target detection model for the model by using the labeled image set;
step 4, the target detection model outputs a prediction result of the frame without the label;
step 5, calculating the model uncertainty of each iteration for each unmarked frame according to the prediction result, and arranging from large to small;
step 6, taking a sample with the largest model uncertainty, and inquiring and marking the sample to an expert if the time sequence variance of the sample is greater than a threshold value and an adjacent frame is not selected;
step 7, updating the labeled image set
Figure 90188DEST_PATH_IMAGE001
And the image set U is not marked, and the prediction model is updated;
and 8, returning to the step 4 or selecting enough samples after query and outputting the target detection model
Figure 226640DEST_PATH_IMAGE002
Further, the specific method for calculating the model uncertainty size of each iteration for each unlabeled frame in step 5 is as follows:
step 5.1: a time series data frame F of one segment is input and is averagely divided into k small segments, namely
Figure 50240DEST_PATH_IMAGE003
Each segment containing n sample frames, i.e.
Figure 313862DEST_PATH_IMAGE004
(ii) a The model outputs a prediction value for each frame sample, i.e. for
Figure 539307DEST_PATH_IMAGE005
All are provided with
Figure 456447DEST_PATH_IMAGE006
(ii) a For each small segment
Figure 659021DEST_PATH_IMAGE007
Calculating a variance from n values of the n frame outputs
Figure 585388DEST_PATH_IMAGE008
To obtain
Figure 275127DEST_PATH_IMAGE009
(ii) a The calculated variance was regarded as a score of k segmentsInquiring the price index, and selecting more sample frames in small sections with larger variance;
step 5.2: the variance calculation method involved in step 5.1 is as follows: the target detection model outputs the position, type and confidence of the object detected in one picture, and if k objects are detected in total in a certain picture, the output can be expressed as:
Figure 894327DEST_PATH_IMAGE010
wherein
Figure 817153DEST_PATH_IMAGE011
Counting the average value of the i-th class object in the sample, and if the target detection task has C-class objects in total, normalizing the output of one picture into a class vector:
Figure 281632DEST_PATH_IMAGE012
a vector represents the average position and the average confidence of each type of object in a sample;
step 5.3: in the training process, performing multiple rounds of training, updating iteration of the model after each round, and outputting the unmarked sample by the model after each iteration, wherein if the predicted output difference of the model to different iteration rounds of the same sample is large, the model is uncertain about the sample, inquiring and marking the sample of the type, and if the model judges the same sample before and after the iteration, the model learns the characteristics of the sample with stable judgment results, the sample has low information value quantity, and does not need to be inquired and marked;
the variance calculation for a single image frame is as follows: the model after n iterations yields n such
Figure 216090DEST_PATH_IMAGE013
Vectors, each in each dimension of the vectorsSolving one, then calculating the average value of each variance value
Figure 881558DEST_PATH_IMAGE014
Figure 901466DEST_PATH_IMAGE015
Is shown to
Figure 792805DEST_PATH_IMAGE016
Each dimension of the vector is squared off.
Further, in step 5.1, the average value of the i-th object in the sample is counted, and the calculation method is as follows:
Figure 847349DEST_PATH_IMAGE017
Figure 542773DEST_PATH_IMAGE018
Figure 925344DEST_PATH_IMAGE019
is expressed asj1, l when (= i)j0 when not equal to i.
Further, in step 6, the specific method for judging whether the sample needs to be queried and marked by the expert is as follows:
step 6.1: taking a sample with the largest model uncertainty, regarding the sample and two adjacent sample frames as a set, and calculating a time sequence variance for the set;
step 6.2: if the time sequence variance of the sample is larger than the threshold value and the adjacent frame is not selected, the sample is inquired and marked to an expert, and if any one condition is not met, the sample is discarded.
Further, in step 6.1, the specific method for calculating the time-series variance for the set is as follows: 3 image frames are generated
Figure 731626DEST_PATH_IMAGE020
And solving the vectors one for each dimension of the three vectors, and then solving the average quantity of the obtained variance values:
Figure 30889DEST_PATH_IMAGE021
Figure 897214DEST_PATH_IMAGE022
is shown to
Figure 32660DEST_PATH_IMAGE023
Each dimension of the vector is squared off.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages: the invention applies the active learning technology to the target detection algorithm and actively selects the most valuable image. Because the time sequence data under the automatic driving scene is easy to encounter the dilemma of high redundancy, difficult labeling and the like in the training process, the invention queries the information with the most information quantity in the time sequence data by using an active query method, further reduces the negative influence caused by redundant samples, and can train an effective target detection model by using labels as few as possible. Through processing of model output, a special information quantity measuring index is established for a time sequence sample in two aspects of variance and model uncertainty, specifically, whether the sample with the maximum model uncertainty is similar to an adjacent sample or not is calculated and judged while the sample with the maximum model uncertainty is selected, namely whether the sample is larger than a time sequence method threshold or not is judged. Only the adjacent sample frames are not selected and the difference between the three frames is large. The method fully uses the information of the sample and the output result of the model generated in the training process stage, selects the data frame with the most query value to label, and reduces the labeling cost required by model training.
Drawings
FIG. 1 is a flow chart of the mechanism of the present invention;
FIG. 2 is a schematic diagram of calculating model uncertainty for each image frame;
FIG. 3 is a schematic diagram of an active sampling model for target detection based on a time-series variance threshold.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
Examples
Fig. 1 shows a flow chart of the mechanism of the present invention. It is assumed that initially there is a data set consisting of a small number of annotated images
Figure 377053DEST_PATH_IMAGE001
And a data set consisting of a large number of unlabelled images
Figure 406189DEST_PATH_IMAGE024
. First, the target detection model is initialized with a small number of labeled time series image datasets L and the number of scheduled query samples n, as well as the variance threshold δ, and the number of picked samples q is initialized to 0. And then, initializing a target detection model for the model by using the labeled image set, and outputting a prediction result of a non-labeled frame. And then, calculating the model uncertainty of each iteration for each unlabeled frame according to the prediction result, and arranging from large to small. And taking the sample with the largest model uncertainty, and if the time sequence variance of the sample is greater than a threshold value and the adjacent frame is not selected, inquiring and marking the sample to an expert. Finally, the labeled image set is updated
Figure 69514DEST_PATH_IMAGE001
And the unlabeled image set U, and updating the prediction model. The query process will loop until the marking overhead reaches the budget.
FIG. 2 is a schematic diagram illustrating the calculation of model uncertainty for each image frame. First, the object detection model outputs the position, type, and confidence of the object detected in one picture. If a picture detects k objects in total, the output can be expressed as:
Figure 551311DEST_PATH_IMAGE010
wherein
Figure 840341DEST_PATH_IMAGE011
And (3) counting the mean value of the objects belonging to the ith class in the sample, wherein the calculation method comprises the following steps:
Figure 723983DEST_PATH_IMAGE017
Figure 932111DEST_PATH_IMAGE018
Figure 25838DEST_PATH_IMAGE025
is expressed asj1, l when (= i)j0 when not equal to i.
Finally, if there are C-class objects in the target detection task in total, we can normalize the output of one picture into a class of vectors:
Figure 977613DEST_PATH_IMAGE026
such a vector represents the average position and the average confidence of each class of objects in a sample. The variance calculation for a single image frame is as follows: the model after n iterations yields n such
Figure 591128DEST_PATH_IMAGE020
And (5) vector quantity. Solving one vector in each dimension, and averaging the variance values
Figure 970157DEST_PATH_IMAGE027
Figure 160967DEST_PATH_IMAGE022
Is shown to
Figure 375540DEST_PATH_IMAGE023
Each dimension of the vector is squared off.
FIG. 3 is a schematic diagram of an active sampling model for target detection based on a time-series variance threshold. The target detection model records the output result of each image frame in multiple iterations, selects the sample with the largest uncertainty according to the method provided by the invention, and then judges whether the sample meets the requirements of the adjacent frame time sequence method. Thereby proactively choosing the most appropriate sample to query the human expert.
In the embodiment, the fast-RCNN model is adopted to carry out experimental verification on a Waymo data set, 10 epochs are trained totally, COCO indexes on a test set are calculated, and great attention is paid to
Figure 233774DEST_PATH_IMAGE028
And
Figure 659071DEST_PATH_IMAGE029
. The comparison method is (1) training by using all samples; (2) randomly sampling 20% of data and labeling; (3) only considering model uncertainty, sampling 20% of data and labeling; (4) and after a time sequence variance threshold value is set, marking 20% of data with the largest uncertainty of the sampling model, namely the method provided by the invention. The experimental results show that AP @ IoU =0.5 is 0.4752 using all data, while the results of the examples are 0.4693 and higher than the second, third comparative method. The effect of using all samples for marking can be achieved by only using 20% of the sample amount in the Waymo time sequence data, and the marking cost is greatly saved.

Claims (5)

1. A target detection active sampling method based on a time sequence variance threshold is characterized by comprising the following steps:
step 1, collecting an unmarked time sequence data image set L and an annotated time sequence image data set U;
step 2, setting a planned query sample number n and a variance threshold value delta, and initializing the selected sample number q to 0;
step 3, initializing a target detection model for the model by using the labeled image set;
step 4, the target detection model outputs a prediction result of the frame without the label;
step 5, calculating the model uncertainty of each iteration for each unmarked frame according to the prediction result, and arranging from large to small;
step 6, taking a sample with the largest model uncertainty, and inquiring and marking the sample to an expert if the time sequence variance of the sample is greater than a threshold value and an adjacent frame is not selected;
step 7, updating the labeled image set
Figure 598812DEST_PATH_IMAGE001
And the image set U is not marked, and the prediction model is updated;
and 8, returning to the step 4 or selecting enough samples after query and outputting the target detection model f.
2. The time series variance threshold based target detection active sampling method of claim 1, wherein the specific method for calculating the model uncertainty size of each iteration for each unlabeled frame in step 5 is as follows:
step 5.1: a time series data frame F of one segment is input and is averagely divided into k small segments, namely
Figure 47111DEST_PATH_IMAGE002
Each segment containing n sample frames, i.e.
Figure 482641DEST_PATH_IMAGE003
(ii) a The model outputs a prediction value for each frame sample, i.e. for
Figure 143429DEST_PATH_IMAGE004
All are provided with
Figure 364326DEST_PATH_IMAGE005
(ii) a For each small segment
Figure 717947DEST_PATH_IMAGE006
Calculating a variance from n values of the n frame outputs
Figure 250559DEST_PATH_IMAGE007
To obtain
Figure 606717DEST_PATH_IMAGE008
(ii) a The calculated variance is regarded as an evaluation index of k small sections for inquiry, and the small sections with larger variance select more sample frames;
step 5.2: the variance calculation method involved in step 5.1 is as follows: the target detection model outputs the position, type and confidence of the object detected in one picture, and if k objects are detected in total in a certain picture, the output can be expressed as:
Figure 541175DEST_PATH_IMAGE009
wherein
Figure 941063DEST_PATH_IMAGE010
Counting the average value of the i-th class object in the sample, and if the target detection task has C-class objects in total, normalizing the output of one picture into a class vector:
Figure 226551DEST_PATH_IMAGE011
a vector represents the average position and the average confidence of each type of object in a sample;
step 5.3: in the training process, performing multiple rounds of training, updating iteration of the model after each round, and outputting the unmarked sample by the model after each iteration, wherein if the predicted output difference of the model to different iteration rounds of the same sample is large, the model is uncertain about the sample, inquiring and marking the sample of the type, and if the model judges the same sample before and after the iteration, the model learns the characteristics of the sample with stable judgment result, the sample has low information value quantity, and does not need to be inquired and marked;
the variance calculation for a single image frame is as follows: the model after n iterations yields n such
Figure 229142DEST_PATH_IMAGE012
Vectors are solved for one in each dimension, and then the average quantity of the obtained variance values is obtained
Figure 408319DEST_PATH_IMAGE013
Figure 103743DEST_PATH_IMAGE014
Is shown to
Figure 751893DEST_PATH_IMAGE015
Each dimension of the vector is squared off.
3. The active sampling method for target detection based on temporal variance threshold as claimed in claim 2, wherein in step 5.1, the average of the i-th object in the sample is counted, and the calculation method is as follows:
Figure 292596DEST_PATH_IMAGE016
Figure 467225DEST_PATH_IMAGE017
Figure 956719DEST_PATH_IMAGE018
is expressed asj1, l when (= i)j0 when not equal to i.
4. The active sampling method for target detection based on the time sequence variance threshold value as claimed in claim 1, wherein in step 6, the specific method for determining whether the sample needs to be queried and marked to an expert is as follows:
step 6.1: taking a sample with the largest model uncertainty, regarding the sample and two adjacent sample frames as a set, and calculating a time sequence variance for the set;
step 6.2: if the time sequence variance of the sample is larger than the threshold value and the adjacent frame is not selected, the sample is inquired and marked to an expert, and if any one condition is not met, the sample is discarded.
5. The target detection active sampling method based on the temporal variance threshold as claimed in claim 4, wherein in step 6.1, the specific method for calculating the temporal variance for the set is: 3 image frames are generated
Figure 951220DEST_PATH_IMAGE019
And solving the vectors one for each dimension of the three vectors, and then solving the average quantity of the obtained variance values:
Figure 436559DEST_PATH_IMAGE020
Figure 200116DEST_PATH_IMAGE021
is shown to
Figure 502921DEST_PATH_IMAGE022
Each dimension of the vector is squared off.
CN202210244128.7A 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold Active CN114332801B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210244128.7A CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210244128.7A CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Publications (2)

Publication Number Publication Date
CN114332801A true CN114332801A (en) 2022-04-12
CN114332801B CN114332801B (en) 2022-06-28

Family

ID=81033831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210244128.7A Active CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Country Status (1)

Country Link
CN (1) CN114332801B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502093A (en) * 2023-06-28 2023-07-28 江苏瑞中数据股份有限公司 Target detection data selection method and device based on active learning
CN117496118A (en) * 2023-10-23 2024-02-02 浙江大学 Method and system for analyzing steal vulnerability of target detection model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115182A (en) * 2020-09-15 2020-12-22 招商局金融科技有限公司 Time sequence data processing method, device, equipment and storage medium
US20210319081A1 (en) * 2020-04-03 2021-10-14 Alibaba Group Holding Limited Change of variance detection in time series data
CN113537040A (en) * 2021-07-13 2021-10-22 南京理工大学 Time sequence behavior detection method and system based on semi-supervised learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210319081A1 (en) * 2020-04-03 2021-10-14 Alibaba Group Holding Limited Change of variance detection in time series data
CN112115182A (en) * 2020-09-15 2020-12-22 招商局金融科技有限公司 Time sequence data processing method, device, equipment and storage medium
CN113537040A (en) * 2021-07-13 2021-10-22 南京理工大学 Time sequence behavior detection method and system based on semi-supervised learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502093A (en) * 2023-06-28 2023-07-28 江苏瑞中数据股份有限公司 Target detection data selection method and device based on active learning
CN116502093B (en) * 2023-06-28 2023-10-13 江苏瑞中数据股份有限公司 Target detection data selection method and device based on active learning
CN117496118A (en) * 2023-10-23 2024-02-02 浙江大学 Method and system for analyzing steal vulnerability of target detection model
CN117496118B (en) * 2023-10-23 2024-06-04 浙江大学 Method and system for analyzing steal vulnerability of target detection model

Also Published As

Publication number Publication date
CN114332801B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
Mayer et al. Learning target candidate association to keep track of what not to track
CN108629284B (en) Method, device and system for real-time face tracking and face pose selection based on embedded vision system
CN114332801B (en) Target detection active sampling method based on time sequence variance threshold
US8150170B2 (en) Statistical approach to large-scale image annotation
CN109508642B (en) Ship monitoring video key frame extraction method based on bidirectional GRU and attention mechanism
US9158971B2 (en) Self-learning object detectors for unlabeled videos using multi-task learning
CN106327469B (en) A kind of video picture segmentation method of semantic label guidance
CN110120064B (en) Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning
US20210326638A1 (en) Video panoptic segmentation
CN110781818B (en) Video classification method, model training method, device and equipment
Xiang et al. Activity based surveillance video content modelling
CN112861758B (en) Behavior identification method based on weak supervised learning video segmentation
CN115695950B (en) Video abstract generation method based on content perception
CN114186069A (en) Deep video understanding knowledge graph construction method based on multi-mode heteromorphic graph attention network
CN113642482A (en) Video character relation analysis method based on video space-time context
Chen et al. An interactive semantic video mining and retrieval platform--application in transportation surveillance video for incident detection
CN111814653B (en) Method, device, equipment and storage medium for detecting abnormal behavior in video
Zhao et al. Action recognition based on C3D network and adaptive keyframe extraction
CN107578069B (en) Image multi-scale automatic labeling method
CN114332729A (en) Video scene detection and marking method and system
CN114782860A (en) Violent behavior detection system and method in monitoring video
CN112487927A (en) Indoor scene recognition implementation method and system based on object associated attention
Chen et al. Active inference for retrieval in camera networks
CN117152851B (en) Face and human body collaborative clustering method based on large model pre-training
JP2009003638A (en) Sequential update type unsteady-state detection device, sequential update type unsteady-state detection method, sequential update type unsteady-state detection program and recording medium storing the program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant