CN114332801B - Target detection active sampling method based on time sequence variance threshold - Google Patents

Target detection active sampling method based on time sequence variance threshold Download PDF

Info

Publication number
CN114332801B
CN114332801B CN202210244128.7A CN202210244128A CN114332801B CN 114332801 B CN114332801 B CN 114332801B CN 202210244128 A CN202210244128 A CN 202210244128A CN 114332801 B CN114332801 B CN 114332801B
Authority
CN
China
Prior art keywords
sample
variance
model
target detection
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210244128.7A
Other languages
Chinese (zh)
Other versions
CN114332801A (en
Inventor
黄圣君
罗世发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210244128.7A priority Critical patent/CN114332801B/en
Publication of CN114332801A publication Critical patent/CN114332801A/en
Application granted granted Critical
Publication of CN114332801B publication Critical patent/CN114332801B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a target detection active sampling method based on a time sequence variance threshold. The method comprises the following steps: collecting a large amount of timing data without labels and a small amount of labeled data; secondly, setting the number n of query samples and a variance threshold value delta; thirdly, initializing the model; fourthly, the target detection model outputs a prediction result of the unmarked frame; fifthly, calculating the model uncertainty of each iteration for the unmarked frame according to the prediction result; taking a sample with the maximum model uncertainty, and if the time sequence variance is greater than a threshold value and an adjacent frame is not selected, marking the sample to the query; updating the labeled image set, the unlabeled image set and the prediction model; and eighthly, returning to the step four or inquiring enough samples and outputting the target detection model f. According to the invention, a special active learning index is set for a target detection task of time sequence data in an automatic driving scene to reduce the labeling cost.

Description

Target detection active sampling method based on time sequence variance threshold
Technical Field
The invention belongs to the technical field of automatic digital image labeling, and particularly relates to a target detection active sampling method based on a time sequence variance threshold.
Background
In the actual industrial application process, data is always regarded as a core resource of the industrial internet. Mass data are generated after people, machines and objects are interconnected, the advance development of industry is promoted, however, the data also contain a large amount of redundant data, and data extraction and cleaning become urgent. One of the highlighted fields is the time series data of automatic driving, and the automatic driving data is usually captured from a plurality of driving segments recorded by a camera, and each driving segment contains continuous pictures in a driving scene, namely time series frames. For the data, selective sampling is carried out by using an active learning algorithm, so that redundant information can be reduced, and the most effective part can be extracted. If frames are selected only through uncertainty, although all the selected frames are uncertain frames, adjacent frames are likely to be too similar, and all labels are not necessarily selected, otherwise data redundancy is easily caused. Therefore, by combining the time sequence variance calculation method of a single sample frame, certain time sequence dissimilarity is met while a frame with uncertain model is selected. The industry currently applies full labeling to time series data or samples at intervals. The labeling cost of the former is too large, and the sampling of the latter is random, so that key information can be lost.
Disclosure of Invention
The invention aims to: the invention aims to provide a target detection active sampling method based on a time sequence variance threshold, which fully uses the information of a sample and an output result generated by a model in the training process, selects a data frame with the most inquiry value for labeling, and reduces the labeling cost required by model training.
The technical scheme is as follows: the invention discloses a target detection active sampling method based on a time sequence variance threshold, which comprises the following steps of:
step 1, collecting a small amount of labeled time sequence data image sets L and a large amount of unlabeled time sequence data image sets U;
step 2, setting a plan query sample number n and a variance threshold value delta, and initializing a selected sample number q to be 0;
step 3, initializing a target detection model by using the marked time sequence data image set L;
step 4, outputting a prediction result of the image set U without the annotated time series data by the target detection model;
step 5, calculating the model uncertainty of each iteration for each image frame in the non-labeling time sequence data image set U according to the prediction result, and arranging from large to small;
step 6, taking a sample with the largest model uncertainty, and inquiring and marking the sample to an expert if the time sequence variance of the sample is greater than a threshold value and an adjacent frame is not selected;
Step 7, updating the annotated time sequence data image set L and the annotated time sequence data image set U, and updating the target detection model;
step 8, if q is less than n, returning to the step 4; if q is equal to n, the number of samples after the query is selected and preset is indicated, and the target detection model is output at the moment.
Further, the specific method for calculating the model uncertainty of each iteration for each image frame in the unmarked time series data image set in step 5 is as follows:
step 5.1: inputting a time series data frame F of a segment, and averagely dividing the time series data frame F into k segments, namely F ═ F1,f2,f3,…,fk) Each segment containing n sample frames, i.e. fi=(x1,x2,…,xn) (ii) a The model outputs a prediction value for each frame sample,i.e. for xi∈xnAll have yi←xi(ii) a For each small segment fi∈fk(δ) can be obtained by calculating a variance δ from n values of the n frame outputs12,…,δk) (ii) a The calculated variance is regarded as an evaluation index of k small sections for inquiry, and the small sections with larger variance need to select more sample frames;
step 5.2: the variance calculation method in the process is as follows: the target detection model outputs the position, the type and the confidence coefficient of the object detected by one picture, and if k objects are detected in total in a certain picture, the output is expressed as follows:
O={p,l,c}
Where p, l, c are each k in length, i.e. p ═ p1,…,pk),l=(l1,…,lk),c=(c1,…,ck);
Counting the average value of the i-th class objects in the sample, and if there are C-class objects in the target detection task in total, normalizing the output of one picture into a class vector:
Figure GDA0003626927760000021
a vector represents the average position and the average confidence of each class of objects in a sample;
step 5.3: in the training process, performing multiple rounds of training, updating iteration of the model after each round, and outputting an unlabeled sample by the model after each iteration, wherein if the predicted output difference of the model to different iteration rounds of the same sample is large, the model is relatively uncertain about the sample, the sample of the type is inquired and labeled, and if the model judges the same sample before and after the iteration each time, the model is proved to have learned the characteristics of the sample, the sample information value is low, and the label does not need to be inquired;
the variance calculation for a single image frame is as follows: after n iterations, the model generates n such v (x) vectors, each of which solves for a variance value in the same dimension, and then averages the obtained variance values:
Figure GDA0003626927760000031
δ (-) denotes for V (x)iThe variance is found for the same dimension of the vector.
Further, in step 5.1, the calculation method of the average value of the i-th class object in the sample is as follows:
Figure GDA0003626927760000032
Figure GDA0003626927760000033
Figure GDA0003626927760000034
is shown as ljWhen i is 1, lj0 when not equal to i.
Further, in step 6, the specific method for querying and marking the sample to the expert is as follows:
step 6.1: taking a sample with the largest model uncertainty, regarding the sample and two adjacent sample frames as a set, and calculating a time sequence variance for the set;
step 6.2: if the time sequence variance of the sample is larger than the threshold value and the adjacent frame is not selected, the sample is inquired and marked to an expert, and if any one condition is not met, the sample is discarded.
Further, in step 6.1, the specific method for calculating the time-series variance for the set is as follows: 3 image frames generate 3V (x) vectors in total, a variance value is respectively solved on the same dimension of the three vectors, and then the average value of the obtained variance values is obtained:
Figure GDA0003626927760000035
δ (-) denotes for V (x)iThe variance is found for the same dimension of the vector,
has the advantages that: compared with the prior art, the invention has the following remarkable advantages: the invention applies the active learning technology to the target detection algorithm and actively selects the most valuable image. Because the time sequence data under the automatic driving scene is easy to encounter the dilemma of high redundancy, difficult labeling and the like in the training process, the invention queries the information with the most information quantity in the time sequence data by using an active query method, further reduces the negative influence caused by redundant samples, and can train an effective target detection model by using labels as few as possible. Through processing of model output, a special information quantity measuring index is established for a time sequence sample in two aspects of variance and model uncertainty, specifically, whether the sample with the maximum model uncertainty is similar to an adjacent sample or not is calculated and judged while the sample with the maximum model uncertainty is selected, namely whether the sample is larger than a time sequence method threshold or not is judged. Only the adjacent sample frames are not selected and the difference between the three frames is large. The method fully uses the information of the sample and the output result of the model generated in the training process stage, selects the data frame with the most query value to label, and reduces the labeling cost required by model training.
Drawings
FIG. 1 is a flow chart of the mechanism of the present invention;
FIG. 2 is a schematic diagram of calculating model uncertainty for each image frame;
FIG. 3 is a schematic diagram of an active sampling model for target detection based on a time sequence variance threshold.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
Examples
Fig. 1 shows a flow chart of the mechanism of the present invention. Assume that initially there is a data set L consisting of a small number of annotated time series data images, and a data set U consisting of a large number of annotated time series data images. First, initialize the target detection model with a small number of labeled time series data image sets L and set the number of planned query samples n, as well as the variance threshold δ, initialize the number of selected samples q to 0. And then, initializing the target detection model by using the labeled time sequence data image set L, and outputting a prediction result of the image frame in the unlabeled time sequence data image set U. And then, calculating the model uncertainty of each iteration for each image frame in the non-labeling time sequence data image set U according to the prediction result, and arranging from large to small. And taking the sample with the largest model uncertainty, and if the time sequence variance of the sample is greater than a threshold value and the adjacent frame is not selected, inquiring and marking the sample to an expert. And finally, updating the marked time sequence data image set L and the unmarked time sequence data image set U, and updating the target detection model. The query process will loop until the marking overhead reaches the budget.
Fig. 2 is a schematic diagram illustrating the calculation of model uncertainty for each image frame. First, the object detection model outputs the position, type, and confidence of the object detected in one picture. If a picture detects k objects in total, the output can be expressed as:
O={p,l,c}
where p, l, c are each k in length, i.e. p ═ p1,…,pk),l=(l1,…,lk),c=(c1,…,ck)。
Counting the mean value of the objects belonging to the ith class in the sample, wherein the calculation method comprises the following steps:
Figure GDA0003626927760000041
Figure GDA0003626927760000042
Figure GDA0003626927760000043
is shown asljWhen i is 1, lj0 when not equal to i.
Finally, if there are C-class objects in the target detection task in total, we can normalize the output of one picture into a class of vectors:
Figure GDA0003626927760000051
such a vector represents the average position and the average confidence of each class of objects in a sample. The variance calculation for a single image frame is as follows: the model yields n such v (x) vectors after n iterations. Solving a variance value in the same dimension of the vectors respectively, and then solving the average value of the obtained variance values:
Figure GDA0003626927760000052
δ (-) denotes for V (x)iThe variance is found for the same dimension of the vector.
FIG. 3 is a schematic diagram of an active sampling model for target detection based on a time-series variance threshold. The target detection model records the output result of each image frame in multiple iterations, selects the sample with the largest uncertainty according to the method provided by the invention, and then judges whether the sample meets the requirements of the adjacent frame time sequence method. Thereby proactively choosing the most appropriate sample to query the human expert.
In the embodiment, experimental verification is carried out on a Waymo data set by adopting a fast-RCNN model, 10 epochs are trained totally, COCO indexes on a test set are calculated, and an important point is that AP and AP are concernedIoU=0.5. The comparison method is (1) training by using all samples; (2) randomly sampling 20% of data and marking; (3) only considering model uncertainty to carry out sampling 20% data labeling; (4) and after a time sequence variance threshold value is set, marking 20% of data with the maximum uncertainty of a sampling model, namely the method provided by the invention. The experimental results show that AP @ IoU ═ 0.5 using all data is 0.4752, while the results for the examples are 0.4693 and higher than the second, third comparative method. Description of the inventionAccording to the invention, only 20% of sample amount is used in the Waymo time sequence data, the effect of using all samples for marking can be achieved, and the marking cost is greatly saved.

Claims (5)

1. A target detection active sampling method based on a time sequence variance threshold is characterized by comprising the following steps:
step 1, collecting a small amount of marked time sequence data image sets L and a large amount of unmarked time sequence data image sets U;
step 2, setting a planned query sample number n and a variance threshold value delta, and initializing the selected sample number q to 0;
step 3, initializing a target detection model by using the marked time sequence data image set L;
Step 4, outputting a prediction result of the image set U of the unmarked time series data by the target detection model;
step 5, calculating the model uncertainty of each iteration for each image frame in the unmarked time series data image set U according to the prediction result, and arranging from large to small;
step 6, taking a sample with the maximum model uncertainty, and inquiring and marking the sample to an expert if the time sequence variance of the sample is greater than a threshold value and an adjacent frame is not selected;
step 7, updating the annotated time sequence data image set L and the annotated time sequence data image set U, and updating the target detection model;
step 8, if q is less than n, returning to the step 4; if q is equal to n, the number of samples after the query is selected and preset is indicated, and the target detection model is output at the moment.
2. The method of claim 1, wherein the step 5 of calculating the model uncertainty of each iteration for each image frame in the unlabeled time series data image set U comprises:
step 5.1: inputting a time series data frame F of one segment, dividing it into k small segments equally, i.e. F ═ F1,f2,f3,…,fk) Each segment containing n sample frames, i.e. f i=(x1,x2,…,xn) (ii) a The model outputs a prediction value for each frame sample, i.e. for xi∈xnAll have yi←xi(ii) a For each small segment fi∈fk(δ) can be obtained by calculating a variance δ from n values of the n frame outputs12,…,δk) (ii) a The calculated variance is regarded as an evaluation index of k small sections for inquiry, and the small sections with larger variance select more sample frames;
and step 5.2: the variance calculation method in step 5.1 is as follows: the target detection model outputs the position, the type and the confidence coefficient of the object detected by one picture, and if k objects are detected in total in a certain picture, the output is expressed as follows:
O={p,l,c}
where p, l, c are each k in length, i.e. p ═ p (p)1,…,pk),l=(l1,…,lk),c=(c1,…,ck);
Counting the average value of the i-th class object in the sample, and if the target detection task has C-class objects in total, normalizing the output of one picture into a class vector:
Figure FDA0003626927750000021
a vector represents the average position and the average confidence of each type of object in a sample;
step 5.3: in the training process, performing multiple rounds of training, updating iteration of the model after each round, and outputting an unlabeled sample by the model after each iteration, wherein if the predicted output difference of the model to different iteration rounds of the same sample is large, the model is relatively uncertain about the sample and should be inquired and labeled, and if the model judges the same sample before and after the iteration, the model learns the characteristics of the sample with stable judgment result, and the sample has low information value and does not need to be inquired and labeled;
The variance calculation for a single image frame is as follows: after n iterations, the target detection model generates n such v (x) vectors, a variance value is respectively solved on the same dimension of the vectors, and then the obtained variance values are averaged:
Figure FDA0003626927750000022
δ (-) denotes for V (x)iThe variance is found for the same dimension of the vector.
3. The active sampling method for target detection based on temporal variance threshold as claimed in claim 2, wherein in step 5.1, the average of the i-th object in the sample is counted, and the calculation method is as follows:
Figure FDA0003626927750000023
Figure FDA0003626927750000024
Figure FDA0003626927750000025
is shown as ljWhen i is 1, lj0 when not equal to i.
4. The active sampling method for target detection based on the temporal variance threshold as claimed in claim 1, wherein in step 6, the specific method for querying and marking the sample to the expert is:
step 6.1: taking a sample with the largest model uncertainty, regarding the sample and two adjacent sample frames as a set, and calculating a time sequence variance for the set;
step 6.2: if the time sequence variance of the sample is larger than the threshold value and the adjacent frame is not selected, the sample is inquired and marked to an expert, and if any one condition is not met, the sample is discarded.
5. The time series variance threshold based target detection active sampling method of claim 4, wherein in step 6.1, the specific method for calculating time series variance for the set is: 3 image frames generate 3V (x) vectors, a variance value is respectively solved on the same dimension of the three vectors, and then the average value of the solved variance values is solved
Figure FDA0003626927750000031
δ (-) denotes for V (x)iThe variance is found for the same dimension of the vector.
CN202210244128.7A 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold Active CN114332801B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210244128.7A CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210244128.7A CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Publications (2)

Publication Number Publication Date
CN114332801A CN114332801A (en) 2022-04-12
CN114332801B true CN114332801B (en) 2022-06-28

Family

ID=81033831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210244128.7A Active CN114332801B (en) 2022-03-14 2022-03-14 Target detection active sampling method based on time sequence variance threshold

Country Status (1)

Country Link
CN (1) CN114332801B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502093B (en) * 2023-06-28 2023-10-13 江苏瑞中数据股份有限公司 Target detection data selection method and device based on active learning
CN117496118B (en) * 2023-10-23 2024-06-04 浙江大学 Method and system for analyzing steal vulnerability of target detection model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222093B2 (en) * 2020-04-03 2022-01-11 Alibaba Group Holding Limited Change of variance detection in time series data
CN112115182A (en) * 2020-09-15 2020-12-22 招商局金融科技有限公司 Time sequence data processing method, device, equipment and storage medium
CN113537040B (en) * 2021-07-13 2024-07-05 南京理工大学 Time sequence behavior detection method and system based on semi-supervised learning

Also Published As

Publication number Publication date
CN114332801A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN114332801B (en) Target detection active sampling method based on time sequence variance threshold
CN111814854B (en) Target re-identification method without supervision domain adaptation
Mayer et al. Learning target candidate association to keep track of what not to track
Wu et al. Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning
Cohen et al. Learning Bayesian network classifiers for facial expression recognition both labeled and unlabeled data
KR101967086B1 (en) Entity-based temporal segmentation of video streams
US8150170B2 (en) Statistical approach to large-scale image annotation
US9158971B2 (en) Self-learning object detectors for unlabeled videos using multi-task learning
US8917907B2 (en) Continuous linear dynamic systems
Tsintotas et al. Probabilistic appearance-based place recognition through bag of tracked words
CN107818307B (en) Multi-label video event detection method based on LSTM network
CN110458022B (en) Autonomous learning target detection method based on domain adaptation
CN110532911B (en) Covariance measurement driven small sample GIF short video emotion recognition method and system
CN110575663A (en) physical education auxiliary training method based on artificial intelligence
CN111353448A (en) Pedestrian multi-target tracking method based on relevance clustering and space-time constraint
WO2021243947A1 (en) Object re-identification method and apparatus, and terminal and storage medium
Haber et al. A practical approach to real-time neutral feature subtraction for facial expression recognition
CN113642482A (en) Video character relation analysis method based on video space-time context
CN114359791A (en) Group macaque appetite detection method based on Yolo v5 network and SlowFast network
Chen et al. An interactive semantic video mining and retrieval platform--application in transportation surveillance video for incident detection
CN107194322B (en) A kind of behavior analysis method in video monitoring scene
Parkhi et al. Automated video face labelling for films and tv material
Zhao et al. Action recognition based on C3D network and adaptive keyframe extraction
CN107578069B (en) Image multi-scale automatic labeling method
CN116089874A (en) Emotion recognition method and device based on ensemble learning and migration learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant