CN110414336A - A kind of depth complementation classifier pedestrian's searching method of triple edge center loss - Google Patents
A kind of depth complementation classifier pedestrian's searching method of triple edge center loss Download PDFInfo
- Publication number
- CN110414336A CN110414336A CN201910542675.1A CN201910542675A CN110414336A CN 110414336 A CN110414336 A CN 110414336A CN 201910542675 A CN201910542675 A CN 201910542675A CN 110414336 A CN110414336 A CN 110414336A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- feature
- image
- frame
- classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of depth complementation classifier pedestrian's searching methods of triple edge center loss, belong to computer vision technique processing technology field.In pedestrian's weight identification division, it is proposed that triple edge center is lost, on the basis of the feature difference of the same pedestrian can be effectively reduced in loss at center, the thinking of triple loss is introduced, the feature difference between different pedestrians is effectively increased.By searching for subtask, detection and the raising identified to pedestrian again, reach the promotion to pedestrian's search model overall performance.The present invention can carry out pedestrian detection simultaneously to large-scale reality scene image and identify again, play a significant role in safety-security areas such as supervision of the cities.
Description
Technical field
The invention belongs to computer vision technique processing technology field, further relates to target detection and target retrieval is led
Depth complementation classifier pedestrian's searching method of one of field technique field triple edge center loss
Background technique
Document is by Xiao, Tong, et al. " End-to-End Deep Learning for Person Search. "
Pedestrian detection and pedestrian are identified that be integrated to one goes end to end by arXiv:1604.01850.arXiv, (2016) again for the first time
People's search framework.Since current pedestrian identifies that benchmark and method mainly match pedestrian's picture of clipped mistake again, still
Scene in reality will not be ideal in this way, cuts pedestrian's picture and needs to consume a large amount of time, and sometimes due to real-time
The problem of cutting photo can not be provided.Then it when doing pedestrian's search, does not need to provide the pedestrian's picture cut, can pass through
The method of pedestrian detection detects candidate pedestrian's frame, then knows with pedestrian the people that method for distinguishing retrieves specific identity again.
Document is directed to the big data small sample problem of pedestrian's search, increases the synthesis of no label using confrontation network is generated
Pedestrian, the especially significant effect when the pedestrian in scene picture is rare.
It usually will appear in the mankind/inhuman judgement of pedestrian detection and fail to judge and misjudge.
Summary of the invention
In view of the above technical problems, depth complementation classifier, which is efficiently used, in this method makes detector it can be found that complementation is sentenced
Other region.On the basis of the feature difference of the same pedestrian can be effectively reduced in loss at center, triple loss is introduced
Thinking, effectively increase the feature difference between different pedestrians.By the subtask searched for pedestrian, that is, detects and identify again
To improve the performance of whole pedestrian's search model.
In order to achieve the above technical purposes, the technical scheme adopted by the invention is that:
A kind of depth complementation classifier pedestrian's searching method of triple edge center loss, comprising the following steps:
(1) it before model training, is fought by original image training pedestrian and generates network, and using the network original
Any position of image synthesizes new pedestrian, generates new scene picture, reaches generation and enhancing pedestrian searches for network training number
According to the purpose of collection;
(2) in the training stage, characteristic information is carried out by entire scene image of the convolutional neural networks to input first and is mentioned
It takes;
(3) in pedestrian's detection-phase setting area candidate network RPN, and depth complementation classifier is utilized, obtains each frame
It is likely to be the candidate region of pedestrian target, and the size and location of constantly amendment pedestrian candidate region in video image, extracts
Their characteristic information;
(4) it after the characteristic information pond for the pedestrian candidate region that will test out is melted into identical size, is sent into pedestrian and identifies net again
Network training using the loss of triple edge center and online example match loss function combined optimization and updates pedestrian's feature, mentions
Rise the performance that pedestrian's search model identifies again;
(5) it in model measurement and forecast period, is gone using trained pedestrian's search model to input scene image
People's detection after detecting pedestrian's frame, carries out characteristic similarity matching sequence with target pedestrian image and retrieves, characteristic matching degree
Soprano is the pedestrian information that need to be retrieved.
The step 1 specifically includes:
1.1, pedestrian's frame of true target in pedestrian's scene image is filtered, only retains height and width is less than a certain fixed value
Bounding box, and intercept include the pedestrian fixed-size image block;
1.2, in pedestrian image block after filtration, then the pedestrian image for showing the complete bodily form is selected, pedestrian's frame is used
Random pixel 0 or 255 noises, i.e., random black or white covering, the image block comprising noise image fight as pedestrian and generate network
Training set;
1.3, by pedestrian fight generate model training, make the e-learning one by black and white noise frame to specific pedestrian
The mapping relations of image;
1.4, when generating pedestrian image, interception needs to generate the fixed-size image block of the scene image of pedestrian, selection
Wherein any position covering certain altitude and width noise frame, fight as pedestrian and generate model measurement collection;
1.5, using network training go out by black and white noise frame to the mapping relations of specific pedestrian image, generation pedestrian image
Afterwards, and former scene image is reduced back, completes data generation and enhancing that pedestrian searches for data set.
The step 3 is specific as follows:
3.1, after the characteristic information of the entire scene image extracted in step 2, suggested using Faster R-CNN and region
Network RPN detects pedestrian candidate region, and cross entropy loss function classifier is arranged, and for determining whether anchor point is pedestrian, is arranged
Smooth absolutely loss function returns position and the size of bounding box;
3.2, non-maximum suppression then, is used to delete repetition detection, and retains 128 couple candidate detections for each image
Then pedestrian's frame feeding pond layer is obtained 7 × 7 × 2048 characteristic pattern, reconnects one layer of fully-connected network and be sent into three by frame
A branching networks;
3.3, first branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
3.4, second branch carries out further process of refinement to the position of pedestrian's frame and size;
3.5, third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
3.6, in first branch, two depth complementation classifiers are set, are used for the mankind/inhuman judgement, first
Using classifier A, it is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing is grasped
Being directed toward wherein has judgement index characteristic pattern, and the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f
(θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
3.7, characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
Step 4 specifically includes:
4.1, the characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection
Training is closed, in back-propagating, if the tag along sort of target pedestrian is t, is just updated in inquiry table using following formula
T column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x,
Wherein,
VtTag along sort is represented as pedestrian's feature of t;
The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
4.2, to pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for learning characteristic
Expression be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QCarry out table
Show, D × Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is set according to actual scene
Size is set, while calculating the cosine similarity U in the feature U without tag identity and minimum batch between sample xTX, in each round
After iteration, new feature vector is pressed into queue, and reject those out-of-date feature vectors, the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity,
By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool
There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians
Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass;
Representative's identity label yiThe central feature of class;
Representative's identity label yjThe central feature of class;
The quantity of m expression pedestrian pedestrian's classification.
The step 5 is specific as follows:
Construct the test sample of pedestrian's search;And test sample is sent into the depth of trained triple edge center loss
It spends complementary classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian
Frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain it
Itself and pedestrian's frame feature are done characteristic similarity matching using cosine similarity and sort out possibility most by pedestrian's feature of 256 dimensions
High identity label, as the result of retrieving identity.
The beneficial effects of the present invention are:
The first, propose it is a kind of using confrontation generate network to original image carry out data enhancing, and then promoted pedestrian search
The frame of rope model generalization.
Second, pass through the depth complementation classifier for proposing pedestrian detection in the stage of pedestrian detection, utilizes complementary target area
Domain carries out people/non-human classification, to improve the overall performance of model, so that model is restrained faster, improves pedestrian detection
Efficiency.
Third is combined using the loss of triple edge center and the polymerization matching of online example, solves the row of same label
When proper manners sheet is less, minimizes class one skilled in the art feature gap and maximize pedestrian's feature gap between class.So that model is acquired
Feature robustness it is stronger, cope with data set challenge bigger in reality scene.
Detailed description of the invention
Fig. 1 is the network flow that a kind of depth complementation classifier pedestrian of triple edge center loss of the present invention searches for network
Cheng Tu.
Fig. 2 is the flow through a network figure of depth complementation classifier of the present invention.
Fig. 3 is the policy map of triple edge center loss of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and
Attached drawing, the present invention will be further described.
People search task is a typical big data small sample problem, because each pedestrian has seldom picture.
For model, the differentiation of the few pedestrian of acquisition pedestrian's quantity is characterized in highly difficult while easy to a small amount of pedestrian's figure
Piece generates over-fitting, in order to inhibit network to be easy to produce over-fitting, it is promoted to search for the development in practical application, this hair in pedestrian
It is bright to propose a kind of depth complementation classifier pedestrian's searching method of triple edge center loss.Network is generated using confrontation,
Can on original digital image data the new pedestrian of any position synthesis.Pedestrian's frame can not be accurately and efficiently detected in pedestrian detection and is made
Out in the case where the classification of front and back scape, using the classifier of depth complementation, the information that there is complementary conspicuousness to differentiate pedestrian is excavated,
Pedestrian detection model accuracy is further increased with this.
In addition to this, this method combines online example match and triple edge center loss function, more preferable area
Divide image and different classes of image from the same category, so that pedestrian is searched for e-learning to diversification and have the spy of judgement index
Sign, to be effectively relieved, the generic image of data set is few and lack of diversity problem bring influences.
It is the depth complementation classifier pedestrian search network of triple edge center loss of the present invention as shown in Figure 1, including
Following steps:
1, it generates and searches for scene image sample with enhancing pedestrian
(a) original pedestrian scene picture is searched for screen, select pedestrian's frame height degree pixel therein less than 70 and
Width pixel is mad less than 25 boundary, and intercepts the image block of 256 × 256 resolution ratio comprising pedestrian's frame, then therefrom screen
Wherein there is the pedestrian image of complete body.
(b) noise of pedestrian's frame random pixel 0 or 255 is covered, i.e., random black or white covering, then by 256 × 256 points
The image block of resolution fights the training set for generating network as pedestrian;
(c) scene image that selection needs to generate pedestrian intercepts the image block of wherein 256 × 256 resolution ratio again, wherein
Any position cover height less than 70 pixels and width less than 25 pixels noise frame, be sent into trained pedestrian to antibiosis
At network, after generating new synthesis pedestrian image, then former scene image is cut back, so that completing pedestrian searches for scene image sample
Generation and enhancing.
2, pedestrian detection and feature learning
(d) existing depth convolutional network ResNet-50 or VGG-16 are used into transfer learning strategy, imports and uses
The trained network parameter of ImageNet data set, as the initial training parameter of depth network, by a series of volume
After product, 1024 channel characteristics figures are exported, size is the 1/16 of input image resolution, obtained characteristic pattern while sharing to pedestrian
Detection and weight identification division.
(e) Faster R-CNN is set above characteristic pattern, and region is suggested that network (RPN) is responsible for detecting pedestrian's frame, examined
The pedestrian's frame feature measured enters pond layer.
(f) identification network again is set after the layer of pond and there are three branches, first branch is depth complementation classifier, warp
The judgement of pedestrian/non-pedestrian can be made by crossing training
(g) second branch is the further process of refinement to pedestrian's frame position and size.
(h) second branch.It is responsible for preservation in training, optimization and update have pedestrian's feature of label, survey in model
It is responsible for searched targets pedestrian when examination.
3, pedestrian detection is done using depth complementation classifier
(i) in first branch, two depth complementation classifiers are set, flow through a network figure is as shown in Fig. 2, be used for people
Class/inhuman judgement.Classifier A is used first, is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing operation is directed toward wherein with judgement index characteristic pattern.Then the characteristic pattern after erasing is supplied to it
Complementary classifier B, is expressed as f (θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
(j) characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and in unknown losses combined optimization model;
4, it is identified again using the loss of online example match and triple edge center loss joint training pedestrian
(k) characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection
Training is closed, wherein triple edge center loses its strategic process figure as shown in figure 3, in back-propagating, if target pedestrian
Tag along sort be t, just updated using following formula in inquiry table t column, so that inquiry table is saved same target pedestrian
Many attitude and angle under various feature Vt←γVt+ (1- γ) x, wherein VtTag along sort is represented as pedestrian's feature of t,
The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
To pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic
Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D
× Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big
It is small, while calculating the cosine similarity U in U and minimum batch between sample xTX, after each round iteration, by new feature
Vector is pressed into queue, and rejects those out-of-date feature vectors, and the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity,
By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool
There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians
Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass,Represent the person
Part label yiThe central feature of class,Representative's identity label yjThe central feature of class, m indicate the number of pedestrian pedestrian's classification
Amount;
5, pedestrian's search model is tested
Construct the test sample of pedestrian's search;And be sent into test sample, trained triple edge center loss
Depth complementation classifier pedestrian searches for network, carries out pedestrian detection to the test scene sample image of input, detects candidate row
People's frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain
Itself and pedestrian's frame feature are done characteristic similarity using cosine similarity and match the possibility that sorts out by pedestrian's feature of its 256 dimension
Highest identity label, as the result of retrieving identity.
Specific embodiment:
S1, generation and enhancing pedestrian search for scene image sample;
S2, pedestrian detection and feature learning, and the depth complementation classifier for being used for pedestrian detection is set;
The joint training of the loss of line example match and the loss of triple edge center that S3, setting identify again for pedestrian;
S4, pedestrian's search model is tested and is predicted.
The step S1 specifically includes following sub-step:
(a) the pedestrian's frame for filtering true target in pedestrian's scene image, only retains all height less than 70 pixels and width
Less than the bounding box of 25 pixels, and intercept 256 × 256 image block comprising the pedestrian;
(b) in pedestrian's edge image after filtration, then select the pedestrian image for showing the complete bodily form, by pedestrian's frame with
Machine pixel 0 or 255 noises, i.e., random black or white covering, 256 × 256 image block comprising noise image are fought as pedestrian
Generate the training set of network;
(c) training pedestrian, which fights, generates network, in the image block of interception pedestrian's scene image 256 × 256, selects any position
Cover height is set less than 70 pixels and width less than 25 pixels noise frame as test set, trained network generates for utilization
After the different pedestrian image of posture, and former scene image is reduced back, completes data generation and enhancing that pedestrian searches for data set.
The step S2 is specific as follows:
Using VGG-16 or ResNet-50,1024 channel characteristics figures are exported, size is that the resolution ratio of input picture is 1/
16, with Faster R-CNN on characteristic pattern, region suggests that network (RPN) detects pedestrian's frame, and setting binary softmax divides
Position and size that smooth absolutely loss function returns bounding box is arranged for determining whether anchor point is pedestrian in class device;
Then, non-maximum suppression is used to delete repetition detection, and retains 128 couple candidate detection frames for each image, so
Pedestrian's frame feeding pond layer is obtained into 7 × 7 × 2048 characteristic pattern afterwards, one layer of fully-connected network is reconnected and is sent into three branches
Network;
First branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
Second branch carries out further process of refinement to the position of pedestrian's frame and size;
Third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
(d) in first branch, two depth complementation classifiers are set, is used for the mankind/inhuman judgement, makes first
With classifier A, it is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing is grasped
Being directed toward wherein has judgement index characteristic pattern, and the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f
(θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
(e) characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
Step S3 is specifically included:
(f) characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection
Training is closed, in back-propagating, if the tag along sort of target pedestrian is t, is just updated in inquiry table using following formula
T column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x,
Wherein,
VtTag along sort is represented as pedestrian's feature of t, the weight that γ setting updates can be in section (0,1) interior value, this
γ=0.5 is taken in method;
To pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic
Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D
× Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big
It is small, while calculating the cosine similarity U in U and minimum batch between sample xTX, after each round iteration, by new feature
Vector is pressed into queue, and rejects those out-of-date feature vectors, and the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity,
By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool
There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians
Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass,Represent the person
Part label yiThe central feature of class,Representative's identity label yjThe central feature of class, m indicate the number of pedestrian pedestrian's classification
Amount.
The step S4 includes:
Construct the test sample of pedestrian's search;And test sample is sent into the depth of trained triple edge center loss
It spends complementary classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian
Frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain it
Itself and pedestrian's frame feature are done characteristic similarity matching using cosine similarity and sort out possibility most by pedestrian's feature of 256 dimensions
High identity label, as the result of retrieving identity.
Claims (5)
1. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss, which is characterized in that including following
Step:
(1) it before model training, is fought by original image training pedestrian and generates network, and using the network in original image
Any position synthesize new pedestrian, generate new scene picture, reach generation and enhancing pedestrian search for network training dataset
Purpose;
(2) in the training stage, feature information extraction is carried out by entire scene image of the convolutional neural networks to input first;
(3) in pedestrian's detection-phase setting area candidate network RPN, and depth complementation classifier is utilized, obtains each frame video
It is likely to be the candidate region of pedestrian target, and the size and location of constantly amendment pedestrian candidate region in image, extracts them
Characteristic information;
(4) it after the characteristic information pond for the pedestrian candidate region that will test out is melted into identical size, is sent into pedestrian and identifies that network is instructed again
Practice, using the loss of triple edge center and online example match loss function combined optimization and updates pedestrian's feature, promote row
The performance that people's search model identifies again;
(5) in model measurement and forecast period, pedestrian's inspection is carried out to input scene image using trained pedestrian's search model
It surveys, after detecting pedestrian's frame, carries out characteristic similarity matching sequence with target pedestrian image and retrieve, characteristic matching degree highest
Person is the pedestrian information that need to be retrieved.
2. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1,
It is characterized in that, the step 1 specifically includes:
1.1, pedestrian's frame of true target in pedestrian's scene image is filtered, only retains height and width is less than the side of a certain fixed value
Boundary's frame, and intercept the fixed-size image block comprising the pedestrian;
1.2, in pedestrian image block after filtration, then the pedestrian image for showing the complete bodily form is selected, by pedestrian's frame at random
Pixel 0 or 255 noises, i.e., random black or white covering, the image block comprising noise image fight the instruction for generating network as pedestrian
Practice collection;
1.3, by pedestrian fight generate model training, make the e-learning one by black and white noise frame to specific pedestrian image
Mapping relations;
1.4, when generating pedestrian image, interception needs to generate the fixed-size image block of the scene image of pedestrian, and selection is wherein
Any position covers certain altitude and width noise frame, fights as pedestrian and generates model measurement collection;
1.5, using network training go out by black and white noise frame to the mapping relations of specific pedestrian image, after generation pedestrian image,
And former scene image is reduced back, complete data generation and enhancing that pedestrian searches for data set.
3. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1,
It is characterized in that, the step 3 is specific as follows:
3.1, after the characteristic information of the entire scene image extracted in step 2, suggest network using Faster R-CNN and region
RPN detects pedestrian candidate region, and cross entropy loss function classifier is arranged, and for determining whether anchor point is pedestrian, setting is smooth
Absolute loss function returns position and the size of bounding box;
3.2, non-maximum suppression then, is used to delete repetition detection, and retains 128 couple candidate detection frames for each image, so
Pedestrian's frame feeding pond layer is obtained into 7 × 7 × 2048 characteristic pattern afterwards, one layer of fully-connected network is reconnected and is sent into three branches
Network;
3.3, first branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
3.4, second branch carries out further process of refinement to the position of pedestrian's frame and size;
3.5, third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
3.6, in first branch, two depth complementation classifiers are set, is used for the mankind/inhuman judgement, uses first
Classifier A is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing operation is referred to
There is judgement index characteristic pattern thereto, the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f (θB), with
It was found that the complementary characteristic area with judgement index, generates characteristic pattern
3.7, characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
4. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1,
It is characterized in that, step 4 specifically includes:
4.1, the characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss joint instruction
Practice, in back-propagating, if the tag along sort of target pedestrian is t, the t in inquiry table is just updated using following formula
Column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x,
In,
VtTag along sort is represented as pedestrian's feature of t;
The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
4.2, to pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic
Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D
× Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big
It is small, while calculating the cosine similarity U in the feature U without tag identity and minimum batch between sample xTX, in each round iteration
Later, new feature vector is pressed into queue, and rejects those out-of-date feature vectors, the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity, pass through
Reduce difference in class, increases class inherited loss Optimized model training, only training has mark to triple edge center loss function
Pedestrian's feature of label changes model minimization with the internal feature of a group traveling together, maximizes the internal feature variation of different pedestrians
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass;
Representative's identity label yiThe central feature of class;
Representative's identity label yjThe central feature of class;
The quantity of m expression pedestrian pedestrian's classification.
5. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1,
It is characterized in that, the step 5 is specific as follows:
Construct the test sample of pedestrian's search;And the depth that test sample is sent into the loss of trained triple edge center is mutual
It mends classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian's frame position
Set, be sent into pedestrian identify again network obtain its 256 dimension pedestrian's feature, then input target pedestrian image equally obtain its 256
Itself and pedestrian's frame feature are done characteristic similarity using cosine similarity and match the possibility highest that sorts out by pedestrian's feature of dimension
Identity label, as the result of retrieving identity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910542675.1A CN110414336A (en) | 2019-06-21 | 2019-06-21 | A kind of depth complementation classifier pedestrian's searching method of triple edge center loss |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910542675.1A CN110414336A (en) | 2019-06-21 | 2019-06-21 | A kind of depth complementation classifier pedestrian's searching method of triple edge center loss |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110414336A true CN110414336A (en) | 2019-11-05 |
Family
ID=68359564
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910542675.1A Pending CN110414336A (en) | 2019-06-21 | 2019-06-21 | A kind of depth complementation classifier pedestrian's searching method of triple edge center loss |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110414336A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027397A (en) * | 2019-11-14 | 2020-04-17 | 上海交通大学 | Method, system, medium and device for detecting comprehensive characteristic target in intelligent monitoring network |
CN111062479A (en) * | 2019-12-19 | 2020-04-24 | 北京迈格威科技有限公司 | Model rapid upgrading method and device based on neural network |
CN111340700A (en) * | 2020-02-21 | 2020-06-26 | 北京中科虹霸科技有限公司 | Model generation method, resolution improvement method, image identification method and device |
CN113723188A (en) * | 2021-07-28 | 2021-11-30 | 国网浙江省电力有限公司电力科学研究院 | Dress uniform person identity verification method combining face and gait features |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063776A (en) * | 2018-08-07 | 2018-12-21 | 北京旷视科技有限公司 | Image identifies network training method, device and image recognition methods and device again again |
CN109146921A (en) * | 2018-07-02 | 2019-01-04 | 华中科技大学 | A kind of pedestrian target tracking based on deep learning |
-
2019
- 2019-06-21 CN CN201910542675.1A patent/CN110414336A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109146921A (en) * | 2018-07-02 | 2019-01-04 | 华中科技大学 | A kind of pedestrian target tracking based on deep learning |
CN109063776A (en) * | 2018-08-07 | 2018-12-21 | 北京旷视科技有限公司 | Image identifies network training method, device and image recognition methods and device again again |
Non-Patent Citations (4)
Title |
---|
TONG XIAO等: "End-to-End Deep Learning for Person Search", 《ARXIV》 * |
XIAOLIN ZHANG等: "Adversarial Complementary Learning for Weakly Supervised Object Location", 《COMPUTER VISION FOUNDATION》 * |
何琼华: "基于Triplet-awared Center Loss的人脸识别算法研究与实现", 《中国优秀硕士学位论文全文数据库》 * |
赵文轩: "智能监控下的行人再识别问题", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027397A (en) * | 2019-11-14 | 2020-04-17 | 上海交通大学 | Method, system, medium and device for detecting comprehensive characteristic target in intelligent monitoring network |
CN111027397B (en) * | 2019-11-14 | 2023-05-12 | 上海交通大学 | Comprehensive feature target detection method, system, medium and equipment suitable for intelligent monitoring network |
CN111062479A (en) * | 2019-12-19 | 2020-04-24 | 北京迈格威科技有限公司 | Model rapid upgrading method and device based on neural network |
CN111062479B (en) * | 2019-12-19 | 2024-01-23 | 北京迈格威科技有限公司 | Neural network-based rapid model upgrading method and device |
CN111340700A (en) * | 2020-02-21 | 2020-06-26 | 北京中科虹霸科技有限公司 | Model generation method, resolution improvement method, image identification method and device |
CN111340700B (en) * | 2020-02-21 | 2023-04-25 | 北京中科虹霸科技有限公司 | Model generation method, resolution improvement method, image recognition method and device |
CN113723188A (en) * | 2021-07-28 | 2021-11-30 | 国网浙江省电力有限公司电力科学研究院 | Dress uniform person identity verification method combining face and gait features |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110956185B (en) | Method for detecting image salient object | |
Jia et al. | Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot | |
Zheng et al. | Cross-regional oil palm tree counting and detection via a multi-level attention domain adaptation network | |
Zhang et al. | Multi-task cascaded convolutional networks based intelligent fruit detection for designing automated robot | |
CN110414336A (en) | A kind of depth complementation classifier pedestrian's searching method of triple edge center loss | |
CN105512640B (en) | A kind of people flow rate statistical method based on video sequence | |
CN105844292B (en) | A kind of image scene mask method based on condition random field and secondary dictionary learning | |
CN108830188A (en) | Vehicle checking method based on deep learning | |
CN109948425A (en) | A kind of perception of structure is from paying attention to and online example polymerize matched pedestrian's searching method and device | |
CN109614985A (en) | A kind of object detection method based on intensive connection features pyramid network | |
CN105389562B (en) | A kind of double optimization method of the monitor video pedestrian weight recognition result of space-time restriction | |
CN107346420A (en) | Text detection localization method under a kind of natural scene based on deep learning | |
CN109871875B (en) | Building change detection method based on deep learning | |
Liu et al. | Super-pixel cloud detection using hierarchical fusion CNN | |
CN112488229B (en) | Domain self-adaptive unsupervised target detection method based on feature separation and alignment | |
CN110246141A (en) | It is a kind of based on joint angle point pond vehicles in complex traffic scene under vehicle image partition method | |
CN109344842A (en) | A kind of pedestrian's recognition methods again based on semantic region expression | |
Song et al. | A hierarchical object detection method in large-scale optical remote sensing satellite imagery using saliency detection and CNN | |
Zhang et al. | Guided attention in cnns for occluded pedestrian detection and re-identification | |
CN110956158A (en) | Pedestrian shielding re-identification method based on teacher and student learning frame | |
CN109033944A (en) | A kind of all-sky aurora image classification and crucial partial structurtes localization method and system | |
CN106874825A (en) | The training method of Face datection, detection method and device | |
CN107463954A (en) | A kind of template matches recognition methods for obscuring different spectrogram picture | |
CN110533100A (en) | A method of CME detection and tracking is carried out based on machine learning | |
Ge et al. | Coarse-to-fine foraminifera image segmentation through 3D and deep features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191105 |
|
RJ01 | Rejection of invention patent application after publication |