CN110414336A - A kind of depth complementation classifier pedestrian's searching method of triple edge center loss - Google Patents

A kind of depth complementation classifier pedestrian's searching method of triple edge center loss Download PDF

Info

Publication number
CN110414336A
CN110414336A CN201910542675.1A CN201910542675A CN110414336A CN 110414336 A CN110414336 A CN 110414336A CN 201910542675 A CN201910542675 A CN 201910542675A CN 110414336 A CN110414336 A CN 110414336A
Authority
CN
China
Prior art keywords
pedestrian
feature
image
frame
classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910542675.1A
Other languages
Chinese (zh)
Inventor
姚睿
高存远
夏士雄
赵佳琦
周勇
牛强
袁冠
张凤荣
陈朋朋
王重秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN201910542675.1A priority Critical patent/CN110414336A/en
Publication of CN110414336A publication Critical patent/CN110414336A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of depth complementation classifier pedestrian's searching methods of triple edge center loss, belong to computer vision technique processing technology field.In pedestrian's weight identification division, it is proposed that triple edge center is lost, on the basis of the feature difference of the same pedestrian can be effectively reduced in loss at center, the thinking of triple loss is introduced, the feature difference between different pedestrians is effectively increased.By searching for subtask, detection and the raising identified to pedestrian again, reach the promotion to pedestrian's search model overall performance.The present invention can carry out pedestrian detection simultaneously to large-scale reality scene image and identify again, play a significant role in safety-security areas such as supervision of the cities.

Description

A kind of depth complementation classifier pedestrian's searching method of triple edge center loss
Technical field
The invention belongs to computer vision technique processing technology field, further relates to target detection and target retrieval is led Depth complementation classifier pedestrian's searching method of one of field technique field triple edge center loss
Background technique
Document is by Xiao, Tong, et al. " End-to-End Deep Learning for Person Search. " Pedestrian detection and pedestrian are identified that be integrated to one goes end to end by arXiv:1604.01850.arXiv, (2016) again for the first time People's search framework.Since current pedestrian identifies that benchmark and method mainly match pedestrian's picture of clipped mistake again, still Scene in reality will not be ideal in this way, cuts pedestrian's picture and needs to consume a large amount of time, and sometimes due to real-time The problem of cutting photo can not be provided.Then it when doing pedestrian's search, does not need to provide the pedestrian's picture cut, can pass through The method of pedestrian detection detects candidate pedestrian's frame, then knows with pedestrian the people that method for distinguishing retrieves specific identity again.
Document is directed to the big data small sample problem of pedestrian's search, increases the synthesis of no label using confrontation network is generated Pedestrian, the especially significant effect when the pedestrian in scene picture is rare.
It usually will appear in the mankind/inhuman judgement of pedestrian detection and fail to judge and misjudge.
Summary of the invention
In view of the above technical problems, depth complementation classifier, which is efficiently used, in this method makes detector it can be found that complementation is sentenced Other region.On the basis of the feature difference of the same pedestrian can be effectively reduced in loss at center, triple loss is introduced Thinking, effectively increase the feature difference between different pedestrians.By the subtask searched for pedestrian, that is, detects and identify again To improve the performance of whole pedestrian's search model.
In order to achieve the above technical purposes, the technical scheme adopted by the invention is that:
A kind of depth complementation classifier pedestrian's searching method of triple edge center loss, comprising the following steps:
(1) it before model training, is fought by original image training pedestrian and generates network, and using the network original Any position of image synthesizes new pedestrian, generates new scene picture, reaches generation and enhancing pedestrian searches for network training number According to the purpose of collection;
(2) in the training stage, characteristic information is carried out by entire scene image of the convolutional neural networks to input first and is mentioned It takes;
(3) in pedestrian's detection-phase setting area candidate network RPN, and depth complementation classifier is utilized, obtains each frame It is likely to be the candidate region of pedestrian target, and the size and location of constantly amendment pedestrian candidate region in video image, extracts Their characteristic information;
(4) it after the characteristic information pond for the pedestrian candidate region that will test out is melted into identical size, is sent into pedestrian and identifies net again Network training using the loss of triple edge center and online example match loss function combined optimization and updates pedestrian's feature, mentions Rise the performance that pedestrian's search model identifies again;
(5) it in model measurement and forecast period, is gone using trained pedestrian's search model to input scene image People's detection after detecting pedestrian's frame, carries out characteristic similarity matching sequence with target pedestrian image and retrieves, characteristic matching degree Soprano is the pedestrian information that need to be retrieved.
The step 1 specifically includes:
1.1, pedestrian's frame of true target in pedestrian's scene image is filtered, only retains height and width is less than a certain fixed value Bounding box, and intercept include the pedestrian fixed-size image block;
1.2, in pedestrian image block after filtration, then the pedestrian image for showing the complete bodily form is selected, pedestrian's frame is used Random pixel 0 or 255 noises, i.e., random black or white covering, the image block comprising noise image fight as pedestrian and generate network Training set;
1.3, by pedestrian fight generate model training, make the e-learning one by black and white noise frame to specific pedestrian The mapping relations of image;
1.4, when generating pedestrian image, interception needs to generate the fixed-size image block of the scene image of pedestrian, selection Wherein any position covering certain altitude and width noise frame, fight as pedestrian and generate model measurement collection;
1.5, using network training go out by black and white noise frame to the mapping relations of specific pedestrian image, generation pedestrian image Afterwards, and former scene image is reduced back, completes data generation and enhancing that pedestrian searches for data set.
The step 3 is specific as follows:
3.1, after the characteristic information of the entire scene image extracted in step 2, suggested using Faster R-CNN and region Network RPN detects pedestrian candidate region, and cross entropy loss function classifier is arranged, and for determining whether anchor point is pedestrian, is arranged Smooth absolutely loss function returns position and the size of bounding box;
3.2, non-maximum suppression then, is used to delete repetition detection, and retains 128 couple candidate detections for each image Then pedestrian's frame feeding pond layer is obtained 7 × 7 × 2048 characteristic pattern, reconnects one layer of fully-connected network and be sent into three by frame A branching networks;
3.3, first branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
3.4, second branch carries out further process of refinement to the position of pedestrian's frame and size;
3.5, third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
3.6, in first branch, two depth complementation classifiers are set, are used for the mankind/inhuman judgement, first Using classifier A, it is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing is grasped Being directed toward wherein has judgement index characteristic pattern, and the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f (θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
3.7, characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
Step 4 specifically includes:
4.1, the characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection Training is closed, in back-propagating, if the tag along sort of target pedestrian is t, is just updated in inquiry table using following formula T column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x, Wherein,
VtTag along sort is represented as pedestrian's feature of t;
The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
4.2, to pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for learning characteristic Expression be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QCarry out table Show, D × Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is set according to actual scene Size is set, while calculating the cosine similarity U in the feature U without tag identity and minimum batch between sample xTX, in each round After iteration, new feature vector is pressed into queue, and reject those out-of-date feature vectors, the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity, By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass;
Representative's identity label yiThe central feature of class;
Representative's identity label yjThe central feature of class;
The quantity of m expression pedestrian pedestrian's classification.
The step 5 is specific as follows:
Construct the test sample of pedestrian's search;And test sample is sent into the depth of trained triple edge center loss It spends complementary classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian Frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain it Itself and pedestrian's frame feature are done characteristic similarity matching using cosine similarity and sort out possibility most by pedestrian's feature of 256 dimensions High identity label, as the result of retrieving identity.
The beneficial effects of the present invention are:
The first, propose it is a kind of using confrontation generate network to original image carry out data enhancing, and then promoted pedestrian search The frame of rope model generalization.
Second, pass through the depth complementation classifier for proposing pedestrian detection in the stage of pedestrian detection, utilizes complementary target area Domain carries out people/non-human classification, to improve the overall performance of model, so that model is restrained faster, improves pedestrian detection Efficiency.
Third is combined using the loss of triple edge center and the polymerization matching of online example, solves the row of same label When proper manners sheet is less, minimizes class one skilled in the art feature gap and maximize pedestrian's feature gap between class.So that model is acquired Feature robustness it is stronger, cope with data set challenge bigger in reality scene.
Detailed description of the invention
Fig. 1 is the network flow that a kind of depth complementation classifier pedestrian of triple edge center loss of the present invention searches for network Cheng Tu.
Fig. 2 is the flow through a network figure of depth complementation classifier of the present invention.
Fig. 3 is the policy map of triple edge center loss of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and Attached drawing, the present invention will be further described.
People search task is a typical big data small sample problem, because each pedestrian has seldom picture. For model, the differentiation of the few pedestrian of acquisition pedestrian's quantity is characterized in highly difficult while easy to a small amount of pedestrian's figure Piece generates over-fitting, in order to inhibit network to be easy to produce over-fitting, it is promoted to search for the development in practical application, this hair in pedestrian It is bright to propose a kind of depth complementation classifier pedestrian's searching method of triple edge center loss.Network is generated using confrontation, Can on original digital image data the new pedestrian of any position synthesis.Pedestrian's frame can not be accurately and efficiently detected in pedestrian detection and is made Out in the case where the classification of front and back scape, using the classifier of depth complementation, the information that there is complementary conspicuousness to differentiate pedestrian is excavated, Pedestrian detection model accuracy is further increased with this.
In addition to this, this method combines online example match and triple edge center loss function, more preferable area Divide image and different classes of image from the same category, so that pedestrian is searched for e-learning to diversification and have the spy of judgement index Sign, to be effectively relieved, the generic image of data set is few and lack of diversity problem bring influences.
It is the depth complementation classifier pedestrian search network of triple edge center loss of the present invention as shown in Figure 1, including Following steps:
1, it generates and searches for scene image sample with enhancing pedestrian
(a) original pedestrian scene picture is searched for screen, select pedestrian's frame height degree pixel therein less than 70 and Width pixel is mad less than 25 boundary, and intercepts the image block of 256 × 256 resolution ratio comprising pedestrian's frame, then therefrom screen Wherein there is the pedestrian image of complete body.
(b) noise of pedestrian's frame random pixel 0 or 255 is covered, i.e., random black or white covering, then by 256 × 256 points The image block of resolution fights the training set for generating network as pedestrian;
(c) scene image that selection needs to generate pedestrian intercepts the image block of wherein 256 × 256 resolution ratio again, wherein Any position cover height less than 70 pixels and width less than 25 pixels noise frame, be sent into trained pedestrian to antibiosis At network, after generating new synthesis pedestrian image, then former scene image is cut back, so that completing pedestrian searches for scene image sample Generation and enhancing.
2, pedestrian detection and feature learning
(d) existing depth convolutional network ResNet-50 or VGG-16 are used into transfer learning strategy, imports and uses The trained network parameter of ImageNet data set, as the initial training parameter of depth network, by a series of volume After product, 1024 channel characteristics figures are exported, size is the 1/16 of input image resolution, obtained characteristic pattern while sharing to pedestrian Detection and weight identification division.
(e) Faster R-CNN is set above characteristic pattern, and region is suggested that network (RPN) is responsible for detecting pedestrian's frame, examined The pedestrian's frame feature measured enters pond layer.
(f) identification network again is set after the layer of pond and there are three branches, first branch is depth complementation classifier, warp The judgement of pedestrian/non-pedestrian can be made by crossing training
(g) second branch is the further process of refinement to pedestrian's frame position and size.
(h) second branch.It is responsible for preservation in training, optimization and update have pedestrian's feature of label, survey in model It is responsible for searched targets pedestrian when examination.
3, pedestrian detection is done using depth complementation classifier
(i) in first branch, two depth complementation classifiers are set, flow through a network figure is as shown in Fig. 2, be used for people Class/inhuman judgement.Classifier A is used first, is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing operation is directed toward wherein with judgement index characteristic pattern.Then the characteristic pattern after erasing is supplied to it Complementary classifier B, is expressed as f (θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
(j) characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and in unknown losses combined optimization model;
4, it is identified again using the loss of online example match and triple edge center loss joint training pedestrian
(k) characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection Training is closed, wherein triple edge center loses its strategic process figure as shown in figure 3, in back-propagating, if target pedestrian Tag along sort be t, just updated using following formula in inquiry table t column, so that inquiry table is saved same target pedestrian Many attitude and angle under various feature Vt←γVt+ (1- γ) x, wherein VtTag along sort is represented as pedestrian's feature of t, The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
To pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D × Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big It is small, while calculating the cosine similarity U in U and minimum batch between sample xTX, after each round iteration, by new feature Vector is pressed into queue, and rejects those out-of-date feature vectors, and the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity, By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass,Represent the person Part label yiThe central feature of class,Representative's identity label yjThe central feature of class, m indicate the number of pedestrian pedestrian's classification Amount;
5, pedestrian's search model is tested
Construct the test sample of pedestrian's search;And be sent into test sample, trained triple edge center loss Depth complementation classifier pedestrian searches for network, carries out pedestrian detection to the test scene sample image of input, detects candidate row People's frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain Itself and pedestrian's frame feature are done characteristic similarity using cosine similarity and match the possibility that sorts out by pedestrian's feature of its 256 dimension Highest identity label, as the result of retrieving identity.
Specific embodiment:
S1, generation and enhancing pedestrian search for scene image sample;
S2, pedestrian detection and feature learning, and the depth complementation classifier for being used for pedestrian detection is set;
The joint training of the loss of line example match and the loss of triple edge center that S3, setting identify again for pedestrian;
S4, pedestrian's search model is tested and is predicted.
The step S1 specifically includes following sub-step:
(a) the pedestrian's frame for filtering true target in pedestrian's scene image, only retains all height less than 70 pixels and width Less than the bounding box of 25 pixels, and intercept 256 × 256 image block comprising the pedestrian;
(b) in pedestrian's edge image after filtration, then select the pedestrian image for showing the complete bodily form, by pedestrian's frame with Machine pixel 0 or 255 noises, i.e., random black or white covering, 256 × 256 image block comprising noise image are fought as pedestrian Generate the training set of network;
(c) training pedestrian, which fights, generates network, in the image block of interception pedestrian's scene image 256 × 256, selects any position Cover height is set less than 70 pixels and width less than 25 pixels noise frame as test set, trained network generates for utilization After the different pedestrian image of posture, and former scene image is reduced back, completes data generation and enhancing that pedestrian searches for data set.
The step S2 is specific as follows:
Using VGG-16 or ResNet-50,1024 channel characteristics figures are exported, size is that the resolution ratio of input picture is 1/ 16, with Faster R-CNN on characteristic pattern, region suggests that network (RPN) detects pedestrian's frame, and setting binary softmax divides Position and size that smooth absolutely loss function returns bounding box is arranged for determining whether anchor point is pedestrian in class device;
Then, non-maximum suppression is used to delete repetition detection, and retains 128 couple candidate detection frames for each image, so Pedestrian's frame feeding pond layer is obtained into 7 × 7 × 2048 characteristic pattern afterwards, one layer of fully-connected network is reconnected and is sent into three branches Network;
First branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
Second branch carries out further process of refinement to the position of pedestrian's frame and size;
Third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
(d) in first branch, two depth complementation classifiers are set, is used for the mankind/inhuman judgement, makes first With classifier A, it is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing is grasped Being directed toward wherein has judgement index characteristic pattern, and the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f (θB), to find the complementary characteristic area with judgement index, generate characteristic pattern
(e) characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
Step S3 is specifically included:
(f) characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss connection Training is closed, in back-propagating, if the tag along sort of target pedestrian is t, is just updated in inquiry table using following formula T column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x, Wherein,
VtTag along sort is represented as pedestrian's feature of t, the weight that γ setting updates can be in section (0,1) interior value, this γ=0.5 is taken in method;
To pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D × Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big It is small, while calculating the cosine similarity U in U and minimum batch between sample xTX, after each round iteration, by new feature Vector is pressed into queue, and rejects those out-of-date feature vectors, and the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity, By reducing difference in class, increase class inherited loss Optimized model training, triple edge center loss function only trains tool There is pedestrian's feature of label, changes model minimization with the internal feature of a group traveling together, maximize the internal feature of different pedestrians Variation
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass,Represent the person Part label yiThe central feature of class,Representative's identity label yjThe central feature of class, m indicate the number of pedestrian pedestrian's classification Amount.
The step S4 includes:
Construct the test sample of pedestrian's search;And test sample is sent into the depth of trained triple edge center loss It spends complementary classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian Frame position is sent into pedestrian and identifies that network obtains pedestrian's feature of its 256 dimension again, then inputs target pedestrian image and equally obtain it Itself and pedestrian's frame feature are done characteristic similarity matching using cosine similarity and sort out possibility most by pedestrian's feature of 256 dimensions High identity label, as the result of retrieving identity.

Claims (5)

1. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss, which is characterized in that including following Step:
(1) it before model training, is fought by original image training pedestrian and generates network, and using the network in original image Any position synthesize new pedestrian, generate new scene picture, reach generation and enhancing pedestrian search for network training dataset Purpose;
(2) in the training stage, feature information extraction is carried out by entire scene image of the convolutional neural networks to input first;
(3) in pedestrian's detection-phase setting area candidate network RPN, and depth complementation classifier is utilized, obtains each frame video It is likely to be the candidate region of pedestrian target, and the size and location of constantly amendment pedestrian candidate region in image, extracts them Characteristic information;
(4) it after the characteristic information pond for the pedestrian candidate region that will test out is melted into identical size, is sent into pedestrian and identifies that network is instructed again Practice, using the loss of triple edge center and online example match loss function combined optimization and updates pedestrian's feature, promote row The performance that people's search model identifies again;
(5) in model measurement and forecast period, pedestrian's inspection is carried out to input scene image using trained pedestrian's search model It surveys, after detecting pedestrian's frame, carries out characteristic similarity matching sequence with target pedestrian image and retrieve, characteristic matching degree highest Person is the pedestrian information that need to be retrieved.
2. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1, It is characterized in that, the step 1 specifically includes:
1.1, pedestrian's frame of true target in pedestrian's scene image is filtered, only retains height and width is less than the side of a certain fixed value Boundary's frame, and intercept the fixed-size image block comprising the pedestrian;
1.2, in pedestrian image block after filtration, then the pedestrian image for showing the complete bodily form is selected, by pedestrian's frame at random Pixel 0 or 255 noises, i.e., random black or white covering, the image block comprising noise image fight the instruction for generating network as pedestrian Practice collection;
1.3, by pedestrian fight generate model training, make the e-learning one by black and white noise frame to specific pedestrian image Mapping relations;
1.4, when generating pedestrian image, interception needs to generate the fixed-size image block of the scene image of pedestrian, and selection is wherein Any position covers certain altitude and width noise frame, fights as pedestrian and generates model measurement collection;
1.5, using network training go out by black and white noise frame to the mapping relations of specific pedestrian image, after generation pedestrian image, And former scene image is reduced back, complete data generation and enhancing that pedestrian searches for data set.
3. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1, It is characterized in that, the step 3 is specific as follows:
3.1, after the characteristic information of the entire scene image extracted in step 2, suggest network using Faster R-CNN and region RPN detects pedestrian candidate region, and cross entropy loss function classifier is arranged, and for determining whether anchor point is pedestrian, setting is smooth Absolute loss function returns position and the size of bounding box;
3.2, non-maximum suppression then, is used to delete repetition detection, and retains 128 couple candidate detection frames for each image, so Pedestrian's frame feeding pond layer is obtained into 7 × 7 × 2048 characteristic pattern afterwards, one layer of fully-connected network is reconnected and is sent into three branches Network;
3.3, first branch is the classifier of depth complementation, can make the mankind/inhuman judgement by training;
3.4, second branch carries out further process of refinement to the position of pedestrian's frame and size;
3.5, third branch is the full convolutional layer of 256 dimensions, and output is the standardized feature of L-2;
3.6, in first branch, two depth complementation classifiers are set, is used for the mankind/inhuman judgement, uses first Classifier A is expressed as f (θA) identify the region for most having judgement index, generate characteristic patternAnd erasing operation is referred to There is judgement index characteristic pattern thereto, the characteristic pattern after erasing is then supplied to its complementary classifier B, is expressed as f (θB), with It was found that the complementary characteristic area with judgement index, generates characteristic pattern
3.7, characteristic pattern F is supervised with cross entropy loss functionAAnd FB, and with unknown losses combined optimization model.
4. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1, It is characterized in that, step 4 specifically includes:
4.1, the characteristic pattern that second branch obtains utilizes the loss of online example match and triple edge center loss joint instruction Practice, in back-propagating, if the tag along sort of target pedestrian is t, the t in inquiry table is just updated using following formula Column, enable inquiry table to save many attitude of same target pedestrian and the various feature V under anglet←γVt+ (1- γ) x, In,
VtTag along sort is represented as pedestrian's feature of t;
The weight that γ setting updates, can take γ=0.5 in section (0,1) interior value, this method;
4.2, to pedestrian's frame feature of the not tag identity occurred in scene picture as negative sample, for the table of learning characteristic Up to be also it is of great value, these features without tag identity are saved by setting round-robin queue Q, with U ∈ RD×QIt indicates, D × Q ties up matrix, and D is pedestrian's frame characteristic dimension after L2 regularization, and Q is the size of round-robin queue, is arranged according to actual scene big It is small, while calculating the cosine similarity U in the feature U without tag identity and minimum batch between sample xTX, in each round iteration Later, new feature vector is pressed into queue, and rejects those out-of-date feature vectors, the process of a circulation is presented;
It introduces triple edge center loss function shown in formula (6) and constraint is realized to the feature with tag identity, pass through Reduce difference in class, increases class inherited loss Optimized model training, only training has mark to triple edge center loss function Pedestrian's feature of label changes model minimization with the internal feature of a group traveling together, maximizes the internal feature variation of different pedestrians
Wherein, Xi∈RdThe feature of pedestrian's frame i is represented, it is to belong to people's identity label yiClass;
Representative's identity label yiThe central feature of class;
Representative's identity label yjThe central feature of class;
The quantity of m expression pedestrian pedestrian's classification.
5. a kind of depth complementation classifier pedestrian's searching method of triple edge center loss according to claim 1, It is characterized in that, the step 5 is specific as follows:
Construct the test sample of pedestrian's search;And the depth that test sample is sent into the loss of trained triple edge center is mutual It mends classifier pedestrian and searches for network, pedestrian detection is carried out to the test scene sample image of input, detects candidate pedestrian's frame position Set, be sent into pedestrian identify again network obtain its 256 dimension pedestrian's feature, then input target pedestrian image equally obtain its 256 Itself and pedestrian's frame feature are done characteristic similarity using cosine similarity and match the possibility highest that sorts out by pedestrian's feature of dimension Identity label, as the result of retrieving identity.
CN201910542675.1A 2019-06-21 2019-06-21 A kind of depth complementation classifier pedestrian's searching method of triple edge center loss Pending CN110414336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910542675.1A CN110414336A (en) 2019-06-21 2019-06-21 A kind of depth complementation classifier pedestrian's searching method of triple edge center loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910542675.1A CN110414336A (en) 2019-06-21 2019-06-21 A kind of depth complementation classifier pedestrian's searching method of triple edge center loss

Publications (1)

Publication Number Publication Date
CN110414336A true CN110414336A (en) 2019-11-05

Family

ID=68359564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910542675.1A Pending CN110414336A (en) 2019-06-21 2019-06-21 A kind of depth complementation classifier pedestrian's searching method of triple edge center loss

Country Status (1)

Country Link
CN (1) CN110414336A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027397A (en) * 2019-11-14 2020-04-17 上海交通大学 Method, system, medium and device for detecting comprehensive characteristic target in intelligent monitoring network
CN111062479A (en) * 2019-12-19 2020-04-24 北京迈格威科技有限公司 Model rapid upgrading method and device based on neural network
CN111340700A (en) * 2020-02-21 2020-06-26 北京中科虹霸科技有限公司 Model generation method, resolution improvement method, image identification method and device
CN113723188A (en) * 2021-07-28 2021-11-30 国网浙江省电力有限公司电力科学研究院 Dress uniform person identity verification method combining face and gait features

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063776A (en) * 2018-08-07 2018-12-21 北京旷视科技有限公司 Image identifies network training method, device and image recognition methods and device again again
CN109146921A (en) * 2018-07-02 2019-01-04 华中科技大学 A kind of pedestrian target tracking based on deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146921A (en) * 2018-07-02 2019-01-04 华中科技大学 A kind of pedestrian target tracking based on deep learning
CN109063776A (en) * 2018-08-07 2018-12-21 北京旷视科技有限公司 Image identifies network training method, device and image recognition methods and device again again

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
TONG XIAO等: "End-to-End Deep Learning for Person Search", 《ARXIV》 *
XIAOLIN ZHANG等: "Adversarial Complementary Learning for Weakly Supervised Object Location", 《COMPUTER VISION FOUNDATION》 *
何琼华: "基于Triplet-awared Center Loss的人脸识别算法研究与实现", 《中国优秀硕士学位论文全文数据库》 *
赵文轩: "智能监控下的行人再识别问题", 《中国优秀硕士学位论文全文数据库》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027397A (en) * 2019-11-14 2020-04-17 上海交通大学 Method, system, medium and device for detecting comprehensive characteristic target in intelligent monitoring network
CN111027397B (en) * 2019-11-14 2023-05-12 上海交通大学 Comprehensive feature target detection method, system, medium and equipment suitable for intelligent monitoring network
CN111062479A (en) * 2019-12-19 2020-04-24 北京迈格威科技有限公司 Model rapid upgrading method and device based on neural network
CN111062479B (en) * 2019-12-19 2024-01-23 北京迈格威科技有限公司 Neural network-based rapid model upgrading method and device
CN111340700A (en) * 2020-02-21 2020-06-26 北京中科虹霸科技有限公司 Model generation method, resolution improvement method, image identification method and device
CN111340700B (en) * 2020-02-21 2023-04-25 北京中科虹霸科技有限公司 Model generation method, resolution improvement method, image recognition method and device
CN113723188A (en) * 2021-07-28 2021-11-30 国网浙江省电力有限公司电力科学研究院 Dress uniform person identity verification method combining face and gait features

Similar Documents

Publication Publication Date Title
CN110956185B (en) Method for detecting image salient object
Jia et al. Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot
Zheng et al. Cross-regional oil palm tree counting and detection via a multi-level attention domain adaptation network
Zhang et al. Multi-task cascaded convolutional networks based intelligent fruit detection for designing automated robot
CN110414336A (en) A kind of depth complementation classifier pedestrian's searching method of triple edge center loss
CN105512640B (en) A kind of people flow rate statistical method based on video sequence
CN105844292B (en) A kind of image scene mask method based on condition random field and secondary dictionary learning
CN108830188A (en) Vehicle checking method based on deep learning
CN109948425A (en) A kind of perception of structure is from paying attention to and online example polymerize matched pedestrian's searching method and device
CN109614985A (en) A kind of object detection method based on intensive connection features pyramid network
CN105389562B (en) A kind of double optimization method of the monitor video pedestrian weight recognition result of space-time restriction
CN107346420A (en) Text detection localization method under a kind of natural scene based on deep learning
CN109871875B (en) Building change detection method based on deep learning
Liu et al. Super-pixel cloud detection using hierarchical fusion CNN
CN112488229B (en) Domain self-adaptive unsupervised target detection method based on feature separation and alignment
CN110246141A (en) It is a kind of based on joint angle point pond vehicles in complex traffic scene under vehicle image partition method
CN109344842A (en) A kind of pedestrian's recognition methods again based on semantic region expression
Song et al. A hierarchical object detection method in large-scale optical remote sensing satellite imagery using saliency detection and CNN
Zhang et al. Guided attention in cnns for occluded pedestrian detection and re-identification
CN110956158A (en) Pedestrian shielding re-identification method based on teacher and student learning frame
CN109033944A (en) A kind of all-sky aurora image classification and crucial partial structurtes localization method and system
CN106874825A (en) The training method of Face datection, detection method and device
CN107463954A (en) A kind of template matches recognition methods for obscuring different spectrogram picture
CN110533100A (en) A method of CME detection and tracking is carried out based on machine learning
Ge et al. Coarse-to-fine foraminifera image segmentation through 3D and deep features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191105

RJ01 Rejection of invention patent application after publication