CN108875455A - A kind of unsupervised face intelligence precise recognition method and system - Google Patents

A kind of unsupervised face intelligence precise recognition method and system Download PDF

Info

Publication number
CN108875455A
CN108875455A CN201710332276.3A CN201710332276A CN108875455A CN 108875455 A CN108875455 A CN 108875455A CN 201710332276 A CN201710332276 A CN 201710332276A CN 108875455 A CN108875455 A CN 108875455A
Authority
CN
China
Prior art keywords
feature vector
label
euclidean distance
face
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710332276.3A
Other languages
Chinese (zh)
Other versions
CN108875455B (en
Inventor
蒋佳
朱林楠
占宏锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Corp
Original Assignee
TCL Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Corp filed Critical TCL Corp
Priority to CN201710332276.3A priority Critical patent/CN108875455B/en
Publication of CN108875455A publication Critical patent/CN108875455A/en
Application granted granted Critical
Publication of CN108875455B publication Critical patent/CN108875455B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • G06V40/173Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of unsupervised face intelligence precise recognition method and system, wherein method includes step:Preliminary classification is carried out to extracted face picture feature vector using density clustering algorithm, model parameter study is carried out to training set using logistic regression algorithm, obtains initialization logic regression parameter;Logistic regression prediction processing is carried out to face picture feature vector in the test set according to initialization logic regression parameter, and obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;When the probability value is greater than preset confidence rate threshold value, then the current face picture feature vector for carrying out logistic regression prediction processing is distributed into corresponding label, and form new training set;The present invention realizes the accuracy for ensuring face picture identification while enhancing Hierarchical Clustering ability;And the present invention is to be avoided artificial mark label based on unsupervised sorting algorithm, saved manpower and material resources, and improve the processing speed of face picture identification.

Description

A kind of unsupervised face intelligence precise recognition method and system
Technical field
The present invention relates to field of face identification, more particularly to a kind of unsupervised face intelligence precise recognition method and it is System.
Background technique
With flourishing for the fields such as video storage in recent years, big data analysis, recognition of face is not only required nothing more than Face can be detected in photo, and wants accurately find out the photo of same people from multiple pictures, and can apply to In various smart machines.
Nowadays common face identification method is that face is carried out to artificial mark label, is then store in database, When picture pick-up device takes face again, the photo of the photo newly shot and database purchase is compared, is being stored The highest photo of similarity is found out in library, and using the label of the photo as the label of newly-increased photo.
In practical application, it can usually encounter test sample (i.e. facial image) and there was only feature vector, not specified corresponding mark The unsupervised situation of label, it is this based on unsupervised sorting algorithm, gradually obtain the attention of industry;However, at this stage without prison It is also immature to superintend and direct sorting algorithm, that inaccuracy is less high or recall rate is not high or speed is very slow, or can not adapt to The feature vector of Arbitrary distribution.Therefore, how quickly and accurately feature vector to be clustered, just for described eigenvector distribution True label, the problem of becoming prior art urgent need to resolve.
Therefore, the existing technology needs to be improved and developed.
Summary of the invention
In view of above-mentioned deficiencies of the prior art, the purpose of the present invention is to provide a kind of unsupervised accurate knowledges of face intelligence Other method and system, it is intended to solve existing unsupervised face identification method accuracy of identification difference and processing speed is slowly asked Topic.
Technical scheme is as follows:
A kind of unsupervised face intelligence precise recognition method, wherein including step:
A, preliminary classification is carried out to extracted face picture feature vector using density clustering algorithm, by classified people Face picture feature vector composition training set simultaneously distributes corresponding label, and non-classified face picture feature vector is formed and is tested Collection;
B, model parameter study is carried out to the training set using logistic regression algorithm, obtains initialization logic and returns ginseng Number;
C, any face picture feature vector in the test set is carried out according to the initialization logic regression parameter Logistic regression prediction processing, and obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;
D, when the probability value is greater than preset confidence rate threshold value, then carry out what logistic regression prediction was handled for current Face picture feature vector is distributed into corresponding label, and new training set is formed together with original training set;
E, judge whether the extracted face picture feature vector divides equally to be assigned in corresponding label, if so, knot Beam distribution;If it is not, then return step B is iterated operation, until all extracted face picture feature vectors are assigned with Corresponding label.
Preferably, the unsupervised face intelligence precise recognition method, wherein the step D further includes:
D1, when the probability value is less than preset confidence rate threshold value, then determine current carry out at logistic regression prediction The face picture feature vector of reason is outlier.
Preferably, the unsupervised face intelligence precise recognition method, wherein after the step E, divided Face picture equipped with label forms a database, and when adding newly-increased face picture in the database, the side Method is further comprising the steps of after step E:
F1, the preparatory N number of feature vector for successively taking out the face picture in database under each label;
F2, the Euclidean distance increased newly between the feature vector of face picture and N number of feature vector under current label is calculated, And count the number that the Euclidean distance is less than Euclidean distance threshold value;
If the Characteristic Number that F3, the Euclidean distance are less than Euclidean distance threshold value is greater than or equal to minimum neighbours' number door Limit, then distribute newly-increased picture into current label.
Preferably, the unsupervised face intelligence precise recognition method, wherein further include after the step F2:
If the Characteristic Number that F4, the Euclidean distance are less than Euclidean distance threshold value is greater than second level minimum neighbours number door Minimum neighbours' number thresholding is limited and be less than, then current label label record is got off, and enter step F41;
F41, in database label face picture feature vector carry out logistic regression operation obtain current logic return Return parameter, logistic regression prediction processing is carried out according to feature vector of the current logic regression parameter to newly-increased picture, and will obtain Predicted value carry out probability normalization corresponding probability value is calculated;When the probability value is greater than preset confidence rate threshold value When, then newly-increased picture feature vector is distributed into current label.
Preferably, the unsupervised face intelligence precise recognition method, wherein further include after the step F2:
If the Characteristic Number that F5, the Euclidean distance are less than Euclidean distance threshold value is less than second level minimum neighbours number door In limited time, then current label is given up, enters step F51;
Between F51, the feature vector for traversing newly-increased face picture and the N number of feature vector subset for being left each label Euclidean distance, and the number that the Euclidean distance is less than Euclidean distance threshold value is counted, if obtained Euclidean distance is less than European When the number of distance threshold value is respectively less than second level minimum neighbours' number thresholding, then all labels are given up;
F52, using density clustering algorithm to the features of all outliers in the feature vector and database of newly-increased picture to Amount carries out clustering, requires if meeting cluster, distributes new label for newly-increased picture.
A kind of unsupervised face intelligence accurate recognition system, wherein including step:
Preliminary classification module, for tentatively being divided using density clustering algorithm extracted face picture feature vector Class distributes classified face picture feature vector to corresponding label and composition training set, and non-classified face picture is special It levies vector and forms test set;
Model building module obtains just for carrying out model parameter study to the training set using logistic regression algorithm Beginningization logistic regression parameter;
Predict processing module, for according to the initialization logic regression parameter to any face figure in the test set Piece feature vector carries out logistic regression prediction processing, and obtained predicted value is carried out probability normalization and is calculated accordingly generally Rate value;
Label distribution module, for when the probability value is greater than preset confidence rate threshold value, then will currently patrol The face picture feature vector for collecting regression forecasting processing is distributed into corresponding label, and is formed newly together with original training set Training set;
Judgment module is assigned to corresponding label for judging whether the extracted face picture feature vector divides equally In, if so, terminating distribution;If it is not, then returning to model building module is iterated operation, until all extracted face figures Piece feature vector is assigned with corresponding label.
Preferably, the unsupervised face intelligence accurate recognition system, wherein the system also includes:
Judging unit, for when the probability value is less than preset confidence rate threshold value, then determining current progress logic The face picture feature vector of regression forecasting processing is outlier.
Preferably, the unsupervised face intelligence accurate recognition system, wherein the system also includes:
Feature vector chooses module, for successively taking out N number of spy of the face picture in database under each label in advance Levy vector;
Count computing module, N number of feature vector under feature vector and current label for calculating newly-increased face picture Between Euclidean distance, and count the Euclidean distance be less than Euclidean distance threshold value number;
Newly-increased picture distribution module, the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance be greater than or When equal to minimum neighbours' number thresholding, then newly-increased picture is distributed into current label.
Preferably, the unsupervised face intelligence accurate recognition system, wherein the system also includes:
Logging modle, it is minimum that the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance is greater than the second level Neighbours' number thresholding and when being less than minimum neighbours' number thresholding, then get off current label label record, and enter secondary distribution Module;
Secondary distribution module, for the face picture feature vector progress logistic regression operation of label obtains in database To current logic regression parameter, carried out at logistic regression prediction according to feature vector of the current logic regression parameter to newly-increased picture Reason, and obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;When the probability value is greater than preset When confidence rate threshold value, then newly-increased picture feature vector is distributed into current label.
Preferably, the unsupervised face intelligence accurate recognition system, wherein the system also includes:
Give up module, it is minimum that the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance is less than the second level When neighbours' number thresholding, then current label is given up, into spider module;
Spider module, for traversing the feature vector of newly-increased face picture and being left N number of feature vector of each label Euclidean distance between subset, and count the Euclidean distance be less than Euclidean distance threshold value number, if obtained Euclidean away from When being respectively less than second level minimum neighbours' number thresholding from the number for being less than Euclidean distance threshold value, then all labels are given up;
Cluster module, for using density clustering algorithm to all outliers in the feature vector and database of newly-increased picture Feature vector carry out clustering, required if meeting cluster, distribute new label for newly-increased picture.
Beneficial effect:The present invention first uses Density Clustering method to find the stronger face picture of some similarities, then with patrolling The strong picture feature of these similarities of recurrence learning is collected, then the parameter acquired is applied on the lower picture of other similarities, And similar credibility is verified by confidence rate, ensure that face picture is known while enhancing Hierarchical Clustering ability to realize Other accuracy;And the present invention is to be avoided artificial mark label based on unsupervised sorting algorithm, saved manpower object Power, to improve the processing speed of face picture identification.
Detailed description of the invention
Fig. 1 is a kind of flow chart of unsupervised face intelligence the first preferred embodiment of precise recognition method of the present invention.
Fig. 2 is a kind of flow chart of unsupervised face intelligence the second preferred embodiment of precise recognition method of the present invention.
Fig. 3 is a kind of structural block diagram of unsupervised face intelligence accurate recognition system preferred embodiment of the present invention.
Specific embodiment
The present invention provides a kind of unsupervised face intelligence precise recognition method and system, to make the purpose of the present invention, skill Art scheme and effect are clearer, clear, and the present invention is described in more detail below.It should be appreciated that tool described herein Body embodiment is only used to explain the present invention, is not intended to limit the present invention.
Referring to Fig. 1, Fig. 1 is the unsupervised face intelligence precise recognition method preferred embodiment of one kind provided by the invention Flow chart, as shown in the figure comprising step:
S100, preliminary classification is carried out to extracted face picture feature vector using density clustering algorithm, to have classified Face picture feature vector distribute corresponding label and composition training set, non-classified face picture feature vector is formed and is surveyed Examination collection;
S200, model parameter study is carried out to the training set using logistic regression algorithm, obtains initialization logic recurrence Parameter;
S300, according to the initialization logic regression parameter to any face picture feature vector in the test set into Row logistic regression prediction processing, and obtained predicted value is subjected to probability normalization calculating and obtains corresponding probability value;
S400, when the probability value be greater than preset confidence rate threshold value when, then will currently carry out at logistic regression prediction The face picture feature vector of reason is distributed into corresponding label, and new training set is formed together with original training set;
S500, judge whether the extracted face picture feature vector is respectively assigned in corresponding label, if so, Terminate distribution;If it is not, then return step S200 is iterated operation, until all extracted face picture feature vectors are divided equally Equipped with corresponding label, these face pictures for being assigned label form a database.
Specifically, for the step S100, the present invention uses density clustering algorithm to propose deep neural network first The picture feature vector got carries out preliminary classification, and the density clustering algorithm can be DBSCAN or HDBSCAN, and the present invention is excellent Select DBSCAN algorithm;The DBSCAN algorithm needs 2 global parameters:Euclidean distance thresholding (Eps) and minimum neighbours' number door It limits (Mpoint), the picture feature vector of successful classification is formed into training set according to the two parameters and distributes corresponding label, The picture feature vector classified not successfully forms test set, and each picture feature vector in test set is fixed tentatively as outlier;
For the step S200, the present invention counts training set using the logistic regression algorithm with regularization constraint It calculates, initialization logic regression parameter { W, b } is obtained, wherein the method that { W, b } is specifically obtained is as follows:
LOOP from i=1:D;
For t=1:K
Correct (t)=- log (CRate (t, Y (t)))
End
Wherein t is the tag number of sample, and Y (t) is the actual tag number of sample.
Regloss=0.5*reg*WT*W
Loss=dataloss+regloss
Dscore (t, Y (t))=CRate (t, Y (t)) -1
Dscore=dscore/M
dw=XT*dscore
db=sum (dscores, 1)
dw=dw+reg*W
W=W-step_size*dw
B=b-step_size*db
If I=D;End LOOP
By D iteration above, { W, b } is obtained, according to this { W, b } to each picture feature vector in test set Logistic regression prediction is carried out, wherein M is sample number, N is characterized dimension, K is cluster number, X is input in the iterative process Picture sample amount, D are the number of iterations, reg is Study rate parameter, W is matrix, b is scalar, and step_size is iteration step length system Number, is defaulted as 1.
Further, according to the initialization logic regression parameter { W, b } to the face picture feature vector in training set into Obtained predicted value is done probability normalization using softmax algorithm, generated between a 0-1 by the pretreatment of row logistic regression Probability value, this probability value are exactly the confidence rate for the label being currently assigned in fact, are set if the confidence rate of this label is greater than Letter rate thresholding, then it is assumed that the label value of the picture feature vector distribution in this current test set must be accepted and believed;Further, when described When probability value is less than preset confidence rate threshold value, then determine the current face picture feature for carrying out logistic regression prediction processing to Amount is outlier.
This picture feature vector sum label for being worth accepting and believing can be added to the instruction of logistic regression in iteration next time Practice and concentrate, the coefficient { W, b } of logistic regression is updated for iteration, and this coefficient updated can be again to remaining test set Logistic regression prediction is done, until there is no new outliers to be monitored, stops iteration.
Specific iterative process is as follows:
While 1
If Max{CRateNEW(i) } i ∈ { 1~K } > CThr
The sample is put into training set
Else
Continue the sample being put into test set
End
This time iteration does not have sample to be placed into training set to If
Break;
End
End while
Wherein, NEW is sample to be predicted, NEW is the vector of a Q*N, and Q is of picture feature vector in test set The dimension that number, the i.e. quantity of picture, N are feature vector.
By above-mentioned iterative process it can be found that if NEW sample meets Max { CRateNEW(i) } i ∈ { 1~K } > The NEW sample (sample to be predicted) is then put into the training set of logistic regression, recalculates { W, b } by CThr;After update { W, b } confidence rate CRate is recalculated to test setNEW(i), it and again monitors whether to meet Max { CRateNEW(i)}i∈ { 1~K } > CThr continues the sample being put into training set if meeting, and continues to update { W, b };If conditions are not met, then program Iteration terminates.That is, carrying out logistic regression operation to the new training set obtains the more new logic time of next iteration Return parameter, logic is carried out to face picture feature vector remaining in the test set according to the update logistic regression parameter and is returned Prediction is returned to handle, to distributing remaining face picture feature vector into corresponding label;Judge all extracted people Whether face picture feature vector, which is divided equally, is assigned in corresponding label, if so, terminating distribution;If it is not, then return step S200, with Continuous iterative cycles step S200-S500, until all people's face picture feature vector in the test set is patrolled Regression forecasting processing is collected, precise classifications are carried out to all extracted face picture feature vectors to realize, these have divided Face picture equipped with corresponding label forms a database.
The present invention had both overcome DBSCAN clustering algorithm by the combination of above-mentioned density clustering algorithm and logistic regression algorithm The shortcomings that a large amount of test sets without distributing label can be generated, and combine the spy that logistic regression needs the training data of label Point;And logic return in by the introducing of regular terms, it is therefore prevented that the generation of over-fitting, then calculated by softmax algorithm Confidence rate, by the adjustment to confidence rate thresholding, to reach the probability for reducing mistake classification.
Further, further comprising the steps of before the step S100:Calculate the feature of face picture to be identified to Amount.
Referring to Figure 2 together, further, it completes when by above-mentioned unsupervised face intelligence precise recognition method to face After the labeling of picture, and when having increased face picture to be identified newly into the face picture database of the label, this The unsupervised face intelligence precise recognition method of invention is further comprising the steps of:
S510, the N number of feature vector for successively taking out face picture in database under each label;
S520, the Europe increased newly between the feature vector of face picture and N number of feature vector subset under current label is calculated Family name's distance, and count the number that the Euclidean distance is less than Euclidean distance threshold value;
If the number that S530, the Euclidean distance are less than Euclidean distance threshold value is greater than or equal to minimum neighbours' number door Limit, then distribute newly-increased picture into current label.
Further, in the present invention, further include after the step S520:
If the number that S540, the Euclidean distance are less than Euclidean distance threshold value is greater than second level minimum neighbours number thresholding And be less than minimum neighbours' number thresholding, then current label label record is got off, and enter step S541, wherein the second level Minimum neighbours' number thresholding is less than minimum neighbours' number thresholding;
S541, in database label face picture feature vector carry out logistic regression operation obtain current logic return Return parameter, logistic regression prediction processing is carried out according to feature vector of the current logic regression parameter to newly-increased picture, and will obtain Predicted value carry out probability normalization corresponding probability value is calculated;When the probability value is greater than preset confidence rate threshold value When, then newly-increased picture feature vector is distributed into current label.
Further, in the present invention, further include after the step S520
If the number that S550, the Euclidean distance are less than Euclidean distance threshold value is less than second level minimum neighbours number thresholding When, then current label is given up, enters step S551;
Between S551, the feature vector for traversing newly-increased face picture and the N number of feature vector subset for being left each label Euclidean distance, and count the Euclidean distance be less than Euclidean distance threshold value number, if obtained Euclidean distance be less than Europe When the number of formula distance threshold value is respectively less than second level minimum neighbours' number thresholding, then all labels are given up;
S552, using density clustering algorithm to the features of all outliers in the feature vector and database of newly-increased picture to Amount carries out clustering, if meeting threshold requirement, distributes new label for newly-increased picture.
Specifically, have in newly-increased picture and database with the presence of three kinds of possibilities of picture:1, picture and number are increased newly According to having in library, picture is closely similar, and the label of existing picture directly can be assigned to newly-increased picture by such case;2, figure is increased newly Having picture in piece and database has certain similarity, and the logistic regression parameter obtained at this time using existing picture training is right Newly-increased picture is monitored, if confidence rate is higher than thresholding CThr, updates the parameter of Logic Regression Models, and most by confidence rate High label is assigned to newly-increased picture;If being not above the confidence rate of thresholding CThr, newly-increased picture is judged as outlier subset; 3, it is very low to increase existing picture similarity in picture and database newly, can find and peel off in the database of existing picture at this time Then value allows newly-increased picture and outlier subset to carry out DBSCAN (or HDBSCAN) cluster;If reaching threshold requirement, Think to find a new label, generate a newly-increased label, the new cluster subset of generation is marked into newly-increased label.Finally, will Logistic regression training set is added in newly-increased subset, is updated to logistic regression parameter;If the threshold requirement of cluster is not achieved, Newly-increased picture is put into outlier subset.
Based on the above method, the present invention also provides a kind of unsupervised face intelligence accurate recognition systems, as shown in figure 3, Wherein, including:
Preliminary classification module 100, for being carried out just using density clustering algorithm to extracted face picture feature vector Step classification distributes classified face picture feature vector to corresponding label and composition training set, by non-classified face figure Piece feature vector forms test set;
Model building module 200 is obtained for carrying out model parameter study to the training set using logistic regression algorithm Initialization logic regression parameter;
Predict processing module 300, for according to the initialization logic regression parameter to any people in the test set Face picture feature vector carries out logistic regression prediction processing, and obtained predicted value is carried out probability normalization and is calculated accordingly Probability value;
Label distribution module 400, for when the probability value is greater than preset confidence rate threshold value, then will currently carry out The face picture feature vector of logistic regression prediction processing is distributed into corresponding label, and is formed newly together with original training set Training set;
Judgment module 500 is assigned to corresponding mark for judging whether all extracted face picture feature vectors divide equally In label, if so, terminating distribution;If it is not, then returning to model building module is iterated operation, until all extracted faces Picture feature vector is assigned with corresponding label.
Further, in the present invention, the label distribution module 400 further includes:
Judging unit, for when the probability value is less than preset confidence rate threshold value, then determining current progress logic The face picture feature vector of regression forecasting processing is outlier.
Further, in the present invention, the system also includes:
Feature vector chooses module, for successively taking out N number of spy of the face picture in database under each label in advance Levy vector;
Computing module is counted, when having increased the face picture of unallocated label newly, the statistics computing module is for calculating Euclidean distance between the feature vector of newly-increased face picture and N number of feature vector under current label, and count described European Distance is less than the number of Euclidean distance threshold value;
Newly-increased picture distribution module, the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance be greater than or When equal to minimum neighbours' number thresholding, then newly-increased picture is distributed into current label.
Further, in the present invention, the system also includes:
Logging modle, it is minimum that the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance is greater than the second level Neighbours' number thresholding and when being less than minimum neighbours' number thresholding, then get off current label label record, and enter secondary distribution Module;
Secondary distribution module, for the face picture feature vector progress logistic regression operation of label obtains in database To current logic regression parameter, carried out at logistic regression prediction according to feature vector of the current logic regression parameter to newly-increased picture Reason, and obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;When the probability value is greater than preset When confidence rate threshold value, then newly-increased picture feature vector is distributed into current label.
Further, in the present invention, the system also includes:
Give up module, it is minimum that the Characteristic Number for being less than Euclidean distance threshold value when the Euclidean distance is less than the second level When neighbours' number thresholding, then current label is given up, into spider module;
Spider module, for traversing the feature vector of newly-increased face picture and being left N number of feature vector of each label Euclidean distance between subset, and count the Euclidean distance be less than Euclidean distance threshold value number, if obtained Euclidean away from When being respectively less than second level minimum neighbours' number thresholding from the number for being less than Euclidean distance threshold value, then all labels are given up;
Cluster module, for using density clustering algorithm to all outliers in the feature vector and database of newly-increased picture Feature vector carry out clustering, if meeting threshold requirement, distribute new label for newly-increased picture.
In conclusion the present invention first uses Density Clustering method to find the stronger face picture of some similarities, then with patrolling The strong picture feature of these similarities of recurrence learning is collected, then the parameter acquired is applied on the lower picture of other similarities, And similar credibility is verified by confidence rate, ensure that face picture is known while enhancing Hierarchical Clustering ability to realize Other accuracy;Further, for increasing picture newly, the present invention is carried out using the disaggregated model parameter that label picture calculates Identifying processing effectively prevents the case where newly-increased picture needs are compared with all picture feature vectors in legacy data library, real Show quickly and accurately newly-increased picture has been added in classified face label, or has established new mark for newly-increased picture Label.
It should be understood that the application of the present invention is not limited to the above for those of ordinary skills can With improvement or transformation based on the above description, all these modifications and variations all should belong to the guarantor of appended claims of the present invention Protect range.

Claims (10)

1. a kind of unsupervised face intelligence precise recognition method, which is characterized in that including step:
A, preliminary classification is carried out to extracted face picture feature vector using density clustering algorithm, is classified face figure Piece feature vector distributes corresponding label and composition training set, and non-classified face picture feature vector is formed test set;
B, model parameter study is carried out to the training set using logistic regression algorithm, obtains initialization logic regression parameter;
C, logic is carried out to any face picture feature vector in the test set according to the initialization logic regression parameter Regression forecasting processing, and obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;
D, when the probability value is greater than preset confidence rate threshold value, then by the current face for carrying out logistic regression prediction processing Picture feature vector is distributed into corresponding label, and forms new training set;
E, judge whether the extracted face picture feature vector divides equally to be assigned in corresponding label, if so, terminating to divide Match;If it is not, then return step B is iterated operation, until all extracted face picture feature vectors are assigned with accordingly Label.
2. unsupervised face intelligence precise recognition method according to claim 1, which is characterized in that the step D is also Including:
D1, when the probability value is less than preset confidence rate threshold value, then determine current to carry out logistic regression prediction processing Face picture feature vector is outlier.
3. unsupervised face intelligence precise recognition method according to claim 1, which is characterized in that pass through the step After E, the face picture for being assigned label forms a database, and ought add newly-increased face picture in the database When, the method is further comprising the steps of after step E:
F1, the preparatory N number of feature vector for successively taking out the face picture in database under each label;
F2, the Euclidean distance increased newly between the feature vector of face picture and N number of feature vector subset under current label is calculated, And count the number that the Euclidean distance is less than Euclidean distance threshold value;
It, will if the number that F3, the Euclidean distance are less than Euclidean distance threshold value is greater than or equal to minimum neighbours' number thresholding Newly-increased picture is distributed into current label.
4. unsupervised face intelligence precise recognition method according to claim 3, which is characterized in that the step F2 it After further include:
If the number that F4, the Euclidean distance are less than Euclidean distance threshold value is greater than second level minimum neighbours' number thresholding and is less than Minimum neighbours' number thresholding, then get off current label label record, and enter step F41;
F41, in database label face picture feature vector carry out logistic regression operation obtain current logic return ginseng Number carries out logistic regression prediction processing according to feature vector of the current logic regression parameter to newly-increased picture, and pre- by what is obtained Measured value carries out probability normalization and corresponding probability value is calculated;When the probability value is greater than preset confidence rate threshold value, Then newly-increased picture feature vector is distributed into current label.
5. unsupervised face intelligence precise recognition method according to claim 3, which is characterized in that the step F2 it After further include:
If the number that F5, the Euclidean distance are less than Euclidean distance threshold value is less than second level minimum neighbours number thresholding, Current label is given up, into F51;
Euclidean between F51, the feature vector for traversing newly-increased face picture and the N number of feature vector subset for being left each label Distance, and the number that the Euclidean distance is less than Euclidean distance threshold value is counted, if obtained Euclidean distance is less than Euclidean distance When the number of threshold value is respectively less than second level minimum neighbours' number thresholding, then all labels are given up;
F52, using density clustering algorithm to the feature vectors of all outliers in the feature vector and database of newly-increased picture into Row clustering distributes new label if meeting threshold requirement for newly-increased picture.
6. a kind of unsupervised face intelligence accurate recognition system, which is characterized in that including step:
Preliminary classification module, for carrying out preliminary classification to extracted face picture feature vector using density clustering algorithm, Distribute classified face picture feature vector to corresponding label and composition training set, by non-classified face picture feature to Amount composition test set;
Model building module is initialized for carrying out model parameter study to the training set using logistic regression algorithm Logistic regression parameter;
Processing module is predicted, for special to any face picture in the test set according to the initialization logic regression parameter It levies vector and carries out logistic regression prediction processing, and obtained predicted value is subjected to probability normalization, corresponding probability is calculated Value;
Label distribution module, for when the probability value is greater than preset confidence rate threshold value, then will currently carry out logic and return The face picture feature vector for returning prediction to handle is distributed into corresponding label, and forms new training set;
Judgment module is assigned in corresponding label for judging whether the extracted face picture feature vector divides equally, if It is then to terminate to distribute;If it is not, then returning to model building module is iterated operation, until all extracted face pictures are special Sign vector is assigned with corresponding label.
7. unsupervised face intelligence accurate recognition system according to claim 6, which is characterized in that the label distribution Module further includes:
Judging unit, for when the probability value is less than preset confidence rate threshold value, then determining current progress logistic regression The face picture feature vector of prediction processing is outlier.
8. unsupervised face intelligence accurate recognition system according to claim 6, which is characterized in that the system is also wrapped It includes:
Feature vector chooses module, for successively take out in advance N number of feature of the face picture in database under each label to Amount;
Computing module is counted, when having increased the face picture of unallocated label newly, the statistics computing module is newly-increased for calculating Euclidean distance between the feature vector of face picture and N number of feature vector subset under current label, and count described European Distance is less than the number of Euclidean distance threshold value;
Newly-increased picture distribution module, the number for being less than Euclidean distance threshold value when the Euclidean distance are greater than or equal to minimum When neighbours' number thresholding, then newly-increased picture is distributed into current label.
9. unsupervised face intelligence accurate recognition system according to claim 8, which is characterized in that the system is also wrapped It includes:
Logging modle, the number for being less than Euclidean distance threshold value when the Euclidean distance are greater than second level minimum neighbours' number Thresholding and when being less than minimum neighbours' number thresholding, then get off current label label record, and enter secondary distribution module;
Secondary distribution module, for the face picture feature vector progress logistic regression operation of label is worked as in database Preceding logistic regression parameter carries out logistic regression prediction processing according to feature vector of the current logic regression parameter to newly-increased picture, And obtained predicted value is subjected to probability normalization, corresponding probability value is calculated;When the probability value is greater than preset confidence When rate threshold value, then newly-increased picture feature vector is distributed into current label.
10. unsupervised face intelligence accurate recognition system according to claim 8, which is characterized in that the system is also Including:
Give up module, the number for being less than Euclidean distance threshold value when the Euclidean distance is less than second level minimum neighbours' number When thresholding, then current label is given up, into spider module;
Spider module, for traversing the feature vector of newly-increased face picture and being left N number of feature vector subset of each label Between Euclidean distance, and count the Euclidean distance be less than Euclidean distance threshold value number, if obtained Euclidean distance is small When the number of Euclidean distance threshold value is respectively less than second level minimum neighbours' number thresholding, then all labels are given up;
Cluster module, for the spy using density clustering algorithm to all outliers in the feature vector and database of newly-increased picture It levies vector and carries out clustering, if meeting threshold requirement, distribute new label for newly-increased picture.
CN201710332276.3A 2017-05-11 2017-05-11 Unsupervised intelligent face accurate identification method and system Active CN108875455B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710332276.3A CN108875455B (en) 2017-05-11 2017-05-11 Unsupervised intelligent face accurate identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710332276.3A CN108875455B (en) 2017-05-11 2017-05-11 Unsupervised intelligent face accurate identification method and system

Publications (2)

Publication Number Publication Date
CN108875455A true CN108875455A (en) 2018-11-23
CN108875455B CN108875455B (en) 2022-01-18

Family

ID=64319732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710332276.3A Active CN108875455B (en) 2017-05-11 2017-05-11 Unsupervised intelligent face accurate identification method and system

Country Status (1)

Country Link
CN (1) CN108875455B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079849A (en) * 2019-12-23 2020-04-28 西南交通大学 Method for constructing new target network model for voice-assisted audio-visual collaborative learning
CN113033444A (en) * 2021-03-31 2021-06-25 北京金山云网络技术有限公司 Age estimation method and device and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567391A (en) * 2010-12-20 2012-07-11 ***通信集团广东有限公司 Method and device for building classification forecasting mixed model
CN105389583A (en) * 2014-09-05 2016-03-09 华为技术有限公司 Image classifier generation method, and image classification method and device
CN105426878A (en) * 2015-12-22 2016-03-23 小米科技有限责任公司 Method and device for face clustering
CN105740842A (en) * 2016-03-01 2016-07-06 浙江工业大学 Unsupervised face recognition method based on fast density clustering algorithm
CN105787770A (en) * 2016-04-27 2016-07-20 上海遥薇(集团)有限公司 Non-negative matrix factorization (NMF) algorithm-based big data commodity and service recommending method and system
CN106355170A (en) * 2016-11-22 2017-01-25 Tcl集团股份有限公司 Photo classifying method and device
US20170061322A1 (en) * 2015-08-31 2017-03-02 International Business Machines Corporation Automatic generation of training data for anomaly detection using other user's data samples
CN106503656A (en) * 2016-10-24 2017-03-15 厦门美图之家科技有限公司 A kind of image classification method, device and computing device
CN106650804A (en) * 2016-12-13 2017-05-10 深圳云天励飞技术有限公司 Facial sample cleaning method and system based on deep learning features
US20200302225A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Training Data Generation for Artificial Intelligence-Based Sequencing

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567391A (en) * 2010-12-20 2012-07-11 ***通信集团广东有限公司 Method and device for building classification forecasting mixed model
CN105389583A (en) * 2014-09-05 2016-03-09 华为技术有限公司 Image classifier generation method, and image classification method and device
US20170061322A1 (en) * 2015-08-31 2017-03-02 International Business Machines Corporation Automatic generation of training data for anomaly detection using other user's data samples
US10147049B2 (en) * 2015-08-31 2018-12-04 International Business Machines Corporation Automatic generation of training data for anomaly detection using other user's data samples
CN105426878A (en) * 2015-12-22 2016-03-23 小米科技有限责任公司 Method and device for face clustering
CN105740842A (en) * 2016-03-01 2016-07-06 浙江工业大学 Unsupervised face recognition method based on fast density clustering algorithm
CN105787770A (en) * 2016-04-27 2016-07-20 上海遥薇(集团)有限公司 Non-negative matrix factorization (NMF) algorithm-based big data commodity and service recommending method and system
CN106503656A (en) * 2016-10-24 2017-03-15 厦门美图之家科技有限公司 A kind of image classification method, device and computing device
CN106355170A (en) * 2016-11-22 2017-01-25 Tcl集团股份有限公司 Photo classifying method and device
CN106650804A (en) * 2016-12-13 2017-05-10 深圳云天励飞技术有限公司 Facial sample cleaning method and system based on deep learning features
US20200302225A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Training Data Generation for Artificial Intelligence-Based Sequencing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079849A (en) * 2019-12-23 2020-04-28 西南交通大学 Method for constructing new target network model for voice-assisted audio-visual collaborative learning
CN113033444A (en) * 2021-03-31 2021-06-25 北京金山云网络技术有限公司 Age estimation method and device and electronic equipment

Also Published As

Publication number Publication date
CN108875455B (en) 2022-01-18

Similar Documents

Publication Publication Date Title
CN111191732B (en) Target detection method based on full-automatic learning
CN114241282A (en) Knowledge distillation-based edge equipment scene identification method and device
CN108573031A (en) A kind of complaint sorting technique and system based on content
CN112668579A (en) Weak supervision semantic segmentation method based on self-adaptive affinity and class distribution
CN108427740B (en) Image emotion classification and retrieval algorithm based on depth metric learning
CN112819065B (en) Unsupervised pedestrian sample mining method and unsupervised pedestrian sample mining system based on multi-clustering information
Zhu et al. Semi-supervised streaming learning with emerging new labels
CN110225001B (en) Dynamic self-updating network traffic classification method based on topic model
CN110458022B (en) Autonomous learning target detection method based on domain adaptation
CN110807086A (en) Text data labeling method and device, storage medium and electronic equipment
CN113553906A (en) Method for discriminating unsupervised cross-domain pedestrian re-identification based on class center domain alignment
Noman et al. Object detection techniques: Overview and performance comparison
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN104680193A (en) Online target classification method and system based on fast similarity network fusion algorithm
CN112861896A (en) Image identification method and device
CN111460200A (en) Image retrieval method and model based on multitask deep learning and construction method thereof
WO2020135054A1 (en) Method, device and apparatus for video recommendation and storage medium
CN116486483A (en) Cross-view pedestrian re-recognition method and device based on Gaussian modeling
WO2023273572A1 (en) Feature extraction model construction method and target detection method, and device therefor
CN110705384B (en) Vehicle re-identification method based on cross-domain migration enhanced representation
CN115292532A (en) Remote sensing image domain adaptive retrieval method based on pseudo label consistency learning
CN108875455A (en) A kind of unsupervised face intelligence precise recognition method and system
CN110795410A (en) Multi-field text classification method
Fan et al. Nonparametric hierarchical Bayesian models for positive data clustering based on inverted Dirichlet-based distributions
Gao et al. An improved XGBoost based on weighted column subsampling for object classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 516006 TCL science and technology building, No. 17, Huifeng Third Road, Zhongkai high tech Zone, Huizhou City, Guangdong Province

Applicant after: TCL Technology Group Co.,Ltd.

Address before: 516006 Guangdong province Huizhou Zhongkai hi tech Development Zone No. nineteen District

Applicant before: TCL Corp.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant