CN107644235A - Image automatic annotation method based on semi-supervised learning - Google Patents

Image automatic annotation method based on semi-supervised learning Download PDF

Info

Publication number
CN107644235A
CN107644235A CN201711002595.4A CN201711002595A CN107644235A CN 107644235 A CN107644235 A CN 107644235A CN 201711002595 A CN201711002595 A CN 201711002595A CN 107644235 A CN107644235 A CN 107644235A
Authority
CN
China
Prior art keywords
image
training
mark
sample
lda
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711002595.4A
Other languages
Chinese (zh)
Inventor
李志欣
林兰
张灿龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN201711002595.4A priority Critical patent/CN107644235A/en
Publication of CN107644235A publication Critical patent/CN107644235A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of image automatic annotation method based on semi-supervised learning, and data set is divided into training dataset, Unlabeled data collection and test set first.Then extract the SIFT feature of training sample and HOG features train LDA_SVM graders;Color and textural characteristics are extracted to train neutral net.Unlabeled data is utilized afterwards, allow two graders that prediction is marked to same unmarked sample simultaneously, and according to contribution of the grader to unmarked sample classification precision, fusion is weighted to the classification results of two graders with adaptive weighted convergence strategy, obtains the final predictive marker probability vector of sample.Finally two graders are updated with the high sample of confidence level and its predictive marker, until reaching default maximum iteration.The present invention can make full use of unmarked sample to excavate the inherent law of characteristics of image, efficiently reduce mark sample size required during classifier training, and obtain preferably mark effect.

Description

Image automatic annotation method based on semi-supervised learning
Technical field
The present invention relates to image retrieval technologies field, and in particular to a kind of automatic image annotation side based on semi-supervised learning Method.
Background technology
With the popularization of network and digital equipment, various media image data are skyrocketed through, and how they are carried out effective Organization and management, to user it is efficient browse and retrieve as researcher it is widely studied the problem of.
Image retrieval just turns into very active research field since the last century 70's, applies at present than wide Image retrieval technologies have text based image retrieval technologies (Text-based Image Retrieval, TBIR) and be based on The image retrieval technologies (Content-basedImage Retrieval, CBIR) of content.Due to TBIR technologies, there is obvious The defects of, particularly when the quantity of image is very more, the workload marked by hand needed for image is very big, and mark by hand Subjectivity and inexactness likely result in image in retrieving mismatch;And there is prominent low-level feature for CBIR technologies " semantic gap " problem between high-level semantic, thus both approaches are all difficult to apply to current large-scale view data Library management.
Automatic image annotation is exactly to allow computer automatically to learn to have marked image, semantic and concept space and visual signature sky Between between potential relation can reflect the semantic key words of its content to not marking image and add.Automatic image annotation can be with The predicament of current image retrieval is effectively improved, retrieval is reduced manual mark while basic text key word search is retained The huge workload of note, also reduces " semantic gap ", therefore, automatic image annotation technology is all the time standby to a certain extent Paid close attention to by the research of people.
Although researcher has been achieved for very big progress in terms of automatic image annotation, traditional image is marked automatically Injecting method usually requires to be trained grader using substantial amounts of training sample, and in actual applications, marker samples obtain Must be relatively difficult, unmarked sample is but readily available, and how to make full use of the connection between marked sample and unmarked sample Series structure marking model, the accuracy rate and performance of grader are improved, the problem of being a great challenge.
The content of the invention
The present invention still needs a large amount of training samples marked by hand for traditional images automatic marking, in marker samples number In the case of less, a kind of the problem of automatic marking effect is undesirable, there is provided automatic image annotation based on semi-supervised learning Method, it can make full use of unmarked sample to excavate the inherent law of characteristics of image, efficiently reduce classifier training when institute The mark sample size needed, and obtain preferably mark effect.
The present invention principle be:In the case where training sample data are less, to make full use of unmarked sample excavation figure As the inherent law of feature, so as to obtain preferable automatic image annotation effect, the present invention proposes that one kind is based on semi-supervised learning Image automatic annotation method:First, data set is divided into training dataset, Unlabeled data collection and test set.Then, carry Take training sample SIFT feature and HOG features as feature set A, for training LDA_SVM graders;Extract color and texture Feature is as feature set B, for training neutral net.Because now training data is less, obtained classifier performance is weaker, because This, by two grader coorinated trainings, can lift the classification performance of grader using substantial amounts of Unlabeled data.Afterwards, Unlabeled data is recycled, allows two graders that prediction is marked to same unmarked sample simultaneously, and according to grader pair The classification results of two graders are weighted and melted by the contribution of unmarked sample classification precision with adaptive weighted convergence strategy Close, obtain the final predictive marker probability vector of sample.Finally two are classified with the high sample of confidence level and its predictive marker Device is updated, and until reaching default maximum iteration, exits algorithm.
Image automatic annotation method based on semi-supervised learning, including step are as follows:
Step 1, given data set is divided into 3 Sub Data Sets, i.e. training dataset, Unlabeled data collection and test Data set;
Step 2, LDA_SVM classifier training stages;
The SIFT feature and HOG features for the training image that step 2.1, extraction training data are concentrated as fisrt feature collection, Its visual signature is quantified using bag of words method, the bag of words for obtaining every width training image represent;
Step 2.2, the visual signature using LDA modeling training images, obtain each vision word theme of training image Distribution and the visual theme of every width training image are distributed;
Step 2.3, the visual theme distribution with step 2.2 gained and their original mark construction SVM multi classifiers, The LDA_SVM graders currently trained;
Step 3, neural network classifier training stage;
The color characteristic and textural characteristics for the training image that step 3.1, extraction training data are concentrated are as second feature collection;
Step 3.2, neutral net is input to together with second feature collection and corresponding label information is trained, worked as Before the neural network classifier that trains;
Step 4, coorinated training stage;
The SIFT feature and HOG features for the unmarked image that step 4.1, extraction Unlabeled data are concentrated, and use bag of words Method quantifies its visual signature, and the bag of words for obtaining every unmarked image represent;
Step 4.2, the visual theme point with the unmarked image of vision word theme distribution study obtained by step 2.2 Cloth;
Step 4.3, the LDA_SVM graders for currently training the image vision theme distribution learnt input, are obtained First mark prediction probability vector of unmarked image;
Step 4.4, with the neural network classifier currently trained the unmarked image that Unlabeled data is concentrated is carried out Mark prediction, obtain the second mark prediction probability vector of unmarked image;
Step 4.5, the first mark prediction probability according to the given unmarked image of adaptive weighted convergence strategy fusion Vector sum second marks prediction probability vector, draws the final mark prediction probability vector of unmarked image;
Step 4.6, selection confidence level high unmarked image and its predictive marker are added to training dataset, give again LDA_SVM graders and neural network classifier are trained renewal, i.e. return to step 2, until reaching default greatest iteration Number, the LDA_SVM graders and neural network classifier finally trained;
The mark stage of step 5, test image;
Step 5.1, the fisrt feature collection and second feature collection for extracting the test image that test data is concentrated respectively;
Step 5.2, with the LDA_SVM graders finally trained the fisrt feature collection of test image is marked it is pre- Survey, obtain the first mark prediction probability vector of test image;
Step 5.3, with the neural network classifier finally trained the second feature collection of test image is marked it is pre- Survey, obtain the second mark prediction probability vector of test image;
Step 5.4, according to the first mark prediction probability of given adaptive weighted convergence strategy fusion test image to Amount and the second mark prediction probability vector, draw the final mark prediction probability vector of test image;
Step 5.5, tag set of the n mark of confidence level highest as test image is chosen, wherein n is set to be artificial Value.
Although the quantity of 3 sub- data images can be set as needed in above-mentioned 3 Sub Data Sets, 3 The quantity of respective image is preferably in individual Sub Data Set:Unlabeled data collection>Test data set>Training dataset.
In above-mentioned steps 4.5 and step 5.4, adaptive weighted convergence strategy is according to LDA_SVM graders and neutral net Contribution of the grader to same Unlabeled data precision of prediction and determine.
Compared with prior art, the present invention has following features:
(1) in the two different characteristic collection A and feature set B of feature extraction phases, respectively extraction image, wherein feature set A For SIFT feature and HOG features, feature set B is color and textural characteristics, and it is in order to from different angles to extract different feature sets Degree description image.
(2) image is converted to the theme vector of K dimension and represented by LDA from feature set A expression, and this theme vector Also imply the semantic information of image.The effect of effective dimensionality reduction can be played to the vector of higher-dimension and can preferably expression figure Picture.
(3) coorinated training is carried out with two different graders of LDA_SVM graders and neutral net, from different angles Training image, the prediction result of two graders is finally merged, obtains preferably marking effect.
(4) the semi-supervised learning method of coorinated training is used, makes full use of unmarked sample to excavate the inherence of characteristics of image Rule, greatly reduce the accuracy that the workload manually marked improves mark again.
Brief description of the drawings
Automatic image annotation general frames of the Fig. 1 based on semi-supervised learning.
Fig. 2 LDA_SVM classifier trainings and dimensioning algorithm flow chart.
Fig. 3 LDA graph models.
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with instantiation, and with reference to attached Figure, the present invention is described in more detail.
A kind of general frame of the image automatic annotation method based on semi-supervised learning as shown in figure 1, its specifically include as Lower step:
Step (1) divides data set, and data are divided into three Sub Data Sets, respectively training dataset, unmarked number According to collection and test data set.The ratio of three Sub Data Sets can artificially be set, and setting principle is Unlabeled data collection>Test data Collection>Training dataset.
The training process of step (2) training image is divided into several stages, is LDA_SVM classifier training stages, god respectively Through network training stage and coorinated training stage.
Step (2.1) LDA_SVM classifier training stages.
Step (2.1.1) extract respectively training sample image SIFT feature and HOG features as feature set A, using " word Bag " method quantifies its visual signature, and " bag of words " for obtaining each image represent.
Step (2.1.2) obtains each vision word master of training image using the visual signature of LDA modeling training images Topic distributionθ is distributed with the visual theme of each imaged.During using LDA models, theme number is arranged to 60, hyper parameter initial value For α=0.1, β=0.01.
The obtained visual theme distribution θ of step (2.1.3)dSVM multicategory classification models are constructed with their original mark.
Step (2.2) the neural metwork training stage.When training neutral net, learning rate η=0.01, hidden neuron number Mesh value is 9.
Step (2.2.1) extract respectively training sample image color characteristic and textural characteristics as feature set B.
Step (2.2.2) is input to neutral net with feature set B and corresponding label information and is trained together.
Step (2.3) the coorinated training stage.
Step (2.3.1) performs step (2.1.1), obtains " bag of words " of image for the image in unmarked sample set Represent.With the vision word theme distribution obtained in step (2.1.2)Learn the visual theme distribution θ of unmarked sampled
The image vision theme distribution θ that step (2.3.2) will learndThe SVM multi classifiers trained are inputted, are obtained The mark prediction probability vector C of unmarked sampleL
Prediction is marked to the image in unmarked sample set with the neutral net trained in step (2.3.3), obtains Mark prediction probability vector CN
Step (2.3.4) according to contribution of the two graders to same Unlabeled data precision of prediction, by one from Adapt to Weighted Fusion strategy fusion CLAnd CNDraw last mark prediction probability vector, select the high predictive marker of confidence level and Its sample gives two classifier trainings again, until reaching default maximum iteration, exits algorithm.
The mark stage of step (3) image to be marked (test image).
Step (3.1) extracts the feature set A and feature set B of test sample image respectively.
Prediction is marked to the feature set A of test image with the LDA_SVM graders trained in step (3.2), obtains Mark prediction probability vector CL
Prediction is marked to test image feature set B with the neutral net trained in step (3.3), obtains mark prediction Probability vector CN
Step (3.4) merges C by adaptive weighted convergence strategyLAnd CNDraw last mark prediction probability vector, choosing Tag set of the n mark of confidence level highest as test sample is taken, wherein n is the value artificially set.Parameter n is set to 5, i.e., Tag set using 5 marks of confidence level highest as the test image.
It is divided into three phases for the training process of training sample image:LDA_SVM classifier training stages, neutral net Training stage and coorinated training stage.(1) the LDA_SVM stages are trained, SIFT feature and the HOG for first extracting training image respectively are special Sign is used as feature set A, then quantifies visual signature using " bag of words " method, then with the vision of LDA model modeling training images Feature, obtain the theme distribution of each vision wordθ is distributed with the visual theme of every width training imaged, by this visual theme The intermediate representation vector as each image is distributed, is distributed θ with visual themedAnd to construct SVM together more for their label informations Class grader.(2) training stage of neutral net, first the color of extraction training image and textural characteristics are as feature set B, then use Feature set B constructing neural networks.(3) the coorinated training stage, the feature set A of unmarked image is equally extracted, with " bag of words " method amount Change visual signature, the vision word theme distribution obtained using the LDA_SVM training stagesVision word as unmarked image Theme distribution, it is distributed using the visual signature and vision word of unmarked imageLearn every unmarked image subject distribution θd, Using theme distribution θ as each image intermediate vector, the intermediate vector is classified using the SVM multi classifiers that training obtains, obtained To the mark prediction probability vector C of unmarked sampleL.The feature set B of unmarked image is extracted, with the neutral net pair trained Prediction is marked in unmarked image, obtains marking prediction probability vector CN.According to the two graders to same unmarked number It is predicted that the contribution of precision, C is merged by an adaptive weighted convergence strategyLAnd CNDraw last mark prediction probability to Amount, select the high predictive marker of confidence level and its sample to give two classifier trainings again, changed until reaching default maximum Generation number, exits algorithm.
Four-stage is divided into the annotation process of test image:(1) the feature set A and spy of test sample image are extracted respectively Collect B.(2) prediction is marked to the feature set A of test image with the LDA_SVM graders trained, obtains mark prediction Probability vector CL.(3) prediction is marked to test image feature set B with the neutral net trained, it is general obtains mark prediction Rate vector CN.(4) C is merged by adaptive weighted convergence strategyLAnd CNLast mark prediction probability vector is drawn, selection is put Several tag sets of mark as test sample of reliability highest.
The method that the feature set A of present invention extracting method first uses each image in data set dense piece of sampling Be divided into regular square, square is 16 × 16, and entire image is traveled through according to the step-length of 10 pixels, using window overlay area as One characteristic area extracts the SIFT feature and HOG features of image respectively.Then image is represented using " bag of words " method, step is such as Under:
Step 1) constructs visual dictionary.The parts of images of every class training data is taken at random, using k-means algorithms to image SIFT feature and HOG features cluster respectively, it is assumed that SIFT feature obtains N by clusterSIndividual vision word, HOG features are passed through It is N to cluster obtained vision word numberH, then the size of final visual dictionary is that both vision word sums are NS+NH
Step 2) visual signature quantifies.The visual signature of each image is mapped on visual dictionary, and to each image Vision word carry out statistics with histogram, then piece image can use (1) formula shown in NS+NHVision histogram is tieed up to represent:
v(di)={ n (di,v1),n(di,v2)…n(di,vNS),n(di,vNS+1),n(di,vNS+NH)}
The feature set B of present invention extracting method is that each image in data set is divided into regular square first, side Block size is 16 × 16, and the characteristic vector of one 18 dimension is then extracted for each square, is tieed up comprising 9 color characteristics tieed up and 9 Textural characteristics, color characteristic are described with color histogram, by the hsv color space quantization of image into 9 bin, and are led to Cross and calculate the color histogram that the pixel quantity that color falls in each bin then can obtain each image;Textural characteristics are to use 3 The Gabor filter group of individual yardstick calculates on 3 directions (being respectively 0 °, 60 °, 120 °).
Semi-supervised learning refers to the knowledge in the case where there is a small amount of marker samples, allowing grader to be obtained from training sample Based on, the performance of grader is automatically lifted using unmarked sample.Coorinated training (Co-Training) is a kind of half prison Educational inspector's learning method, this method are needed to utilize two or more graders, and independent training is carried out on different data characteristics collection, And the precision of grader is improved by combining the categorised decision of all graders, unlabelled data are classified device and progressively predicted And give and mark, the data for then selecting confidence level higher are added to training set, continuous iteration, until unlabelled data are whole Untill labeled.
The present invention uses two independent feature sets, constructs two different grader LDA_SVM graders and nerve net Network, by the coorinated training of two graders, the performance of automatic image annotation is lifted using a large amount of Unlabeled datas.
LDA_SVM classifier trainings and dimensioning algorithm flow chart are as shown in Figure 2.
For the feature set A of extraction, its visual signature is quantified using " bag of words " method, obtains " bag of words " table of each image Show.Then all training samples are modeled using LDA, the feature using obtained image vision theme distribution θ as each image, used To train SVM multi classifiers.
LDA (Latent dirichlet allocation) is a kind of topic model, and text and image can be built Mould.When being modeled to image, image can be regarded as document, visual word be regarded as the word in document, then pass through LDA Model the intermediate representation vector for excavate the potential theme distribution of image, obtaining image so that the intrinsic dimensionality of image drops significantly It is low, and can preferably represent image.
Assuming that D={ d1,d2,...dMRepresent an image data set, w={ w11,w22,...wmnIt is in m width images N-th of vision word, the model assumption each image tie up implicit theme variable Z={ z by K1,z2,…zkMixing generation, and it is every Individual theme zkIt is the probability distribution on visual dictionary generated by parameter θ.Parameter θ and parameterIt is α, β to obey parameter respectively Di Li Crays distribution, θ represent image subject distribution mixed proportion,Represent in given theme zkUnder the conditions of vision word Distribution.W then represents the vision word of image.The model determines that LDA graph models are as shown in figure 3, wherein by this 6 major parameters In addition to w is observable variable, remaining is unobservable hidden variable.From the foregoing, LDA committed step is exactly to require Go out optimal hyper parameter α and β, the optimal solution of the two parameters is tried to achieve according to observable variable w by variation EM algorithms.
SVMs (SVM) is because it can efficiently solve high dimensional data problem, and in the less situation of training sample Under also can obtain preferable effect and be widely used, its core concept is by finding optimal classification in feature space Hyperplane separates different data samples.Automatic image annotation can be regarded as a multicategory classification problem, and traditional SVM It is two-value grader, in order that SVM can solve more classification problems, the most frequently used strategy has " one vs.all " referred to as " OVA " plans Slightly (with given class compared with other all classes) and " one vs one " (by the way of comparing in pairs), the present invention adopts More classification are realized with " OVA " strategy, when training grader for each semantic concept, belong to the training sample of certain semantic concept Originally it is considered as positive sample, and other all samples are regarded as negative sample.So, it is assumed that data are concentrated with n classes image then N SVM classifier can be produced.Test phase, each grader produce a prediction probability to each unmarked sample, and prediction is general The maximum classification of rate is considered as the most probable classification of unmarked sample.
LDA_SVM training algorithm processes are as follows:
(1) for training image collection, each image is divided into 16 × 16 rule side using the method for intensive piece of sampling Block, sampling interval are 10 pixels.
(2) SIFT the and HOG features of each square are extracted respectively, its visual signature is quantified using " bag of words " method, are obtained every " bag of words " of width image represent.
(3) using the visual signature of LDA modeling training images, the theme distribution of each vision word of training image is obtainedθ is distributed with the visual theme of each imaged
(4) with obtained visual theme distribution θdSVM multicategory classification models are constructed with their original mark.
LDA_SVM dimensioning algorithms comprise the following steps that:
(1) for every width new images dnew, perform training algorithm the step of (1) and (2).
(2) the vision word theme distribution obtained according to training algorithmTo learn the visual theme of new images distribution θnew
(3) visual theme of the new images learnt is distributed θnewThe SVM multi classifiers trained are inputted, are obtained new The mark prediction probability vector of image.
For the feature set B of extraction, the present invention is handled using neutral net.Artificial neural network (ANN) referred to as god Through network (NN), there is powerful ability in solving the problems, such as multicategory classification.The present invention uses the multilayer with Three Tiered Network Architecture Feedforward neural network carries out the training and prediction of sample, and first layer receives the input signal from sample, possessed and sample characteristics Dimension identical neuron number;Intermediate layer is hidden layer, how to select optimal hidden layer neuron number so far also It is a problem, the number of hidden layer is determined generally according to experience;Last layer is output layer, comprising identical with sample class number Neuron number.Neuron between different layers is connected by the side of Weight, and generally use sigmoid functions are as activation Function, produces the output of interlayer neuron, and the training process of neutral net is exactly that different neurons are adjusted according to training sample Between " connection weight " and threshold value.
Assuming that there are data set D={ (x1,y1),(x2,y2)…(xi,yi),I.e. each sample is by n Dimensional feature vector describes, and the real-valued vectors tieed up for m is exported, for each input sample (xi,yi), corresponding network output isI.e.
WhereinThe input received for k-th of neuron of output layer, wlkFor sample xi Implicit connection weight between unit l and output layer unit k, vlFor the output of l-th of neuron of hidden layer,For output layer The threshold value of k neuron, then neutral net is in sample xiThe error that the reality output of upper k-th of class exports with target is
As k ∈ yi, yikValue be 1, otherwise its value be -1.
Strategy is declined according to gradient, gives learning rate η, the right value update formula of each hidden layer to output layer is:
wlk←wlk+Δwlk (3)
WhereinConvolution (1) and (2) can be released
The then threshold value of output layer neuron
Similar, the weights that can release input layer and hidden layer are with threshold value more new formula
The training process of neutral net is broadly divided into two stages:Propagated forward (calculation error) (is repaiied with error Back-Propagation Change weights), detailed process is as follows:
(1) build and sample characteristics dimension n input block of identical, l hidden unit and m output unit first.
(2) random initializtion all-network weights, scope is in (0,1).
(3) network propagated forward process, sample is inputted network, and in calculating network each unit k output Calculated by formula (1), wherein αkTotal input value that k-th of neuron of output layer receives is represented,It is expressed as output layer k-th The threshold value of neuron, function f are s type activation primitives.Then sample xiNetwork error by formula (2) calculate, wherein,For sample Reality output, yikThe target output of sample.
(4) each output unit m in the error Back-Propagation stage for network, calculates its error term
To each hidden unit l of network, its error term is calculated
The weights of final updating network, stop changing when neutral net reaches default iterations or training precision Generation.
Due to the nicety of grading of each grader in the case where markd training sample is fewer, training to obtain compared with It is weak, according to traditional co-training methods, some Weak Classifier is only thought that the high sample of confidence level and its mark are handed over Training is updated to other side, it is easy to larger error occurs, the present invention is by considering two graders to same instruction Practice the influence of the mark confidence level of data, using marking probability pre- direction finding of the method for adaptive weighted fusion to two graders Amount is weighted fusion, and blending weight is determined by contribution of each of which to image classification accuracy.
Adaptive weighted fusion (Adaptive weighted fusion, AWF) formula is as follows:
WhereinIt is vectorial for final mark prediction probability,Respectively LDA_SVM graders and neutral net pair The mark prediction probability vector of same sample, * are inner product operation symbols,It is the fusion weight vectors of LDA_SVM graders, its is big It is small to be determined by contribution of the LDA_SVM graders to image classification accuracy.It is calculated by likelihood method for normalizing, calculating process It is as follows:
(1) two likelihood matrix Ls are constructed respectively firstl, Lg, the output of LDA_SVM graders and neutral net is represented respectively Likelihood, matrix size N*M, N are total sample number to be marked, and M is prediction classification number.
(2) LDA_SVM and the weight vector of neutral net are calculated respectively by following formulaWith
Wherein wl,mAnd wg,m, m=1,2,3 ..., M, respectively two graders on classification m normalized output seemingly So, can be calculated by following formula
Wherein, Ll(n, c), Lg(n, c) represents that two graders are predicted to be classification c probability on the n-th width image, point Mother is classification m average likelihood, and molecule is the total average likelihood of M classes, is obtainedWithAfterwards, final weight vectorsUnder The formula in face calculates:
Coorinated training (Co-Training) algorithm assumes that data set has two different " views ".That is training is worked as When data are enough, each character subset can train strong classifier, and in given mark, each character subset condition Independently of another character subset.Therefore, view data is divided into two independent character subsets by the present invention, then constructs two Individual different grader LDA_SVM graders and neutral net, then the coorinated training by two graders, using not marking largely Evidence count to lift the performance of automatic image annotation.Assuming that a data are concentrated and D=m+n picture number are included in addition to test set According to for wherein m to there is flag data number, n is data untagged.(x, Y) represents markd training sample, wherein x=(xA, xB) represent sample characteristic vector, xAFor sample characteristics collection A characteristic vector, xBFor sample characteristics collection B characteristic vector, Y tables Show the tag set of the sample,L be all images tag set, L=(l1,l2,...,lI), I is the class of data set Shuo not;Represent that piece image is noted as classification i probability with C={ ci | i=1,2 ..., I }, (x) represents unmarked sample, Then the training process of two grader coorinated trainings is as shown in table 1:
Traditional images automatic marking still needs a large amount of training samples marked by hand, in the less feelings of marker samples data Under condition, the grader for training to obtain is weaker, and according to traditional co-training methods, only some Weak Classifier is thought The high sample of confidence level and its mark give other side and are updated training, it is easy to larger error occur, the present invention is by comprehensive The influence for considering two grader LDA_SVM and neutral net to the mark confidence level of same training data is closed, using adaptive The method of Weighted Fusion is weighted fusion to the marking probability predicted vector of two graders, then with the high sample of confidence level and Its predictive marker is updated to two graders, has both efficiently reduced mark sample size required during classifier training, again Preferably mark effect can be obtained.
It should be noted that although embodiment of the present invention is illustrative above, but it is to the present invention that this, which is not, Limitation, therefore the invention is not limited in above-mentioned embodiment.Without departing from the principles of the present invention, it is every The other embodiment that those skilled in the art obtain under the enlightenment of the present invention, it is accordingly to be regarded as within the protection of the present invention.

Claims (3)

1. the image automatic annotation method based on semi-supervised learning, it is characterized in that, including step is as follows:
Step 1, given data set is divided into 3 Sub Data Sets, i.e. training dataset, Unlabeled data collection and test data Collection;
Step 2, LDA_SVM classifier training stages;
The SIFT feature and HOG features for the training image that step 2.1, extraction training data are concentrated use as fisrt feature collection Bag of words method quantifies its visual signature, and the bag of words for obtaining every width training image represent;
Step 2.2, the visual signature using LDA modeling training images, obtain each vision word theme distribution of training image It is distributed with the visual theme of every width training image;
Step 2.3, the visual theme distribution with step 2.2 gained and their original mark construction SVM multi classifiers, are obtained The LDA_SVM graders currently trained;
Step 3, neural network classifier training stage;
The color characteristic and textural characteristics for the training image that step 3.1, extraction training data are concentrated are as second feature collection;
Step 3.2, neutral net is input to together with second feature collection and corresponding label information is trained, currently instructed The neural network classifier perfected;
Step 4, coorinated training stage;
The SIFT feature and HOG features for the unmarked image that step 4.1, extraction Unlabeled data are concentrated, and use bag of words method amount Change its visual signature, the bag of words for obtaining every unmarked image represent;
Step 4.2, the visual theme distribution with the unmarked image of vision word theme distribution study obtained by step 2.2;
Step 4.3, the LDA_SVM graders for currently training the image vision theme distribution learnt input, are not marked Remember the first mark prediction probability vector of image;
Step 4.4, with the neural network classifier currently trained the unmarked image that Unlabeled data is concentrated is marked Prediction, obtain the second mark prediction probability vector of unmarked image;
Step 4.5, the first mark prediction probability vector according to the given unmarked image of adaptive weighted convergence strategy fusion With the second mark prediction probability vector, show that the final mark prediction probability of unmarked image is vectorial;
Step 4.6, selection confidence level high unmarked image and its predictive marker are added to training dataset, and return to step 2, Until reaching default maximum iteration, the LDA_SVM graders and neural network classifier that are finally trained;
The mark stage of step 5, test image;
Step 5.1, the fisrt feature collection and second feature collection for extracting the test image that test data is concentrated respectively;
Step 5.2, with the LDA_SVM graders finally trained prediction is marked to the fisrt feature collection of test image, obtained To the first mark prediction probability vector of test image;
Step 5.3, with the neural network classifier finally trained prediction is marked to the second feature collection of test image, obtained To the second mark prediction probability vector of test image;
Step 5.4, the first mark prediction probability vector sum according to given adaptive weighted convergence strategy fusion test image Second mark prediction probability vector, draw the final mark prediction probability vector of test image;
Step 5.5, tag set of the n mark of confidence level highest as test image is chosen, wherein n is artificially set Value.
2. the image automatic annotation method based on semi-supervised learning according to claim 1, it is characterized in that, in step 1,3 The quantity of respective image is in Sub Data Set:Unlabeled data collection>Test data set>Training dataset.
3. the image automatic annotation method based on semi-supervised learning according to claim 1, it is characterized in that, step 4.5 and step In rapid 5.4, adaptive weighted convergence strategy is pre- to same Unlabeled data according to LDA_SVM graders and neural network classifier Survey the contribution of precision and determine.
CN201711002595.4A 2017-10-24 2017-10-24 Image automatic annotation method based on semi-supervised learning Pending CN107644235A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711002595.4A CN107644235A (en) 2017-10-24 2017-10-24 Image automatic annotation method based on semi-supervised learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711002595.4A CN107644235A (en) 2017-10-24 2017-10-24 Image automatic annotation method based on semi-supervised learning

Publications (1)

Publication Number Publication Date
CN107644235A true CN107644235A (en) 2018-01-30

Family

ID=61123785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711002595.4A Pending CN107644235A (en) 2017-10-24 2017-10-24 Image automatic annotation method based on semi-supervised learning

Country Status (1)

Country Link
CN (1) CN107644235A (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
CN108647264A (en) * 2018-04-28 2018-10-12 北京邮电大学 A kind of image automatic annotation method and device based on support vector machines
CN108830466A (en) * 2018-05-31 2018-11-16 长春博立电子科技有限公司 A kind of image content semanteme marking system and method based on cloud platform
CN108959431A (en) * 2018-06-11 2018-12-07 中国科学院上海高等研究院 Label automatic generation method, system, computer readable storage medium and equipment
CN108960409A (en) * 2018-06-13 2018-12-07 南昌黑鲨科技有限公司 Labeled data generation method, equipment and computer readable storage medium
CN109214463A (en) * 2018-09-25 2019-01-15 合肥优控科技有限公司 A kind of classification of landform method based on coorinated training
CN109325434A (en) * 2018-09-15 2019-02-12 天津大学 A kind of image scene classification method of the probability topic model of multiple features
CN109359697A (en) * 2018-10-30 2019-02-19 国网四川省电力公司广元供电公司 Graph image recognition methods and inspection system used in a kind of power equipment inspection
CN109389180A (en) * 2018-10-30 2019-02-26 国网四川省电力公司广元供电公司 A power equipment image-recognizing method and inspection robot based on deep learning
CN109460914A (en) * 2018-11-05 2019-03-12 云南大学 Method is determined based on the bridge health grade of semi-supervised error correction study
CN109657087A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 A kind of batch data mask method, device and computer readable storage medium
CN109784392A (en) * 2019-01-07 2019-05-21 华南理工大学 A kind of high spectrum image semisupervised classification method based on comprehensive confidence
CN110008924A (en) * 2019-04-15 2019-07-12 中国石油大学(华东) A kind of semi-supervised automark method and device towards atural object in Hyperspectral imaging
CN110059217A (en) * 2019-04-29 2019-07-26 广西师范大学 A kind of image text cross-media retrieval method of two-level network
CN110084289A (en) * 2019-04-11 2019-08-02 北京百度网讯科技有限公司 Image labeling method, device, electronic equipment and storage medium
CN110110795A (en) * 2019-05-10 2019-08-09 厦门美图之家科技有限公司 Image classification method and device
CN110222171A (en) * 2019-05-08 2019-09-10 新华三大数据技术有限公司 A kind of application of disaggregated model, disaggregated model training method and device
CN110427542A (en) * 2018-04-26 2019-11-08 北京市商汤科技开发有限公司 Sorter network training and data mask method and device, equipment, medium
CN110542819A (en) * 2019-09-25 2019-12-06 贵州电网有限责任公司 transformer fault type diagnosis method based on semi-supervised DBNC
CN110674854A (en) * 2019-09-09 2020-01-10 东软集团股份有限公司 Image classification model training method, image classification method, device and equipment
CN110765855A (en) * 2019-09-12 2020-02-07 杭州迪英加科技有限公司 Pathological image processing method and system
CN110858327A (en) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 Method of validating training data, training system and computer program product
CN110909803A (en) * 2019-11-26 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training method and device and computer readable storage medium
CN111126592A (en) * 2018-10-30 2020-05-08 三星电子株式会社 Method and apparatus for outputting prediction result, method and apparatus for generating neural network, and storage medium
CN111160373A (en) * 2019-12-30 2020-05-15 重庆邮电大学 Method for extracting, detecting and classifying defect image features of variable speed drum parts
CN111340261A (en) * 2018-12-03 2020-06-26 北京嘀嘀无限科技发展有限公司 Method, system, computer device and storage medium for judging order violation behavior
CN111382758A (en) * 2018-12-28 2020-07-07 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN111489792A (en) * 2020-04-14 2020-08-04 西安交通大学 T cell receptor sequence classification method based on semi-supervised learning framework
CN111506757A (en) * 2020-04-10 2020-08-07 复旦大学 Voice marking device and method based on incremental iteration
CN111563590A (en) * 2020-04-30 2020-08-21 华南理工大学 Active learning method based on generation countermeasure model
CN111768007A (en) * 2020-06-28 2020-10-13 北京百度网讯科技有限公司 Method and apparatus for mining data
CN111861103A (en) * 2020-06-05 2020-10-30 中南民族大学 Fresh tea leaf classification method based on multiple features and multiple classifiers
CN112418304A (en) * 2020-11-19 2021-02-26 北京云从科技有限公司 OCR (optical character recognition) model training method, system and device
CN112580673A (en) * 2019-09-27 2021-03-30 中国石油化工股份有限公司 Seismic reservoir sample expansion method and device based on spatial probability distribution
CN112668657A (en) * 2020-12-30 2021-04-16 中山大学 Method for detecting out-of-distribution image of attention enhancement based on classifier prediction uncertainty
CN113407713A (en) * 2020-10-22 2021-09-17 腾讯科技(深圳)有限公司 Corpus mining method and apparatus based on active learning and electronic device
CN113554627A (en) * 2021-07-27 2021-10-26 广西师范大学 Wheat head detection method based on computer vision semi-supervised pseudo label learning
CN114155412A (en) * 2022-02-09 2022-03-08 北京阿丘科技有限公司 Deep learning model iteration method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096825A (en) * 2011-03-23 2011-06-15 西安电子科技大学 Graph-based semi-supervised high-spectral remote sensing image classification method
CN104036021A (en) * 2014-06-26 2014-09-10 广西师范大学 Method for semantically annotating images on basis of hybrid generative and discriminative learning models
CN105279519A (en) * 2015-09-24 2016-01-27 四川航天***工程研究所 Remote sensing image water body extraction method and system based on cooperative training semi-supervised learning
CN106778832A (en) * 2016-11-28 2017-05-31 华南理工大学 The semi-supervised Ensemble classifier method of high dimensional data based on multiple-objection optimization

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096825A (en) * 2011-03-23 2011-06-15 西安电子科技大学 Graph-based semi-supervised high-spectral remote sensing image classification method
CN104036021A (en) * 2014-06-26 2014-09-10 广西师范大学 Method for semantically annotating images on basis of hybrid generative and discriminative learning models
CN105279519A (en) * 2015-09-24 2016-01-27 四川航天***工程研究所 Remote sensing image water body extraction method and system based on cooperative training semi-supervised learning
CN106778832A (en) * 2016-11-28 2017-05-31 华南理工大学 The semi-supervised Ensemble classifier method of high dimensional data based on multiple-objection optimization

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张辰: "《复杂环境中运动目标检测与跟踪研究》", 31 August 2014, 《中国矿业大学出版社》 *
徐美香: ""基于半监督的多标签图像分类技术研究"", 《中国优秀硕士学位论文全文数据库,信息科技辑》 *
蔡晰 等,: ""基于半监督技术的多分类器融合策略研究"", 《计算机工程与应用》 *

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416382B (en) * 2018-03-01 2022-04-19 南开大学 Web image training convolutional neural network method based on iterative sampling and one-to-many label correction
CN108416382A (en) * 2018-03-01 2018-08-17 南开大学 One kind is based on iteration sampling and a pair of of modified Web graph of multi-tag as training convolutional neural networks method
CN110427542A (en) * 2018-04-26 2019-11-08 北京市商汤科技开发有限公司 Sorter network training and data mask method and device, equipment, medium
CN108647264A (en) * 2018-04-28 2018-10-12 北京邮电大学 A kind of image automatic annotation method and device based on support vector machines
CN108647264B (en) * 2018-04-28 2020-10-13 北京邮电大学 Automatic image annotation method and device based on support vector machine
CN108830466A (en) * 2018-05-31 2018-11-16 长春博立电子科技有限公司 A kind of image content semanteme marking system and method based on cloud platform
CN108959431A (en) * 2018-06-11 2018-12-07 中国科学院上海高等研究院 Label automatic generation method, system, computer readable storage medium and equipment
CN108960409A (en) * 2018-06-13 2018-12-07 南昌黑鲨科技有限公司 Labeled data generation method, equipment and computer readable storage medium
CN108960409B (en) * 2018-06-13 2021-08-03 南昌黑鲨科技有限公司 Method and device for generating annotation data and computer-readable storage medium
CN110858327A (en) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 Method of validating training data, training system and computer program product
CN109325434A (en) * 2018-09-15 2019-02-12 天津大学 A kind of image scene classification method of the probability topic model of multiple features
CN109214463A (en) * 2018-09-25 2019-01-15 合肥优控科技有限公司 A kind of classification of landform method based on coorinated training
CN109389180A (en) * 2018-10-30 2019-02-26 国网四川省电力公司广元供电公司 A power equipment image-recognizing method and inspection robot based on deep learning
CN109359697A (en) * 2018-10-30 2019-02-19 国网四川省电力公司广元供电公司 Graph image recognition methods and inspection system used in a kind of power equipment inspection
CN111126592A (en) * 2018-10-30 2020-05-08 三星电子株式会社 Method and apparatus for outputting prediction result, method and apparatus for generating neural network, and storage medium
CN109460914A (en) * 2018-11-05 2019-03-12 云南大学 Method is determined based on the bridge health grade of semi-supervised error correction study
CN109657087A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 A kind of batch data mask method, device and computer readable storage medium
CN111340261A (en) * 2018-12-03 2020-06-26 北京嘀嘀无限科技发展有限公司 Method, system, computer device and storage medium for judging order violation behavior
CN111340261B (en) * 2018-12-03 2023-07-18 北京嘀嘀无限科技发展有限公司 Method, system, computer equipment and storage medium for judging order violation
CN111382758B (en) * 2018-12-28 2023-12-26 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN111382758A (en) * 2018-12-28 2020-07-07 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN109784392A (en) * 2019-01-07 2019-05-21 华南理工大学 A kind of high spectrum image semisupervised classification method based on comprehensive confidence
CN110084289B (en) * 2019-04-11 2021-07-27 北京百度网讯科技有限公司 Image annotation method and device, electronic equipment and storage medium
CN110084289A (en) * 2019-04-11 2019-08-02 北京百度网讯科技有限公司 Image labeling method, device, electronic equipment and storage medium
CN110008924A (en) * 2019-04-15 2019-07-12 中国石油大学(华东) A kind of semi-supervised automark method and device towards atural object in Hyperspectral imaging
CN110059217B (en) * 2019-04-29 2022-11-04 广西师范大学 Image text cross-media retrieval method for two-stage network
CN110059217A (en) * 2019-04-29 2019-07-26 广西师范大学 A kind of image text cross-media retrieval method of two-level network
CN110222171A (en) * 2019-05-08 2019-09-10 新华三大数据技术有限公司 A kind of application of disaggregated model, disaggregated model training method and device
CN110110795B (en) * 2019-05-10 2021-04-20 厦门美图之家科技有限公司 Image classification method and device
CN110110795A (en) * 2019-05-10 2019-08-09 厦门美图之家科技有限公司 Image classification method and device
CN110674854B (en) * 2019-09-09 2022-05-17 东软集团股份有限公司 Image classification model training method, image classification method, device and equipment
CN110674854A (en) * 2019-09-09 2020-01-10 东软集团股份有限公司 Image classification model training method, image classification method, device and equipment
CN110765855B (en) * 2019-09-12 2023-04-18 杭州迪英加科技有限公司 Pathological image processing method and system
CN110765855A (en) * 2019-09-12 2020-02-07 杭州迪英加科技有限公司 Pathological image processing method and system
CN110542819B (en) * 2019-09-25 2022-03-22 贵州电网有限责任公司 Transformer fault type diagnosis method based on semi-supervised DBNC
CN110542819A (en) * 2019-09-25 2019-12-06 贵州电网有限责任公司 transformer fault type diagnosis method based on semi-supervised DBNC
CN112580673A (en) * 2019-09-27 2021-03-30 中国石油化工股份有限公司 Seismic reservoir sample expansion method and device based on spatial probability distribution
CN112580673B (en) * 2019-09-27 2024-04-12 中国石油化工股份有限公司 Seismic reservoir sample expansion method and device based on space probability distribution
CN110909803A (en) * 2019-11-26 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training method and device and computer readable storage medium
CN110909803B (en) * 2019-11-26 2023-04-18 腾讯科技(深圳)有限公司 Image recognition model training method and device and computer readable storage medium
CN111160373A (en) * 2019-12-30 2020-05-15 重庆邮电大学 Method for extracting, detecting and classifying defect image features of variable speed drum parts
CN111506757A (en) * 2020-04-10 2020-08-07 复旦大学 Voice marking device and method based on incremental iteration
CN111489792A (en) * 2020-04-14 2020-08-04 西安交通大学 T cell receptor sequence classification method based on semi-supervised learning framework
CN111563590A (en) * 2020-04-30 2020-08-21 华南理工大学 Active learning method based on generation countermeasure model
CN111861103A (en) * 2020-06-05 2020-10-30 中南民族大学 Fresh tea leaf classification method based on multiple features and multiple classifiers
CN111861103B (en) * 2020-06-05 2024-01-12 中南民族大学 Fresh tea classification method based on multiple features and multiple classifiers
CN111768007B (en) * 2020-06-28 2023-08-08 北京百度网讯科技有限公司 Method and device for mining data
CN111768007A (en) * 2020-06-28 2020-10-13 北京百度网讯科技有限公司 Method and apparatus for mining data
CN113407713A (en) * 2020-10-22 2021-09-17 腾讯科技(深圳)有限公司 Corpus mining method and apparatus based on active learning and electronic device
CN113407713B (en) * 2020-10-22 2024-04-05 腾讯科技(深圳)有限公司 Corpus mining method and device based on active learning and electronic equipment
CN112418304A (en) * 2020-11-19 2021-02-26 北京云从科技有限公司 OCR (optical character recognition) model training method, system and device
CN112668657A (en) * 2020-12-30 2021-04-16 中山大学 Method for detecting out-of-distribution image of attention enhancement based on classifier prediction uncertainty
CN112668657B (en) * 2020-12-30 2023-08-29 中山大学 Attention-enhanced out-of-distribution image detection method based on uncertainty prediction of classifier
CN113554627B (en) * 2021-07-27 2022-04-29 广西师范大学 Wheat head detection method based on computer vision semi-supervised pseudo label learning
CN113554627A (en) * 2021-07-27 2021-10-26 广西师范大学 Wheat head detection method based on computer vision semi-supervised pseudo label learning
CN114155412A (en) * 2022-02-09 2022-03-08 北京阿丘科技有限公司 Deep learning model iteration method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107644235A (en) Image automatic annotation method based on semi-supervised learning
CN110334705B (en) Language identification method of scene text image combining global and local information
Zheng et al. Topic modeling of multimodal data: an autoregressive approach
Eigen et al. Nonparametric image parsing using adaptive neighbor sets
Farabet et al. Scene parsing with multiscale feature learning, purity trees, and optimal covers
Sun et al. Scene image classification method based on Alex-Net model
CN106126581A (en) Cartographical sketching image search method based on degree of depth study
CN108427740B (en) Image emotion classification and retrieval algorithm based on depth metric learning
CN108154156B (en) Image set classification method and device based on neural topic model
CN109886161A (en) A kind of road traffic index identification method based on possibility cluster and convolutional neural networks
Li et al. Multiple VLAD encoding of CNNs for image classification
Liang et al. Environmental microorganism classification using optimized deep learning model
CN110263174A (en) - subject categories the analysis method based on focus
Li et al. Latent semantic representation learning for scene classification
CN113688894A (en) Fine-grained image classification method fusing multi-grained features
Nguyen et al. Adaptive nonparametric image parsing
Xin et al. Hybrid dilated multilayer faster RCNN for object detection
CN103440332B (en) A kind of image search method strengthening expression based on relational matrix regularization
Gao et al. An improved XGBoost based on weighted column subsampling for object classification
Foumani et al. A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation
Hu et al. Learning salient features for flower classification using convolutional neural network
CN111768214A (en) Product attribute prediction method, system, device and storage medium
Guo Deep learning for visual understanding
Wu et al. Supervised Contrastive Representation Embedding Based on Transformer for Few-Shot Classification
Zhou et al. An improved convolutional neural network model with adversarial net for multi-label image classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180130