CN107169530A - Mask method, device and the electronic equipment of picture - Google Patents
Mask method, device and the electronic equipment of picture Download PDFInfo
- Publication number
- CN107169530A CN107169530A CN201710431451.4A CN201710431451A CN107169530A CN 107169530 A CN107169530 A CN 107169530A CN 201710431451 A CN201710431451 A CN 201710431451A CN 107169530 A CN107169530 A CN 107169530A
- Authority
- CN
- China
- Prior art keywords
- picture
- marked
- matrix
- default
- characteristic vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Mask method, device and the electronic equipment of a kind of picture provided in an embodiment of the present invention, are related to picture label technology field.Methods described includes carrying out feature extraction to the picture to be marked got, obtains the corresponding characteristic vector of the picture to be marked;The corresponding characteristic vector of picture to be marked and the semi-supervised picture marking model of default multi views are based on again, obtain the annotation results of the picture to be marked, realized and picture is labeled with this using multi views semi-supervised picture marking model, possesses good noise cognitive ability, efficiency high, stability are strong.
Description
Technical field
The present invention relates to picture label technology field, in particular to a kind of mask method of picture, device and electronics
Equipment.
Background technology
In the multimedia big data epoch, with the appearance of increasing picture, current most of pictures based on content
Search method can not obtain good experience, and automatic picture mark (Automatic Image Annotation) is due to can
Picture semantic retrieval and other picture concerned management roles is promoted to have become the most important research side of MultiMedia Field well
One of to.Picture retrieval based on content can be converted to text based by automatic picture mark by combination tag and picture
Picture retrieval.After picture feature and related semantic label are obtained, label can be adapted to using a variety of machine learning algorithms.
Nowadays, due to the development of smart mobile phone and cordless communication network, the acquisition of picture increasingly facilitates, can be at any time
Share everywhere onto internet and go, this brings the active demand of multimedia application, including semantic indexing, search, retrieval and its
His pictures management task.Although having done many work, main search engine products base in terms of Multimedia content analysis
In text index technology.Thus under the background of picture big data, efficiency, the stability of picture dimensioning algorithm are still not enough.
The content of the invention
In view of this, the purpose of the embodiment of the present invention is to provide a kind of mask method of picture, device and electronic equipment,
To improve above mentioned problem.To achieve these goals, the technical scheme that the present invention takes is as follows:
In a first aspect, the embodiments of the invention provide a kind of mask method of picture, methods described is included to getting
Picture to be marked carries out feature extraction, obtains the corresponding characteristic vector of the picture to be marked;Based on the picture pair to be marked
The semi-supervised picture marking model of characteristic vector and default multi views answered, obtains the annotation results of the picture to be marked.
Second aspect, the embodiments of the invention provide a kind of annotation equipment of picture, described device includes feature extraction list
Member and mark unit.Feature extraction unit, for carrying out feature extraction to the picture to be marked got, is obtained described to be marked
The corresponding characteristic vector of picture.Unit is marked, for the picture correspondence to be marked obtained based on the feature extraction unit
Characteristic vector and the semi-supervised picture marking model of default multi views, obtain the annotation results of the picture to be marked.
The third aspect, the embodiments of the invention provide a kind of electronic equipment, the electronic equipment includes processor and storage
Device.The processor and the memory are electrically connected by bus.The memory is used to store program.The processor is used for
The program being stored in the memory is called by the bus, performed:Feature is carried out to the picture to be marked got to carry
Take, obtain the corresponding characteristic vector of the picture to be marked;Based on the corresponding characteristic vector of the picture to be marked and default
The semi-supervised picture marking model of multi views, obtain the annotation results of the picture to be marked.
It is to be marked to what is got the embodiments of the invention provide a kind of mask method of picture, device and electronic equipment
Picture carries out feature extraction, obtains the corresponding characteristic vector of the picture to be marked;It is corresponding based on the picture to be marked again
Characteristic vector and the semi-supervised picture marking model of default multi views, obtain the annotation results of the picture to be marked, with this
Realize and picture be labeled using multi views semi-supervised picture marking model possess good noise cognitive ability, efficiency high,
Stability is strong.
Other features and advantages of the present invention will be illustrated in subsequent specification, also, partly be become from specification
It is clear that or by implementing understanding of the embodiment of the present invention.The purpose of the present invention and other advantages can be by saying for being write
Specifically noted structure is realized and obtained in bright book, claims and accompanying drawing.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be attached to what is used required in embodiment
Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, therefore is not construed as pair
The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this
A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is the structured flowchart of a kind of electronic equipment provided in an embodiment of the present invention;
The flow chart of the mask method for the picture that Fig. 2 provides for first embodiment of the invention;
The semi-supervised picture of default multi views is obtained in the mask method for the picture that Fig. 3 provides for first embodiment of the invention
The flow chart of marking model;
Using NUS-WIDE as training data in the mask method for the picture that Fig. 4 provides for first embodiment of the invention, extract
LLC features, FK features and the Contrast on effect schematic diagram of existing algorithm;
Using NUS-WIDE as training data in the mask method for the picture that Fig. 5 provides for first embodiment of the invention, extract
FC6, FC7 feature and the Contrast on effect schematic diagram of existing algorithm;
Using MIRFLICKR-25000 as training number in the mask method for the picture that Fig. 6 provides for first embodiment of the invention
According to extraction LLC features, FK features and the Contrast on effect schematic diagram of existing algorithm;
Using MIRFLICKR-25000 as training number in the mask method for the picture that Fig. 7 provides for first embodiment of the invention
According to extraction FC6, FC7 feature and the Contrast on effect schematic diagram of existing algorithm;
Using IAPRTC-12 as training data in the mask method for the picture that Fig. 8 provides for first embodiment of the invention, extract
LLC features, FK features and the Contrast on effect schematic diagram of existing algorithm;
Using IAPRTC-12 as training data in the mask method for the picture that Fig. 9 provides for first embodiment of the invention, extract
FC6, FC7 feature and the Contrast on effect schematic diagram of existing algorithm;
The structured flowchart of the annotation equipment for the picture that Figure 10 provides for second embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Ground is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.Generally exist
The component of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged and designed with a variety of configurations herein.Cause
This, the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention below
Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi
It is defined in individual accompanying drawing, then it further need not be defined and explained in subsequent accompanying drawing.Meanwhile, the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or implying relative importance.
Referring to Fig. 1, Fig. 1 shows the structured flowchart of a kind of electronic equipment 100 provided in an embodiment of the present invention.The electricity
Sub- equipment 100 can also be used as server as user terminal.User terminal can be:PC(personal
Computer) computer, tablet personal computer, mobile phone, electronic reader, notebook computer, intelligent television, set top box, car-mounted terminal etc.
Terminal device.As shown in figure 1, electronic equipment 100 can include memory 110, storage control 111, processor 112, peripheral hardware
Interface 113, input-output unit 115, audio unit 116, display unit 117.
The memory 110, storage control 111, processor 112, Peripheral Interface 113, input-output unit 115, sound
Directly or indirectly electrically connected between frequency unit 116, each element of display unit 117, to realize the transmission or interaction of data.Example
Such as, it can be realized and electrically connected by one or more communication bus or signal bus between these elements.The mask method of picture
The software function mould in memory 110 can be stored in the form of software or firmware (firmware) by including at least one respectively
Block, such as software function module or computer program that the annotation equipment of described picture includes.
Memory 110 can store the mark of various software programs and module, such as picture that the embodiment of the present application is provided
Corresponding programmed instruction/the module of method and device.Processor 112 by run storage software program in the memory 110 with
And module, so as to perform various function application and data processing, that is, realize the mask method of the picture in the embodiment of the present application.
Memory 110 can include but is not limited to random access memory (Random Access Memory, RAM), read-only storage
(Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM),
Erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), electric erasable is read-only to be deposited
Reservoir (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Processor 112 can be a kind of IC chip, with signal handling capacity.Above-mentioned processor can be general
Processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network
Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), it is application specific integrated circuit (ASIC), ready-made programmable
Gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hardware components.It can
To realize or perform disclosed each method, step and the logic diagram in the embodiment of the present application.General processor can be micro-
Processor or the processor can also be any conventional processors etc..
Various input/output devices are coupled to processor 112 and memory 110 by the Peripheral Interface 113.At some
In embodiment, Peripheral Interface 113, processor 112 and storage control 111 can be realized in one single chip.Other one
In a little examples, they can be realized by independent chip respectively.
Input-output unit 115 is used for the friendship for being supplied to user input data to realize user and server (or local terminal)
Mutually.The input-output unit 115 may be, but not limited to, mouse and keyboard etc..
Audio unit 116 provides a user COBBAIF, and it may include one or more microphones, one or more raises
Sound device and voicefrequency circuit.
Display unit 117 the server (or local terminal) provided between user an interactive interface (for example with
Family operation interface) or for display image data give user reference.In the present embodiment, the display unit 117 can be liquid
Crystal display or touch control display.If touch control display, it can be support single-point and the capacitance touching control of multi-point touch operation
Screen or resistance type touch control screen etc..Single-point and multi-point touch operation is supported to refer to that touch control display can be sensed and shown from the touch-control
The touch control operation that one or more positions are produced simultaneously on device, and transfer to processor to be counted the touch control operation that this is sensed
Calculate and handle.
It is appreciated that the structure shown in Fig. 1 be only signal, electronic equipment 100 may also include it is more more than shown in Fig. 1 or
Less component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can using hardware, software or its
Combination is realized.
First embodiment
Referring to Fig. 2, the embodiments of the invention provide a kind of mask method of picture, methods described include step S200 and
Step S210.
Step S200:Feature extraction is carried out to the picture to be marked got, the corresponding spy of the picture to be marked is obtained
Levy vector.
Step S210:Based on the corresponding characteristic vector of picture to be marked and the semi-supervised picture mark of default multi views
Injection molding type, obtains the annotation results of the picture to be marked.
Based on step S210, further, based on expression formula (1):
The prediction label value of the picture to be marked is obtained, the annotation results of the picture to be marked are obtained with this;Wherein,
Xt, t=1,2 ..., m is the corresponding characteristic vector of the picture to be marked, Wt, t=1,2 ..., m is default mapping matrix,
bt, t=1,2 ..., m is default bias term,For the prediction label value of the picture to be marked.
Expression formula (1) is the default semi-supervised picture marking model of multi views.By the picture pair to be marked
The characteristic vector answered brings expression formula (1) into, obtains the prediction label value of the picture to be marked, i.e., according to the picture to be marked
Prediction label value, obtain its corresponding annotation results.
Based on step S200, methods described also includes:Characteristic vector corresponding to the picture to be marked passes through principal component
Analytic approach carries out dimensionality reduction, obtains the characteristic vector after dimensionality reduction.
Bring the characteristic vector after the dimensionality reduction into default multi views semi-supervised picture marking model again, obtain described in treat
Mark the annotation results of picture.
Referring to Fig. 3, in order to obtain the semi-supervised picture marking model of default multi views, methods described can also include step
Rapid S300, step S310, step S320, step S330 and step S340.
Step S300:Feature extraction is carried out to the n picture got, the corresponding multi views of the n picture are obtained special
Levy and label matrix.
Using the n picture got as training data, feature extraction is carried out respectively, obtains each picture corresponding
M view feature, obtains the corresponding n × m multi views feature of n picture.
For t-th of multi views feature in n picture, wherein,For t-th of view feature in i-th of picture, dtFor the dimension of its t-th of view feature of correspondence.
L picture has label before in training data, and remaining n-l picture is no label.Regarded in training data with t-th
The related label matrix of figure feature isWherein, c is the number of label
Amount, as (1≤i≤l),To there is label picture;As (l+1≤i≤n),For full null vector, for no label
Picture.OrderJ-th of classification of i-th picture related to t-th of view feature is represented, and when i-th of picture is at j-th
In classification,In the case of remaining,If i-th of figure does not have label,For 0, with this using there is label picture
Learn the semi-supervised picture marking model of view with no label picture, improve stability (robustness).
Step S310:Based on the corresponding multi views feature of the n picture and default Similarity Measure rule, obtain
The corresponding similarity matrix of the n picture.
Based on step S310, further, it is based onObtain the n figure
The corresponding similarity matrix of piece, S=[Sij], 1≤i, j≤n is the corresponding similarity matrix of the n picture, xi,xj(1≤i,
J≤n) be the n picture in i-th, each self-corresponding multi views feature of j picture, Nk(xi) it is xiArest neighbors set k, Nq
(xj) it is xjArest neighbors set q.
Similarity matrix, S are built using multi views featureijDefinition be the default Similarity Measure rule, instead
The corresponding multi views feature x of two pictures is mirrorediAnd xjBetween characteristic similarity.In order to reduce number of parameters, the present invention is real
Apply the similarity matrix that example defines the above.
Step S320:All diagonal element values in the corresponding similarity matrix of the n picture are obtained, are obtained to angular moment
Battle array.
Step S330:The diagonal matrix is subtracted into the corresponding similarity matrix of the n picture, the n figure is obtained
The corresponding Laplacian Matrix of piece.
Based on step S310, diagonal matrix D is obtained, its i-th of diagonal element value passes throughCalculating is obtained.Then
L=D-S is calculated, the corresponding Laplacian Matrix L of the n picture is obtained.
Step S340:The corresponding label matrix of the n picture, Laplacian Matrix are brought into default object function and entered
Row iteration is calculated, and obtains the default mapping matrix and the default bias term.
Further, in order to obtain label and no label picture simultaneously, the embodiment of the present invention is definedAs the Tag Estimation matrix of all training datas, wherein,For i-th
The prediction label of picture.As a kind of embodiment, according to semi-supervised learning method, F can be following most by solving object function
Small optimization problem is obtained:
In expression formula (2),For diagonal matrix, it is referred to as decision mode matrix.If i-th of picture has label its
Diagonal element Uii(10 are set to for very big number10), otherwise, Uii=1.So decision rule matrix is set to make the label of solution
Prediction matrix F and Y are consistent.
In order to further improve the tolerance noise immune of the obtained semi-supervised picture marking model of default multi views, this hair
Bright embodiment proposes a loss function for being integrated with the robust for adapting to different stage noise immune.Select l2,pLoss function,
Then expression formula (2) is represented by following form:
Expression formula (3) is the default object function, wherein, | | | |2,pFor the l of matrix2,pNorm, | | | |FMark
Sign the Frobenius norms of matrix, symbol ()TThe transposition of representing matrix, the mark of Tr () representing matrix, μ, γ is default
Balance parameters,For default mapping matrix,For default bias term,For regular terms, 1nFor complete 1
Vector.M l2,pNorm is defined as:
In expression formula (4), MiFor M the i-th row.
Further, to t-th of view feature in n picture, the embodiment of the present invention can be from view feature XtCalculate
Obtain associated Laplacian Matrix Lt, it is then corresponding, the prediction for obtaining view dependence can be calculated using expression formula (3)
Label matrix Ft.Accordingly, the present invention is incorporated into multi views feature learning in expression formula (3) to adjust phase in different views feature
Information close and supplement is so as to obtain more preferable effect.Propose that the object function for minimizing all view features jointly to the greatest extent may be used
The F of a certain view feature can be limitedt, i.e.,:
In expression formula (5), λ is default balance parameters,The output result of each pair view feature can be made more
Plus it is consistent, so as to reach more preferable effect.The study of models coupling multi views and the advantage of the semi-supervised learning based on figure, have
Effect make use of the side information in substantial amounts of untagged data and different views.
Due to l2,pThe nonconvex property and l of loss function2,pRegular terms, direct solution expression formula (5) is not easy to.For simplification
Calculate, the embodiment of the present invention proposes a kind of efficient iterative algorithm to solve expression formula (5), first change expression formula (5)
For:
In expression formula (6),For diagonal matrix, its i-th of diagonal element can be calculated by formula (7) and obtained:
In formula (7),For matrixThe i-th row.Similarly,Also it is a diagonal matrix, its is diagonal
Member can be obtained by formula (8):
In formula (8),For matrix Ft-FsThe i-th row.
Due toWithAll with Ft、WtAnd btIt is related so that formula (6) is more difficult to solve.Accordingly, the present invention is implemented
Example devises a kind of alternative manner, fixed in previous iterationWithTo break through obstacle, it can so be asked by formula (6)
Solve Ft,WtAnd bt。
By setting formula (6) on btDerivation result be 0, have:
Formula (9) is brought into expression formula (6), and expression formula (6) is set on WtDerivation result be 0, have:
Wt=AtFt (10)
In formula (10), have:
Formula (9) and formula (10) are brought into expression formula (6) again, had:
In expression formula (13), object function is set on FtDerivation result be 0, have:
Ft=MtQt (14)
In formula (14), have:
Set in formula (15), (16)Work as t=s, t=1,2 ..., m..Accordingly, can be by solving object function
To obtain Ft,Wt,btOptimal solution.
Specifically, the corresponding label matrix of the n picture, Laplacian Matrix are brought into expression formula (5), passes through iteration
Mode is solved to expression formula (5), random initializtion Ft,Wt,bt, (t=1,2 ..., m), are iterated to calculate to Ft,Wt,bt,(t
=1,2 ..., m) carry out it is optimal solve, untill meeting iteration convergence, iteration convergence condition preferably is:Twice recently
The change of solving result is no more than predetermined threshold value, and the default mapping matrix and the default bias term are obtained with this.
In iterative process, calculated respectively according to formula (7) and (8) firstWithAgain to each view (t=1,2 ..., m), according to
It is secondary to have:
H is calculated according to formula (12)t, A is calculated according to formula (11)t, M is calculated according to formula (15)t, according to formula (16)
Calculate Qt, according to formula (14), formula (10), formula (9), F is updated respectivelyt,Wt,bt.Obtain Ft,Wt,btOptimal solution, with this
The semi-supervised picture marking model of multi views is obtained, i.e.,:
In addition, the beneficial effect of the mask method in order to which picture provided in an embodiment of the present invention is further illustrated, should
Three kinds of network image data collection are used, respectively NUS-WIDE (comprising 269,648 reality scene pictures, is labeled with 81
Plant label), MIRFLICKR-25000 (including 25,000 pictures and 24 kinds of labels) and IAPRTC-12 (include 20,000
From representational static natural landscape picture all over the world).Data set is divided into two parts, and a part is used to train, separately
A part is used to test.
Characteristic extraction part the present embodiment is accorded with first by SIFT partial descriptions, is extracted two kinds of vision spies of picture
Levy, be based respectively on two kinds of i.e. LLC of decoded mode (locality-constrained linear encoding) and FK
(improved Fisher encoding).The final dimension of LLC characteristic vectors is k (vocabulary table size), and the present embodiment sets k
=4096.For FK features, the final dimension of characteristic vector is equal to 2d*k, and wherein d is the dimension of SIFT descriptors, and k is vocabulary
Size.In the present embodiment, the dimension of SIFT descriptor is reduced to 50 from 128 by PCA (PCA).At this
In embodiment, FK feature vector dimensions are 25,600, and then by PCA dimensionality reductions to 4096, to save computing cost.Final LLC
All it is 4096 dimensions with FK characteristic vectors.The present embodiment is also extracted two kinds of new deep learning features, and FC6, FC7 (are 4096
Dimension) obtained using outputs of the Caffe based on the 6th layer and the 7th full articulamentum of layer network.
By experiment, the present embodiment sets different parameter values to reach best performance for different data sets.
For data set NUS-WIDE, arrange parameter is as follows:
μ=106, γ=104, λ=104, p=0.8, q=1.9
For data set MIRFLICKR-25000, arrange parameter is as follows:
μ=106, γ=102, λ=102, p=0.4, q=1.5
For data set IAPR TC-12, arrange parameter is as follows:
μ=106, γ=104, λ=104, p=1.0, q=1.6
Mask method based on picture provided in an embodiment of the present invention obtains output result, and the present embodiment is using average accurate
Rate (MAP) come weigh picture mark performance.As shown in Figure 4 and Figure 5,269,648 reality scene figures (are included with NUS-WIDE
Piece, is labeled with 81 kinds of labels) it is training data, with the LLC characteristic vectors of extraction, FK characteristic vectors, FC6 characteristic vectors, FC7
Characteristic vector, contrasts the result that the mask method and existing algorithm of picture provided in an embodiment of the present invention are drawn respectively.It is existing
Algorithm include typical multi views learning algorithm CCA (according to be based on least square return (Least Square
Regression, LS) or SVMs (SVM), be respectively defined as CCA-LS and CCA-SVM), it is a kind of new based on many of LS
The semi-supervised dimension descent method (MVSSDR-LS) of view and two kinds of semi-supervised algorithms, i.e. Structural Feature
In Selection with Sparsity (SFSS) and Flexible Manifold Embedding (FME), boost algorithms race
A new algorithm TaylorBoost.In Fig. 4, abscissa indicates the picture number of label, and 1xc represents every class one, indulges and sits
Mark represents Average Accuracy, and A1 is provided in an embodiment of the present invention with the LLC characteristic vectors of extraction, the effect of the mask method of picture
Really, A2 is to be provided in an embodiment of the present invention with the FK characteristic vectors of extraction, the effect of the mask method of picture, and A3 is SFSS algorithms
Effect, A4 be MVSSDR-LS algorithms effect, A5 be CCA-LS algorithms effect, A6 be TaylorBoost algorithms effect
Really, A7 is the effect of FME algorithms, and A8 is the effect of CCA-SVM algorithms.In Fig. 5, D1 is to be provided in an embodiment of the present invention with extraction
FC6 characteristic vectors, the effect of the mask method of picture, D2 for it is provided in an embodiment of the present invention with the FC7 characteristic vectors of extraction,
The effect of the mask method of picture, D3 is the effect of SFSS algorithms, and D4 is the effect of MVSSDR-LS algorithms, and D5 calculates for CCA-LS
The effect of method, D6 is the effect of TaylorBoost algorithms, and D 7 is the effect of FME algorithms, and D8 is the effect of CCA-SVM algorithms.
It is obvious that the mask method of picture provided in an embodiment of the present invention has very big lifting in effect.
Similarly, as shown in Figure 6 and Figure 7, with MIRFLICKR-25000 (including 25,000 pictures and 24 kinds of labels) for instruction
Practice data, with the LLC characteristic vectors of extraction, FK characteristic vectors, FC6 characteristic vectors, FC7 characteristic vectors, the present invention is contrasted respectively
The result that the mask method and existing algorithm for the picture that embodiment is provided are drawn.In Fig. 6, abscissa indicates the picture of label
Number, 1xc represents every class one, and ordinate represents Average Accuracy, and B1 is special for the LLC provided in an embodiment of the present invention with extraction
Vector, the effect of the mask method of picture are levied, B2 is provided in an embodiment of the present invention with the FK characteristic vectors of extraction, the mark of picture
The effect of injecting method, B3 is the effect of SFSS algorithms, and B4 is the effect of MVSSDR-LS algorithms, and B5 is the effect of CCA-LS algorithms,
B6 is the effect of TaylorBoost algorithms, and B 7 is the effect of FME algorithms, and B8 is the effect of CCA-SVM algorithms.In Fig. 7, E1 is
It is provided in an embodiment of the present invention with the FC6 characteristic vectors of extraction, the effect of the mask method of picture, E2 carries for the embodiment of the present invention
Supply with the FC7 characteristic vectors of extraction, the effect of the mask method of picture, E3 is the effect of SFSS algorithms, and E4 is MVSSDR-LS
The effect of algorithm, E5 is the effect of CCA-LS algorithms, and E6 is the effect of TaylorBoost algorithms, and E7 is the effect of FME algorithms,
E8 is the effect of CCA-SVM algorithms.Carried greatly very much it is obvious that the mask method of picture provided in an embodiment of the present invention has in effect
Rise.
Similarly, as shown in Figure 8 and Figure 9, it is special with the LLC characteristic vectors of extraction, FK using IAPRTC-12 as training data
Vector, FC6 characteristic vectors, FC7 characteristic vectors are levied, the mask method of picture provided in an embodiment of the present invention is contrasted respectively and is had
The result that draws of algorithm.In Fig. 8, abscissa indicates the picture number of label, and 1xc represents every class one, and ordinate is represented
Average Accuracy, C1 is provided in an embodiment of the present invention with the LLC characteristic vectors of extraction, the effect of the mask method of picture, C2
To be provided in an embodiment of the present invention with the FK characteristic vectors of extraction, the effect of the mask method of picture, C3 is the effect of SFSS algorithms
Really, C4 is the effect of MVSSDR-LS algorithms, and C5 is the effect of CCA-LS algorithms, and C6 is the effect of TaylorBoost algorithms, C7
For the effect of FME algorithms, B8 is the effect of CCA-SVM algorithms.In Fig. 9, F1 is the FC6 provided in an embodiment of the present invention with extraction
The effect of characteristic vector, the mask method of picture, F2 is provided in an embodiment of the present invention with the FC7 characteristic vectors of extraction, picture
Mask method effect, F3 is the effect of SFSS algorithms, E4 is the effect of MVSSDR-LS algorithms, and F5 is CCA-LS algorithm
Effect, F6 is the effect of TaylorBoost algorithms, and F7 is the effect of FME algorithms, and F8 is the effect of CCA-SVM algorithms.It is very bright
Aobvious, the mask method of picture provided in an embodiment of the present invention has very big lifting in effect.
Training data is used as by the different image data collection of three of the above, under the conditions of multi views, what the present invention was provided
The effect of the mask method of picture and the Contrast on effect of algorithm known, it is adaptable to the picture mark under a small amount of label of multi views condition
Note, and with good noise cognitive ability, can be restrained under less iterations.
A kind of mask method of picture provided in an embodiment of the present invention, carries out feature to the picture to be marked got and carries
Take, obtain the corresponding characteristic vector of the picture to be marked;The corresponding characteristic vector of the picture to be marked is based on again and pre-
If the semi-supervised picture marking model of multi views, obtain the annotation results of the picture to be marked, realized with this and use multi views
Semi-supervised picture marking model is labeled to picture, possesses good noise cognitive ability, and efficiency high, stability are strong.
Second embodiment
Referring to Fig. 10, the embodiments of the invention provide a kind of annotation equipment 400 of picture, described device 400 includes carrying
Take unit 410, similarity matrix obtaining unit 420, diagonal matrix obtaining unit 430, Laplacian Matrix obtaining unit 440,
Computing unit 450, feature extraction unit 470, mark unit 480.
Extraction unit 410, for carrying out feature extraction to the n picture got, obtains the n picture corresponding many
View feature and label matrix.
Similarity matrix obtaining unit 420, the n picture for being obtained based on the extraction unit 410 is corresponding
Multi views feature and default Similarity Measure rule, obtain the corresponding similarity matrix of the n picture.
As a kind of embodiment, the similarity matrix obtaining unit 420 can include similarity matrix and obtain son list
Member 421.
Similarity matrix obtain subelement 421, for based onObtain institute
State the corresponding similarity matrix of n picture, S=[Sij], 1≤i, j≤n is the corresponding similarity matrix of the n picture, xi,
xj(1≤i, j≤n) is i-th, each self-corresponding multi views feature of j picture, N in the n picturek(xi) it is xiArest neighbors
Set k, Nq(xj) it is xjArest neighbors set q.
Diagonal matrix obtaining unit 430, for obtaining all diagonal elements in the corresponding similarity matrix of the n picture
Value, obtains diagonal matrix.
Laplacian Matrix obtaining unit 440, for the diagonal matrix to be subtracted into the corresponding similarity of the n picture
Matrix, obtains the corresponding Laplacian Matrix of the n picture.
Computing unit 450, for bringing the corresponding label matrix of the n picture, Laplacian Matrix into default mesh
Scalar functions are iterated calculating, obtain the default mapping matrix and the default bias term.
Feature extraction unit 470, for carrying out feature extraction to the picture to be marked got, obtains the figure to be marked
The corresponding characteristic vector of piece.
Mark unit 480, for the corresponding feature of the picture to be marked that is obtained based on the feature extraction unit to
Amount and the semi-supervised picture marking model of default multi views, obtain the annotation results of the picture to be marked.
As a kind of embodiment, the mark unit 480 can include mark subelement 481.
Mark subelement 481, for based onObtain the picture to be marked
Prediction label value, the annotation results of the picture to be marked are obtained with this;Wherein, Xt, t=1,2 ..., m is the figure to be marked
The corresponding characteristic vector of piece, Wt, t=1,2 ..., m is default mapping matrix, bt, t=1,2 ..., m is default bias term,For the prediction label value of the picture to be marked.
Described device 400 can also include dimensionality reduction unit 460.
Dimensionality reduction unit 460, for being dropped to the corresponding characteristic vector of the picture to be marked by PCA
Dimension, obtains the characteristic vector after dimensionality reduction.
Above each unit can be that now, above-mentioned each unit can be stored in memory 110 by software code realization.
Above each unit can equally be realized by hardware such as IC chip.
The technique effect of the annotation equipment 400 of picture provided in an embodiment of the present invention, its realization principle and generation and foregoing
Embodiment of the method is identical, to briefly describe, and device embodiment part does not refer to part, refers to corresponding in preceding method embodiment
Content.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, it can also pass through
Other modes are realized.Device embodiment described above is only schematical, for example, flow chart and block diagram in accompanying drawing
Show according to the device of multiple embodiments of the present invention, the architectural framework in the cards of method and computer program product,
Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code
Part a, part for the module, program segment or code is used to realize holding for defined logic function comprising one or more
Row instruction.It should also be noted that in some implementations as replacement, the function of being marked in square frame can also with different from
The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially be performed substantially in parallel, they are sometimes
It can perform in the opposite order, this is depending on involved function.It is also noted that every in block diagram and/or flow chart
The combination of individual square frame and block diagram and/or the square frame in flow chart, can use the special base for performing defined function or action
Realize, or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each functional module in each embodiment of the invention can integrate to form an independent portion
Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized using in the form of software function module and is used as independent production marketing or in use, can be with
It is stored in a computer read/write memory medium.Understood based on such, technical scheme is substantially in other words
The part contributed to prior art or the part of the technical scheme can be embodied in the form of software product, the meter
Calculation machine software product is stored in a storage medium, including some instructions are to cause a computer equipment (can be individual
People's computer, server, or network equipment etc.) perform all or part of step of each of the invention embodiment methods described.
And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.Need
Illustrate, herein, such as first and second or the like relational terms be used merely to by an entity or operation with
Another entity or operation make a distinction, and not necessarily require or imply between these entities or operation there is any this reality
The relation or order on border.Moreover, term " comprising ", "comprising" or its any other variant are intended to the bag of nonexcludability
Contain, so that process, method, article or equipment including a series of key elements are not only including those key elements, but also including
Other key elements being not expressly set out, or also include for this process, method, article or the intrinsic key element of equipment.
In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including the key element
Process, method, article or equipment in also there is other identical element.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area
For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies
Change, equivalent substitution, improvement etc., should be included in the scope of the protection.It should be noted that:Similar label and letter exists
Similar terms is represented in following accompanying drawing, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing
It is further defined and explained.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained
Cover within protection scope of the present invention.Therefore, protection scope of the present invention described should be defined by scope of the claims.
It should be noted that herein, such as first and second or the like relational terms are used merely to a reality
Body or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or deposited between operating
In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to
Nonexcludability is included, so that process, method, article or equipment including a series of key elements not only will including those
Element, but also other key elements including being not expressly set out, or also include being this process, method, article or equipment
Intrinsic key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that
Also there is other identical element in process, method, article or equipment including the key element.
Claims (10)
1. a kind of mask method of picture, it is characterised in that methods described includes:
Feature extraction is carried out to the picture to be marked got, the corresponding characteristic vector of the picture to be marked is obtained;
Based on the corresponding characteristic vector of picture to be marked and the semi-supervised picture marking model of default multi views, institute is obtained
State the annotation results of picture to be marked.
2. according to the method described in claim 1, it is characterised in that described to be based on the corresponding characteristic vector of the picture to be marked
And the default semi-supervised picture marking model of multi views, the annotation results of the picture to be marked are obtained, including:
It is based onThe prediction label value of the picture to be marked is obtained, obtains described with this
The annotation results of picture to be marked;Wherein, Xt, t=1,2 ..., m is the corresponding characteristic vector of the picture to be marked, Wt, t=
1,2 ..., m are default mapping matrix, bt, t=1,2 ..., m is default bias term,For the pre- of the picture to be marked
Survey label value.
3. method according to claim 2, it is characterised in that methods described also includes:
Feature extraction is carried out to the n picture got, the corresponding multi views feature of the n picture and label matrix is obtained;
Based on the corresponding multi views feature of the n picture and default Similarity Measure rule, the n picture pair is obtained
The similarity matrix answered;
All diagonal element values in the corresponding similarity matrix of the n picture are obtained, diagonal matrix is obtained;
The diagonal matrix is subtracted into the corresponding similarity matrix of the n picture, the corresponding drawing pula of the n picture is obtained
This matrix;
The corresponding label matrix of the n picture, Laplacian Matrix are brought into default object function and be iterated calculating, is obtained
Obtain the default mapping matrix and the default bias term.
4. method according to claim 3, it is characterised in that described to be based on the corresponding multi views feature of the n picture
And default Similarity Measure rule, the corresponding similarity matrix of the n picture is obtained, including:
It is based onObtain the corresponding similarity matrix of the n picture, S=
[Sij], 1≤i, j≤n is the corresponding similarity matrix of the n picture, xi,xj(1≤i, j≤n) is the in the n picture
Each self-corresponding multi views feature of i, j pictures, Nk(xi) it is xiArest neighbors set k, Nq(xj) it is xjArest neighbors set q.
5. according to the method described in claim 1, it is characterised in that methods described also includes:
Characteristic vector corresponding to the picture to be marked carries out dimensionality reduction by PCA, obtain the feature after dimensionality reduction to
Amount.
6. a kind of annotation equipment of picture, it is characterised in that described device includes:
Feature extraction unit, for carrying out feature extraction to the picture to be marked got, obtains the picture correspondence to be marked
Characteristic vector;
Unit is marked, for the corresponding characteristic vector of the picture to be marked that is obtained based on the feature extraction unit and in advance
If the semi-supervised picture marking model of multi views, obtain the annotation results of the picture to be marked.
7. device according to claim 6, it is characterised in that the mark unit includes:
Mark subelement, for based onObtain the prediction label of the picture to be marked
Value, the annotation results of the picture to be marked are obtained with this;Wherein, Xt, t=1,2 ..., m is that the picture to be marked is corresponding
Characteristic vector, Wt, t=1,2 ..., m is default mapping matrix, bt, t=1,2 ..., m is default bias term,To be described
The prediction label value of picture to be marked.
8. device according to claim 6, it is characterised in that described device also includes:
Extraction unit, for carrying out feature extraction to the n picture got, obtains the corresponding multi views of the n picture special
Levy and label matrix;
Similarity matrix obtaining unit, for the corresponding multi views feature of the n picture obtained based on the extraction unit
And default Similarity Measure rule, obtain the corresponding similarity matrix of the n picture;
Diagonal matrix obtaining unit, for obtaining all diagonal element values in the corresponding similarity matrix of the n picture, is obtained
Diagonal matrix;
Laplacian Matrix obtaining unit, for the diagonal matrix to be subtracted into the corresponding similarity matrix of the n picture, is obtained
Obtain the corresponding Laplacian Matrix of the n picture;
Computing unit, for bringing default object function into and entering the corresponding label matrix of the n picture, Laplacian Matrix
Row iteration is calculated, and obtains the default mapping matrix and the default bias term.
9. device according to claim 8, it is characterised in that the similarity matrix obtaining unit includes:
Similarity matrix obtain subelement, for based onObtain the n picture
Corresponding similarity matrix, S=[Sij], 1≤i, j≤n is the corresponding similarity matrix of the n picture, xi,xj(1≤i,j
≤ n) be the n picture in i-th, each self-corresponding multi views feature of j picture, Nk(xi) it is xiArest neighbors set k, Nq
(xj) it is xjArest neighbors set q.
10. a kind of electronic equipment, it is characterised in that the electronic equipment includes processor and memory, the processor and institute
Memory is stated to electrically connect by bus;The memory is used to store program;The processor is used to call by the bus
The program in the memory is stored in, is performed:
Feature extraction is carried out to the picture to be marked got, the corresponding characteristic vector of the picture to be marked is obtained;
Based on the corresponding characteristic vector of picture to be marked and the semi-supervised picture marking model of default multi views, institute is obtained
State the annotation results of picture to be marked.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710431451.4A CN107169530A (en) | 2017-06-09 | 2017-06-09 | Mask method, device and the electronic equipment of picture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710431451.4A CN107169530A (en) | 2017-06-09 | 2017-06-09 | Mask method, device and the electronic equipment of picture |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107169530A true CN107169530A (en) | 2017-09-15 |
Family
ID=59825571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710431451.4A Pending CN107169530A (en) | 2017-06-09 | 2017-06-09 | Mask method, device and the electronic equipment of picture |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107169530A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107976992A (en) * | 2017-11-29 | 2018-05-01 | 东北大学 | Industrial process big data fault monitoring method based on figure semisupervised support vector machines |
CN108509959A (en) * | 2018-04-13 | 2018-09-07 | 广州优视网络科技有限公司 | Pornographic application and identification method, device, computer readable storage medium and server |
CN110032914A (en) * | 2018-01-12 | 2019-07-19 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus marking picture |
US11494595B2 (en) | 2018-06-15 | 2022-11-08 | Tencent Technology (Shenzhen) Company Limited | Method , apparatus, and storage medium for annotating image |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040264769A1 (en) * | 2003-06-30 | 2004-12-30 | Xerox Corporation | Systems and methods for associating color profiles with a scanned input image using spatial attributes |
CN103593357A (en) * | 2012-08-15 | 2014-02-19 | 富士通株式会社 | Semi-supervised feature transformation method and device |
CN103955462A (en) * | 2014-03-21 | 2014-07-30 | 南京邮电大学 | Image marking method based on multi-view and semi-supervised learning mechanism |
-
2017
- 2017-06-09 CN CN201710431451.4A patent/CN107169530A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040264769A1 (en) * | 2003-06-30 | 2004-12-30 | Xerox Corporation | Systems and methods for associating color profiles with a scanned input image using spatial attributes |
CN103593357A (en) * | 2012-08-15 | 2014-02-19 | 富士通株式会社 | Semi-supervised feature transformation method and device |
CN103955462A (en) * | 2014-03-21 | 2014-07-30 | 南京邮电大学 | Image marking method based on multi-view and semi-supervised learning mechanism |
Non-Patent Citations (2)
Title |
---|
MENGQIU HU,YANG YANG ET. AL.: ""Multi-view Semi-supervised Learning for Web Image", 《ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》 * |
史彩娟: ""网络空间图像标注中半监督稀疏特征选择算法研究"", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107976992A (en) * | 2017-11-29 | 2018-05-01 | 东北大学 | Industrial process big data fault monitoring method based on figure semisupervised support vector machines |
CN107976992B (en) * | 2017-11-29 | 2020-01-21 | 东北大学 | Industrial process big data fault monitoring method based on graph semi-supervised support vector machine |
CN110032914A (en) * | 2018-01-12 | 2019-07-19 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus marking picture |
CN108509959A (en) * | 2018-04-13 | 2018-09-07 | 广州优视网络科技有限公司 | Pornographic application and identification method, device, computer readable storage medium and server |
US11494595B2 (en) | 2018-06-15 | 2022-11-08 | Tencent Technology (Shenzhen) Company Limited | Method , apparatus, and storage medium for annotating image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110321477B (en) | Information recommendation method and device, terminal and storage medium | |
CN109871446A (en) | Rejection method for identifying, electronic device and storage medium in intention assessment | |
CN107169530A (en) | Mask method, device and the electronic equipment of picture | |
CN107679447A (en) | Facial characteristics point detecting method, device and storage medium | |
CN104572735B (en) | A kind of picture mark words recommending method and device | |
Liao et al. | An image retrieval method for binary images based on DBN and softmax classifier | |
CN108170755A (en) | Cross-module state Hash search method based on triple depth network | |
CN109902672A (en) | Image labeling method and device, storage medium, computer equipment | |
CN110827112B (en) | Deep learning commodity recommendation method and device, computer equipment and storage medium | |
CN110427480B (en) | Intelligent personalized text recommendation method and device and computer readable storage medium | |
US20230030419A1 (en) | Machine Learning Model Training Method and Device and Electronic Equipment | |
CN108319888A (en) | The recognition methods of video type and device, terminal | |
CN110503459A (en) | User credit degree appraisal procedure, device and storage medium based on big data | |
Wang et al. | CLARE: A joint approach to label classification and tag recommendation | |
CN110363206A (en) | Cluster, data processing and the data identification method of data object | |
CN104199838B (en) | A kind of user model constructing method based on label disambiguation | |
CN110866042A (en) | Intelligent table query method and device and computer readable storage medium | |
CN112948575A (en) | Text data processing method, text data processing device and computer-readable storage medium | |
CN110347789A (en) | Text is intended to intelligent method for classifying, device and computer readable storage medium | |
CN111784372A (en) | Store commodity recommendation method and device | |
Xu et al. | Multi‐pyramid image spatial structure based on coarse‐to‐fine pyramid and scale space | |
CN109299887A (en) | A kind of data processing method, device and electronic equipment | |
CN108519986A (en) | A kind of webpage generating method, device and equipment | |
CN111797622B (en) | Method and device for generating attribute information | |
Tang et al. | An efficient concept detection system via sparse ensemble learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170915 |