CN110533184A

CN110533184A - A kind of training method and device of network model

Info

Publication number: CN110533184A
Application number: CN201910819512.3A
Authority: CN
Inventors: 张慧中; 朱亚旋; 郭少博
Original assignee: Nanjing Institute Of Artificial Intelligence Co Ltd
Current assignee: Nanjing Institute Of Artificial Intelligence Co Ltd
Priority date: 2019-08-31
Filing date: 2019-08-31
Publication date: 2019-12-03
Anticipated expiration: 2039-08-31
Also published as: CN110533184B

Abstract

Disclose the training method and device of a kind of network model, comprising: the corresponding first eigenvector of sample image is determined by network to be trained；The sample image is divided into more than two area images；More than two second feature vectors are determined according to described two above area images by supervision network；Based on the first eigenvector and described two above second feature vectors, the first-loss value of the network to be trained is determined；If the first-loss value meets the first preset condition, the weight parameter of the network to be trained is adjusted；Pass through the mutual learning training between first eigenvector and second feature vector, so that the feature vector of network analysis to be trained preferably is corresponded to the specific localized areas of object, therefore can accurately complete the identification based on object regional area using the network to be trained obtained after training.

Description

A kind of training method and device of network model

Technical field

This disclosure relates to the training method and device of image analysis technology field more particularly to a kind of network model.

Background technique

Under some concrete scenes of image recognition, since the object in image may be at least partially obscured, so nothing Method obtains the complete characteristic pattern of object and feature vector to image analysis by network model, and can only be obtained by Object locally analyze and identify by corresponding characteristic pattern and feature vector, completion.

But during parsing to obtain characteristic pattern using network model, the information in image will appear a degree of It mixes miscellaneous, so that can not preferably determine the regional area of object in image on characteristic pattern.In this case, existing Network model can not accurately extract the corresponding characteristic pattern of regional area and feature vector of object in image.

Summary of the invention

In order to solve the above-mentioned technical problem, the disclosure is proposed.Embodiment of the disclosure provides a kind of network model Training method and device；In network model training process, based on the first eigenvector obtained by whole sample image, It is carried out with the second feature vector obtained by area image, the network to be trained for enabling training to obtain accurately extracts figure The specific portion character pair vector of object as in.

According to the first aspect of the disclosure, a kind of training method of network model is provided, comprising:

The corresponding first eigenvector of sample image is determined by network to be trained；

The sample image is divided into more than two area images；

More than two second feature vectors are determined according to described two above area images by supervision network；

Based on the first eigenvector and described two above second feature vectors, the network to be trained is determined First-loss value；

If the first-loss value meets the first preset condition, the weight parameter of the network to be trained is adjusted.

According to the second aspect of the disclosure, a kind of training device of network model is provided, comprising:

First eigenvector determining module, for by network to be trained determine the corresponding fisrt feature of sample image to Amount；

Image division module, for the sample image to be divided into more than two area images；

Second feature vector determining module, for determining two according to described two above area images by supervision network A above second feature vector；

First-loss module, for based on the first eigenvector and described two above second feature vectors, really The first-loss value of the fixed network to be trained；

Parameter adjustment module, for when the first-loss value meets the first preset condition, adjustment to be described to training net The weight parameter of network.

According to the third aspect of the disclosure, a kind of computer readable storage medium is provided, the storage medium is stored with Computer program, the computer program are used to execute the training method of network model described in above-mentioned first aspect.

According to the fourth aspect of the disclosure, a kind of electronic equipment is provided, the electronic equipment includes: processor；For Store the memory of the processor-executable instruction；

The processor for reading the executable instruction from the memory, and executes the executable instruction To realize the training method of network model described in above-mentioned first aspect.

The training method and device for the network model that the disclosure provides, by the way that sample image is divided at least two regions Image, and second feature vector is determined according to area image, so that second feature vector is preferably corresponded to the specific of object Regional area；And then by the mutual learning training between first eigenvector and second feature vector, make network solution to be trained The feature vector of analysis can preferably correspond to the specific localized areas of object, therefore utilize the network to be trained obtained after training The identification based on object regional area can accurately be completed.

Detailed description of the invention

The embodiment of the present disclosure is described in more detail in conjunction with the accompanying drawings, the above-mentioned and other purposes of the disclosure, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present disclosure, and constitutes explanation A part of book is used to explain the disclosure together with the embodiment of the present disclosure, does not constitute the limitation to the disclosure.In the accompanying drawings, Identical reference label typically represents same parts or step.

Fig. 1 is the structural schematic diagram of the training system for the network model that one exemplary embodiment of the disclosure provides；

Fig. 2 is the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides；

Fig. 3 is the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides；

Fig. 4 is the signal of model structure involved in the training method for the network model that one exemplary embodiment of the disclosure provides Figure；

Fig. 5 is that sample image is divided into area in the training method for the network model that one exemplary embodiment of the disclosure provides The schematic diagram of area image；

Fig. 6 is the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides；

Fig. 7 is the identification stream of object involved in the training method for the network model that one exemplary embodiment of the disclosure provides Journey schematic diagram；

Fig. 8 is the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides；

Fig. 9 is the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides；

Figure 10 is the structural schematic diagram of the training device for the network model that one exemplary embodiment of the disclosure provides；

Figure 11 is the knot of first-loss module in the training device for the network model that one exemplary embodiment of the disclosure provides Structure schematic diagram；

Figure 12 is that second feature vector determines in the training device for the network model that one exemplary embodiment of the disclosure provides The structural schematic diagram of module；

Figure 13 is that first eigenvector determines in the training device for the network model that one exemplary embodiment of the disclosure provides The structural schematic diagram of module；

Figure 14 is the structural schematic diagram of the training device for the network model that one exemplary embodiment of the disclosure provides；

Figure 15 is that initially network to be trained is instructed in the training device for the network model that one exemplary embodiment of the disclosure provides Practice the structural schematic diagram of module；

Network training is initially supervised in the training device for the network model that Figure 16 provides for one exemplary embodiment of the disclosure The structural schematic diagram of module；

Figure 17 is the structure chart for the electronic equipment that one exemplary embodiment of the disclosure provides.

Specific embodiment

In the following, will be described in detail by referring to the drawings according to an example embodiment of the present disclosure.Obviously, described embodiment is only It is only a part of this disclosure embodiment, rather than the whole embodiments of the disclosure, it should be appreciated that the disclosure is not by described herein The limitation of example embodiment.

Application is summarized

The object of identification involved in the disclosure, usually personage.But in other cases, be also possible to animal, plant, Various other objects such as vehicle.In some cases, the object in image may be at least partially obscured, so can not pass through net Network model obtains the complete characteristic pattern of object and feature vector to image analysis, and can only utilize obtainable object office The corresponding characteristic pattern in portion region and feature vector complete analysis and identification.

But during parsing to obtain characteristic pattern using network model, the information in image will appear a degree of It mixes miscellaneous, so that can not preferably determine the specific localized areas of object in image on characteristic pattern.For example, passing through convolution Neural network model (Convolutional Neural Networks, abbreviation CNN) carries out obtained from convolution operation image It is this to mix miscellaneous phenomenon just clearly in characteristic pattern.

By taking the specific localized areas that object is portrait and object is " head " as an example.In the picture, portrait is indicated " head " all pixels, all can significantly concentrate within the scope of some of image.Due to information occur mixing it is miscellaneous, so feature Indicate that " feature " on " head " is not to be totally absorbed in same range in figure.And " feature " on " head " is opposite in characteristic pattern In the range of concentration, also not there was only " feature " on " head ", can also mix miscellaneous some other regional area (such as shoulders nearby of coming in Portion) " feature ".

So in this case, existing network model can not accurately extract the specific portion of object in image The corresponding characteristic pattern in region, and the feature vector further determined that based on characteristic pattern are also just difficult to accurately be directed to object Part analyzed and identified.

The training method and device for the network model that the disclosure provides, will in the training process, based on the sample by entirety The first eigenvector that this image obtains mutually is learnt with the second feature vector obtained by area image, makes to train Obtained network to be trained can accurately extract the specific portion character pair vector of object in image.

Exemplary system

Fig. 1 be this disclosure relates to network model training system structural schematic diagram.In the system, it will utilize wait train Network and supervision network are trained jointly.The network to be trained obtained after training is completed, can be applied to the mistake of object identification Cheng Dangzhong.

Above-mentioned network model specifically can be based on convolutional neural networks (Convolutional Neural Networks, abbreviation CNN) building computation model, including network train and supervise network.The system can receive input Great amount of samples image be trained.Wherein, network to be trained carries out whole convolution, Chi Hua, dimensionality reduction etc. for sample image A series of processing obtain the whole corresponding first eigenvector of object in sample image.Parallel with this, supervision network is directed to A series of processing such as whole convolution, Chi Hua, dimensionality reduction are carried out by the area image that sample image divides, obtain sample graph The corresponding second feature vector of regional area of object as in.

It should be noted that including specific object in the sample image；And divided by sample image In area image, then the specific localized areas of the object has been only included.So in the corresponding second feature vector of area image, Also " characteristic information " of the specific localized areas is just only included." characteristic information " of other regional areas can not mix it is miscellaneous wherein. This means that second feature vector can preferably correspond to the specific localized areas of object, avoid between each regional area " characteristic information " mixes miscellaneous problem.

So within the system, by the mutual learning training between first eigenvector and second feature vector, make to Training network can learn the characteristic of into supervision network " specific localized areas that feature vector more preferably corresponds to object ", so as to The identification based on object regional area can be accurately completed in utilization network to be trained.

Illustrative methods

As shown in Fig. 2, being the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides. The present embodiment can be applicable on electronic equipment.Method in the present embodiment the following steps are included:

Step 201 determines the corresponding first eigenvector of sample image by network to be trained.

As described above, network to be trained is a part in network model, specifically be can be based on convolutional neural networks The calculating network of building.Preferably, the network structure of network to be trained can be resnet50 skeleton pattern.The present embodiment technology The purpose of scheme is to treat trained network to be trained, so that the network to be trained obtained after the completion of training can be more quasi- Identification of the true completion based on object regional area.

Sample image is to a large amount of image used in training process.It include specific target in each image Object.

Network to be trained carries out dissection process using sample image as input, for sample image, and then obtains sample graph The whole corresponding first eigenvector of object as in.The process that first eigenvector is determined using network to be trained, specifically may be used To include a series of processing such as convolution, Chi Hua, dimensionality reduction.A series of above-mentioned processing are regarded as convolutional neural networks technical field In conventional resolution procedures, this will not be repeated here.

Sample image is divided into more than two area images by step 202.

It is related to the object of identification, usually personage.But it is each in other cases, to be also possible to animal, plant, vehicle etc. Other objects of kind.This step is by taking object is portrait as an example.Sample image is divided based on portrait, specifically can be utilization Human body key point information divides sample image.Sample image can be divided into two or more using human body key point information Area image, wherein including a regional area of object in each area image.It is divided using human body key point information The process of sample image is the prior art, and this will not be repeated here.

Step 203 determines more than two second feature vectors according to more than two area images by supervision network.

As described above, supervision network is all a part in network model, specifically be can be based on convolutional neural networks The calculating network of building.Preferably, the network structure for supervising network can be resnet101 skeleton pattern.

Network is supervised using area image obtained above as input, carries out dissection process for each area image, in turn Obtain the corresponding second feature vector of regional area of object in each area image.That is, can be directed in this step Each area image obtains a corresponding second feature vector for the regional area of each object.

The process that second feature vector is determined using supervision network, it is a series of to can specifically include convolution, Chi Hua, dimensionality reduction etc. Processing.A series of above-mentioned processing are regarded as the conventional resolution procedures in convolutional neural networks technical field, do not do herein superfluous It states.

It is understood that during supervising network analysis area image, due to having only included mesh in area image The specific localized areas of object is marked, so also just only including " the feature letter of the specific localized areas in corresponding second feature vector Breath "." characteristic information " of other regional areas can not mix it is miscellaneous wherein.This also means that supervision network is directed to area image Determining second feature vector can preferably correspond to the specific localized areas of object, avoid the " special of each regional area Reference breath " miscellaneous problem of mutually mixing.

Step 204 is based on first eigenvector and more than two second feature vectors, determines the first of network to be trained Penalty values.

After determining first eigenvector and second feature vector, it can be calculated therebetween according to specific loss function Penalty values, and using the penalty values as the first-loss value of network to be trained.It is opposite that first-loss value has reacted network to be trained In the extent of damage of supervision network, it is also believed to measure network to be trained and supervises the finger of the degree of convergence and consistency between network Mark.

Loss function in the present embodiment specifically can be l2 range loss function or KL divergence loss function.It certainly can also To be that other are able to achieve the function similar to effect.For loss function concrete form without limitation, it is all can be realized it is identical Or the function of similar effect may be incorporated in the present embodiment overall technical architecture.

If step 205, first-loss value meet the first preset condition, the weight parameter of network to be trained is adjusted.

According to aforementioned it is appreciated that the first eigenvector that network to be trained determines can not preferably correspond to the spy of object Determine regional area.And the specific of object can preferably be corresponded to by supervising the second feature vector that network is determined for area image Regional area, " characteristic information " for avoiding each regional area are mutually mixed miscellaneous problem.So passing through first in the present embodiment The mutual learning training of feature vector and second feature vector, enable network to be trained learn to supervision network in " feature to Amount can preferably correspond to the specific localized areas of object " characteristic, just compensating for the shortcoming of network to be trained.So that The identification based on object regional area can be accurately completed using network to be trained.

By the first-loss value between first eigenvector and second feature vector, network to be trained and prison can be judged Whether superintend and direct has enough degrees of convergence and consistency between network.And the mark that the first preset condition in this step judges thus It is quasi-.It is believed that the first preset condition is the numerical threshold range for the setting of first-loss value.If first-loss value meets One preset condition meets the numerical threshold range, it is meant that do not have enough receipts between network to be trained and supervision network Degree of holding back and consistency then adjust the weight parameter of network to be trained at this time.

For the training process of network to be trained, it is also assumed that being the adjustment process of repetitious weight parameter.And The specific adjustment mode of weight parameter adjustment, belongs to the basic principle of convolutional neural networks machine learning, this will not be repeated here.

When first-loss value meets the first preset condition, the weight parameter of supervision network can also be adjusted, simultaneously to make " mutually study " is realized between network to be trained and supervision network, and then with enough degrees of convergence and consistency.

As seen through the above technical solutions, beneficial effect existing for the present embodiment is: by by sample image be divided into Few two area images, and second feature vector is determined according to area image, so that second feature vector is preferably corresponded to mesh Mark the specific localized areas of object；And then by mutual learning training between first eigenvector and second feature vector, make to The feature vector of training network analysis can preferably correspond to the specific localized areas of object, therefore using obtaining after training Network to be trained can accurately complete the identification based on object regional area.

As shown in Figure 2 it is only the basic embodiment of method of disclosure, carries out certain optimization and expansion on its basis, also It can obtain the other embodiments of method.

As shown in figure 3, being the process signal of the training method for the network model that disclosure another exemplary embodiment provides Figure.The present embodiment can be applicable on electronic equipment.In the present embodiment, this method will be had more in conjunction with practical application scene The explanation of body.

In the present embodiment, the structural schematic diagram of network model is as shown in Figure 4.Each composition that network structure includes in Fig. 4 Part and interactive relation therebetween combine in the method flow of the present embodiment and have carried out specific description.As shown in figure 3, this Embodiment includes the following steps:

Step 301, the skeleton structure layer based on network to be trained determine the corresponding fisrt feature figure of sample image.

In case where being portrait in conjunction with the object in sample image.In this step, sample image be first applied to Skeleton structure layer in training network is parsed.In the present embodiment, network to be trained is specially to be based on convolutional neural networks structure The calculating network built.Then skeleton structure layer can be made of at least one convolutional layer in convolutional neural networks.Preferably, skeleton Structure sheaf can be resnet50 skeleton structure.Skeleton structure layer is to the resolving of sample image, the i.e. process of process of convolution. Sample image is parsed using convolutional neural networks and obtains the principle of characteristic pattern, is belonged to it is known in the art, this will not be repeated here. Also the specific structure of skeleton structure layer is not defined in the present embodiment.

After the parsing of skeleton structure layer, the first global characteristics figure of corresponding sample image entirety can be obtained.This step Network to be trained can also further Selection utilization human body key point information, the first global characteristics figure is divided.It utilizes The process that human body key point information divides the first global characteristics figure is the prior art, and this will not be repeated here.

Preferred in the present embodiment, can use human body key point information can be divided into " upper half for the first global characteristics figure The first partial characteristic pattern of body ", " lower part of the body ", " head ", " chest ", " abdomen ", " thigh ", " shank and foot " totally 7 parts. That is, the fisrt feature figure in the present embodiment may include 1 the first global characteristics figure and 7 first partial characteristic patterns.

But network to be trained is to be determined entirely by the first global characteristics figure according to sample image in this step.So same Problems of the prior art are similar, and during determining the first global characteristics figure, the information in sample image will appear one It is miscellaneous to determine mixing for degree, so that on the first global characteristics figure, and on the first partial characteristic pattern that is obtained after dividing, " feature Information " can not preferably correspond to the specific localized areas of object.

Step 302, the pond layer based on network to be trained carry out global maximum pond to fisrt feature figure, obtain initial First eigenvector.

Step 303, the characteristic layer based on network to be trained carry out dimensionality reduction to initial first eigenvector, it is special to obtain first Levy vector.

In convolutional neural networks technical field, determine that the process of first eigenvector can be substantially by fisrt feature figure It is described as follows: fisrt feature figure being subjected to global maximum pond (Global Max Pooling, abbreviation GMP) and obtains initial first Feature vector further drops initial first eigenvector by characteristic layer (Embedding Layer, abbreviation EM) Dimension obtains first eigenvector.The above process is regarded as the conventional resolution procedures in convolutional neural networks technical field, herein It does not repeat them here.

It in step 302~step 303, specifically can be through above-mentioned mode, obtained pair by the first global characteristics figure 1 the first global characteristics vector answered, and from 7 first partial characteristic patterns obtain respectively 7 corresponding first partial features to Amount.Then, in conjunction with the first global characteristics vector sum first partial feature vector, that is, it can determine first eigenvector.

It should also be noted that, during each first partial feature vector of the first global characteristics vector sum combines, Can preferably using weighting or Ghost-VLAD algorithm be allowed to combine so as in first eigenvector entirety and each office Portion has different stress.

Sample image is divided into more than two area images by step 304.

It can use human body key point information in this step to divide sample image.Specifically, target can be based on Sample image is divided into " upper part of the body ", " lower part of the body ", " head ", " chest ", " abdomen ", " big by the design feature of object (portrait) Leg ", " shank and foot " totally 7 area images.As shown in Figure 5.It obviously, include an office of object in each area image Portion region.Also, the division and the above-mentioned division for the first global characteristics figure for sample image are one-to-one.

Step 305, the skeleton structure layer based on supervision network, determine more than two area images corresponding second Characteristic pattern obtains more than two second feature figures.

The skeleton structure layer that 7 above-mentioned area images are first applied in supervision network is parsed.The present embodiment In, supervision network is specially the calculating network constructed based on convolutional neural networks.Then skeleton structure layer can be by convolutional Neural net At least one convolutional layer in network is constituted.Preferably, skeleton structure layer can be resnet101 skeleton structure.

It can be parsed to obtain a corresponding second feature figure according to each area image.Skeleton structure layer is utilized in this step Parsing image obtains the process of characteristic pattern, similarly describes in step 301, herein not repeated description.

Obviously, it is parsed for 7 area images as shown in Figure 5, determines corresponding 7 second feature figures.Another Under some cases, can also be additional to supervision network inputs sample image, make to supervise network the entirety of sample image solved Analysis, determines corresponding second feature figure.

So in the present embodiment, it is preferred to be believed that supervision network be whole to sample image, and as shown in Figure 5 7 area images parsed, it is determined that corresponding totally 8 second feature figures.It is understood that in supervision network analysis During area image, due to having only included the specific localized areas of object in area image, so corresponding second is special It levies in figure, also just only includes " characteristic information " of the specific localized areas.Other part " characteristic informations " can not mix it is miscellaneous its In.

For example, supervision network is parsed for the area image on " head ", the second feature figure on " head " can determine. Due to the division to sample image, causes to only include the information for indicating object " head " in the area image on " head ", not deposit In the information for indicating other parts (such as chest) of object.So during the second feature figure on determining " head ", object The information of other parts (such as chests) is impossible to mix in the miscellaneous second feature figure to " head ".

This also means that the second feature figure that supervision network is determined for area image can preferably correspond to object Specific localized areas, " characteristic information " for avoiding each regional area mutually mix miscellaneous problem.

More than two second feature figures are carried out global maximum pond by step 306, the pond layer based on supervision network respectively Change, obtains more than two initial second feature vectors.

Step 307, the characteristic layer based on supervision network carry out dimensionality reduction to more than two initial second feature vectors, obtain Obtain more than two second feature vectors.

For step 306~step 307, determine the process of feature vector similarly in step 302~step according to characteristic pattern It is described in 303, herein not repeated description.

It should be noted that by this present embodiment altogether including 8 second feature figures, it is possible to be directed to each second The global maximum pond of the progress of characteristic pattern respectively and dimensionality reduction, obtain corresponding second reference vector；It is obtained 8 second Reference vector.Then, in conjunction with above-mentioned the second all reference vector, that is, it can determine second feature vector.It should also be noted that, During second feature vector combines, preferably it can be allowed to combine using weighting or Ghost-VLAD algorithm, so that To in second feature vector entirety and each part have and different stress.

Step 308, using preset loss function, determine that the distance between first eigenvector and second feature vector damage Mistake value and divergence penalty values.

In this step, loss function specifically can be range loss function and divergence loss function.Wherein, range loss letter Number can be l2 range loss function, and divergence loss function can be KL divergence loss function.It certainly in other cases, can also Unknown losses function is selected, or further combines unknown losses function.

Using range loss function, the loss of the distance between first eigenvector and second feature vector can be calculated Value.Likewise, the divergence loss between first eigenvector and second feature vector can be calculated using divergence loss function Value.

Step 309, according to range loss value and divergence penalty values, determine first-loss value.

Based on range loss value and divergence penalty values, or in other cases can also be further combined with unknown losses function The penalty values being calculated may be summed, weighted sum or other relevant calculations, to determine first-loss value.

If step 310, first-loss value meet the first preset condition, the weight parameter of network to be trained is adjusted.

As seen through the above technical solutions, the present embodiment combination practical application scene, to the training method of network model into More specific description is gone.

As shown in fig. 6, being the process signal of the training method for the network model that disclosure another exemplary embodiment provides Figure.The present embodiment can be applicable on electronic equipment.In the present embodiment, method includes the following steps:

Step 601 determines the corresponding first eigenvector of sample image by network to be trained.

Sample image is divided into more than two area images by step 602.

Step 603 determines more than two second feature vectors according to more than two area images by supervision network.

Step 604 is based on first eigenvector and more than two second feature vectors, determines the first of network to be trained Penalty values.

If step 605, first-loss value meet the first preset condition, the weight parameter of network to be trained is adjusted.

601~step 605 of above-mentioned steps is consistent with corresponding steps meaning in embodiment illustrated in fig. 2 in the present embodiment, herein Not repeated description.Associated description in embodiment illustrated in fig. 2 is equally applicable in this present embodiment.And the present embodiment is in Fig. 2 institute On the basis of showing embodiment, further includes:

If step 606, first-loss value do not meet the first preset condition, object identification is carried out using network to be trained.

If first-loss value does not meet the first preset condition, it is meant that had between network to be trained and supervision network Enough degrees of convergence and consistency, in other words network to be trained learnt to supervision network in " feature vector can be better The characteristic of the specific localized areas of corresponding object ".Think then to complete using training to the end of the training of training network at this time The network to be trained obtained afterwards carries out object identification.

As shown in fig. 7, can be described in detail below to above-mentioned object identification process in the present embodiment:

Assuming that the object for including in known image to be checked is personage A, then it is directed to using network to be trained to mapping As carrying out object identification, whether to judge in testing image including the same object in image to be checked, i.e. personage A.And It is understood that this scene does not constitute the restriction to method in this present embodiment, object involved in this embodiment Identification process also can be applicable in other similar scene.

Then the identification process of object may include following steps:

Step 701 determines the corresponding feature vector to be checked of image to be checked by network to be trained.

It is " special into supervision network due in the training process of network to be trained having learnt, network to be trained Sign vector can preferably correspond to the specific localized areas of object " characteristic, so network to be trained can be extracted accurately The corresponding feature vector of the specific localized areas of object in image out.And the characteristics of by network to be trained in this present embodiment It is accurately realize the identification for regional area, so feature vector to be checked similarly needs to embody target simultaneously The entirety and regional area of object.

In this step, network to be trained is by first according to the entirety of image to be checked, to determine that image to be checked is corresponding Global characteristics figure to be checked, and then pond, the processing such as dimensionality reduction are carried out to global characteristics figure to be checked, obtained to be checked global special Levy vector.

It include all " feature letters of object in image to be checked (i.e. personage A) in global characteristics vector to be checked Breath ".Global characteristics vector to be checked is for (under the present embodiment scene, that is, being directed to personage A for object is whole Whole body) feature vector that is identified.

In order to realize identification when target portion blocks, also need further to determine for for object regional area The local feature vectors to be checked identified.Specifically, can use human body key point information to global characteristics figure to be checked It is divided, obtains more than two local feature figures to be checked；And then it treats the local characteristic pattern of inquiry and carries out pond, dimensionality reduction etc. Processing, obtains local feature vectors to be checked.

In conjunction with scene in above-described embodiment it is understood that such as " head " corresponding local feature vectors to be checked, It just include " characteristic information " on the head of personage A, it is possible to when other regional areas are blocked, for the head of personage A Portion is identified.Other are similarly analogized.

Then, in conjunction with global characteristics vector sum local feature vectors to be checked to be checked, it may be determined that feature vector to be checked, Feature vector to be checked is set to meet the needs of whole identification and local region recognition simultaneously.In global characteristics vector to be checked During combining with each local feature vectors to be checked, phase preferably can be allowed to using weighting or Ghost-VLAD algorithm In conjunction with, so that the entirety and each regional area in feature vector to be checked stress with different, further raising identification Accuracy.

Image is parsed in this step and obtains characteristic pattern, and the process of feature vector is obtained similarly in aforementioned reality by characteristic pattern Example is applied, herein not repeated description.

Step 702 determines the corresponding feature vector to be measured of testing image by network to be trained.

The process of feature vector to be measured is determined based on testing image, can be analogous in step 701 true based on image to be checked The process of fixed feature vector to be checked.Not repeated description herein.

Step 703, the similarity for determining feature vector to be checked and feature vector to be measured, and image is carried out according to similarity Identification.

Whether object identifies the purpose to be realized in the present embodiment, that is, judge in testing image to include personage A.Specifically , that is, judge feature vector to be measured, with the feature vector to be checked that can indicate personage A, the similarity of the two.Work as the two Similarity meet specified conditions, that is, the actual content for thinking that feature vector to be measured indicates is similarly personage A, also mean that It include personage A in altimetric image.It is achieved in image recognition.

Due in feature vector to be checked and feature vector to be measured, including representing global feature vector and representing each The feature vector of a regional area, even if so certain regional areas of personage A are hidden in image to be checked or testing image It keeps off, method can still complete image recognition in the present embodiment.Simultaneously as the training method of network to be trained allows to standard True extracts the corresponding feature vector of specific localized areas, ensure that the accuracy that image recognition is carried out based on regional area.

Determine between two feature vectors that similarity is regarded as technological means conventional in the art, it is right in the present embodiment This is without limitation.All technological means that can be realized same or like effect may be incorporated in the present embodiment overall technical architecture In the middle.

In addition, extending under this application scenarios, it can also realize and object is scanned in picture library. Such as using a monitored picture in some region as image to be checked, and personage A is therefrom determined.It further, can be with All monitored pictures in the region establish picture library, and using whole monitored pictures other than image to be checked as testing image.So Image recognition frame by frame is carried out afterwards, judges to include personage A in which testing image.It is thus achieved that for personage in picture library The search of A.It is understood that for the image recognition frame by frame of multiple testing images, actually 701~step of above-mentioned steps 703 repeatedly to this execute process.

As shown in figure 8, being the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides. The present embodiment can be applicable on electronic equipment.It, can be preferred right in order to improve the training effectiveness of method in embodiment illustrated in fig. 2 Network to be trained carries out certain preparatory training.On the basis of the present embodiment embodiment shown in Fig. 2, comprising the following steps:

Step 801 determines the corresponding third feature figure of sample image by initially network to be trained, according to third feature figure Obtain the third feature vector of sample image.

Initially network to be trained, it is believed that be network to be trained before " training in advance " being related to by the present embodiment Convolutional neural networks original state.It is generally acknowledged that obtaining characteristic pattern by image analysis, and by characteristic pattern determine feature to During amount, initially the performance of network to be trained is relatively low, and accuracy is insufficient.Then through this embodiment involved in it is " pre- First train " network to be trained can be made to carry out as shown in figures 2-3 before the training of embodiment, reach certain accurate first Degree.

Sample image can be the sample image marked in this step, i.e., includes specific object in sample image, And including for " recognition result " known to object, which can be the ID of object.Pass through in the present embodiment The sample image of known recognition result, can exercise supervision to initially network to be trained learning training, so that obtaining after training Network to be trained for sample image restrain.The corresponding third feature of sample image is wherein determined by initially network to be trained The process of vector, similarly with correlation step in previous embodiment, not repeated description herein.

Step 802, the full articulamentum based on initially network to be trained obtain corresponding second loss of third feature vector Value.

It is exercised supervision learning training by the sample image of known recognition result to initially network to be trained, it can be understood as It is to be identified by initially network to be trained to the object in sample image, and determine second according to the order of accuarcy of identification Penalty values, and then the process based on the initial network to be trained of the second penalty values adjustment.

For example, it is assumed that including object X, ID 001 in sample image；And object Y, ID 002.Then using initial After training network identifies object, the ID that output phase is answered can be calculated.If initially Network Recognition object X to be trained Afterwards, output ID is 001, then it is assumed that this recognition result is correct.If initially exporting ID after training Network Recognition object X It is 002, then it is assumed that this recognition result mistake.Thus using initially network to be trained to a large amount of targets for including in sample image After object is identified, object ID penalty values can be determined according to specific loss function, to damage as second in this step Mistake value.

It should be noted that in convolutional neural networks field, it is true according to specific loss function involved in this step The process for the object ID penalty values that set the goal can be obtained by the way that third feature vector is inputted full articulamentum to calculate.This process The conventionally calculation process being regarded as in convolutional neural networks field, this will not be repeated here, does not also make to certain loss function It is specific to limit.

If step 803, the second penalty values meet the second preset condition, the weight parameter of initially network to be trained is adjusted.

If the second penalty values meet the second preset condition, then it is assumed that initially the accuracy of network to be trained is lower at this time, then The weight parameter of initially network to be trained is adjusted to realize the training optimization to initially network to be trained.Equally, this was trained Journey is also recycled and executes repeatedly, until the second loss is no longer complies with the second preset condition or cycle-index reaches preset number, Then think initially network convergence to be trained, preparatory training process is completed.

Initial network to be trained after adjustment weight parameter is determined as network to be trained by step 804.

In advance after the completion of training, the initial network to be trained after last time is adjusted weight parameter is as to training net Network, to carry out training involved in subsequent embodiment as shown in figures 2-3.

As seen through the above technical solutions, further existing beneficial on the basis of the present embodiment embodiment shown in Fig. 2 Effect is: being trained in advance to initially network to be trained, to obtain network to be trained, which thereby enhances the instruction of network to be trained Practice efficiency.

As shown in figure 9, being the flow diagram of the training method for the network model that one exemplary embodiment of the disclosure provides. The present embodiment can be applicable on electronic equipment.It, can be preferred right in order to improve the training effectiveness of method in embodiment illustrated in fig. 2 It supervises network and carries out certain preparatory training.On the basis of the present embodiment embodiment shown in Fig. 2, comprising the following steps:

Step 901 determines the corresponding fourth feature figure of sample image by initially supervising network, is obtained according to fourth feature figure Obtain the fourth feature vector of sample image.Initial supervision network, it is believed that be supervise network be related to by the present embodiment it is " pre- First train " before convolutional neural networks original state.It is generally acknowledged that obtaining characteristic pattern by image analysis, and pass through feature Scheme during determining feature vector, the initial performance for supervising network is relatively low, and accuracy is insufficient.Then through this embodiment in " training in advance " being related to can make to supervise network before carrying out the training of embodiment as shown in figures 2-3, reach first certain Order of accuarcy.

Sample image can be the sample image marked in this step, i.e., includes specific object in sample image, And including for " recognition result " known to object, which can be the ID of object.And in order to embody prison It superintends and directs network to stress for what regional area identified, further includes " the local recognition result " of specific objective object, the knowledge in sample image Other result can be the local I D of object.It, can be to initial prison by the sample image of known recognition result in the present embodiment It superintends and directs network to exercise supervision learning training, makes to supervise network and sample image is restrained.Wherein sample is determined by initially supervising network The process of the corresponding fourth feature vector of this image, similarly with correlation step in previous embodiment, not repeated description herein.

Additionally, it is preferred that can determine fourth feature vector process be added convolutional neural networks field in " pay attention to Power mechanism " thus indicates that the difference " characteristic information " of different regional areas matches corresponding weight to fourth feature vector.It is so-called " attention mechanism " is training method existing in convolutional neural networks field, and this will not be repeated here.

Step 902, the full articulamentum based on initial supervision network, obtain the corresponding third penalty values of fourth feature vector.

It is exercised supervision learning training by the sample image of known recognition result to initial supervision network, it can be understood as be The object in sample image is identified by initially supervising network, and determines that third is lost according to the order of accuarcy of identification Value, and then the initial process for supervising network is adjusted based on third penalty values.

Such as assume to include object (portrait) M, ID 003 in sample image, and object M includes two partial zones Domain, respectively " upper part of the body " and " lower part of the body "；The local I D of object M " upper part of the body " is 011, the office of object M " lower part of the body " Portion ID is 012；And object (portrait) N, ID 004, and object N also includes two regional areas, respectively " upper half Body " and " lower part of the body "；The local I D of object N " upper part of the body " is 021, and the local I D of object N " lower part of the body " is 022.

After then identifying using initial supervision network to object, the ID or local I D that output phase is answered can be calculated.If After initial supervision Network Recognition object M, output ID is 003, then it is assumed that this recognition result is correct.If initially supervising network After identifying object M, output ID is 004, then it is assumed that this recognition result mistake.Alternatively, if initially supervising Network Recognition mesh After marking object M " upper part of the body ", output ID is 011, then it is assumed that this recognition result is correct.If initially supervision Network Recognition object After M " upper part of the body ", ID non-zero 11 is exported, then it is assumed that this recognition result mistake.Thus using initial supervision network to sample image In include a large amount of objects identified after, can determine object ID penalty values and object according to specific loss function Local I D penalty values, and combine both as the third penalty values in this step.

It should be noted that in convolutional neural networks field, it is true according to specific loss function involved in this step The process of the object ID penalty values that set the goal and object local I D penalty values, can be by inputting full articulamentum for fourth feature vector It is obtained to calculate.This process is regarded as the conventionally calculation process in convolutional neural networks field, and this will not be repeated here, also not Specific restriction is made to certain loss function.

If step 903, third penalty values meet third preset condition, the weight parameter of initial supervision network is adjusted.

If third penalty values meet third preset condition, then it is assumed that initially the accuracy of supervision network is lower at this time, then adjusts The weight parameter of whole initial supervision network is to realize the training optimization to initial supervision network.Equally, this training process can also Circulation executes repeatedly, until third loss is no longer complies with third preset condition or cycle-index reaches preset number, then it is assumed that Initial supervision network convergence, preparatory training process are completed.

Step 904 is determined as the initial supervision network after adjustment weight parameter to supervise network.

In advance training after the completion of, using last time adjust weight parameter after initial supervision network as supervise network, To carry out training involved in subsequent embodiment as shown in figures 2-3.

As seen through the above technical solutions, further existing beneficial on the basis of the present embodiment embodiment shown in Fig. 2 Effect is: being trained in advance to initial supervision network, to obtain supervision network, which thereby enhances the training effect of network to be trained Rate.

Exemplary means

Figure 10 is the structural schematic diagram of the training device for the network model that one exemplary embodiment of the disclosure provides.This implementation Example device, i.e., for executing the entity apparatus of FIG. 1 to FIG. 9 method.Its technical solution is substantially consistent with above-described embodiment, above-mentioned Accordingly description in embodiment is equally applicable in this present embodiment.Device includes: in the present embodiment

First eigenvector determining module 1001, for determining the corresponding fisrt feature of sample image by network to be trained Vector.

Image division module 1002, for sample image to be divided into more than two area images.

Second feature vector determining module 1003, for determining two according to more than two area images by supervision network A above second feature vector.

First-loss module 1004, for based on first eigenvector and more than two second feature vectors, determine to The first-loss value of training network.

Parameter adjustment module 1005, for adjusting network to be trained when first-loss value meets the first preset condition Weight parameter.

Figure 11 is first-loss module in the training device for the network model that disclosure another exemplary embodiment provides 1004 structural schematic diagram.As shown in figure 11, in the exemplary embodiment, first-loss module 1004 includes:

Distance and divergence lose determination unit 1111, for utilizing preset loss function, determine first eigenvector with The distance between second feature vector penalty values and divergence penalty values.

First-loss determination unit 1112, for determining first-loss value according to range loss value and divergence penalty values.

Figure 12 is that second feature vector is true in the training device for the network model that disclosure another exemplary embodiment provides The structural schematic diagram of cover half block 1003.As shown in figure 12, in the exemplary embodiment, second feature vector determining module 1003 is wrapped It includes:

Second feature figure determination unit 1211 determines more than two areas for the skeleton structure layer based on supervision network The corresponding second feature figure of area image, obtains more than two second feature figures.

Initial second feature vector determination unit 1212, for the pond layer based on supervision network, by more than two the Two characteristic patterns carry out global maximum pond respectively, obtain more than two initial second feature vectors.

Second feature vector determination unit 1213, for the characteristic layer based on supervision network, to more than two initial the Two feature vectors carry out dimensionality reduction, obtain more than two second feature vectors.

Figure 13 is that first eigenvector is true in the training device for the network model that disclosure another exemplary embodiment provides The structural schematic diagram of cover half block 1001.As shown in figure 13, in the exemplary embodiment, first eigenvector determining module 1001 is wrapped It includes:

Fisrt feature figure determination unit 1311, for the skeleton structure layer based on network to be trained, determines sample image pair The fisrt feature figure answered.

First initial characteristics vector determination unit 1312, for the pond layer based on network to be trained, to fisrt feature figure Global maximum pond is carried out, initial first eigenvector is obtained.

First eigenvector determination unit 1313, for the characteristic layer based on network to be trained, to initial fisrt feature to Amount carries out dimensionality reduction, obtains first eigenvector.

Figure 14 is the structural schematic diagram of the training device for the network model that one exemplary embodiment of the disclosure provides.This implementation Example further comprises on the basis of embodiment illustrated in fig. 10:

Object identification module 1401, for when first-loss value does not meet the first preset condition, using to training net Network carries out object identification.

In the training device for the network model that one exemplary embodiment of the disclosure provides, further comprise: initially wait instruct Practice network training module 1501.Figure 15 is the structural schematic diagram of initially network training module 1501 to be trained.As shown in figure 15, In In exemplary embodiment, initially network training module 1501 to be trained includes:

Third feature vector determination unit 1511, for determining the corresponding third of sample image by initially network to be trained Characteristic pattern obtains the third feature vector of sample image according to third feature figure.

Second penalty values determination unit 1512 obtains third feature for the full articulamentum based on initially network to be trained Corresponding second penalty values of vector.

First weight parameter adjustment unit 1513, for when the second penalty values meet the second preset condition, adjustment to be initial The weight parameter of network to be trained.

Network determination unit 1514 to be trained, for being determined as the initial network to be trained after adjustment weight parameter wait instruct Practice network.

In the training device for the network model that one exemplary embodiment of the disclosure provides, further comprise: initial supervision Network training module 1601.Figure 16 is the structural schematic diagram of initial supervision network training module 1601.As shown in figure 16, in example Property embodiment in, initial network training module 1601 of supervising includes:

Fourth feature vector determination unit 1611, for determining corresponding 4th spy of sample image by initially supervising network Sign figure, the fourth feature vector of sample image is obtained according to fourth feature figure.

Third penalty values determination unit 1612, for the full articulamentum based on initial supervision network, obtain fourth feature to Measure corresponding third penalty values.

Second weight parameter adjustment unit 1613, for when third penalty values meet third preset condition, adjustment to be initial Supervise the weight parameter of network.

Network determination unit 1614 is supervised, for being determined as the initial supervision network after adjustment weight parameter to supervise net Network.

Example electronic device

In the following, being described with reference to Figure 17 the electronic equipment according to the embodiment of the present disclosure.The electronic equipment can be first and set Standby 100 and second any of equipment 200 or both or with their independent stand-alone devices, which can be with the One equipment and the second equipment are communicated, to receive the collected input signal of institute from them.

Figure 17 illustrates the block diagram of the electronic equipment according to the embodiment of the present disclosure.

As shown in figure 17, electronic equipment 10 includes one or more processors 11 and memory 12.

Processor 11 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution capability Other forms processing unit, and can control the other assemblies in electronic equipment 10 to execute desired function.

Memory 12 may include one or more computer program products, and the computer program product may include each The computer readable storage medium of kind form, such as volatile memory and/or nonvolatile memory.The volatile storage Device for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-volatile to deposit Reservoir for example may include read-only memory (ROM), hard disk, flash memory etc..It can be deposited on the computer readable storage medium One or more computer program instructions are stored up, processor 11 can run described program instruction, to realize this public affairs described above The training method of the network model for each embodiment opened and/or other desired functions.It computer-readable is deposited described The various contents such as input signal, signal component, noise component(s) can also be stored in storage media.

In one example, electronic equipment 10 can also include: input unit 13 and output device 14, these components pass through The interconnection of bindiny mechanism's (not shown) of bus system and/or other forms.

For example, the input unit 13 can be above-mentioned when the electronic equipment is the first equipment 100 or the second equipment 200 Microphone or microphone array, for capturing the input signal of sound source.When the electronic equipment is stand-alone device, input dress Setting 13 can be communication network connector, for receiving input signal collected from the first equipment 100 and the second equipment 200.

In addition, the input equipment 13 can also include such as keyboard, mouse etc..

The output device 14 can be output to the outside various information, including range information, the directional information etc. determined.It should Output equipment 14 may include that such as display, loudspeaker, printer and communication network and its long-range output connected are set It is standby etc..

Certainly, to put it more simply, illustrated only in Figure 17 it is some in component related with the disclosure in the electronic equipment 10, The component of such as bus, input/output interface etc. is omitted.In addition to this, according to concrete application situation, electronic equipment 10 is also It may include any other component appropriate.

Illustrative computer program product and computer readable storage medium

Other than the above method and equipment, embodiment of the disclosure can also be computer program product comprising meter Calculation machine program instruction, it is describedComputer programIt is above-mentioned that instruction makes the processor execute this specification when being run by processor According to the step in the training method of the network model of the various embodiments of the disclosure described in " illustrative methods " part.

The computer program product can be write with any combination of one or more programming languages for holding The program code of row embodiment of the present disclosure operation, described program design language includes object oriented program language, such as Java, C++ etc. further include conventional procedural programming language, such as " C " language or similar programming language.Journey Sequence code can be executed fully on the user computing device, partly execute on a user device, be independent soft as one Part packet executes, part executes on a remote computing or completely in remote computing device on the user computing device for part Or it is executed on server.

In addition, embodiment of the disclosure can also be computer readable storage medium, it is stored thereon with computer program and refers to It enables, the computer program instructions make the processor execute above-mentioned " the exemplary side of this specification when being run by processor According to the step in the training method of the network model of the various embodiments of the disclosure described in method " part.

The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.

The basic principle of the disclosure is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in the disclosure The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are the disclosure Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand With, rather than limit, it is that must be realized using above-mentioned concrete details that above-mentioned details, which is not intended to limit the disclosure,.

Device involved in the disclosure, device, equipment, system block diagram only as illustrative example and be not intended to It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that , it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool " etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above "or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.

It may also be noted that each component or each step are can to decompose in the device of the disclosure, device and method And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the disclosure.

The above description of disclosed aspect is provided so that any person skilled in the art can make or use this It is open.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein General Principle can be applied to other aspect without departing from the scope of the present disclosure.Therefore, the disclosure is not intended to be limited to Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.

In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the disclosure It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.

Claims

1. a kind of training method of network model, comprising:

The sample image is divided into more than two area images；

Based on the first eigenvector and described two above second feature vectors, the first of the network to be trained is determined Penalty values；

2. method according to claim 1, described based on the first eigenvector and described two above second feature Vector determines that the first-loss value of the network to be trained includes:

Using preset loss function, the distance between the first eigenvector and the second feature vector penalty values are determined With divergence penalty values；

According to the range loss value and divergence penalty values, the first-loss value is determined.

3. method according to claim 1, described determine two according to described two above area images by supervision network A above second feature vector includes:

Based on the skeleton structure layer of the supervision network, described two above corresponding second feature of area image are determined Figure, obtains more than two second feature figures；

Based on the pond layer of the supervision network, described two above second feature figures are subjected to global maximum pond respectively, Obtain more than two initial second feature vectors；

Based on the characteristic layer of the supervision network, dimensionality reduction is carried out to described two above initial second feature vectors, obtains two A above second feature vector.

4. method according to claim 1, described determine the corresponding first eigenvector of sample image by network to be trained Include:

Based on the skeleton structure layer of the network to be trained, the corresponding fisrt feature figure of the sample image is determined；

Based on the pond layer of the network to be trained, global maximum pond is carried out to the fisrt feature figure, obtains initial first Feature vector；

Based on the characteristic layer of the network to be trained, dimensionality reduction is carried out to the initial first eigenvector, obtain fisrt feature to Amount.

5. method according to claim 1, further includes:

If the first-loss value does not meet the first preset condition, object identification is carried out using the network to be trained.

6. method according to claim 1, further includes:

The corresponding third feature figure of sample image is determined by initially network to be trained, according to third feature figure acquisition The third feature vector of sample image；

Based on the full articulamentum of initially network to be trained, corresponding second penalty values of the third feature vector are obtained；

If second penalty values meet the second preset condition, the weight parameter of the network initially to be trained is adjusted；

The network initially to be trained after adjustment weight parameter is determined as the network to be trained.

7. method according to claim 1, further includes:

The corresponding fourth feature figure of sample image is determined by initially supervising network, and the sample is obtained according to the fourth feature figure The fourth feature vector of this image；

Based on the full articulamentum of initial supervision network, the corresponding third penalty values of the fourth feature vector are obtained；

If the third penalty values meet third preset condition, the weight parameter of the initial supervision network is adjusted；

The initial supervision network after adjustment weight parameter is determined as the supervision network.

8. a kind of training device of network model, comprising:

First eigenvector determining module, for determining the corresponding first eigenvector of sample image by network to be trained；

Second feature vector determining module, for by supervision network according to described two above area images determine two with On second feature vector；

First-loss module, for determining institute based on the first eigenvector and described two above second feature vectors State the first-loss value of network to be trained；

Parameter adjustment module, for when the first-loss value meets the first preset condition, adjusting the network to be trained Weight parameter.

9. a kind of computer readable storage medium, the storage medium is stored with computer program, and the computer program is used for Execute the training method of any network model of the claims 1-7.

10. a kind of electronic equipment, the electronic equipment include:

Processor；

For storing the memory of the processor-executable instruction；

The processor, for reading the executable instruction from the memory, and it is above-mentioned to realize to execute described instruction The training method of network model as claimed in claim 1 to 7.