CN111291639A

CN111291639A - Cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding

Info

Publication number: CN111291639A
Application number: CN202010063845.0A
Authority: CN
Inventors: 文载道; 刘泽超; 刘准钆; 潘泉
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-01-20
Filing date: 2020-01-20
Publication date: 2020-06-16
Anticipated expiration: 2040-01-20
Also published as: CN111291639B

Abstract

The invention discloses a cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding, which is used for acquiring an image to be identified of an optical or synthetic aperture radar of a target ship; extracting ship inter-class difference characteristics and data source inter-difference characteristics in an image to be identified by using a first encoder in a trained hierarchical variational self-encoding network; analyzing the difference characteristics among ship classes and the difference characteristics among data sources by using a second encoder in the trained hierarchical variational self-coding network, and determining the class of a target ship in the image to be recognized and the data source class of the image to be recognized; the invention utilizes the layered variation self-coding network to automatically extract structural characteristics with both representation/interpretability and discriminability from a large number of different source ship target images which can not be registered, thereby realizing cross-source ship characteristic fusion learning and accurate identification of ship targets.

Description

Cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding

[ technical field ] A method for producing a semiconductor device

The invention belongs to the technical field of remote sensing information fusion and target identification, and particularly relates to a cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding.

[ background of the invention ]

The reliable ocean monitoring capability can effectively guarantee the ocean rights and interests of a country, and is beneficial to the development of tasks such as marine rescue, fishery management, marine traffic control and the like. The key technology is the accurate ship identification technology. In actual ocean monitoring, a ship target and a satellite carrying an imaging radar or an optical remote sensing camera are in relative motion, and when a certain ship target exits from a visual area of a current satellite-borne radar or camera and enters the visual field of another satellite-borne or airborne remote sensing device, a multi-source ship target collaborative identification task needs to be completed.

Multi-source image fusion methods can be broadly divided into three categories: pixel-level fusion, feature-level fusion and decision set fusion. Pixel-level fusion requires first registering a pair of heterogeneous images and then fusing the images based on pixels; for feature level fusion, extracting respective features from heterogeneous images, and then fusing the heterogeneous features; for the decision set fusion, data of each source needs to be classified separately, and then classification results are fused to obtain a final classification result. In the aspect of multi-source image fusion, the fusion at the pixel level needs different registered images about the same target, which is difficult to obtain in the marine target identification problem, and the feature fusion-based method does not necessarily need the registered different images, but needs the targets in the different images to be the same object, and the acquisition cost of the images in the marine target identification problem is huge.

In the traditional target identification method, the two parts of feature extraction and classification are optimized independently, so that the extracted features are not necessarily beneficial to subsequent target identification, namely, the feature has insufficient discriminability, and the target identification accuracy is not ideal.

[ summary of the invention ]

The invention aims to provide a cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding, and meanwhile, feature extraction and feature classification of a target image are optimized, so that the identification accuracy of the target image is improved.

The invention adopts the following technical scheme: a cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding comprises the following steps:

acquiring an image to be identified of an optical or synthetic aperture radar of a target ship;

extracting ship inter-class difference characteristics and data source inter-difference characteristics in an image to be identified by using a first encoder in a trained hierarchical variational self-encoding network;

analyzing the difference characteristics among ship classes and the difference characteristics among data sources by using a second encoder in the trained hierarchical variational self-coding network, and determining the class of a target ship in the image to be recognized and the data source class of the image to be recognized;

the hierarchical variational self-coding network is obtained by using a plurality of groups of data through machine learning training, and each group of data in the plurality of groups of data comprises a ship image, a category label representing the ship image and a label of a ship image data source.

Furthermore, the layered variation self-coding network is a two-layer variation self-coding network;

extracting different hidden features in an image to be identified from a first encoder in an encoding network by using a first layer variation;

splicing and combining different hidden features to obtain combined hidden features;

and analyzing the combined hidden variable by using a second encoder in the second-layer variation self-coding network to obtain the category of the target ship and the category of the data source in the image to be identified.

Further, the first layer variation self-coding network consists of a first encoder and a first decoder;

the first encoder comprises two layers of convolution networks, the first layer of convolution network is 1 convolution network, the second layer of convolution network is 4 convolution networks which are arranged in parallel, and the convolution network of the first layer is respectively connected with each convolution network of the second layer;

the first decoder comprises 1 deconvolution network, and the deconvolution network is used for reconstructing an image to be identified after splicing and combining different hidden features obtained by the encoder to generate a reconstructed image;

the reconstructed image is used for comparing with the image to be identified so as to optimize the network parameters of the first-layer variation self-coding network.

Further, the second layer variation self-coding network consists of a second encoder and a second decoder;

the second encoder comprises 2 full-connection networks which are arranged in parallel; the 2 full-connection networks are used for generating the category of a target ship in the image to be identified and the source category of the image to be identified according to the combined hidden features;

the second decoder comprises 3 fully connected networks, the 3 fully connected networks being respectively for:

reconstructing the corresponding hidden features according to the class of the target ship obtained by the second encoder to obtain reconstructed hidden features;

reconstructing the corresponding hidden features according to the data source type of the image to be identified obtained by the second encoder to obtain reconstructed hidden features; and

reconstructing the corresponding hidden features according to the type of the spliced and combined target ship and the type of the data source of the image to be identified to obtain reconstructed hidden features;

the reconstructed hidden features are used for comparing with the hidden features before reconstruction so as to optimize the network parameters of the second-layer variation self-coding network.

Further, a loss function of the hierarchical variational self-coding network

Comprises the following steps:

wherein ,

as a function of the reconstruction error of the input image, X being the image in the training image set X, c₁₂₃₄Is denoted by "c₁,c₂,c₃,c₄”，q(c₁₂₃₄|x)、p(x|c₁₂₃₄)、q(c₁|x)、p(c₁)、q(c₂|x)、p(c₂|l)、q(c₃|x)、p(c₃|d)、q(c₄|x)、p(c₄|d,l)、q(c₂₃₄| x) obey a Gaussian distribution, α and β are weights, c₁、c₂、c₃、c₄Respectively different hidden features, c₂₃₄Is denoted by "c₂,c₃,c₄", d is the image data source class label, l is the class label of the target ship in the image, γ is the sum θ is the weight, q (d | c)₃₄)、p(d)、q(l|c₂₄)、p(l)、q(d,l|c₂₃₄) All obey the Concrete distribution, c₃₄Is denoted by "c₃,c₄”，c₂₄Is denoted by "c₂,c₄”。

The invention has the beneficial effects that: the invention utilizes the layered variation self-coding network to automatically extract structural characteristics with both representation/interpretability and discriminability from a large number of different source ship target images which can not be registered, thereby realizing cross-source ship characteristic fusion learning and accurate identification of ship targets.

[ description of the drawings ]

FIG. 1 is a flow chart of an embodiment of the present invention;

FIG. 2 is a schematic diagram of a hierarchical variational self-coding network structure in an embodiment of the present invention;

FIG. 3 shows a hidden feature c in an embodiment of the present invention₁Generating an effect graph;

FIG. 4 is a diagram illustrating the effect of generating hidden feature d according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating the effect of generating hidden features l according to an embodiment of the present invention;

FIG. 6 is a comparison chart of classification results of the verification embodiment of the present invention.

[ detailed description ] embodiments

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The invention discloses a cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding. In practical application, Synthetic Aperture Radars (SAR) have the advantages of being free from interference of weather, illumination and other conditions, being capable of continuously monitoring targets in marine environment and the like, but images of the SAR are not suitable for being distinguished by human eyes; the optical remote sensing camera adopting passive imaging is easily interfered by conditions such as cloud layers, illumination and the like, but has rich imaging content and intuitive target characteristics, so that the two heterogeneous images are widely used in ship target identification tasks. The optical image can provide abundant frequency spectrum information about the aspect of target appearance (such as color and the like), and the SAR image can well reflect target structure (such as edge angle and the like), so that the information of the two data sources has complementarity, and how to perform fusion learning and identification on the ship target in the optical image and the SAR image is a difficult problem to be solved urgently.

In terms of object recognition, the conventional object recognition method is composed of two parts: the system comprises a feature extractor and a classifier, wherein the feature extractor is usually designed manually and is used for extracting target features from raw sensor data, and the classifier is used for classifying the features to obtain a predicted target class. In the implementation process of the methods, the two parts of feature extraction and classification are optimized independently.

With the development of deep learning, discriminant networks such as deep convolutional neural networks and the like are widely applied to target identification, and the target identification accuracy is greatly improved by performing joint optimization on a feature extractor and a classifier by using an end-to-end method. However, the method is characterized in that the network automatic extraction is adopted, and compared with the traditional manual extraction method, the method has the defects of poor interpretability and the like, and the identification process is difficult to understand by people.

In the aspect of feature fusion learning, editable hidden features are introduced into a Variational Auto Encoder (VAE) serving as a generating model, so that the feature interpretability is enhanced, and the join variational self encoder (jointVAE) further realizes learning of discrete hidden features such as category labels and the like.

The single-layer variational self-coding network assumes that dimensions of the hidden features are independent from each other, and in a practical situation, the hidden features can be possibly divided into a plurality of groups, wherein the groups are independent from each other, but the groups are not independent from each other, namely, a certain dependency relationship exists among different features related to a target. Therefore, the adoption of a single-layer self-coding network cannot sufficiently represent the non-independent phenomenon, namely the representation capability of the network is insufficient, and meanwhile, the improvement of the target identification accuracy rate is limited.

Therefore, the invention discloses a cross-source ship feature fusion learning and identification method based on hierarchical variational self-coding, which belongs to the technical field of deep learning and target identification, wherein cross-source multi-dimensional heterogeneous target observation data are distributed in different spaces, so that a current mainstream deep feature learning model for single-modal data cannot find corresponding feature distribution manifold for heterogeneous targets.

The method of the invention is shown in figure 1, and specifically comprises the following steps:

acquiring an image to be identified of an optical or synthetic aperture radar of a target ship; extracting ship inter-class difference characteristics and data source inter-difference characteristics in an image to be identified by using a first encoder in a trained hierarchical variational self-encoding network; analyzing the difference characteristics among ship classes and the difference characteristics among data sources by using a second encoder in the trained hierarchical variational self-coding network, and determining the class of a target ship in the image to be recognized and the data source class of the image to be recognized; the hierarchical variational self-coding network is obtained by using a plurality of groups of data through machine learning training, and each group of data in the plurality of groups of data comprises a ship image, a category label representing the ship image and a label of a ship image data source.

Through the layered variational self-coding network, structural features with both representation/interpretability and discriminability are automatically extracted from a large number of different source ship target images which cannot be registered, and cross-source ship feature fusion learning and accurate ship target identification are realized.

In addition, the second decoder and the first decoder in the trained hierarchical variational self-coding network are used for disambiguating the ship target class and the changed image data source class, and cross-source generation of the ship image can be realized.

In this embodiment, the hierarchical variational self-coding network is a two-layer variational self-coding network; extracting different hidden features in an image to be identified from a first encoder in an encoding network by using a first layer variation; splicing and combining different hidden features to obtain combined hidden features; and analyzing the combined dependent variable by using a second encoder in the second-layer variation self-coding network to obtain the category of the target ship and the category of the data source in the image to be identified.

Specifically, as shown in fig. 2, the diversity self-encoding network in this embodiment is composed of an encoder and a decoder, the encoder is responsible for learning hidden features from an input image, the hidden features are high-level semantics describing an image, and the generation process of the image is considered to be controlled by the hidden features, for example, some/some hidden features control the category of a target ship, and the category of the target ship in the image can be changed by changing the hidden features. The decoder is responsible for reconstructing the input image by using the hidden features obtained by coding or generating a new image based on the input image, and then comparing the reconstructed or new image with the original image to further optimize the network parameters of the variational self-coding network so as to further improve the classification precision of the network.

As a possible implementation, the first layer variation self-coding network is composed of a first encoder and a first decoder;

the first encoder comprises two layers of convolution networks, the first layer of convolution network is 1 convolution network, the second layer of convolution network is 4 convolution networks which are arranged in parallel, and the convolution network of the first layer is respectively connected with each convolution network of the second layer. The first decoder comprises 1 deconvolution network, and the deconvolution network is used for reconstructing an image to be identified after splicing and combining different hidden features obtained by the encoder to generate a reconstructed image; the reconstructed image is used for comparing with the image to be identified so as to optimize the network parameters of the first-layer variation self-coding network.

The second layer variation self-coding network consists of a second encoder and a second decoder; the second encoder comprises 2 full-connection networks which are arranged in parallel; the 2 full-connection networks are used for generating the category of a target ship in the image to be identified and the source category of the image to be identified according to the combined hidden features; the second decoder comprises 3 fully connected networks, the 3 fully connected networks being respectively for:

reconstructing the corresponding hidden features according to the class of the target ship obtained by the second encoder to obtain reconstructed hidden features; reconstructing the corresponding hidden features according to the source type of the image to be identified obtained by the second encoder to obtain reconstructed hidden features; and reconstructing the corresponding hidden features according to the type of the spliced and combined target ship and the source type of the image to be identified to obtain the reconstructed hidden features. The reconstructed hidden features are used for comparing with the hidden features before reconstruction so as to optimize the network parameters of the second-layer variation self-coding network.

In this embodiment, in order to test the technical effect of the invention, as shown in fig. 1, four types of features are learned from a heterogeneous ship image X (optical remote sensing image and SAR image) by using a hierarchical variational self-coding network:

1) similarity characteristics (similarity characteristics between data sources) between different image data sources (optics, SAR);

2) the difference characteristics between different image data sources (namely the difference characteristics between the data sources);

3) similarity features between different classes of ships (i.e., inter-class similarity features);

4) features of inter-ship variability from one ship to another (i.e., inter-class variability features).

Then, the data source types of the ship images are identified by using the difference characteristics among the data sources, and the ship types are identified by using the difference characteristics among the ship types.

Therefore, in order to model the four types of characteristics, the first layer variational self-coding network is utilized to extract four hidden characteristics c from a heterogeneous ship image X (optical and SAR)₁、c₂、c₃、c₄. wherein ,c₁Controlling hidden features independent of both image data source and ship type, e.g. two ship mapsSheet: one SAR picture of a cargo ship and one optical picture of a fishing boat, but the azimuth angles of the ships are the same in the two pictures, and c₁May be a hidden feature with respect to the angle of orientation. Similarly, c₂The hidden features are shown independent of the image data source but dependent on the ship category, for example, fishing nets are all provided in fishing ships, and runways are all provided on aircraft carriers. c. C₃The hidden features are shown regardless of the ship type but related to the image data source, for example, SAR images all have coherent spots, optical images have cloud and the like. c. C₄Representing hidden features relating to both the image data source and the ship category, e.g., features that have both speckle and runway.

So that the above-mentioned feature 1) can be obtained by hiding the feature c₁、c₂Common characterization; feature 2) may consist of hidden feature c₃、c₄Common characterization; feature 3) may consist of hidden feature c₁、c₃Common characterization; feature 4) may consist of hidden features c₂、c₄And (5) jointly characterizing.

For the second layer variation self-coding network, it is responsible for completing the slave implicit characteristic c₁、c₂、c₃、c₄And (5) learning and identifying ship class labels l and image data source labels d. Wherein the ship class l is hidden from the feature 4), namely the hidden feature c₂、c₄Is deduced, and the image data source d is derived from the feature 2), i.e. the hidden feature c₃、c₄Is deduced.

Specifically, in the hierarchical variational self-coding network training, after the SAR and the optical ship image X are input into the network, the SAR and the optical ship image X firstly pass through the convolution network shared in the first layer, and then pass through 4 parallel convolution networks to respectively obtain the hidden characteristics c₁、c₂、c₃、c₄Respective parameters (μ, log (σ)) and respective Gaussian distributions using a re-parameterization technique

Respectively obtaining hidden characteristics c after sampling₁、c₂、c₃、c₄Where μ and σ are the mean and standard deviation of the Gaussian distribution, respectively。

The five convolutional networks form a coder of a first-layer variational self-coding network, and four hidden variables c are obtained by sampling₁、c₂、c₃、c₄And splicing the images together by using concat operation, inputting the spliced images into a decoder of a first-layer self-coding network consisting of a deconvolution network, reconstructing the input SAR and the optical ship image to obtain X ', and performing comparison calculation on the X' and the X to optimize parameters.

Meanwhile, in the second layer variation self-coding network, c obtained by sampling₂ and c₄Will be pieced together and input into the full-connection network to obtain the parameter a of the ship class label l_lAnd c is₁ and c₃The parameters a of the image data source label d are also spliced together and input into another fully-connected network to obtain_d. The two fully connected networks constitute the encoder of the second layer variation self-coding network. Using reparameterization technique respectively consisting of_lAnd a_dTwo distributions, Concrete (a), are obtained_l) And Concrete (a)_d) Sampling from two probability distributions to obtain l and d, and then respectively inputting into two fully-connected networks to reconstruct the hidden features to obtain c₂'、c₃', to c₄The reconstruction is carried out by first piecing together l and d and then inputting into the fully-connected network to obtain c₄' the three fully connected networks constitute the decoder of the second layer variation self-coding network.

After the network is built, the network is required to be optimized in parameters.

The Lower Evidence BOund (ELBO) of the whole network is:

wherein, for the sake of simplicity, use c₁₂₃₄Is denoted by "c₁,c₂,c₃,c₄", the rest are similar, p (x | c)₁₂₃₄)、q(c₁|x)、p(c₁)、q(c₂|x)、p(c₂|l)、q(c₃|x)、p(c₃|d)、q(c₄|x)、p(c₄All | d, l) obey a gaussian distribution. Let z obey a Gaussian distribution

The probability density function is then:

and q (d | c)₃₄)、p(d)、q(l|c₂₄) And p (l) are subject to a Concret distribution, which is a continuous relaxation approximation of a simplex discrete variable. n-dimensional Concrete random variable y ═ y₁,y₂,...,y_n)^TSatisfies the following conditions:

and y is_k≥0，

y can be obtained by sampling as follows:

wherein a ═ a₁,a₂,...,a_n)^TIs a probability vector, satisfies a_iIs not less than 0 and

λ>0 is used to control the sag, when it goes to 0, the Concrete variable will go to the simplex variable, g_iFor Gumbel distributed sampling, i.e. g_i～Gumbel(0,1)。

If y is equal to Concrete (a, λ), the probability density function is:

therefore, the loss function of the hierarchical variational self-coding network in the embodiment

Comprises the following steps:

wherein ,

as a function of the reconstruction error of the input image, X being the image in the training image set X, c₁₂₃₄Is denoted by "c₁,c₂,c₃,c₄”，q(c₁₂₃₄|x)、p(x|c₁₂₃₄)、q(c₁|x)、p(c₁)、q(c₂|x)、p(c₂|l)、q(c₃|x)、p(c₃|d)、q(c₄| x) all follow a Gaussian distribution, α, β, γ, and θ are weights, c₁、c₂、c₃、c₄Respectively different hidden features, c₂₃₄Is denoted by "c₂,c₃,c₄", d is the image source category label, l is the category label of the target ship in the image, q (d | c)₃₄)、p(d)、q(l|c₂₄)、p(l)、p(c₄All | d, l) obey the Concrete distribution, c₃₄Is denoted by "c₃,c₄”，c₂₄Is denoted by "c₂,c₄". The network is optimized by adopting an Adam optimizer, and the parameters are set as defaults.

The invention adopts a cross-source ship feature fusion learning and identification method based on hierarchical variational self-coding to realize the fusion of optical and SAR ship features and the identification of ship targets, introduces the idea of layering, extracts the similarity and difference features among different data sources and the similarity and difference features among different classes of ships from the original heterogeneous ship image, realizes the extraction of interpretable structural features, and uses the features in the identification of the ship targets.

In addition, by utilizing cross-source data, the embodiment can extract unique ship features with discriminant in each data source and can also extract ship features with discriminant shared by all data sources, so that the feature extraction is more sufficient, and the accuracy rate of ship identification in the image can be improved.

Verification of the examples:

for verification, optical remote sensing and SAR ship images need to be collected first. In this embodiment, the SAR ship image is taken from a SAR ship identification data set OpenSARShip disclosed by shanghai university of transportation, a modulo operation is performed on images of VV and VH polarization modes of all single-view multiplex images (SLC) to obtain a module value image, then a picture with a width and a height both not exceeding 32 pixels is selected from the module value image, and 0 is filled around the selected picture to obtain a picture with a size of 32 × 32 pixels. Since some classes of ship targets have only 2 pictures, the following sample expansion is performed to expand to 8 times the original: and rotating each picture by 90 degrees, 180 degrees and 270 degrees and horizontally turning over to finally obtain 5840 SAR images containing 11 ship targets. Specific information is shown in table 1.

The optical remote sensing ship image is taken from a remote sensing target ship data set DOTA published by Wuhan university. Firstly cutting out all target images of 'ship', then selecting pictures with width and height not more than 32 pixels, filling 0 in the periphery to obtain pictures with size of 32 x 32 pixels, and finally obtaining 6087 optical ship images without class labels. Specific information is shown in table 1.

Table 1 basic information of data set used in verification process

In order to verify the effectiveness of the method, the hidden features and the hidden features l and d are subjected to generation verification, namely, the value of one dimension of one hidden feature is changed at a time, and other dimensions are kept unchanged, so that the image effect generated by the hidden variables is visually displayed.

FIG. 3 shows a hidden feature c₁Wherein each row corresponds to a hidden feature c₁Is used as the one-dimensional element. For example, first oneThe image of the optical ship is input into the network, and the hidden feature c with the length of 5 is obtained by coding₁And other hidden features c₂₃₄. Hold c₁The last 4-dimensional value of (a) is not changed, and only the value of the 1-dimensional element is changed (e.g., in the interval [ -5,5 ] of]11 values are taken at the medium interval), so that some hidden features c with different values of only the 1 st-dimension element can be obtained₁C are to be measured₁In turn together with the hidden feature c₂₃₄Inputting into a decoding network, and generating the respective ship images, i.e. corresponding to the first row of images in fig. 3(a), and the rest rows are similar to the above. From the figure, for optical and SAR images, by changing c₁All the characteristics of direction angle, length and the like of the generated ship image can be successfully changed, so that the network can indeed successfully utilize c₁And extracting features which are irrelevant to both the data source and the ship category.

Fig. 4 shows the generation effect of d, which is generated as follows: firstly, inputting the ship image into a network, and obtaining a hidden characteristic c through coding₁L and d, wherein d is a vector of length 2. D is set to be [0,1 respectively]And [1,0 ]]So as to obtain two different d, respectively inputting two d and coded l into decoder formed from fully-connected layer of second layer variational self-coding network to obtain two groups of hidden characteristics c₂₃₄Then, two groups of c are respectively put into₂₃₄Together with the coded hidden variable c₁Inputting the image into a decoder consisting of a first layer of variational self-coding network and an deconvolution layer to obtain two ship images.

As shown in fig. 4, the 1 st line of the generation process of the optical ship image, the 2 nd line of the generation process of the SAR ship image, and the 1 st to 3 rd lines display part of the feature map of the deconvolution layer by layer. From the figure, from left to right, the bottom layer features (column 1) of the optical and SAR ship images are similar, and ship images of different data sources can be obtained by performing different combinations on the bottom layer features, namely the cross-source conversion of the ship images can be really realized by changing d, so that the network can really successfully extract the difference features among different data sources.

FIG. 5 shows the generation effect of l, which is generated as follows: firstly, inputting the ship image into a network, and obtaining a hidden characteristic c through coding₁L and d, wherein l is a vector of length 11. Then, setting a value of one element in l as 1 and setting the other elements as 0 in sequence, thus obtaining 11 different l, and setting 11 l together with the hidden feature c in sequence₁And d, inputting the coded network to generate 11 ship images. In fig. 5, the 1 st behavior is an example of the SAR images of 11 different types of ships, and the 2 nd behavior is generated by changing l different types of SAR ship images, and it can be seen through comparison that the generated various types of ship SAR ship images are indeed similar to the real various types of ship SAR images. Therefore, the cross-class conversion of the ship images can be really realized by changing the l, so that the network can really successfully extract the difference characteristics among ships of different classes.

For the present embodiment, in order to determine the parameters α, β, γ, and θ, through grid search and cross validation, it is determined that when α is 0.1, β is 0.001, γ is 1.0, and θ is 1.0, the ship classification accuracy of the network on the test set is 5 times average to 94.74%.

As shown in fig. 6, tests are performed on the same ship data by using K-Nearest Neighbors (KNNs), a Principal Component Analysis-based Support Vector Machine classifier (PCA-SVM), a Sparse representation-based classifier (SRC), a deep Convolutional Neural Network (CNN), a VGG-16 Network, and a joint variable self-coding (joint vae), and the results show that the average Classification accuracy of the method of this embodiment can be at least 8.5% higher than that of all other methods on a test set, which fully illustrates the high efficiency of the method of this embodiment in ship target identification.

Through the verification, the invention adopts a cross-source ship feature fusion learning and identification method based on hierarchical variational self-coding, and automatically extracts the interpretable four-part structural features by introducing the hierarchical idea: the similarity and difference characteristics among different image data sources and the similarity and difference characteristics among different classes of ships enhance the representation capability of the model and the interpretability of the characteristics, realize the characteristic fusion of the optical and SAR ships, ensure that the discriminability of the characteristics is complete as much as possible, and realize the accurate identification of the ship target. By utilizing cross-source data, the method can extract unique ship features with discriminability in each data source, can also mine ship features with discriminability shared by all data sources, and enables the discriminability of the ship features to be more sufficient through fusion.

Meanwhile, the method separates the difference characteristics among the data sources by fusion learning of the characteristics of the heterogeneous ships, and can realize cross-source generation of the images of the heterogeneous ships. In addition, the fusion of the invention to the heterogeneous ship images belongs to the fusion of feature level, and the ship targets in the heterogeneous images are not required to be the same object, so that the application range of the invention is greatly expanded.

The invention adopts a layered (two-layer) variational self-coding network and utilizes four implicit characteristics c₁、c₂、c₃、c₄The similarity and difference characteristics among different image data sources and the similarity and difference characteristics among different classes of ships are modeled, so that the characteristic representation/interpretation of network extraction is stronger. Wherein, c₁Some hidden features irrelevant to both an image data source and a ship type are controlled, for example, two ship pictures are adopted: one SAR picture of cargo ship and one visible light picture of fishing ship, but the ship direction angle is the same in the two pictures, and c₁May be a hidden feature with respect to the orientation angle; similarly, c₂Representing hidden features that are unrelated to the image data source but related to the ship category; c. C₃Representing hidden features that are independent of ship category but related to the image data source; c. C₄The hidden features are represented in relation to both the image data source and the ship category. Therefore, the similarity between different image data sources can be determined by the implicit feature c₁、c₂The common characterization can be realized by latent characterization c₃、c₄The common characteristics are represented, and the similarity characteristics among different ship targets can be represented by hidden characteristics c₁、c₃Common characterization between different classes of vessel targetsThe difference feature may be represented by a hidden feature c₂、c₄And (5) jointly characterizing.

The invention can realize the integration of cross-source generation and target identification of the ship image by extracting the ship characteristics with both representation and discriminability. By separating the difference characteristics among different data sources, for a given optical/SAR ship image, the SAR/optical image corresponding to the given optical/SAR ship image can be estimated by using the method, and cross-source generation is realized. Meanwhile, the method can extract unique ship features with discriminability from each data source, can also mine ship features with discriminability shared by all data sources, and enables the discriminability of the ship features to be more sufficient through fusion, so that the ship image can be accurately identified.

Claims

1. A cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding is characterized by comprising the following steps:

extracting ship inter-class difference characteristics and data source inter-difference characteristics in the image to be identified by using a first encoder in a trained hierarchical variational self-encoding network;

analyzing the difference characteristics among the ship classes and the difference characteristics among the data sources by using a second encoder in the trained hierarchical variational self-coding network, and determining the class of a target ship in the image to be recognized and the data source class of the image to be recognized;

the hierarchical variational self-coding network is obtained by using a plurality of groups of data through machine learning training, wherein each group of data in the plurality of groups of data comprises a ship image, a category label representing the ship image and a label of a ship image data source.

2. The cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding as claimed in claim 1, characterized in that the hierarchical variation self-coding network is a two-layer variation self-coding network;

extracting different hidden features in the image to be identified from a first encoder in an encoding network by using a first layer variation;

splicing and combining the different hidden features to obtain combined hidden features;

and analyzing the combined hidden variable by using a second encoder in a second-layer variation self-coding network to obtain the category and the data source category of the target ship in the image to be identified.

3. The cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding as claimed in claim 1, characterized in that, the first layer variation self-coding network is composed of a first encoder and a first decoder;

the first decoder comprises 1 deconvolution network, and the deconvolution network is used for reconstructing the image to be identified after splicing and combining different hidden features obtained by the encoder to generate a reconstructed image;

and the reconstructed image is used for comparing with an image to be identified so as to optimize the network parameters of the first-layer variation self-coding network.

4. The cross-source ship feature fusion learning and identification method based on hierarchical variational self-coding according to claim 3, characterized in that the second layer variational self-coding network is composed of a second encoder and a second decoder;

the second encoder comprises 2 fully-connected networks arranged in parallel; the 2 full-connection networks are used for generating the category of a target ship in the image to be identified and the data source category of the image to be identified according to the combined hidden features;

and the reconstructed hidden features are used for comparing with the hidden features before reconstruction so as to optimize the network parameters of the second-layer variation self-coding network.

5. The cross-source ship feature fusion learning and identification method based on hierarchical variation self-coding according to claim 2, 3 or 4, wherein the loss function l of the hierarchical variation self-coding network is:

wherein ,

as a function of the reconstruction error of the input image, X being the image in the training image set X, c₁₂₃₄Is denoted by "c₁,c₂,c₃,c₄”，q(c₁₂₃₄|x)、p(x|c₁₂₃₄)、q(c₁|x)、p(c₁)、q(c₂|x)、p(c₂|l)、q(c₃|x)、p(c₃|d)、q(c₄|x)、p(c₄|d,l)、q(c₂₃₄| x) obey a Gaussian distribution, α and β are weights, c₁、c₂、c₃、c₄Respectively different hidden features, c₂₃₄Is denoted by "c₂,c₃,c₄", d is the image source category label, l is the category label of the target ship in the image, γ is the weight, θ is the weight, q (d | c)₃₄)、p(d)、q(l|c₂₄)、p(l)、q(d,l|c₂₃₄) All obey the Concrete distribution, c₃₄Is denoted by "c₃,c₄”，c₂₄Is denoted by "c₂,c₄”。