WO2019238126A1 - 图像分割及分割网络训练方法和装置、设备、介质、产品 - Google Patents

图像分割及分割网络训练方法和装置、设备、介质、产品 Download PDF

Info

Publication number
WO2019238126A1
WO2019238126A1 PCT/CN2019/091328 CN2019091328W WO2019238126A1 WO 2019238126 A1 WO2019238126 A1 WO 2019238126A1 CN 2019091328 W CN2019091328 W CN 2019091328W WO 2019238126 A1 WO2019238126 A1 WO 2019238126A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing
image
feature information
sample image
land
Prior art date
Application number
PCT/CN2019/091328
Other languages
English (en)
French (fr)
Inventor
田超
李聪
石建萍
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to JP2020569112A priority Critical patent/JP7045490B2/ja
Priority to SG11202012531TA priority patent/SG11202012531TA/en
Publication of WO2019238126A1 publication Critical patent/WO2019238126A1/zh
Priority to US17/121,670 priority patent/US20210097325A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/182Network patterns, e.g. roads or rivers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Definitions

  • the present application relates to image processing technologies, and in particular, to an image segmentation and segmentation network training method and device, device, medium, and product.
  • remote sensing images have begun to be applied in various fields. Due to the large scene of satellite remote sensing images, without clear boundaries, and without an accurate structure information, remote sensing images are different from traditional image segmentation scenarios, which leads to some difficulties in segmentation using traditional neural networks. The results are poor and difficult. Promotion.
  • the embodiments of the present application are expected to provide an image segmentation and training method and device, device, medium, and product.
  • an image segmentation method including: performing feature extraction processing on an image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks; Image feature information output by at least two pairs of adjacent processing blocks of the plurality of processing blocks is subjected to at least two levels of fusion processing to obtain target image feature information; and a target object segmentation result of the image is determined based on the target image feature information .
  • performing at least two levels of fusion processing on image feature information output by at least two pairs of neighboring processing blocks of the plurality of processing blocks to obtain target image feature information includes: Performing first-level fusion processing on the image feature information output from the block to obtain the first fusion feature information; and performing second-level fusion processing on at least a pair of adjacent first fusion feature information to obtain at least one second fusion feature information; Determining the target image feature information based on the at least one second fusion feature information.
  • determining the target image feature information based on the at least one second fusion feature information includes: performing subsequent feature fusion processing on the at least one second fusion feature information until the subsequent fusion processing is obtained
  • the number of subsequent fusion feature information of is one; and the number of subsequent fusion feature information is one as the target image feature information.
  • the image feature information output by each pair of adjacent processing blocks is added element by element.
  • the multiple processing blocks are sequentially connected; and / or, the image feature information output by each pair of adjacent processing blocks has the same size and the same number of channels.
  • the processing block includes at least one processing unit, and each of the processing units includes at least one feature extraction layer and feature adjustment layer; and performing feature extraction processing on an image through multiple processing blocks to obtain the multiple
  • the image feature information output by each processing block in the processing block includes: performing feature extraction processing on the input information of the processing unit through the at least one feature extraction layer in the processing unit to obtain first feature information;
  • the feature adjustment layer in the processing unit performs adjustment processing on the first feature information to obtain image feature information output by the processing unit.
  • the method further includes: Feature reduction processing is performed on the image feature information output by the processing block M1 in the plurality of processing blocks; feature expansion processing is performed on the image feature information output by the processing block M2 in the plurality of processing blocks; the input end of the processing block M2 and The output of the processing block M1 is directly or indirectly connected.
  • the performing feature extraction processing on the image based on a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks includes using the processing in the plurality of processing blocks.
  • Block N1 performs feature extraction processing on the input information of the processing block N1 to obtain first image feature information corresponding to the processing block N1.
  • the input information of the processing block N1 includes the image and / or is located in the processing block.
  • the step of inputting the first image feature information to a next processing block of the processing block N1 to obtain second image feature information output by the next processing block includes: And / or the image feature information output by at least one processing block N2 and the first image feature information are input to a next processing block of the processing block N1 for feature extraction processing to obtain a second image output by the next processing block Characteristic information, the input end of the processing block N1 is directly or indirectly connected to the output end of the processing block N2.
  • the method further includes: performing fusion processing on the image feature information output by the at least one processing block N2, and inputting the image feature information obtained by the fusion processing into a next processing block of the processing block N1.
  • the method before performing feature extraction processing on an image based on a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks, the method further includes: The image is subjected to feature extraction processing to obtain initial feature information of the image; the performing feature extraction processing on the image through multiple processing blocks includes: inputting the initial feature information of the image into the plurality of processing blocks for feature extraction processing .
  • the image is a remote sensing image
  • the target object is land.
  • the method is implemented using a segmentation neural network, and the image is a land sample image; the method further includes: using the segmentation neural network to process a road sample image to obtain a segmentation result of the road sample image; Adjusting parameters of the segmentation neural network based on a target object prediction result of the land sample image and a segmentation result of the road sample image.
  • the target image feature information is obtained based on the mixed feature information
  • the mixed feature information is obtained by the segmentation neural network performing batch processing on the land sample image and the road sample image.
  • adjusting the parameters of the segmentation neural network based on the target object prediction result of the land sample image and the road sample image segmentation result includes: based on the target sample prediction result of the land sample image and A first loss is obtained based on the label information of the land sample image; a second loss is obtained based on the segmentation result of the road sample image and the label information of the road sample image; and an office is adjusted based on the first loss and the second loss. Describe the parameters of segmentation neural network.
  • adjusting the parameters of the segmentation neural network based on the first loss and the second loss includes: weighting and summing the first loss and the second loss to obtain a total loss; based on For the total loss, adjust the parameters of the segmentation neural network.
  • the method before performing the feature extraction processing on the image based on a plurality of processing blocks to obtain the image feature information output by each processing block in the plurality of processing blocks, the method further includes: performing a parameter setting on the sample image. At least one of the following enhancement processes: adjusting the size of the sample image, rotating the angle of the sample image, changing the brightness of the sample image; performing feature extraction processing on the image based on a plurality of processing blocks to obtain the plurality of
  • the image feature information output by each processing block in the processing block includes: performing feature extraction processing on the at least one enhanced processing image based on a plurality of processing blocks to obtain the output of each processing block in the plurality of processing blocks.
  • Image feature information is adjusting the size of the sample image, rotating the angle of the sample image, changing the brightness of the sample image.
  • the method before performing feature extraction processing on the image based on multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks, the method further includes: Cropping the image to obtain at least one cropped image; performing feature extraction processing on the image based on a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks, including: The cropped image is subjected to feature extraction processing to obtain image feature information output by each of the plurality of processing blocks.
  • an image segmentation apparatus including: an image processing module configured to perform feature extraction processing on an image through multiple processing blocks to obtain each processing block of the multiple processing blocks Output image feature information; a fusion module configured to perform at least two levels of fusion of image feature information output from at least two pairs of adjacent processing blocks of the plurality of processing blocks to obtain target image feature information; a segmentation module configured to A target object segmentation result of the image is determined based on the target image feature information.
  • a training method for a land segmentation neural network including: inputting at least one land sample image and at least one road sample image into the land segmentation neural network to obtain the at least one land A prediction segmentation result of a sample image and a prediction segmentation result of the at least one road sample image; adjusting the land segmentation nerve based on the prediction segmentation result of the at least one land sample image and the prediction segmentation result of the at least one road sample image The parameters of the network.
  • the land segmentation neural network includes a plurality of processing blocks, a fusion network, and a segmentation network that are sequentially connected; the inputting at least one land sample image and at least one road sample image into the land segmentation neural network to obtain the land segmentation neural network;
  • the predicted segmentation result of the at least one land sample image and the predicted segmentation result of the at least one road sample image include: performing feature extraction processing on the at least one land sample image and the at least one road sample image based on a plurality of processing blocks, Obtaining sample image feature information output by each processing block in the plurality of processing blocks; performing at least two levels of sample image feature information output by at least two pairs of adjacent processing blocks in the plurality of processing blocks through the fusion network Fusion processing to obtain target sample image feature information; based on the target sample image feature information, the predicted segmentation result of the at least one land sample image and the predicted segmentation result of the at least one road sample image are obtained through the segmentation network.
  • the feature information includes: processing each of the land sample images and each of the road sample images based on the plurality of processing blocks to obtain at least two sets of sample image feature information corresponding to each of the land sample images and each of the The road sample images correspond to at least two sets of sample image feature information.
  • performing at least two levels of fusion processing on the sample image feature information output by at least two pairs of adjacent processing blocks of the plurality of processing blocks to obtain target sample image feature information includes: each land sample image The feature information of at least two sets of corresponding sample image features is fused at least two levels to obtain the feature information of the land sample image corresponding to each of the land sample images; the feature information of at least two sets of sample image corresponding to each road sample image is performed at least two levels Fusion to obtain road sample image feature information of each road sample image, wherein the target sample image feature information includes land sample image feature information corresponding to the at least one land sample image and the at least one road sample image corresponds to Road sample image feature information.
  • the land segmentation neural network further includes a slice layer; and based on the target sample image feature information, determining a predicted segmentation result of the at least one land sample image and a predicted segmentation result of the at least one road sample image Before, it further includes: segmenting the land sample image feature information and the road sample image feature information included in the target sample image feature information through the slice layer; inputting the land sample image feature information to all The segmentation network performs processing to obtain a predicted segmentation result of a land sample image, and the feature information of the road sample image is input to the segmentation network for processing to obtain a predicted segmentation result of the road sample image.
  • the land sample image and the road sample image each have label information; and the predicted segmentation result based on the at least one land sample image and the predicted segmentation result of the at least one road sample image are adjusted to the
  • the parameters of the land segmentation neural network include: obtaining a first loss based on the predicted segmentation result corresponding to the land sample image and the label information corresponding to the land sample image; based on the predicted segmentation result corresponding to the road sample image and the road A second loss is obtained from the label information corresponding to the sample image; and parameters of the land segmentation neural network are adjusted based on the first loss and the second loss.
  • adjusting the parameters of the land segmentation neural network based on the first loss and the second loss includes: weighting and summing the first loss and the second loss to obtain a total loss; Based on the total loss, parameters of the land segmentation neural network are adjusted.
  • a training device for a land segmentation neural network including: a result prediction module configured to input at least one land sample image and at least one road sample image into the land segmentation neural network, Obtaining a predicted segmentation result of the at least one land sample image and a predicted segmentation result of the at least one road sample image; a parameter adjustment module configured to be based on the predicted segmentation result of the at least one land sample image and the at least one road sample The prediction segmentation result of the image adjusts the parameters of the land segmentation neural network.
  • an electronic device including: a memory configured to store executable instructions; and a processor configured to communicate with the memory to execute the executable instructions to complete any of the above.
  • a computer-readable storage medium configured to store computer-readable instructions, and when the instructions are executed, the image segmentation method described in any one of the foregoing or any one of the foregoing is performed.
  • a computer program product including computer-readable code.
  • the computer-readable code runs on a device
  • a processor in the device executes a configuration to implement any of the above.
  • An instruction of the image segmentation method or the training method of the land segmentation neural network according to any one of the above.
  • another computer program product configured to store computer-readable instructions, and the instructions, when executed, cause a computer to execute the image segmentation method described in any one of the foregoing possible implementation manners. , Or perform the operation of the training method for a land segmentation neural network described in any possible implementation.
  • the computer program product is specifically a computer storage medium.
  • the computer program product is specifically a software product, such as an SDK or the like.
  • another image segmentation and land segmentation neural network training method and device, electronic device, computer storage medium, and computer program product are also provided, wherein the image is subjected to feature extraction processing through multiple processing blocks to obtain Image feature information output by each processing block in multiple processing blocks; performing at least two levels of fusion on image feature information output by at least two pairs of adjacent processing blocks in multiple processing blocks to obtain target image feature information; based on the target image The feature information determines the target object segmentation result of the image.
  • feature extraction processing is performed on the image through multiple processing blocks to obtain multiple processing blocks.
  • the result of segmentation of the target object is obtained by at least two levels of fusion of the feature information of adjacent images, and more information is obtained, which is conducive to more accurate segmentation of the target object in the image.
  • FIG. 1 is a schematic flowchart of an image segmentation method according to an embodiment of the present application
  • FIG. 2 is an exemplary structural diagram of a processing block in an image segmentation method according to an embodiment of the present application
  • FIG. 3 is an exemplary schematic structural diagram of a segmentation neural network in a training process in an image segmentation method according to an embodiment of the present application
  • FIG. 4 is an example diagram comparing the segmentation effect of the embodiment of the present application with FC-DenseNet;
  • FIG. 5 is an example diagram comparing the structural segmentation effect of the embodiment of the present application with FC-DenseNet and ClassmateNet;
  • FIG. 6 is a schematic structural diagram of an image segmentation apparatus according to an embodiment of the present application.
  • FIG. 7 is an exemplary flowchart of a training method for a land segmentation neural network according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a training device for a land segmentation neural network according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an example of an electronic device suitable for implementing the embodiments of the present application.
  • FIG. 1 is a schematic flowchart of an image segmentation method according to an embodiment of the present application. As shown in FIG. 1, the method includes:
  • Step 110 Perform feature extraction processing on the image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks.
  • the processing block includes at least one processing unit.
  • multiple processing blocks may be connected sequentially, and multiple processing blocks may be located at different depths.
  • the output end of any processing block in the multiple processing blocks may be connected to the input end of its next processing block.
  • Feature extraction processing can be performed on the images in sequence using multiple processing blocks.
  • the first processing block of a plurality of processing blocks may perform feature extraction processing on an input image to obtain image feature information output by the first processing block.
  • the second processing block may perform feature extraction processing on the input image feature information to obtain image feature information output by the second processing block, where the image feature information input in the second processing block may include the output of the first processing block.
  • the image feature information may further include the image, and so on, and the image feature information output by each processing block in the multiple processing blocks may be obtained.
  • the processing block N1 of the multiple processing blocks is used to perform feature extraction processing on the input information of the processing block N1 to obtain first image feature information corresponding to the processing block N1.
  • N1 is an integer greater than or equal to 1.
  • the first image feature information is input to the next processing block of the processing block N1 for feature extraction processing to obtain the second image feature information output by the next processing block.
  • the processing block N1 may be the first processing block among multiple processing blocks.
  • the input information of the processing block N1 may be the above-mentioned image or the initial image feature information of the image; or, the processing block N1 may be multiple The second or later processing block among the processing blocks.
  • the input information of the processing block N1 may include the image feature information output by the previous processing block, or may further include any information located before the previous processing block.
  • the image feature information output by one or more processing blocks may also include an image, that is, the input information of processing block N1 may include an image and / or image feature information output by one or more processing blocks located before processing block N1. Because the input information of the processing block includes image feature information of different depths, the image feature information output by the processing block may contain more image information.
  • the more shallow layer information is included in the image feature information obtained by the preceding processing block, and combined with the image feature information output by the subsequent processing block, both the shallow and deep information in the image can be obtained.
  • inputting the first image feature information to the next processing block of the processing block N1 for processing to obtain the second image feature information output by the next processing block includes: outputting the image and / or at least one processing block N2
  • the image feature information and the first image feature information are input to the next processing block of the processing block N1 to perform feature extraction processing to obtain second image feature information output by the next processing block.
  • the input end of the processing block N1 is directly or indirectly connected to the output end of the processing block N2.
  • the processing block N1 is located after the processing block N2 on the network structure.
  • the input of the next processing block of the processing block N1 may be only the image feature information output by the processing block N1, for example, the processing block N1 is the third processing block, and the next processing block of the processing block N1 is the fourth processing block.
  • the input of the fourth processing block is the image feature information output by the third processing block.
  • the input of the next processing block of the processing block N1 includes image processing information output by the processing block N1 and image processing information output by at least one processing block N2; for example, the processing block N1 is a third processing block, and the The next processing block is a fourth processing block, and at least one processing block N2 includes a first processing block and / or a second processing block.
  • the input of the fourth processing block is the image feature information output by the third processing block and the first processing.
  • image feature information output by the second processing block And image feature information output by the second processing block.
  • the input of the next processing block of the processing block N1 includes image processing information and images output by the processing block N1, or the input of the next processing block of the processing block N1 includes image processing information, images, and images output by the processing block N1. Image processing information output by at least one processing block N2.
  • the input of the next processing block of the processing block N1 includes the image feature information output by the processing block N1 and at least one processing block N2, before inputting these image feature information to the next processing block, it may be
  • the image feature information output by at least one processing block N2 and some or all of the processing blocks N1 is subjected to fusion processing, and the image feature information obtained by the fusion processing is input to a next processing block of the processing block N1.
  • the image feature information output from at least two processing blocks needs to be input into one processing block, these feature information can be fused to process the processing block.
  • the specific fusion method can be bitwise addition (element-wise addition) Or by channel overlay or other way.
  • one or more convolutional layers may also be used to perform feature extraction processing on the image to obtain initial feature information of the image. Accordingly, The initial feature information of the image may be input into multiple processing blocks to sequentially perform feature extraction processing, which is not limited in this embodiment of the present application.
  • the input of the next processing block of the processing block N1 may further include initial feature information of the image.
  • the initial feature information of the image and the image feature information output by the processing block N1 may be fused, or Assuming that the input of the next processing block of the processing block N1 includes the image processing information output by the processing block N1, the image, and the image processing information output by at least one processing block N2, the initial feature information of the image can be compared with the image processing output by the processing block N1.
  • the information, the image, and the image processing information output by the at least one processing block N2 are subjected to fusion processing, and the like, which is not limited in the embodiment of the present application.
  • Step 120 Perform at least two-level fusion processing on the image feature information output by at least two pairs of neighboring processing blocks of the plurality of processing blocks to obtain target image feature information.
  • first-level fusion processing is performed on image feature information output by each pair of adjacent processing blocks to obtain first fusion feature information; and at least one pair of adjacent first fusion feature information is processed.
  • Perform second-level fusion processing to obtain at least one second fusion feature information; and determine target image feature information based on the at least one second fusion feature information.
  • a plurality of processing blocks may be divided into a plurality of pairs of adjacent processing blocks, and each pair of adjacent processing blocks includes two adjacent processing blocks (that is, two processing blocks directly connected), optionally,
  • the adjacent processing blocks of different pairs contain different processing blocks, or the adjacent processing blocks of different pairs may not include the same processing block.
  • the first processing block and the second processing block form the first pair of adjacent blocks.
  • the processing block, the third processing block and the fourth processing block form a second pair of adjacent processing blocks, and so on.
  • the image feature information output by each pair of adjacent processing blocks can be fused (for example, the image feature information output by each pair of adjacent processing blocks is added element by element) to achieve the image feature information. Two by one.
  • first fusion feature information can be obtained after fusing image feature information of each pair of processing blocks.
  • the first fusion feature information (for example, two first fusion feature information or Part or all of the two or more first fusion feature information) are subjected to the second-level fusion processing to obtain one second fusion feature information, and the second fusion feature information is used as the target image feature information; or, multiple The first fusion feature information is fused adjacently to obtain a plurality of second fusion feature information.
  • a plurality of second fusion feature information may be subjected to subsequent feature fusion processing until subsequent The number of fused feature information is one, and the subsequent fused feature information whose number is one is used as target image feature information.
  • the subsequent fusion processing at this time may be a pairwise fusion of the second fusion feature information (for example, adding the two second fusion feature information element by element), and the subsequent fusion feature information obtained after the pairwise fusion includes at least one or more
  • the subsequent fusion feature information is used as the target image feature information; and when the number of subsequent fusion feature information is multiple, the subsequent fusion feature information continues to be fused in pairs (for example, The two subsequent fusion feature information are added element by element) until the number of subsequent fusion feature information obtained by the subsequent fusion process is one, and the subsequent fusion feature information is used as the target image feature information; for example, it includes 8 processing blocks and passes through a level Four first fusion feature information is obtained through fusion, two second fusion feature information is obtained through secondary fusion, and one subsequent fusion feature information is obtained after three-level fusion.
  • the subsequent fusion feature information is used as target image feature information.
  • a dense fusion structure is proposed in the embodiment of the present application. Layers of different depths are fused in pairs and fused by element-wise summation. Recursively merge to the last layer.
  • the dense fusion structure can better enable the network to obtain more deep and shallow information, which is conducive to accurate segmentation in detail.
  • level fusion may also be performed by using three or more adjacent processing blocks as a unit. There is no limit to this.
  • Step 130 Determine a target object segmentation result of the image based on the target image feature information.
  • feature extraction processing is performed on an image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks; At least two pairs of image feature information output from adjacent processing blocks are subjected to at least two levels of fusion processing to obtain target image feature information; based on the target image feature information, the target object segmentation result of the image is determined, and at least two Level fusion to obtain more information, which is conducive to more accurate segmentation of the target object in the image.
  • the feature information may be a third-order vector, for example, including a plurality of two-dimensional matrices, or a feature map having at least one channel, and each feature map corresponds to a two-dimensional vector, which is not limited in this embodiment of the present application.
  • the processing block may include one or more processing units, and each processing unit may perform feature extraction processing on the input information of the processing block.
  • each processing unit may include one or more processing units.
  • the convolution layer, or other layers such as one or any combination of a Batch Normalization (BN) layer, an activation layer, and the like.
  • the processing block may further include other units located behind the processing unit, such as any one or a combination of a reduced-resolution layer, a feature scaling layer, a BN layer, and an activation layer.
  • the processing unit includes at least one feature extraction layer and feature adjustment layer;
  • Step 110 may include: performing feature extraction processing on the input information of the processing unit through at least one feature extraction layer in the processing unit to obtain first feature information; and adjusting and processing the first feature information through a feature adjustment layer in the processing unit to obtain Image feature information output by the processing unit.
  • the image feature information output by each pair of adjacent processing blocks has the same size and the same number of channels.
  • a configuration is configured to adjust the feature information by adding one to the processing unit.
  • the feature adjustment layer of the size and the number of channels is implemented.
  • the feature adjustment layer may be provided in the processing unit or separately. The embodiment of the present application does not limit the position of the feature adjustment layer.
  • each processing unit may include at least one feature extraction layer (such as a convolution layer, a normalization layer BN, and an activation layer ReLU, etc.) and a feature adjustment layer (such as a convolution layer, a normalization layer BN, and Activation layer (ReLU, etc.),
  • FIG. 2 is an exemplary structural diagram of a processing block in an image segmentation method according to an embodiment of the present application.
  • the processing block includes multiple processing units (Layer), each processing unit includes three convolutional layers, and a batch normalization layer is connected after each convolutional layer ( BN) and an activation layer (ReLU), where the feature maps output by the first two convolutional layers are input to the next processing unit, and the convolutional layers output to the side are used as feature adjustment layers and are configured to the second volume
  • the size and channel of the feature map output are adjusted so that the feature information (such as feature map) output is the same as the size and number of channels of the feature information output from other processing units, and prepares for the fusion of feature information.
  • the method may further include: performing feature reduction processing on the image feature information output by the processing block M1 in the plurality of processing blocks; and processing the processing block in the plurality of processing blocks.
  • the image feature information output by M2 is subjected to feature expansion processing.
  • the input end of the processing block M2 is directly or indirectly connected to the output end of the processing block M1, or the image feature information output by the processing block M2 is obtained based at least in part on the image feature information output by the processing block M1.
  • the image feature information obtained by the upper processing block has less image information because it has fewer processing layers, and the image feature information obtained by the lower processing block has more image layers because of the number of processing layers.
  • There are more image information so optionally, when pairwise fusion is used, when the image feature information corresponding to the adjacent processing block is a shallow feature, the image feature information is output to the processing block lower in the adjacent processing block.
  • Perform feature reduction processing for example, downsampling processing
  • the image processing features corresponding to adjacent processing blocks are deep features
  • perform feature expansion processing on the image feature information output from the processing block located higher in the adjacent processing block for example: Interpolation processing can be bilinear difference processing).
  • the image processed in this embodiment of the present application may be a remote sensing image, and the target object is land at this time, that is, the method of the above embodiment of the present application is used to implement land segmentation by remote sensing image, for example, : Divide the land in remote sensing images into forests, grasslands, cities, and cultivated land.
  • Scenarios in which the image segmentation method provided in the foregoing embodiments of the present application can be applied include, but are not limited to, land planning, land use monitoring, and land status survey.
  • the image segmentation method in the embodiment of the present application is implemented by using a segmentation neural network, and the image is a land sample image.
  • the image segmentation method in the embodiment of the present application further includes: training a segmentation neural network based on the target object segmentation result of the sample image and the label information of the sample image.
  • the sample image is a land sample image; the method of the embodiment of the present disclosure further includes: processing the road sample image using a segmentation neural network to obtain a segmentation result of the road sample image;
  • the parameters of the segmentation neural network are adjusted.
  • segmentation land images such as remote sensing images
  • the structural information at the intermediate level is missing, and the structural information is important for auxiliary image segmentation and classification.
  • auxiliary image segmentation and classification For example, for land cover type classification, remote sensing image analysis The scene covered is large, and the scene is constrained and affected by the resolution. At the same time, the noise caused by the annotation will greatly affect the image segmentation. Therefore, how to effectively and accurately obtain the structural information of the land image has become the key to solve the segmentation problem.
  • the segmentation neural network proposed in the embodiment of the present application introduces road data for training, which makes up for the lack of structure of land images and improves detailed information.
  • the segmented neural network for example, Dense, Fusion, Classmate Network, DFCNet
  • DFCNet Dense, Fusion, Classmate Network
  • the target image feature information is obtained based on the mixed feature information
  • the mixed feature information is obtained by performing batch processing on the land sample image and the road sample image by the segmentation neural network.
  • the embodiments of the present disclosure use slices to target the land sample image corresponding to the target.
  • the sample image feature information is distinguished from the target sample image feature information corresponding to the road image. The specific distinction can be made according to the order of the input land sample image and the road image.
  • adjusting the parameters of the segmentation neural network based on the target object prediction result of the land sample image and the segmentation result of the road sample image includes: obtaining the first loss based on the target object prediction result of the land sample image and the label information of the land sample image. ; Obtaining a second loss based on the segmentation result of the road sample image and the label information of the road sample image; adjusting the parameters of the segmentation neural network based on the first loss and the second loss.
  • the first loss and the second loss are weighted and summed to obtain the total loss; based on the total loss, the parameters of the segmentation neural network are adjusted.
  • the parameters of the segmentation neural network are adjusted by weighted summing of the first loss and the second loss, and the weight value of the weighted summation can be set in advance or obtained through experiments or multiple trainings.
  • the weight value of the first loss is greater than that of the first loss.
  • the weight value of the second loss for example: the weight value of the first loss: the weight value of the second loss is 8: 7, and the specific weight value is not limited in the embodiment of the present application.
  • the road data is used to make up for the missing information of the structure of land classification, which improves the accuracy of the segmentation neural network for the land segmentation task.
  • road data that is easily available and easy to standardize, after adding road data for segmentation, the efficiency and accuracy of land cover classification can be improved. And the handling of details is more perfect.
  • the method may further include: performing at least one of the following enhancement processing on the sample image by setting parameters: adjusting the size of the sample image, rotating the angle of the sample image, and changing the sample image. Brightness
  • Step 110 may include: performing feature extraction processing on at least one enhanced processing image based on a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks.
  • the embodiments of the present disclosure implement data enhancement processing.
  • more sample images can be obtained, or the display effect of the sample images can be improved to achieve better training effects.
  • the crop size of the network training data is 513x513
  • the random resize value range for road data images is [0.5,1.5]
  • the random resize value range for land classification images is [0.8,1.25].
  • the random rotation range is [-180,180]
  • the brightness jitter parameter is 0.3.
  • the method may further include: cropping the image based on a cropping frame of a set size to obtain at least one cropped image;
  • Step 110 may include: performing feature extraction processing on the cropped image based on a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks.
  • the embodiments of the present disclosure implement data preprocessing.
  • a sample of land data will be obtained by tailoring it.
  • the cropping size of the training data is increased, which helps the network to extract many scene information, thereby improving the segmentation effect.
  • a dense fusion structure is proposed in the embodiment of the present disclosure. Layers of different depths are fused in pairs, and element-wise sum is used to fused to the last layer recursively.
  • the dense fusion structure can better enable the network to obtain more deep and shallow information, which is conducive to accurate segmentation in detail. At the same time, fusion can make the network's backpropagation back to the shallower layer better and faster, which is conducive to better network supervision.
  • FIG. 3 is an exemplary schematic structural diagram of a segmentation neural network in a training process in an image segmentation method according to an embodiment of the present application.
  • the road data and the sample land data are combined through the concat layer and combined in the 0th dimension.
  • the structure of the entire segmented neural network (DFCNet) is shown in Figure 3.
  • Conv_TD indicates a downsampling operation, (128,1 * 1,0,1) indicates that the number of convolution channels is 128, the size of the convolution kernel is 1 * 1, the padding value is 0, and the step size is 1.
  • Pooling 1, 2, 3, and 4 are pooling layers.
  • the strategy is average pooling.
  • the pooling interval is 2x2.
  • Interp5, 6, 7, and 8 are upsampling processes.
  • Features are doubled by bilinear interpolation.
  • Each Dense Block contains several processing units Layer Unit, each processing unit contains two convolutional layers conv_x1 / conv_x2 (as shown in Figure 2), followed by the BN and RULU layers.
  • the number of Conv_x1 convolution kernels is 64, and the number of conv_x2 convolution kernels is 16.
  • Conv_2x is followed by a conv_f convolution layer to unify features during feature fusion.
  • FIG. 4 is an example diagram of the comparison between the embodiment of the present application and the FC-DenseNet segmentation effect. As shown in FIG. 4, (a) indicates the result of traditional FC-DenseNet segmentation, and (b) indicates the result of segmentation in the embodiment of the present application. For DFCNet added with road data, it can have better structural information in characteristics. Segmentation can be better assisted in cities, arable land and grassland.
  • FIG. 5 is an example diagram comparing the segmentation effect of the FC-DenseNet and ClassmateNet structures in the embodiment of the present application.
  • (a) shows the result of FC-DenseNet structure segmentation
  • (b) shows the result of ClassmateNet structure segmentation
  • (c) shows the result of DFCNet structure segmentation implemented in this application.
  • ClassmateNet with unencrypted set fusion structure has better segmentation performance than classic FC-DenseNet.
  • DFCNet can further improve details in comparison with ClassmateNet with unencrypted set fusion Dense and Fusion structure.
  • the foregoing program may be stored in a computer-readable storage medium.
  • the program is executed, the program is executed.
  • the method includes the steps of the foregoing method embodiment.
  • the foregoing storage medium includes: a ROM, a RAM, a magnetic disk, or an optical disk, and other media that can store program codes.
  • FIG. 6 is a schematic structural diagram of an image segmentation apparatus according to an embodiment of the present application.
  • the device may be used to implement the foregoing method embodiments of the present application. As shown in Figure 6, the device includes:
  • the image processing module 61 is configured to perform feature extraction processing on an image through a plurality of processing blocks to obtain image feature information output by each processing block in the plurality of processing blocks.
  • the fusion module 62 is configured to perform at least two-level fusion processing on image feature information output by at least two pairs of neighboring processing blocks of the plurality of processing blocks to obtain target image feature information.
  • the segmentation module 63 is configured to determine a target object segmentation result of the image based on the target image feature information.
  • the processing block includes at least one processing unit.
  • multiple processing blocks may be connected sequentially.
  • multiple processing blocks may be located at different depths respectively.
  • the output end of any processing block in the multiple processing blocks may be connected to the input end of its next processing block. .
  • the fusion module 62 is specifically configured to perform first-level fusion processing on image feature information output by each pair of adjacent processing blocks to obtain first fusion feature information; and perform second-level fusion on at least one pair of adjacent first fusion feature information. Processing to obtain at least one second fusion feature information; determining target image feature information based on the at least one second fusion feature information.
  • a plurality of processing blocks are divided into a plurality of pairs of adjacent processing blocks, and each pair of adjacent processing blocks includes two adjacent processing blocks (that is, two processing blocks directly connected), optionally, different
  • the adjacent processing blocks of a pair include different processing blocks, or the adjacent processing blocks of different pairs may not include the same processing block.
  • the first processing block and the second processing block constitute the first pair of adjacent processing blocks.
  • Block, the third processing block and the fourth processing block form a second pair of adjacent processing blocks, and so on.
  • the fusion module 62 is specifically configured to perform subsequent feature fusion processing on at least one second fusion feature information until the number of subsequent fusion feature information obtained by the subsequent fusion processing is one; and the number of subsequent fusion feature information is used as the target image feature information .
  • the fusion module 62 is specifically configured to add the image feature information output by each pair of adjacent processing blocks element by element during the fusion processing of the image feature information output by each pair of adjacent processing blocks.
  • a dense fusion structure is proposed in the embodiment of the present application. Layers of different depths are fused in pairs. The elements are fused by element-wise summation and recursively fused to the last layer.
  • the dense fusion structure can better enable the network to obtain more deep and shallow information, which is conducive to accurate segmentation in detail.
  • feature extraction processing is performed on an image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks;
  • the image feature information output by two pairs of adjacent processing blocks is subjected to at least two levels of fusion processing to obtain target image feature information; based on the target image feature information, determining a target object segmentation result of the image, and at least two levels of fusion of adjacent image feature information are performed, More information is obtained, which facilitates more accurate segmentation of the target object in the image.
  • multiple processing blocks are sequentially connected; and / or, the image feature information output by each pair of adjacent processing blocks has the same size and the same number of channels.
  • the image feature information output by each pair of adjacent processing blocks has the same size and the same number of channels.
  • a configuration is configured to adjust the feature information by adding one to the processing unit.
  • the feature adjustment layer of the size and the number of channels is implemented.
  • the feature adjustment layer may be provided in the processing unit or separately. The embodiment of the present application does not limit the position of the feature adjustment layer.
  • each processing unit may include at least one feature extraction layer (such as a convolution layer, a normalization layer BN, and an activation layer ReLU, etc.) and a feature adjustment layer (such as a convolution layer, a normalization layer BN). And activation layer ReLU, etc.).
  • a feature extraction layer such as a convolution layer, a normalization layer BN, and an activation layer ReLU, etc.
  • a feature adjustment layer such as a convolution layer, a normalization layer BN.
  • activation layer ReLU activation layer ReLU
  • the processing block may include one or more processing units, and each processing unit may perform feature extraction processing on the input information.
  • each processing unit may include one or more convolutional layers. Or, it also includes other layers, such as one or any combination of a Batch Normalization (BN) layer and an activation layer.
  • the processing block may further include other units located behind the processing unit, such as any one or a combination of a reduced-resolution layer, a feature scaling layer, a BN layer, and an activation layer.
  • the processing unit includes at least one feature extraction layer and feature adjustment layer;
  • the image processing module 61 is specifically configured to perform feature extraction processing on input information of the processing unit through at least one feature extraction layer in the processing unit to obtain first feature information; and adjust the first feature information through a feature adjustment layer in the processing unit. Processing to obtain image feature information output by the processing unit.
  • the method further includes: a feature image processing module configured to perform at least two levels of fusion processing on image feature information output by at least two pairs of neighboring processing blocks of the plurality of processing blocks.
  • a feature image processing module configured to perform at least two levels of fusion processing on image feature information output by at least two pairs of neighboring processing blocks of the plurality of processing blocks.
  • the image feature information obtained by the upper processing block has less image information because it has fewer processing layers, and the image feature information obtained by the lower processing block has more image layers because of the number of processing layers.
  • There are more image information so optionally, when pairwise fusion is used, when the image feature information corresponding to the adjacent processing block is a shallow feature, the image feature information is output to the processing block lower in the adjacent processing block.
  • Perform feature reduction processing for example, downsampling processing, etc.
  • the image processing features corresponding to adjacent processing blocks are deep features
  • perform feature expansion processing on image feature information output from processing blocks located higher in the adjacent processing blocks for example, : Interpolation processing, etc., can be bilinear difference processing).
  • the image processing module 61 is specifically configured to perform feature extraction processing on the input information of the processing block N1 by using the processing block N1 of the multiple processing blocks to obtain a first corresponding to the processing block N1.
  • Image feature information; the first image feature information is input to the next processing block of the processing block N1 for feature extraction processing, and the second image feature information output by the next processing block is obtained.
  • the input information of the processing block N1 includes image and / or image feature information output by at least one processing block located before the processing block N1, and N1 is an integer greater than or equal to 1.
  • the processing block N1 may be the first processing block among a plurality of processing blocks.
  • the input information of the processing block N1 may be an image or initial feature information of the image; or, the processing block N1 may be a plurality of processing blocks.
  • the input information of the processing block N1 may include the image feature information output by the previous processing block, or may further include any one or more of the preceding processing blocks.
  • the image feature information output by the plurality of processing blocks may also include an image, that is, the input information of the processing block N1 may include an image and / or image feature information output by one or more processing blocks located before the processing block N1. Because the input information of the processing block includes image feature information of different depths, the image feature information output by the processing block may contain more image information.
  • the more shallow layer information is included in the image feature information obtained by the preceding processing block, and combined with the image feature information output by the subsequent processing block, both the shallow and deep information in the image can be obtained.
  • the image processing module 61 is specifically configured to input an image and / or image feature information output by the at least one processing block N2 and the first image feature information to a next processing block of the processing block N1 to perform feature extraction processing to obtain the following The second image feature information output by a processing block, wherein the input end of the processing block N1 is directly or indirectly connected to the output end of the processing block N2.
  • the image processing module 61 is further configured to input the image feature information output by the image and / or at least one processing block N2 and the first image feature information to a next processing block of the processing block N1 for processing.
  • the image feature information output by at least one processing block N2 is subjected to fusion processing, and the image feature information obtained by the fusion processing is input to a next processing block of the processing block N1.
  • the target object segmentation device for the above image further includes: a feature extraction module configured to perform feature extraction processing on the image based on a plurality of processing blocks, before obtaining image feature information output by each of the plurality of processing blocks , Performing feature extraction processing on the image through a convolution layer to obtain initial feature information of the image, and inputting the initial feature information of the image into a plurality of processing blocks for feature extraction processing.
  • a feature extraction module configured to perform feature extraction processing on the image based on a plurality of processing blocks, before obtaining image feature information output by each of the plurality of processing blocks , Performing feature extraction processing on the image through a convolution layer to obtain initial feature information of the image, and inputting the initial feature information of the image into a plurality of processing blocks for feature extraction processing.
  • the image processed by the embodiment of the present application may be a remote sensing image, and the target object is land at this time, that is, the method of the above embodiment of the present application is used to implement the segmentation of the land by the remote sensing image. , Cities, arable land, etc.
  • the image segmentation device provided by the foregoing embodiments of the present application can be applied to, but not limited to, land planning, land use monitoring, land status survey, and the like.
  • the image segmentation apparatus in the embodiment of the present application is implemented by using a segmentation neural network, and the image is a land sample image;
  • the apparatus for segmenting a target object of an image in the embodiment of the present application further includes: a training module configured to process a road sample image using a segmentation neural network to obtain a segmentation result of the road sample image; a target object prediction result based on the land sample image and a road sample image Segmentation results, adjusting parameters of the segmentation neural network
  • the segmentation neural network proposed in the embodiment of the present application introduces road data for training, which makes up for the lack of structure of land images and improves detailed information.
  • the target image feature information is obtained based on the mixed feature information, and the mixed feature information is obtained by batch processing of the land sample image and the road sample image by the segmentation neural network.
  • the training module is specifically configured to obtain the first loss based on the target object prediction result of the land sample image and the label information of the land sample image; obtain the second loss based on the segmentation result of the road sample image and the label information of the road sample image; Adjust the parameters of the segmentation neural network based on the first loss and the second loss.
  • the training module is specifically configured to weight the sum of the first loss and the second loss to obtain the total loss; and adjust the parameters of the segmentation neural network based on the total loss.
  • it may further include: an enhanced image processing module configured to perform feature extraction processing on the image based on a plurality of processing blocks to obtain an image output by each of the plurality of processing blocks.
  • an enhanced image processing module configured to perform feature extraction processing on the image based on a plurality of processing blocks to obtain an image output by each of the plurality of processing blocks.
  • the sample image is subjected to at least one of the following enhancement processing by setting parameters: adjusting the size of the sample image, rotating the angle of the sample image, and changing the brightness of the sample image;
  • the image processing module 61 is specifically configured to perform feature extraction processing on at least one enhanced processing image based on a plurality of processing blocks to obtain image feature information output by each of the plurality of processing blocks.
  • the embodiments of the present disclosure implement data enhancement processing.
  • more sample images can be obtained, or the display effect of the sample images can be improved to achieve better training effects.
  • the crop size of the network training data is 513x513
  • the random resize range for road data images is [0.5,1.5]
  • the random resize range for land classification images is [0.8,1.25].
  • the random rotation range for road and land data is [-180,180], and the brightness adjustment parameter is 0.3.
  • the above-mentioned target object segmentation device for an image may further include: a preprocessing module configured to perform feature extraction processing on the image based on a plurality of processing blocks to obtain each of the plurality of processing blocks. Before processing the image feature information output by each processing block, cropping the image based on a crop frame of a set size to obtain at least one cropped image;
  • the image processing module 61 is specifically configured to perform feature extraction processing on the cropped image based on a plurality of processing blocks to obtain image feature information output by each of the plurality of processing blocks.
  • the embodiments of the present disclosure implement data preprocessing.
  • a sample of land data will be obtained by tailoring it.
  • the cropping size of the training data is increased, which helps the network to extract many scene information, thereby improving the segmentation effect.
  • FIG. 7 is an exemplary flowchart of a training method for a land segmentation neural network according to an embodiment of the present application. As shown in Figure 7, the method includes:
  • Step 710 Input at least one land sample image and at least one road sample image into a land segmentation neural network to obtain a predicted segmentation result of at least one land sample image and a predicted segmentation result of at least one road sample image.
  • Step 720 Adjust the parameters of the land segmentation neural network based on the predicted segmentation results of at least one land sample image and the predicted segmentation results of at least one road sample image.
  • road data with labeled information is used as auxiliary data to help the training of the land segmentation neural network.
  • a land segmentation neural network such as a densely fused classmate network
  • road data is easier to obtain compared to land cover, and it will also be simple to mark, so in actual application, it can use less and difficult to mark land cover data, plus some easily labeled road data to assist land cover. Classification of types.
  • the land segmentation neural network obtained after training in the embodiment of the present disclosure can be applied to the target object of the image shown in FIG. 1 above. Segmentation method to achieve land segmentation in remote sensing images to obtain land segmentation results.
  • the land segmentation neural network includes a plurality of processing blocks, a fusion network, and a segmentation network that are sequentially connected;
  • Step 710 may include: performing feature extraction processing on at least one land sample image and at least one road sample image based on a plurality of processing blocks to obtain sample image feature information output by each processing block in the plurality of processing blocks; At least two levels of fusion are performed on at least two pairs of sample image feature information output by adjacent processing blocks in the processing block to obtain target sample image feature information; based on the target sample image feature information, a predicted segmentation of at least one land sample image is obtained through a segmentation network The result and the predicted segmentation result of at least one road sample image.
  • a dense fusion structure is proposed in the embodiment of the present disclosure. Layers of different depths are fused in pairs, and element-wise sum is used to fused to the last layer recursively.
  • the dense fusion structure can better enable the network to obtain more deep and shallow information, which is conducive to accurate segmentation in detail. At the same time, fusion can make the network back propagation better and faster back to the shallower layer, which is conducive to better network supervision.
  • each of the land sample images and each of the road sample images are processed based on the multiple processing blocks to obtain at least two sets of sample image feature information corresponding to each of the land sample images and each of the land sample images.
  • Each land sample image may be processed through multiple processing blocks to obtain at least two sets of sample image feature information, where the at least two sets of sample image feature information may correspond to at least two processing blocks, for example, including multiple The sample image feature information output by each processing block in the processing block, or contains the sample image feature information output by some processing blocks in a plurality of processing blocks, which is not limited in this embodiment of the present disclosure.
  • the land segmentation neural network processes the input land sample images and road sample images separately to prevent the image feature information between different sample images from being confused during batch processing, resulting in inaccurate training results.
  • At least two levels of fusion are performed on the sample image feature information output by at least two pairs of adjacent processing blocks in the multiple processing blocks to obtain the target sample image feature information, including: at least two corresponding to each land sample image Group sample image feature information is fused at least two levels to obtain the land sample image feature information corresponding to each land sample image; at least two groups of sample image feature information corresponding to each road sample image is fused at least two levels to obtain the The road sample image feature information of each road sample image is described, wherein the target sample image feature information includes the land sample image feature information corresponding to the at least one land sample image and the road sample image corresponding to the at least one road sample image. Feature information.
  • Each image sample image and each road sample image have different image feature information. If the image feature information of different sample images is fused, the training result will be inaccurate.
  • the embodiment of the present disclosure divides the neural network and divides each sample. The two sets of sample image feature information corresponding to the image (land sample image or road sample image) are performed separately to prevent fusion of the sample image feature information corresponding to multiple sample images.
  • the land segmentation neural network further includes a slice layer; before determining the predicted segmentation result of at least one land sample image and the predicted segmentation result of at least one road sample image based on the target sample image feature information, the method further includes:
  • the land segmentation neural network including a plurality of sequential connection processing blocks, and a corresponding target sample image feature information set is obtained, in order to distinguish the land sample image from the road sample image.
  • the road segmentation neural network is trained by using road image information.
  • a slice layer is used to distinguish the target sample image feature information corresponding to the road sample image from the target sample image feature information corresponding to the road sample image. It can be distinguished according to the order of inputting land sample images and road sample images.
  • the land sample image and the road sample image have labeling information, respectively;
  • Adjusting the parameters of the land segmentation neural network based on the predicted segmentation results of at least one land sample image and the predicted segmentation results of at least one road sample image include: obtaining a first segment based on the predicted segmentation results corresponding to the land sample image and the label information corresponding to the land sample image. A loss; obtaining a second loss based on the predicted segmentation result corresponding to the road sample image and the label information corresponding to the road image; adjusting the parameters of the land segmentation neural network based on the first loss and the second loss.
  • the first loss and the second loss are weighted and summed to obtain the total loss; based on the total loss, parameters of the land segmentation neural network are adjusted.
  • the parameters of the land segmentation neural network are adjusted by weighted summing of the first loss and the second loss.
  • the weight value of the weighted summation can be set in advance or obtained through experiments or multiple trainings.
  • the weight value of the first loss is greater than
  • the weight value of the second loss is, for example, the weight value of the first loss: the weight value of the second loss is 8: 7, and the specific weight value is not limited in the embodiment of the present application.
  • the road data is used to make up for the missing information of the structure of land classification, which improves the accuracy of the land segmentation neural network for the land segmentation task.
  • road data that is easily available and easy to standardize, after adding road data for segmentation, the efficiency and accuracy of land cover classification can be improved. And the handling of details is more perfect.
  • FIG. 3 An example of the training process of the land segmentation neural network of the present application can be shown in FIG. 3, and the comparison between the segmentation effect achieved by it and the FC-DenseNet segmentation effect can be shown in FIG. 4.
  • the comparison of segmentation effects can be shown in Figure 5.
  • road data is relatively simple, it is easier to obtain than the land cover image in terms of labeling and acquisition methods. Therefore, the introduction of simple road data can greatly improve the classification of land cover images that are difficult to obtain and label, and can save standard manpower. As well as adding dense fusion model network structure, it can help the classification of land cover in detail.
  • the foregoing program may be stored in a computer-readable storage medium.
  • the program is executed, the program is executed.
  • the method includes the steps of the foregoing method embodiment.
  • the foregoing storage medium includes: a ROM, a RAM, a magnetic disk, or an optical disk, and other media that can store program codes.
  • FIG. 8 is a schematic structural diagram of a training apparatus for a land segmentation neural network according to an embodiment of the present application.
  • the device may be used to implement the foregoing method embodiments of the present application. As shown in Figure 8, the device includes:
  • the result prediction module 81 is configured to input at least one land sample image and at least one road sample image into the land segmentation neural network to obtain a predicted segmentation result of the at least one land sample image and a predicted segmentation result of the at least one road sample image.
  • the parameter adjusting module 82 is configured to adjust the parameters of the land segmentation neural network based on the predicted segmentation results of at least one land sample image and the predicted segmentation results of at least one road sample image.
  • road data with labeled information is used as auxiliary data to help the training of the land segmentation neural network.
  • a land segmentation neural network such as a densely fused classmate network
  • road data is easier to obtain compared to land cover, and it will also be simple to mark, so in actual application, it can use less and difficult to mark land cover data, plus some easily labeled road data to assist land cover. Classification of types.
  • the land segmentation neural network includes a plurality of processing blocks, a fusion network, and a segmentation network that are sequentially connected;
  • the result prediction module 81 is specifically configured to perform feature extraction processing on at least one land sample image and at least one road sample image based on a plurality of processing blocks, to obtain sample image feature information output by each processing block in the plurality of processing blocks; through a fusion network Perform at least two levels of fusion on the sample image feature information output from at least two pairs of adjacent processing blocks in the multiple processing blocks to obtain the target sample image feature information; based on the target sample image feature information, obtain at least one land sample image through a segmentation network And the prediction segmentation result of at least one road sample image.
  • a dense fusion structure is proposed in the embodiment of the present disclosure. Layers of different depths are fused in pairs, and element-wise sum is used to fused to the last layer recursively.
  • the dense fusion structure can better enable the network to obtain more deep and shallow information, which is conducive to accurate segmentation in detail. At the same time, fusion can make the network back propagation better and faster back to the shallower layer, which is conducive to better network supervision.
  • the result prediction module 81 is specifically configured to process each of the land sample images and each of the road sample images based on the multiple processing blocks to obtain at least two sets of samples corresponding to each of the land sample images.
  • the result prediction module 81 is specifically configured to perform at least two levels of fusion of the feature information of at least two sets of sample images corresponding to each land sample image to obtain the feature information of land sample images corresponding to each land sample image; At least two levels of fusion are performed on at least two sets of sample image feature information corresponding to each road sample image to obtain road sample image feature information of each road sample image, wherein the target sample image feature information includes the at least one land The land sample image feature information corresponding to the sample image and the road sample image feature information corresponding to the at least one road sample image.
  • the land segmentation neural network further includes a slice layer
  • the result prediction module 81 is further configured to determine, based on the target sample image feature information, the predicted segmentation result of the at least one land sample image and the predicted segmentation result of the at least one road sample image by the slice layer.
  • the feature information of the land sample image included in the feature information of the target sample image is segmented with the feature information of the road sample image; the feature information of the land sample image is input to the segmentation network for processing to obtain the predicted segmentation of the land sample image.
  • the road sample image feature information is input to the segmentation network for processing, and a predicted segmentation result of the road sample image is obtained.
  • the land sample image and the road sample image respectively have label information;
  • the parameter adjustment module 82 is specifically configured to obtain the first loss based on the predicted segmentation result corresponding to the land sample image and the label information corresponding to the land sample image; based on the road sample image
  • the corresponding prediction segmentation results and the label information corresponding to the road sample image obtain a second loss; adjusting the parameters of the land segmentation neural network based on the first loss and the second loss.
  • the parameter adjustment module 82 is specifically configured to weight-sum the first loss and the second loss to obtain the total loss; and adjust the parameters of the land segmentation neural network based on the total loss.
  • an electronic device including a processor, where the processor includes the image segmentation device according to any one of the above or training of a land segmentation neural network according to any one of the above Device.
  • an electronic device including: a memory configured to store executable instructions;
  • a processor configured to communicate with the memory to execute the executable instruction to complete the operation of the image segmentation method according to any one of the above, or configured to communicate with the memory to execute the executable instruction to complete Operation of the training method for a land segmentation neural network as described in any one of the above.
  • a computer-readable storage medium configured to store computer-readable instructions, and when the instructions are executed, the image segmentation method described in any one of the foregoing or any one of the foregoing is performed.
  • a computer program product including computer-readable code.
  • the computer-readable code runs on a device
  • a processor in the device executes a configuration to implement any of the above.
  • An instruction of the image segmentation method or the training method of the land segmentation neural network according to any one of the above.
  • an embodiment of the present application further provides a computer program program product configured to store computer-readable instructions, and the instructions, when executed, cause a computer to execute any one of the foregoing possible implementation manners.
  • the computer program product may be specifically implemented by hardware, software, or a combination thereof.
  • the computer program product is embodied as a computer storage medium.
  • the computer program product is embodied as a software product, such as a Software Development Kit (SDK), etc. Wait.
  • SDK Software Development Kit
  • an image segmentation and land segmentation neural network training method and device an electronic device, a computer storage medium, and a computer program product are also provided, wherein the image is subjected to feature extraction processing through multiple processing blocks to obtain multiple processes Image feature information output by each processing block in the block; at least two levels of fusion are performed on image feature information output by at least two pairs of adjacent processing blocks in a plurality of processing blocks to obtain target image feature information; based on the target image feature information, Determine the target segmentation result of the image.
  • the target tracking instruction may specifically be a call instruction.
  • the first device may instruct the second device to perform target tracking in a calling manner. Accordingly, in response to receiving the call instruction, the second device may perform the target tracking described above. Steps and / or processes in any embodiment of the method.
  • a plurality may refer to two or more, and “at least one” may refer to one, two, or more.
  • An embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • an electronic device such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • FIG. 9 illustrates a schematic structural diagram of an example of an electronic device 900 suitable for implementing the embodiments of the present application.
  • the electronic device 900 includes one or more processors, a communication unit, and the like.
  • the one or more processors are, for example, one or more central processing units (CPUs) 901, and / or one or more image processors (GPUs) 913, etc., and the processors may be stored in a read-only memory (ROM) according to
  • the executable instructions in 902 or the executable instructions loaded from the storage portion 908 into the random access memory (RAM) 903 perform various appropriate actions and processes.
  • the communication unit 912 may include, but is not limited to, a network card, and the network card may include, but is not limited to, an IB (Infiniband) network card.
  • the processor may communicate with the read-only memory 902 and / or the random access memory 903 to execute executable instructions, connect to the communication unit 912 through the bus 904, and communicate with other target devices through the communication unit 912, thereby completing the embodiments provided in this application.
  • Operations corresponding to any of the methods for example, performing feature extraction processing on an image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks; at least two pairs of the multiple processing blocks are compared
  • the image feature information output by the neighboring processing block is subjected to at least two levels of fusion processing to obtain target image feature information; based on the target image feature information, a target object segmentation result of the image is determined.
  • the RAM 903 can also store various programs and data required for the operation of the device.
  • the CPU 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904.
  • ROM902 is an optional module.
  • the RAM 903 stores executable instructions, or writes executable instructions to the ROM 902 at runtime, and the executable instructions cause the central processing unit 901 to perform operations corresponding to the foregoing communication method.
  • An input / output (I / O) interface 905 is also connected to the bus 904.
  • the communication unit 912 may be provided in an integrated manner, or may be provided with a plurality of sub-modules (for example, a plurality of IB network cards) and connected on a bus link.
  • the following components are connected to the I / O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output portion 907 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 908 including a hard disk and the like ; And a communication section 909 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 909 performs communication processing via a network such as the Internet.
  • the driver 910 is also connected to the I / O interface 905 as necessary.
  • a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 910 as needed, so that a computer program read therefrom is installed into the storage section 908 as needed.
  • FIG. 9 is only an optional implementation manner.
  • the number and types of the components in FIG. 9 may be selected, deleted, added or replaced according to actual needs.
  • Different functional component settings can also be implemented by separate settings or integrated settings.
  • GPU913 and CPU901 can be set separately or GPU913 can be integrated on CPU901.
  • the communication department can be set separately or integrated on CPU901 or GPU913. and many more.
  • embodiments of the present application include a computer program product including a computer program tangibly embodied on a machine-readable medium, the computer program including program code for performing a method shown in a flowchart, and the program code may include a corresponding
  • the instructions corresponding to the method steps provided in the embodiments of the present application are executed, for example, performing feature extraction processing on an image through multiple processing blocks to obtain image feature information output by each processing block in the multiple processing blocks; At least two pairs of image feature information output from adjacent processing blocks are subjected to at least two levels of fusion processing to obtain target image feature information; and based on the target image feature information, a target object segmentation result of the image is determined.
  • the computer program may be downloaded and installed from a network through the communication portion 909, and / or installed from a removable medium 911.
  • a central processing unit (CPU) 901 the operations of the above functions defined in the method of the present application are performed.
  • the methods and apparatus of the present application may be implemented in many ways.
  • the methods and devices of the present application may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps of the method is for illustration only, and the steps of the method of the present application are not limited to the order specifically described above, unless otherwise specifically stated.
  • the present application can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present application.
  • the present application also covers a recording medium storing a program for executing the method according to the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请实施例公开了一种图像的目标对象分割及训练方法和装置、设备、介质、产品,其中,图像的目标对象分割方法包括:通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息;将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于所述目标图像特征信息,确定所述图像的目标对象分割结果。

Description

图像分割及分割网络训练方法和装置、设备、介质、产品
相关申请的交叉引用
本申请要求在2018年6月15日提交中国专利局、申请号为201810623306.0、申请名称为“图像的目标对象分割及训练方法和装置、设备、介质、产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术,尤其是一种图像分割及分割网络训练方法和装置、设备、介质、产品。
背景技术
随着遥感卫星的快速发展,遥感影像也开始为各个领域所应用。由于卫星遥感影像的场景较大,而且没有清晰的边界,没有一个准确的结构信息,遥感影像与传统的图像分割场景不同,导致在使用传统神经网络在分割上存在一些困难,效果较差且难以提升。
发明内容
本申请实施例期望提供一种图像分割及训练方法和装置、设备、介质、产品。
根据本申请实施例的一个方面,提供的一种图像分割方法,包括:通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息;将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于所述目标图像特征信息,确定所述图像的目标对象分割结果。
可选地,所述将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息,包括:将每对所述相邻处理块输出的图像特征信息进行第一级融合处理,得到第一融合特征信息;将至少一对相邻的所述第一融合特征信息进行第二级融合处理,得到至少一个第二融合特征信息;基于所述至少一个第二融合特征信息,确定所述目标图像特征信息。
可选地,所述基于所述至少一个第二融合特征信息,确定所述目标图像特征信息,包括:对所述至少一个第二融合特征信息进行后续特征融合处理,直到所述后续融合处理得到的后续融合特征信息的数量为一个;将所述数量为一个的后续融合特征信息作为所述目标图像特征信息。
可选地,在将每对所述相邻处理块输出的图像特征信息进行融合处理的过程中,将每对所述相邻处理块输出的图像特征信息逐元素相加。
可选地,所述多个处理块之间顺序连接;和/或,每对所述相邻处理块输出的图像特征信息具有相同的大小和相同的通道数。
可选地,所述处理块包括至少一个处理单元,每个所述处理单元包括至少一个特征提取层和特征调整层;所述通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:通过所述处理单元中的所述至少一个特征提取层对所述处理单元的输入信息进行特征提取处理,得到第一特征信息;通过所述处理单元中的所述特征调整层对所述第一特征信息进行调整处理,得到所述处理单元输出的图像特征信息。
可选地,所述将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息之前,所述方法还包括:对所述多个处理块中的处理块M1输出的图像特征信息进行特征缩减处理;对所述多个处理块中的处理块M2输出的图像特征信息进行特征扩展处理;所述处理块M2的输入端与所述处理块M1的输出端直接或间接连接。
可选地,所述基于多个处理块对所述图像进行特征提取处理,得到所述多个处理块中每个 处理块输出的图像特征信息,包括:利用所述多个处理块中的处理块N1对所述处理块N1的输入信息进行特征提取处理,得到所述处理块N1对应的第一图像特征信息,所述处理块N1的输入信息包括所述图像和/或位于所述处理块N1之前的至少一个处理块输出的图像特征信息,N1为大于或等于1的整数;将所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息。
可选地,所述将所述第一图像特征信息输入到所述处理块N1的下一处理块进行处理,得到所述下一处理块输出的第二图像特征信息,包括:将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息,所述处理块N1的输入端与所述处理块N2的输出端直接或间接连接。
可选地,所述将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理之前,所述方法还包括:将所述至少一个处理块N2输出的图像特征信息进行融合处理,并将融合处理得到的图像特征信息输入所述处理块N1的下一处理块。
可选地,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,所述方法还包括:通过卷积层对所述图像进行特征提取处理,得到所述图像的初始特征信息;所述通过多个处理块对图像进行特征提取处理,包括:将所述图像的初始特征信息输入所述多个处理块进行特征提取处理。
可选地,所述图像为遥感图像,所述目标对象为土地。
可选地,所述方法利用分割神经网络实现,所述图像为土地样本图像;所述方法还包括:利用所述分割神经网络对道路样本图像进行处理,得到所述道路样本图像的分割结果;基于所述土地样本图像的目标对象预测结果以及所述道路样本图像的分割结果,调整所述分割神经网络的参数。
可选地,所述目标图像特征信息是基于混合特征信息得到的,所述混合特征信息是由所述分割神经网络对所述土地样本图像和所述道路样本图像进行批量处理得到的。
可选地,所述基于所述土地样本图像的目标对象预测结果以及所述道路样本图像的分割结果,调整所述分割神经网络的参数,包括:基于所述土地样本图像的目标对象预测结果和所述土地样本图像的标注信息获得第一损失;基于所述道路样本图像的分割结果和所述道路样本图像的标注信息获得第二损失;基于所述第一损失和所述第二损失调整所述分割神经网络的参数。
可选地,所述基于所述第一损失和所述第二损失调整所述分割神经网络的参数,包括:将所述第一损失和所述第二损失加权求和,得到总损失;基于所述总损失,调整所述分割神经网络的参数。
可选地,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,还包括:通过设定参数对所述样本图像进行以下至少一种增强处理:调整所述样本图像的大小、旋转所述样本图像的角度、改变所述样本图像的亮度;所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:基于多个处理块对所述至少一种增强处理后的图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
可选地,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,还包括:基于设定大小的剪裁框对所述图像进行剪裁,获得至少一个剪裁图像;所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:基于多个处理块对所述裁剪图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
根据本申请实施例的另一个方面,提供的一种图像分割装置,包括:图像处理模块,配置为通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息;融合模块,配置为将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;分割模块,配置为基于所述目标图像特征信息,确定所述图像的目标对象分割结果。
根据本申请实施例的另一个方面,提供的一种土地分割神经网络的训练方法,包括:将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果;基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数。
可选地,所述土地分割神经网络包括顺序连接的多个处理块、融合网络和分割网络;所述将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,包括:基于多个处理块对所述至少一个土地样本图像和所述至少一个道路样本图像进行特征提取处理,得到所述多个处理块中每个处理块输出的样本图像特征信息;通过所述融合网络将所述多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息;基于所述目标样本图像特征信息,通过所述分割网络得到所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果。
可选地,所述基于所述多个处理块对所述至少一个土地样本图像和所述至少一个道路样本图像进行特征提取处理,得到所述多个处理块中每个处理块输出的样本图像特征信息,包括:基于所述多个处理块对各所述土地样本图像和各所述道路样本图像进行处理,得到每个所述土地样本图像对应的至少两组样本图像特征信息和每个所述道路样本图像对应的至少两组样本图像特征信息。
可选地,所述将所述多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息,包括:对每个土地样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个土地样本图像对应的土地样本图像特征信息;对每个道路样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个道路样本图像的道路样本图像特征信息,其中,所述目标样本图像特征信息包括所述至少一个土地样本图像对应的土地样本图像特征信息和所述至少一个道路样本图像对应的道路样本图像特征信息。
可选地,所述土地分割神经网络还包括切片层;所述基于所述目标样本图像特征信息,确定所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果之前,还包括:通过所述切片层对所述目标样本图像特征信息中包含的所述土地样本图像特征信息与所述道路样本图像特征信息进行分割;将所述土地样本图像特征信息输入到所述分割网络进行处理,得到土地样本图像的预测分割结果,并将所述道路样本图像特征信息输入所述分割网络进行处理,获得所述道路样本图像的预测分割结果。
可选地,所述土地样本图像和所述道路样本图像分别具有标注信息;所述基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数,包括:基于所述土地样本图像对应的预测分割结果和所述土地样本图像对应的标注信息获得第一损失;基于所述道路样本图像对应的预测分割结果和所述道路样本图像对应的标注信息获得第二损失;基于所述第一损失和所述第二损失调整所述土地分割神经网络的参数。
可选地,所述基于所述第一损失和所述第二损失调整所述土地分割神经网络的参数,包括: 将所述第一损失和所述第二损失加权求和,得到总损失;基于所述总损失,调整所述土地分割神经网络的参数。
根据本申请实施例的另一个方面,提供的一种土地分割神经网络的训练装置,包括:结果预测模块,配置为将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果;参数调整模块,配置为基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数。
根据本申请实施例的另一个方面,提供的一种电子设备,包括:存储器,配置为存储可执行指令;以及处理器,配置为与所述存储器通信以执行所述可执行指令从而完成如上任意一项所述图像分割方法的操作,或者,配置为与所述存储器通信以执行所述可执行指令从而完成如上任意一项所述土地分割神经网络的训练方法的操作。
根据本申请实施例的另一个方面,提供的一种计算机可读存储介质,配置为存储计算机可读取的指令,所述指令被执行时执行如上任意一项所述图像分割方法或如上任意一项所述土地分割神经网络的训练方法的操作。
根据本申请实施例的另一个方面,提供的一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在设备上运行时,所述设备中的处理器执行配置为实现如上任意一项所述图像分割方法或如上任意一项所述土地分割神经网络的训练方法的指令。
根据本申请实施例的再一个方面,提供的另一种计算机程序产品,配置为存储计算机可读指令,所述指令被执行时使得计算机执行上述任一可能的实现方式中所述的图像分割方法,或执行任一可能的实现方式中所述的土地分割神经网络的训练方法的操作。
在一个可选实施方式中,所述计算机程序产品具体为计算机存储介质,在另一个可选实施方式中,所述计算机程序产品具体为软件产品,例如SDK等。
根据本申请实施例还提供了另一种图像分割及土地分割神经网络的训练方法和装置、电子设备、计算机存储介质、计算机程序产品,其中,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果。
基于本申请上述实施例提供的一种图像的目标对象分割及土地分割神经网络的训练方法和装置、设备、介质、产品,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果,通过相邻图像特征信息的至少两级融合,获得了更多信息,有利于对图像中的目标对象进行较为准确的分割。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本申请的实施例,并且连同描述一起用于解释本申请的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本申请,其中:
图1为本申请实施例提供的图像分割方法的示意性流程图;
图2为本申请实施例提供的图像分割方法中处理块的一个示例性结构图;
图3为本申请实施例提供的图像分割方法中分割神经网络在训练过程的一个示例性结构示意图;
图4为本申请实施例与FC-DenseNet分割效果对比的一个示例图;
图5为本申请实施例与FC-DenseNet和ClassmateNet结构分割效果对比的一个示例图;
图6为本申请实施例提供的图像分割装置的结构示意图;
图7为本申请实施例提供的土地分割神经网络的训练方法的示例性流程图;
图8为本申请实施例提供的土地分割神经网络的训练装置的结构示意图;
图9为适于用来实现本申请实施例的电子设备的一个示例的结构示意图。
具体实施方式
现在将参照附图来详细描述本申请的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本申请的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本申请及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
应理解,本公开实施例是基于遥感图像的土地分割提出的,但也可以应用于其他领域,本公开实施例对此不做限定。
图1为本申请实施例提供的图像分割方法的示意性流程图,如图1所示,该方法包括:
步骤110:通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
其中,处理块包括至少一个处理单元。可选地,多个处理块之间可以顺序连接,多个处理块分别位于不同深度,例如,多个处理块中的任意处理块的输出端可以与其下一处理块的输入端连接。
可以利用多个处理块对图像依次进行特征提取处理。例如,多个处理块中的第一个处理块可以对输入的图像进行特征提取处理,得到第一个处理块输出的图像特征信息。第二个处理块可以对输入的图像特征信息进行特征提取处理,得到第二处理块输出的图像特征信息,其中,该第二个处理块中输入的图像特征信息可以包括第一处理块输出的图像特征信息,或者还可以进一步包括所述图像,以此类推,可以得到多个处理块中每个处理块输出的图像特征信息。
在一个或多个可选的实施例中,利用多个处理块中的处理块N1对处理块N1的输入信息进行特征提取处理,得到处理块N1对应的第一图像特征信息。其中,N1为大于或等于1的整数。
将第一图像特征信息输入到处理块N1的下一处理块进行特征提取处理,得到下一处理块输出的第二图像特征信息。
可选地,处理块N1可以为多个处理块中的第一个处理块,此时,处理块N1的输入信息可以为上述图像或图像的初始图像特征信息;或者,处理块N1可以为多个处理块中的第二个或更后面的处理块,此时,处理块N1的输入信息可以包括上一个处理块输出的图像特征信息,或者还可以进一步包括位于该上一个处理块之前的任意一个或多个处理块输出的图像特征信息,或者还可以包括图像,即处理块N1的输入信息可以包括图像和/或位于处理块N1之前的一个或多个处理块输出的图像特征信息。由于处理块的输入信息包括不同深度的图像特征信息,使处理块输出的图像特征信息可以包含更多图像信息。
处于前面的处理块获得的图像特征信息中包括的浅层信息越多,结合后面处理块输出的图像特征信息,可以将图像中的浅层信息及深层信息都获得。
可选地,将第一图像特征信息输入到处理块N1的下一处理块进行处理,得到下一处理块输 出的第二图像特征信息,包括:将图像和/或至少一个处理块N2输出的图像特征信息以及第一图像特征信息输入到处理块N1的下一处理块进行特征提取处理,得到下一处理块输出的第二图像特征信息。其中,处理块N1的输入端与处理块N2的输出端直接或间接连接。在本公开实施例中,处理块N1在网络结构上位于处理块N2之后。
可选地,处理块N1的下一处理块的输入可以仅是处理块N1输出的图像特征信息,例如:处理块N1为第三处理块,处理块N1的下一处理块为第四处理块,第四处理块的输入为第三处理块输出的图像特征信息。
可选地,处理块N1的下一处理块的输入包括处理块N1输出的图像处理信息和至少一个处理块N2输出的图像处理信息;例如:处理块N1为第三处理块,处理块N1的下一处理块为第四处理块,至少一个处理块N2包括第一处理块和/或第二处理块,此时第四处理块的输入为第三处理块输出的图像特征信息和第一处理块输出的图像特征信息,或,第三处理块输出的图像特征信息和第二处理块输出的图像特征信息,或,第三处理块输出的图像特征信息、第一处理块输出的图像特征信息和第二处理块输出的图像特征信息。
可选地,处理块N1的下一处理块的输入包括处理块N1输出的图像处理信息和图像,或者,处理块N1的下一处理块的输入包括处理块N1输出的图像处理信息、图像和至少一个处理块N2输出的图像处理信息。
可选地,在处理块N1的下一处理块的输入包括处理块N1和至少一个处理块N2输出的图像特征信息的情况下,在将这些图像特征信息输入到下一处理块之前,还可以将至少一个处理块N2和处理块N1中的部分或全部处理块输出的图像特征信息进行融合处理,并将融合处理得到的图像特征信息输入处理块N1的下一处理块。
当需要将至少两个处理块输出的图像特征信息输入到一个处理块中时,可以将这些特征信息进行融合,以便处理块的处理,具体融合方式可以是按位相加(逐元素相加)或按通道叠加或其他方式。
在一个或多个可选的实施例中,在将图像输入到多个处理块之前,还可以利用一个或多个卷积层对图像进行特征提取处理,得到图像的初始特征信息,相应地,可以将图像的初始特征信息输入到多个处理块中依次进行特征提取处理,本申请实施例对此不做限定。
此时,处理块N1的下一处理块的输入还可以包括图像的初始特征信息。可选地,假设处理块N1的下一处理块的输入包括处理块N1输出的图像处理信息和图像时,则可以将图像的初始特征信息与处理块N1输出的图像特征信息进行融合处理,或者假设处理块N1的下一处理块的输入包括处理块N1输出的图像处理信息、图像和至少一个处理块N2输出的图像处理信息,则可以将图像的初始特征信息与处理块N1输出的图像处理信息、图像和至少一个处理块N2输出的图像处理信息进行融合处理,等等,本申请实施例对此不做限定。
步骤120:将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息。
在一个或多个可选的实施例中,将每对相邻处理块输出的图像特征信息进行第一级融合处理,得到第一融合特征信息;将至少一对相邻的第一融合特征信息进行第二级融合处理,得到至少一个第二融合特征信息;基于至少一个第二融合特征信息,确定目标图像特征信息。
在本公开实施例中,多个处理块可以被分成多对相邻处理块,每对相邻处理块包括两个相邻的处理块(即直接连接的两个处理块),可选地,不同对的相邻处理块中包含不同的处理块,或者不同对的相邻处理块中可以不包括相同的处理块,例如,第一个处理块和第二个处理块组成第一对相邻处理块,第三个处理块和第四个处理块组成第二对相邻处理块,以此类推。
在一些实施例中,可以通过将每对相邻处理块输出的图像特征信息进行融合处理(例如,将每对相邻处理块输出的图像特征信息逐元素相加),实现了图像特征信息的两两融合。
由于具有多对相邻处理块,融合了每对处理块的图像特征信息之后可以得到多个第一融合特征信息,此时可以将第一融合特征信息(例如,两个第一融合特征信息或两个以上的第一融合特征信息)中的部分或全部进行第二级融合处理,获得一个第二融合特征信息,并以该第二融合特征信息作为目标图像特征信息;或者,可以将多个第一融合特征信息进行两两相邻融合,得到多个第二融合特征信息,此时,可选地,可以对多个第二融合特征信息进行后续特征融合处理,直到后续融合处理得到的后续融合特征信息的数量为一个,并将数量为一个的后续融合特征信息作为目标图像特征信息。
此时的后续融合处理,可以是将第二融合特征信息两两融合(例如,将两个第二融合特征信息逐元素相加),两两融合后得到的后续融合特征信息包括至少一个或多个,当后续融合特征信息的数量为一个,将该后续融合特征信息作为目标图像特征信息;而当后续融合特征信息的数量为多个,将该后续融合特征信息继续两两融合(例如,将两个后续融合特征信息逐元素相加),直到后续融合处理得到的后续融合特征信息的数量为一个,将该后续融合特征信息作为目标图像特征信息;例如,包括8个处理块,经过一级融合获得4个第一融合特征信息,经过二级融合获得2个第二融合特征信息,经过三级融合获得一个后续融合特征信息,将该后续融合特征信息作为目标图像特征信息。
为了进一步的处理细节信息,在本申请实施例中提出了密集融合(Dense Fusion)结构,将不同深度的层(Layer)进行两两融合,通过逐元素求和(Element-wise Sum)进行融合,一直递归融合到最后一层。通过密集融合结构可以更好的使得网络获取更多深层和浅层的信息,有利于在细节上准确的分割。
应理解,上文描述以处理块两两逐级融合为例进行描述,本申请实施例中,也可以以相邻的三个或更多个处理块为单位进行逐级融合,本申请实施例对此不走限定。
步骤130:基于目标图像特征信息,确定图像的目标对象分割结果。
基于本申请上述实施例提供的一种图像的目标对象分割方法,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果,通过相邻图像特征信息的至少两级融合,获得了更多信息,有利于对图像中的目标对象进行较为准确的分割。
可选地,特征信息可以为三阶向量,例如包括多个二维矩阵,或者包括具有至少一个通道的特征图,每个特征图对应一个二维向量,本申请实施例对此不做限定。
在一个或多个可选实施例中,处理块可以包括一个或多个处理单元,每个处理单元可以对处理块的输入信息进行特征提取处理,例如,每个处理单元可以包括一个或多个卷积层,或者还包括其它层,例如批归一化(Batch Normalization,BN)层、激活层等中的一种或任意组合。或者,处理块还可以包括位于处理单元之后的其他单元,例如降分辨率层、特征缩放层、BN层、激活层中的任意一种或组合。
在一个或多个可选地实施例中,处理单元包括至少一个特征提取层和特征调整层;
步骤110可以包括:通过处理单元中的至少一个特征提取层对处理单元的输入信息进行特征提取处理,得到第一特征信息;通过处理单元中的特征调整层对第一特征信息进行调整处理,得到处理单元输出的图像特征信息。
可选地,每对所述相邻处理块输出的图像特征信息具有相同的大小和相同的通道数。为了实现图像特征信息之间的两两融合,需要每对相邻处理块输出的图像特征信息具有相同的大小和相同的通道数,本公开实施例通过在处理单元中增加一个配置为调整特征信息的大小和通道数的特征调整层实现,该特征调整层可以设置在处理单元内,也可以单独设置,本申请实施例对特征调整层的位置不做限制。在一个可选的示例中,每个处理单元可以包括至少一个特征提 取层(如:卷积层、标准化层BN和激活层ReLU等)和特征调整层(如:卷积层、标准化层BN和激活层ReLU等),图2为本申请实施例提供的图像分割方法中处理块的一个示例性结构图。如图2所示,处理块(Dense Block)中包括多个处理单元(Layer Unit),每个处理单元中包括三个卷积层,每个卷积层之后各连接一个批归一化层(BN)和一个激活层(ReLU),其中,前两个卷积层输出的特征图输入到下一个处理单元中,而向旁边输出的卷积层作为特征调整层,配置为对第二个卷积层输出的特征图进行大小和通道的调整,以便输出的特征信息(如:特征图)与其他处理单元输出的特征信息大小和通道数相同,为特征信息的融合做准备。
在一个或多个可选的实施例中,在步骤120之前,还可以包括:对多个处理块中的处理块M1输出的图像特征信息进行特征缩减处理;对多个处理块中的处理块M2输出的图像特征信息进行特征扩展处理。其中,处理块M2的输入端与处理块M1的输出端直接或间接连接,或,处理块M2输出的图像特征信息是至少部分地基于处理块M1输出的图像特征信息得到的。
通常神经网络中,上层处理块得到的图像特征信息由于经过的处理层数较少,其包括的图像信息较少,而下层处理块得到的图像特征信息由于经过的处理层数较多,其包括的图像信息较多,因此,可选地,在两两融合时,当相邻处理块对应的图像特征信息为浅层特征,对相邻处理块中位置靠下的处理块输出的图像特征信息进行特征缩减处理(例如:下采样处理);当相邻处理块对应的图像处理特征为深层特征,对相邻处理块中位置靠上的处理块输出的图像特征信息进行特征扩展处理(例如:插值处理,可以为双线性差值处理)。
在一个或多个可选的实施例中,本申请实施例处理的图像可以为遥感图像,此时目标对象为土地,即通过本申请上述实施例的方法实现通过遥感图像对土地实现分割,例如:将遥感图像中的土地分割为森林、草原、城市、耕地等。
本申请上述实施例提供的图像分割方法可以应用的场景包括但不限于:土地规划、土地利用监测、土地现状调查等。
在一个或多个可选的实施例中,本申请实施例图像分割方法利用分割神经网络实现,图像为土地样本图像。
本申请实施例图像分割方法还包括:基于样本图像的目标对象分割结果和样本图像的标注信息,训练分割神经网络。
为了获得更准确的图像分割结果,需要对实现图像分割的分割神经网络进行训练,通过训练提高该网络对特定目标对象(例如:土地)的分割任务的准确性。
可选地,样本图像为土地样本图像;本公开实施例的方法还包括:利用分割神经网络对道路样本图像进行处理,得到道路样本图像的分割结果;
基于土地样本图像的目标对象预测结果和道路样本图像的分割结果,调整分割神经网络的参数。
通过传统的CNN对土地图像(例如:遥感图像)进行分割时会缺失中间层次的结构信息,而结构信息对于辅助图像分割和分类是有着重要作用的,例如,对于土地覆盖类型分类,遥感影像所覆盖的场景较大,景象受到分辨率的约束和影响,同时由标注带来的噪声都会给图像的分割带来很大的影响。因此如何有效且准确的获取土地图像的结构信息成为了解决分割问题的关键。本申请实施例提出的分割神经网络,引入道路数据进行训练,弥补了土地图像的结构缺失问题,并且改善了细节信息。
对于土地覆盖的遥感影像,由于影像的尺度较大,包含的场景多而且杂乱无章没有光滑的边界线,并且由于土地覆盖本身没有明确量化的分界线,标注会存在歧义。传统的CNN很难针对场景较大的遥感影像获取结构信息,从而导致分割效果较差。在本公开实施例中提出利用已经获取的道路数据来作为辅助数据帮助网络的训练。由于道路数据存在明显的结构特征,而且在土地覆盖中,会存在一些道路数据。且在不同的土地类型中,道路的分布呈现不同的状态。 因此基于这个想法,通过分割神经网络(例如:密集融合同学网络Dense Fusion Classmate Network,DFCNet)用以同时获取土地和道路的信息,使得道路辅助土地的分类。由于道路数据相对土地覆盖,更加容易获得,而且在标注上也会简单,因此这个在实际应用中,能够利用较少较难标注的土地覆盖数据,加上部分容易标注的道路数据,辅助土地覆盖类型的分类。
可选地,目标图像特征信息是基于混合特征信息得到的,所述混合特征信息是由所述分割神经网络对所述土地样本图像和所述道路样本图像进行批量处理得到的。
当获得的样本图像集经过将分割神经网络的处理得到对应的目标样本图像特征信息集后,为了区分土地样本图像和道路图像,本公开实施例通过切片层(slice)对土地样本图像对应的目标样本图像特征信息与道路图像对应的目标样本图像特征信息进行区分,具体区分可根据输入土地样本图像和道路图像的顺序进行区分。
可选地,基于土地样本图像的目标对象预测结果以及道路样本图像的分割结果,调整分割神经网络的参数,包括:基于土地样本图像的目标对象预测结果和土地样本图像的标注信息获得第一损失;基于道路样本图像的分割结果和道路样本图像的标注信息获得第二损失;基于第一损失和第二损失调整分割神经网络的参数。
可选地,将第一损失和第二损失加权求和,得到总损失;基于总损失,调整分割神经网络的参数。通过对第一损失和第二损失加权求和对分割神经网络的参数进行调整,该加权求和的权重值可以预先设定或通过实验或多次训练获得,通常第一损失的权重值大于第二损失的权重值,例如:第一损失的权重值:第二损失的权重值为8:7,具体的权重值的大小本申请实施例不作限定。
本公开实施例中,利用道路数据弥补土地分类的结构缺失信息,提高了分割神经网络对土地分割任务的准确性。利用容易获得且容易标准的道路数据,在加入道路数据进行分割之后,能够提升土地覆盖分类的效率和准确率。并且在细节上的处理更加完善。
在一个或多个可选的实施例中,步骤110之前,还可以包括:通过设定参数对样本图像进行以下至少一种增强处理:调整样本图像的大小、旋转样本图像的角度、改变样本图像的亮度;
步骤110可以包括:基于多个处理块对至少一种增强处理后的图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
本公开实施例实现了数据增强处理,通过调整上述至少一个参数,可以获得更多样本图像,或将样本图像的显示效果提升,以达到更好的训练效果。例如:网络训练数据的裁剪大小为513x513,对于道路数据图像的随机调整大小(resize)取值范围[0.5,1.5],对于土地分类图像的随机resize取值范围[0.8,1.25]。对于道路和土地数据的随机旋转(rotate)范围为[-180,180],亮度调整(color jitter)参数为0.3。
在一个或多个可选的实施例中,步骤110之前,还可以包括:基于设定大小的剪裁框对图像进行剪裁,获得至少一个剪裁图像;
步骤110可以包括:基于多个处理块对裁剪图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
本公开实施例实现数据预处理,为了获取更多的信息,增大网络的感受野,加速整个训练过程,可以通过剪裁以减小样本图像的大小,例如:将2448x2448的土地数据裁剪成1024x1024大小,此时一个土地数据经过剪裁将获得多个样本数据。在网路的训练过程中加大了训练数据的裁剪尺寸,有助于网络对很多场景信息的提取,从而提升分割的效果。
为了进一步的处理细节信息,在本公开实施例提出了密集融合结构,将不同深度的Layer进行两两融合,通过Element-wise Sum进行融合,一直递归融合到最后一层。通过密集融合结构可以更好的使得网络获取更多深层和浅层的信息,有利于在细节上准确的分割。同时融合可以使得网络的反向传播(back propagation)更好更快的回传到较浅层的Layer,有利于网络更好 的监督。
图3为本申请实施例提供的图像分割方法中分割神经网络在训练过程的一个示例性结构示意图。如图3所示,将道路数据与样本土地数据经过concat层结合在一起,在第0维度上进行结合。整个分割神经网络(DFCNet)的结构图如图3所示,conv1位卷积层,Dense Block2~Dense Block9为处理块,包含不同数量的处理单元。在图中有参数说明。以Dense Block2为例,l=6表示DenseBlock2包含6个处理单元。Conv_TD表示是降采样操作,(128,1*1,0,1)表示卷积通道数为128,卷积核尺寸为1*1,padding值为0,步长为1。
Pooling1、2、3、4为池化层,策略为平均池化,池化区间为2x2,Interp5、6、7、8是上采样过程,通过双线性插值,将特征放大两倍。
每一个Dense Block中都包含若干个处理单元Layer Unit,每一个处理单元中都包含两个卷积层conv_x1/conv_x2(如图2所示),后面各自接了BN和RULU层。Conv_x1的卷积核数目为64,conv_x2卷积核数目为16。Conv_2x后接一个conv_f卷积层用以对特征融合过程中特征进行统一。
图3中右侧部分表示的是不同Dense Block之间的特征融合过程,像素较低的Dense Block后接一个插补层(Interp)然后与像素较高的的Dense Block进行Element-wise求和。最终融合到最后一层,在最后一个特征融合层上加入slice层,将道路和土地数据分开,用以分别预测。
对土地分类任务与之前经典的FC-DenseNet网络结构进行对比。在卷积网络最深层,将特征图存储,图4为本申请实施例与FC-DenseNet分割效果对比的一个示例图。如图4所示,(a)表示传统FC-DenseNet分割的结果,(b)表示本申请实施例分割的结果;对于加入道路数据的DFCNet,在特征上能够具有更好的结构信息。在城市、耕地和草地上能更好的辅助分割。
对于分割效果,图5为本申请实施例与FC-DenseNet和ClassmateNet结构分割效果对比的一个示例图。如图5所示,(a)表示FC-DenseNet结构分割的结果,(b)表示ClassmateNet结构分割的结果,(c)表示本申请实施DFCNet结构分割的结果。未加密集融合结构的ClassmateNet相比经典的FC-DenseNet在分割上效果更好,DFCNet相比未加密集融合Dense Fusion结构的ClassmateNet在细节上能有进一步的提升。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
图6为本申请实施例提供的图像分割装置的结构示意图。该装置可用于实现本申请上述各方法实施例。如图6所示,该装置包括:
图像处理模块61,配置为通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
融合模块62,配置为将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息。
分割模块63,配置为基于目标图像特征信息,确定图像的目标对象分割结果。
处理块包括至少一个处理单元。可选地,多个处理块之间可以顺序连接,这里,多个处理块可以分别位于不同深度,例如,多个处理块中的任意处理块的输出端可以与其下一处理块的输入端连接。
融合模块62,具体配置为将每对相邻处理块输出的图像特征信息进行第一级融合处理,得到第一融合特征信息;将至少一对相邻的第一融合特征信息进行第二级融合处理,得到至少一个第二融合特征信息;基于至少一个第二融合特征信息,确定目标图像特征信息。
在本公开实施例中,多个处理块被分成多对相邻处理块,每对相邻处理块包括两个相邻的 处理块(即直接连接的两个处理块),可选地,不同对的相邻处理块中包含不同的处理块,或者不同对的相邻处理块中可以不包括相同的处理块,例如,第一个处理块和第二个处理块组成第一对相邻处理块,第三个处理块和第四个处理块组成第二对相邻处理块,以此类推。
融合模块62,具体配置为对至少一个第二融合特征信息进行后续特征融合处理,直到后续融合处理得到的后续融合特征信息的数量为一个;将数量为一个的后续融合特征信息作为目标图像特征信息。
融合模块62,具体配置为在将每对所述相邻处理块输出的图像特征信息进行融合处理的过程中,将每对所述相邻处理块输出的图像特征信息逐元素相加。
为了进一步的处理细节信息,在本申请实施例中提出了密集融合结构,将不同深度的层(Layer)进行两两融合,通过逐元素求和进行融合,一直递归融合到最后一层。通过密集融合结构可以更好的使得网络获取更多深层和浅层的信息,有利于在细节上准确的分割。
基于本申请上述实施例提供的一种图像分割装置,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果,通过相邻图像特征信息的至少两级融合,获得了更多信息,有利于对图像中的目标对象进行较为准确的分割。
在一个或多个可选的实施例中,多个处理块之间顺序连接;和/或,每对所述相邻处理块输出的图像特征信息具有相同的大小和相同的通道数。为了实现图像特征信息之间的两两融合,需要每对相邻处理块输出的图像特征信息具有相同的大小和相同的通道数,本公开实施例通过在处理单元中增加一个配置为调整特征信息的大小和通道数的特征调整层实现,该特征调整层可以设置在处理单元内,也可以单独设置,本申请实施例对特征调整层的位置不做限制。在一个可选的示例中,每个处理单元可以包括至少一个特征提取层(如:卷积层、标准化层BN和激活层ReLU等)和一个特征调整层(如:卷积层、标准化层BN和激活层ReLU等)。
在一个或多个可选实施例中,处理块可以包括一个或多个处理单元,每个处理单元可以对输入信息进行特征提取处理,例如,每个处理单元可以包括一个或多个卷积层,或者还包括其它层,例如批归一化(Batch Normalization,BN)层、激活层等中的一种或任意组合。或者,处理块还可以包括位于处理单元之后的其他单元,例如降分辨率层、特征缩放层、BN层、激活层中的任意一种或组合。
在一个或多个可选的实施例中,处理单元包括至少一个特征提取层和特征调整层;
图像处理模块61,具体配置为通过处理单元中的至少一个特征提取层对处理单元的输入信息进行特征提取处理,得到第一特征信息;通过处理单元中的特征调整层对第一特征信息进行调整处理,得到处理单元输出的图像特征信息。
在一个或多个可选的实施例中,还包括:特征图像处理模块,配置为在将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息之前,对多个处理块中的处理块M1输出的图像特征信息进行特征缩减处理;对多个处理块中的处理块M2输出的图像特征信息进行特征扩展处理;其中,处理块M2的输入端与处理块M1的输出端直接或间接连接,或,处理块M2输出的图像特征信息是至少部分地基于处理块M1输出的图像特征信息得到的。
通常神经网络中,上层处理块得到的图像特征信息由于经过的处理层数较少,其包括的图像信息较少,而下层处理块得到的图像特征信息由于经过的处理层数较多,其包括的图像信息较多,因此,可选地,在两两融合时,当相邻处理块对应的图像特征信息为浅层特征,对相邻处理块中位置靠下的处理块输出的图像特征信息进行特征缩减处理(例如:下采样处理等);当相邻处理块对应的图像处理特征为深层特征,对相邻处理块中位置靠上的处理块输出的图像特 征信息进行特征扩展处理(例如:插值处理等,可以为双线性差值处理)。
在一个或多个可选的实施例中,图像处理模块61,具体配置为利用多个处理块中的处理块N1对处理块N1的输入信息进行特征提取处理,得到处理块N1对应的第一图像特征信息;将第一图像特征信息输入到处理块N1的下一处理块进行特征提取处理,得到下一处理块输出的第二图像特征信息。其中,处理块N1的输入信息包括图像和/或位于处理块N1之前的至少一个处理块输出的图像特征信息,N1为大于或等于1的整数。
可选地,处理块N1可以为多个处理块中的第一个处理块,此时,处理块N1的输入信息可以为图像或图像的初始特征信息;或者,处理块N1可以为多个处理块中的第二个或更后面的处理块,此时,处理块N1的输入信息可以包括上一个处理块输出的图像特征信息,或者还可以进一步包括位于该上一个处理块之前的任意一个或多个处理块输出的图像特征信息,或者还可以包括图像,即处理块N1的输入信息可以包括图像和/或位于处理块N1之前的一个或多个处理块输出的图像特征信息。由于处理块的输入信息包括不同深度的图像特征信息,使处理块输出的图像特征信息可以包含更多图像信息。
处于前面的处理块获得的图像特征信息中包括的浅层信息越多,结合后面处理块输出的图像特征信息,可以将图像中的浅层信息及深层信息都获得。
可选地,图像处理模块61,具体配置为将图像和/或至少一个处理块N2输出的图像特征信息以及第一图像特征信息输入到处理块N1的下一处理块进行特征提取处理,得到下一处理块输出的第二图像特征信息,其中,处理块N1的输入端与处理块N2的输出端直接或间接连接。
可选地,图像处理模块61,还配置为将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理之前,将至少一个处理块N2输出的图像特征信息进行融合处理,并将融合处理得到的图像特征信息输入处理块N1的下一处理块。
可选地,上述图像的目标对象分割装置还包括:特征提取模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,通过卷积层对图像进行特征提取处理,得到图像的初始特征信息,将图像的初始特征信息输入多个处理块进行特征提取处理。
本申请实施例处理的图像可以为遥感图像,此时目标对象为土地,即通过本申请上述实施例的方法实现通过遥感图像对土地实现分割,例如:将遥感图像中的土地分割为森林、草原、城市、耕地等。
本申请上述实施例提供的图像分割装置可以应用于但不限于:土地规划、土地利用监测、土地现状调查等。
在一个或多个可选的实施例中,本申请实施例图像分割装置利用分割神经网络实现,图像为土地样本图像;
本申请实施例图像的目标对象分割装置还包括:训练模块,配置为利用分割神经网络对道路样本图像进行处理,得到道路样本图像的分割结果;基于土地样本图像的目标对象预测结果以及道路样本图像的分割结果,调整所述分割神经网络的参数
为了获得更准确的目标对象分割结果,需要对实现图像分割的分割神经网络进行训练,通过训练提高该网络对特定目标对象(例如:土地)的分割任务的准确性。
通过传统的CNN对土地图像(例如:遥感图像)进行分割时会缺失中间层次的结构信息,而结构信息对于辅助图像分割和分类是有着重要作用的,因此如何有效且准确的获取土地图像的结构信息成为了解决分割问题的关键。本申请实施例提出的分割神经网络,引入道路数据进行训练,弥补了土地图像的结构缺失问题,并且改善了细节信息。
对于土地覆盖的遥感影像,由于影像的尺度较大,包含的场景多而且杂乱无章没有光滑的 边界线,并且由于土地覆盖本身没有明确量化的分界线,标注会存在歧义。传统的CNN很难针对场景较大的遥感影像获取结构信息,从而导致分割效果较差。在本公开实施例中提出利用已经获取的道路数据来作为辅助数据帮助网络的训练。由于道路数据存在明显的结构特征,而且在土地覆盖中,会存在一些道路数据。且在不同的土地类型中,道路的分布呈现不同的状态。因此基于这个想法,通过分割神经网络(例如:密集融合同学网络)用以同时获取土地和道路的信息,使得道路辅助土地的分类。由于道路数据相对土地覆盖,更加容易获得,而且在标注上也会简单,因此这个在实际应用中,能够利用较少较难标注的土地覆盖数据,加上部分容易标注的道路数据,辅助土地覆盖类型的分类。
可选地,目标图像特征信息是基于混合特征信息得到的,混合特征信息是由分割神经网络对土地样本图像和道路样本图像进行批量处理得到的。
可选地,训练模块,具体配置为基于土地样本图像的目标对象预测结果和土地样本图像的标注信息获得第一损失;基于道路样本图像的分割结果和道路样本图像的标注信息获得第二损失;基于第一损失和第二损失调整分割神经网络的参数。
可选地,训练模块,具体配置为将第一损失和第二损失加权求和,得到总损失;基于总损失,调整分割神经网络的参数。
在一个或多个可选的实施例中,还可以包括:增强图像处理模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,通过设定参数对样本图像进行以下至少一种增强处理:调整样本图像的大小、旋转样本图像的角度、改变样本图像的亮度;
图像处理模块61,具体配置为基于多个处理块对至少一种增强处理后的图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
本公开实施例实现了数据增强处理,通过调整上述至少一个参数,可以获得更多样本图像,或将样本图像的显示效果提升,以达到更好的训练效果。例如:网络训练数据的裁剪大小为513x513,对于道路数据图像的随机调整大小取值范围[0.5,1.5],对于土地分类图像的随机resize取值范围[0.8,1.25]。对于道路和土地数据的随机旋转范围为[-180,180],亮度调整参数为0.3。
在一个或多个可选的实施例中,上述图像的目标对象分割装置还可以包括:预处理模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,基于设定大小的剪裁框对图像进行剪裁,获得至少一个剪裁图像;
图像处理模块61,具体配置为基于多个处理块对裁剪图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息。
本公开实施例实现数据预处理,为了获取更多的信息,增大网络的感受野,加速整个训练过程,可以通过剪裁以减小样本图像的大小,例如:将2448x2448的土地数据裁剪成1024x1024大小,此时一个土地数据经过剪裁将获得多个样本数据。在网路的训练过程中加大了训练数据的裁剪尺寸,有助于网络对很多场景信息的提取,从而提升分割的效果。
图7为本申请实施例提供的土地分割神经网络的训练方法的示例性流程图。如图7所示,该方法包括:
步骤710:将至少一个土地样本图像和至少一个道路样本图像输入土地分割神经网络,获得至少一个土地样本图像的预测分割结果以及至少一个道路样本图像的预测分割结果。
步骤720:基于至少一个土地样本图像的预测分割结果以及至少一个道路样本图像的预测分割结果,调整土地分割神经网络的参数。
对于土地图像,通常尺度较大,包含的场景多而且杂乱无章没有光滑的边界线,并且由于土地覆盖本身没有明确量化的分界线,标注会存在歧义。传统的CNN很难针对场景较大的土地图像获取结构信息,从而导致分割效果较差。
在本公开实施例中提出利用具有标注信息的道路数据来作为辅助数据帮助土地分割神经网络的训练。由于道路数据存在明显的结构特征,而且在土地图像中,会存在一些道路数据。且在不同的土地类型中,道路的分布呈现不同的状态。因此基于这个想法,通过土地分割神经网络(例如:密集融合同学网络)用以同时获取土地和道路的信息,使得道路辅助土地的分类。由于道路数据相对土地覆盖,更加容易获得,而且在标注上也会简单,因此这个在实际应用中,能够利用较少较难标注的土地覆盖数据,加上部分容易标注的道路数据,辅助土地覆盖类型的分类。
当图1所示的图像的目标对象分割方法中的图像为遥感图像,目标对象为土地时,经过本公开实施例训练获得的土地分割神经网络可以应用于上述图1所示的图像的目标对象分割方法,以实现对遥感图像中的土地进行分割,以获得土地分割结果。
在一个或多个可选的实施例中,土地分割神经网络包括顺序连接的多个处理块、融合网络和分割网络;
步骤710可以包括:基于多个处理块对至少一个土地样本图像和至少一个道路样本图像进行特征提取处理,得到多个处理块中每个处理块输出的样本图像特征信息;通过融合网络将多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息;基于目标样本图像特征信息,通过分割网络得到至少一个土地样本图像的预测分割结果和至少一个道路样本图像的预测分割结果。
为了进一步的处理细节信息,在本公开实施例提出了密集融合结构,将不同深度的Layer进行两两融合,通过Element-wise Sum进行融合,一直递归融合到最后一层。通过密集融合结构可以更好的使得网络获取更多深层和浅层的信息,有利于在细节上准确的分割。同时融合可以使得网络的反向传播更好更快的回传到较浅层的Layer,有利于网络更好的监督。
可选地,基于所述多个处理块对各所述土地样本图像和各所述道路样本图像进行处理,得到每个所述土地样本图像对应的至少两组样本图像特征信息和每个所述道路样本图像对应的至少两组样本图像特征信息。
其中,可以通过多个处理块对每个土地样本图像进行处理,得到至少两组样本图像特征信息,其中,该至少两组样本图像特征信息可以对应于至少两个处理块,例如,包含多个处理块中每个处理块输出的样本图像特征信息,或者包含多个处理块中部分处理块输出的样本图像特征信息,本公开实施例对此不做限定。
本公开实施例土地分割神经网络对于输入的各土地样本图像和各道路样本图像分别进行处理,防止批量处理时出现不同样本图像之间的图像特征信息出现混淆,而导致训练结果不准确。
可选地,将多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息,包括:对每个土地样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个土地样本图像对应的土地样本图像特征信息;对每个道路样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个道路样本图像的道路样本图像特征信息,其中,所述目标样本图像特征信息包括所述至少一个土地样本图像对应的土地样本图像特征信息和所述至少一个道路样本图像对应的道路样本图像特征信息。
每个图像样本图像和每个道路样本图像都分别具有不同的图像特征信息,如果不同样本图像的图像特征信息融合将会导致训练结果不准确,本公开实施例土地分割神经网络并将每个样本图像(土地样本图像或道路样本图像)对应的两组样本图像特征信息分别进行,防止多个样本图像对应的样本图像特征信息之间融合。
可选地,土地分割神经网络还包括切片层;基于目标样本图像特征信息,确定至少一个土地样本图像的预测分割结果和至少一个道路样本图像的预测分割结果之前,还包括:
通过所述切片层对所述目标样本图像特征信息中包含的所述土地样本图像特征信息与所述 道路样本图像特征信息进行分割;将所述土地样本图像特征信息输入到所述分割网络进行处理,得到土地样本图像的预测分割结果,并将所述道路样本图像特征信息输入所述分割网络进行处理,获得所述道路样本图像的预测分割结果。
当至少一个土地样本图像和至少一个道路样本图像经过土地分割神经网络包括多个顺序连接处理块处理后,得到对应的目标样本图像特征信息集合后,为了区分土地样本图像和道路样本图像,以实现利用道路图像的信息对土地分割神经网络进行训练,本公开实施例通过切片层(slice)对土地样本图像对应的目标样本图像特征信息与道路样本图像对应的目标样本图像特征信息进行区分,具体区分可根据输入土地样本图像和道路样本图像的顺序进行区分。
可选地,土地样本图像和道路样本图像分别具有标注信息;
基于至少一个土地样本图像的预测分割结果以及至少一个道路样本图像的预测分割结果,调整土地分割神经网络的参数,包括:基于土地样本图像对应的预测分割结果和土地样本图像对应的标注信息获得第一损失;基于道路样本图像对应的预测分割结果和道路图像对应的标注信息获得第二损失;基于第一损失和第二损失调整土地分割神经网络的参数。
可选地,将第一损失和第二损失加权求和,得到总损失;基于总损失,调整土地分割神经网络的参数。通过对第一损失和第二损失加权求和对土地分割神经网络的参数进行调整,该加权求和的权重值可以预先设定或通过实验或多次训练获得,通常第一损失的权重值大于第二损失的权重值,例如:第一损失的权重值:第二损失的权重值为8:7,具体的权重值的大小本申请实施例不作限定。
本公开实施例中,利用道路数据弥补土地分类的结构缺失信息,提高了土地分割神经网络对土地分割任务的准确性。利用容易获得且容易标准的道路数据,在加入道路数据进行分割之后,能够提升土地覆盖分类的效率和准确率。并且在细节上的处理更加完善。
本申请土地分割神经网络的训练过程的一个示例可以如图3所示,其实现的分割效果与FC-DenseNet分割效果对比可以如图4所示,其实现的分割效果与FC-DenseNet和ClassmateNet结构分割效果对比可以图5所示。
在实际应用中,由于道路数据是相对简单的,在标注和获取途径上都比土地覆盖的图像更易获得。因此引入简单的道路数据,可以很大程度的提升较难获取和较难标注的土地覆盖图像的分类可以节约标准的人力。以及在加入密集融合模型网络结构,可以在细节上有助于土地覆盖的分类。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
图8为本申请实施例提供的土地分割神经网络的训练装置的结构示意图。该装置可用于实现本申请上述各方法实施例。如图8所示,该装置包括:
结果预测模块81,配置为将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得至少一个土地样本图像的预测分割结果以及至少一个道路样本图像的预测分割结果。
参数调整模块82,配置为基于至少一个土地样本图像的预测分割结果以及至少一个道路样本图像的预测分割结果,调整土地分割神经网络的参数。
对于土地图像,通常尺度较大,包含的场景多而且杂乱无章没有光滑的边界线,并且由于土地覆盖本身没有明确量化的分界线,标注会存在歧义。传统的CNN很难针对场景较大的土地图像获取结构信息,从而导致分割效果较差。
在本公开实施例中提出利用具有标注信息的道路数据来作为辅助数据帮助土地分割神经网 络的训练。由于道路数据存在明显的结构特征,而且在土地图像中,会存在一些道路数据。且在不同的土地类型中,道路的分布呈现不同的状态。因此基于这个想法,通过土地分割神经网络(例如:密集融合同学网络)用以同时获取土地和道路的信息,使得道路辅助土地的分类。由于道路数据相对土地覆盖,更加容易获得,而且在标注上也会简单,因此这个在实际应用中,能够利用较少较难标注的土地覆盖数据,加上部分容易标注的道路数据,辅助土地覆盖类型的分类。
在一个或多个可选的实施例中,土地分割神经网络包括顺序连接的多个处理块、融合网络和分割网络;
结果预测模块81,具体配置为基于多个处理块对至少一个土地样本图像和至少一个道路样本图像进行特征提取处理,得到多个处理块中每个处理块输出的样本图像特征信息;通过融合网络将多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息;基于目标样本图像特征信息,通过分割网络得到至少一个土地样本图像的预测分割结果和至少一个道路样本图像的预测分割结果。
为了进一步的处理细节信息,在本公开实施例提出了密集融合结构,将不同深度的Layer进行两两融合,通过Element-wise Sum进行融合,一直递归融合到最后一层。通过密集融合结构可以更好的使得网络获取更多深层和浅层的信息,有利于在细节上准确的分割。同时融合可以使得网络的反向传播更好更快的回传到较浅层的Layer,有利于网络更好的监督。
可选地,结果预测模块81,具体配置为基于所述多个处理块对各所述土地样本图像和各所述道路样本图像进行处理,得到每个所述土地样本图像对应的至少两组样本图像特征信息和每个所述道路样本图像对应的至少两组样本图像特征信息。
可选地,结果预测模块81,具体配置为对每个土地样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个土地样本图像对应的土地样本图像特征信息;对每个道路样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个道路样本图像的道路样本图像特征信息,其中,所述目标样本图像特征信息包括所述至少一个土地样本图像对应的土地样本图像特征信息和所述至少一个道路样本图像对应的道路样本图像特征信息。
可选地,土地分割神经网络还包括切片层;
结果预测模块81还配置为基于所述目标样本图像特征信息,确定所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果之前,通过所述切片层对所述目标样本图像特征信息中包含的所述土地样本图像特征信息与所述道路样本图像特征信息进行分割;将所述土地样本图像特征信息输入到所述分割网络进行处理,得到土地样本图像的预测分割结果,并将所述道路样本图像特征信息输入所述分割网络进行处理,获得所述道路样本图像的预测分割结果。
可选地,土地样本图像和道路样本图像分别具有标注信息;参数调整模块82,具体配置为基于土地样本图像对应的预测分割结果和土地样本图像对应的标注信息获得第一损失;基于道路样本图像对应的预测分割结果和道路样本图像对应的标注信息获得第二损失;基于第一损失和第二损失调整土地分割神经网络的参数。
可选地,参数调整模块82,具体配置为将第一损失和第二损失加权求和,得到总损失;基于总损失,调整土地分割神经网络的参数。
根据本申请实施例的另一个方面,提供的一种电子设备,包括处理器,所述处理器包括如上任意一项所述的图像分割装置或如上任意一项所述的土地分割神经网络的训练装置。
根据本申请实施例的另一个方面,提供的一种电子设备,包括:存储器,配置为存储可执行指令;
以及处理器,配置为与所述存储器通信以执行所述可执行指令从而完成如上任意一项所述 图像分割方法的操作,或者,配置为与所述存储器通信以执行所述可执行指令从而完成如上任意一项所述土地分割神经网络的训练方法的操作。
根据本申请实施例的另一个方面,提供的一种计算机可读存储介质,配置为存储计算机可读取的指令,所述指令被执行时执行如上任意一项所述图像分割方法或如上任意一项所述土地分割神经网络的训练方法的操作。
根据本申请实施例的另一个方面,提供的一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在设备上运行时,所述设备中的处理器执行配置为实现如上任意一项所述图像分割方法或如上任意一项所述土地分割神经网络的训练方法的指令。
在一个或多个可选实施方式中,本申请实施例还提供了一种计算机程序程序产品,配置为存储计算机可读指令,所述指令被执行时使得计算机执行上述任一可能的实现方式中所述的图像分割方法,或执行任一可能的实现方式中所述的土地分割神经网络的训练方法的操作。
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选例子中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选例子中,所述计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
根据本申请实施例还提供了图像分割及土地分割神经网络的训练方法和装置、电子设备、计算机存储介质、计算机程序产品,其中,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果。
在一些实施例中,该目标跟踪指示可以具体为调用指令,第一装置可以通过调用的方式指示第二装置执行目标跟踪,相应地,响应于接收到调用指令,第二装置可以执行上述目标跟踪方法中的任意实施例中的步骤和/或流程。
应理解,本申请实施例中的“第一”、“第二”等术语仅仅是为了区分,而不应理解成对本申请实施例的限定。
还应理解,在本申请中,“多个”可以指两个或两个以上,“至少一个”可以指一个、两个或两个以上。
还应理解,对于本申请中提及的任一部件、数据或结构,在没有明确限定或者在前后文给出相反启示的情况下,一般可以理解为一个或多个。
还应理解,本申请对各个实施例的描述着重强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,不再一一赘述。
本申请实施例还提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等。下面参考图9,其示出了适于用来实现本申请实施例的电子设备900的一个示例的结构示意图:如图9所示,电子设备900包括一个或多个处理器、通信部等,所述一个或多个处理器例如:一个或多个中央处理单元(CPU)901,和/或一个或多个图像处理器(GPU)913等,处理器可以根据存储在只读存储器(ROM)902中的可执行指令或者从存储部分908加载到随机访问存储器(RAM)903中的可执行指令而执行各种适当的动作和处理。通信部912可包括但不限于网卡,所述网卡可包括但不限于IB(Infiniband)网卡。
处理器可与只读存储器902和/或随机访问存储器903中通信以执行可执行指令,通过总线904与通信部912相连、并经通信部912与其他目标设备通信,从而完成本申请实施例提供的任一项方法对应的操作,例如,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果。
此外,在RAM 903中,还可存储有装置操作所需的各种程序和数据。CPU901、ROM902以及RAM903通过总线904彼此相连。在有RAM903的情况下,ROM902为可选模块。RAM903存储可执行指令,或在运行时向ROM902中写入可执行指令,可执行指令使中央处理单元901执行上述通信方法对应的操作。输入/输出(I/O)接口905也连接至总线904。通信部912可以集成设置,也可以设置为具有多个子模块(例如多个IB网卡),并在总线链接上。
以下部件连接至I/O接口905:包括键盘、鼠标等的输入部分906;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分907;包括硬盘等的存储部分908;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至I/O接口905。可拆卸介质911,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器910上,以便于从其上读出的计算机程序根据需要被安装入存储部分908。
需要说明的,如图9所示的架构仅为一种可选实现方式,在具体实践过程中,可根据实际需要对上述图9的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如GPU913和CPU901可分离设置或者可将GPU913集成在CPU901上,通信部可分离设置,也可集成设置在CPU901或GPU913上,等等。这些可替换的实施方式均落入本申请公开的保护范围。
特别地,根据本申请的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请的实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的方法的程序代码,程序代码可包括对应执行本申请实施例提供的方法步骤对应的指令,例如,通过多个处理块对图像进行特征提取处理,得到多个处理块中每个处理块输出的图像特征信息;将多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;基于目标图像特征信息,确定图像的目标对象分割结果。在这样的实施例中,该计算机程序可以通过通信部分909从网络上被下载和安装,和/或从可拆卸介质911被安装。在该计算机程序被中央处理单元(CPU)901执行时,执行本申请的方法中限定的上述功能的操作。
可能以许多方式来实现本申请的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本申请的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。

Claims (54)

  1. 一种图像分割方法,其中,包括:
    通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息;
    将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;
    基于所述目标图像特征信息,确定所述图像的目标对象分割结果。
  2. 根据权利要求1所述的方法,其中,所述将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息,包括:
    将每对所述相邻处理块输出的图像特征信息进行第一级融合处理,得到第一融合特征信息;
    将至少一对相邻的所述第一融合特征信息进行第二级融合处理,得到至少一个第二融合特征信息;
    基于所述至少一个第二融合特征信息,确定所述目标图像特征信息。
  3. 根据权利要求2所述的方法,其中,所述基于所述至少一个第二融合特征信息,确定所述目标图像特征信息,包括:
    对所述至少一个第二融合特征信息进行后续特征融合处理,直到所述后续融合处理得到的后续融合特征信息的数量为一个;
    将所述数量为一个的后续融合特征信息作为所述目标图像特征信息。
  4. 根据权利要求2或3所述的方法,其中,在将每对所述相邻处理块输出的图像特征信息进行融合处理的过程中,将每对所述相邻处理块输出的图像特征信息逐元素相加。
  5. 根据权利要求1-4任一项所述的方法,其中,所述多个处理块之间顺序连接;和/或,每对所述相邻处理块输出的图像特征信息具有相同的大小和相同的通道数。
  6. 根据权利要求1-5任一项所述的方法,其中,所述处理块包括至少一个处理单元,每个所述处理单元包括至少一个特征提取层和特征调整层;
    所述通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:
    通过所述处理单元中的所述至少一个特征提取层对所述处理单元的输入信息进行特征提取处理,得到第一特征信息;
    通过所述处理单元中的所述特征调整层对所述第一特征信息进行调整处理,得到所述处理单元输出的图像特征信息。
  7. 根据权利要求1-6任一项所述的方法,其中,所述将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息之前,所述方法还包括:
    对所述多个处理块中的处理块M1输出的图像特征信息进行特征缩减处理;
    对所述多个处理块中的处理块M2输出的图像特征信息进行特征扩展处理;所述处理块M2的输入端与所述处理块M1的输出端直接或间接连接。
  8. 根据权利要求1-7任一项所述的方法,其中,所述基于多个处理块对所述图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:
    利用所述多个处理块中的处理块N1对所述处理块N1的输入信息进行特征提取处理,得到所述处理块N1对应的第一图像特征信息,所述处理块N1的输入信息包括所述图像和/或位于所述处理块N1之前的至少一个处理块输出的图像特征信息,N1为大于或等于1的整数;
    将所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息。
  9. 根据权利要求8所述的方法,其中,所述将所述第一图像特征信息输入到所述处理块N1的下一处理块进行处理,得到所述下一处理块输出的第二图像特征信息,包括:
    将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息,所述处理块N1的输入端与所述处理块N2的输出端直接或间接连接。
  10. 根据权利要求9所述的方法,其中,所述将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理之前,所述方法还包括:
    将所述至少一个处理块N2输出的图像特征信息进行融合处理,并将融合处理得到的图像特征信息输入所述处理块N1的下一处理块。
  11. 根据权利要求1-10任一项所述的方法,其中,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,所述方法还包括:
    通过卷积层对所述图像进行特征提取处理,得到所述图像的初始特征信息;
    所述通过多个处理块对图像进行特征提取处理,包括:
    将所述图像的初始特征信息输入所述多个处理块进行特征提取处理。
  12. 根据权利要求1-11任一项所述的方法,其中,所述图像为遥感图像,所述目标对象为土地。
  13. 根据权利要求1-12任一项所述的方法,其中,所述方法利用分割神经网络实现,所述图像为土地样本图像;
    所述方法还包括:
    利用所述分割神经网络对道路样本图像进行处理,得到所述道路样本图像的分割结果;
    基于所述土地样本图像的目标对象预测结果以及所述道路样本图像的分割结果,调整所述分割神经网络的参数。
  14. 根据权利要求13所述的方法,其中,所述目标图像特征信息是基于混合特征信息得到的,所述混合特征信息是由所述分割神经网络对所述土地样本图像和所述道路样本图像进行批量处理得到的。
  15. 根据权利要求13或14所述的方法,其中,所述基于所述土地样本图像的目标对象预测结果以及所述道路样本图像的分割结果,调整所述分割神经网络的参数,包括:
    基于所述土地样本图像的目标对象预测结果和所述土地样本图像的标注信息获得第一损失;
    基于所述道路样本图像的分割结果和所述道路样本图像的标注信息获得第二损失;
    基于所述第一损失和所述第二损失调整所述分割神经网络的参数。
  16. 根据权利要求15所述的方法,其中,所述基于所述第一损失和所述第二损失调整所述分割神经网络的参数,包括:
    将所述第一损失和所述第二损失加权求和,得到总损失;
    基于所述总损失,调整所述分割神经网络的参数。
  17. 根据权利要求13-16任一项所述的方法,其中,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,还包括:
    通过设定参数对所述样本图像进行以下至少一种增强处理:调整所述样本图像的大小、旋转所述样本图像的角度、改变所述样本图像的亮度;
    所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出 的图像特征信息,包括:
    基于多个处理块对所述至少一种增强处理后的图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
  18. 根据权利要求1-17任一项所述的方法,其中,所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,还包括:基于设定大小的剪裁框对所述图像进行剪裁,获得至少一个剪裁图像;
    所述基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息,包括:基于多个处理块对所述裁剪图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
  19. 一种图像分割装置,其中,包括:
    图像处理模块,配置为通过多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息;
    融合模块,配置为将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息;
    分割模块,配置为基于所述目标图像特征信息,确定所述图像的目标对象分割结果。
  20. 根据权利要求19所述的装置,其中,所述融合模块,具体配置为将每对所述相邻处理块输出的图像特征信息进行第一级融合处理,得到第一融合特征信息;将至少一对相邻的所述第一融合特征信息进行第二级融合处理,得到至少一个第二融合特征信息;基于所述至少一个第二融合特征信息,确定所述目标图像特征信息。
  21. 根据权利要求20所述的装置,其中,所述融合模块,具体配置为对所述至少一个第二融合特征信息进行后续特征融合处理,直到所述后续融合处理得到的后续融合特征信息的数量为一个;将所述数量为一个的后续融合特征信息作为所述目标图像特征信息。
  22. 根据权利要求20或21所述的装置,其中,所述融合模块,具体配置为在将每对所述相邻处理块输出的图像特征信息进行融合处理的过程中,将每对所述相邻处理块输出的图像特征信息逐元素相加。
  23. 根据权利要求19-22任一项所述的装置,其中,所述多个处理块之间顺序连接;和/或,每对所述相邻处理块输出的图像特征信息具有相同的大小和相同的通道数。
  24. 根据权利要求19-23任一项所述的装置,其中,所述处理块包括至少一个处理单元,每个所述处理单元包括至少一个特征提取层和特征调整层;
    所述图像处理模块,具体配置为通过所述处理单元中的所述至少一个特征提取层对所述处理单元的输入信息进行特征提取处理,得到第一特征信息;通过所述处理单元中的所述特征调整层对所述第一特征信息进行调整处理,得到所述处理单元输出的图像特征信息。
  25. 根据权利要求19-24任一项所述的装置,其中,还包括:特征图像处理模块,配置为在将所述多个处理块中的至少两对相邻处理块输出的图像特征信息进行至少两级融合处理,得到目标图像特征信息之前,对所述多个处理块中的处理块M1输出的图像特征信息进行特征缩减处理;对所述多个处理块中的处理块M2输出的图像特征信息进行特征扩展处理;所述处理块M2的输入端与所述处理块M1的输出端直接或间接连接。
  26. 根据权利要求19-25任一项所述的装置,其中,所述图像处理模块,具体配置为利用所述多个处理块中的处理块N1对所述处理块N1的输入信息进行特征提取处理,得到所述处理块N1对应的第一图像特征信息,将所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息;所述处理块N1的输入信息包括所述图像和/或位于所述处理块N1之前的至少一个处理块输出的图像特征信息,N1为大于或等于1的整数。
  27. 根据权利要求26所述的装置,其中,所述图像处理模块,具体配置为将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理,得到所述下一处理块输出的第二图像特征信息,所述处理块N1的输入端与所述处理块N2的输出端直接或间接连接。
  28. 根据权利要求27所述的装置,其中,所述图像处理模块,还配置为将所述图像和/或至少一个处理块N2输出的图像特征信息以及所述第一图像特征信息输入到所述处理块N1的下一处理块进行特征提取处理之前,将所述至少一个处理块N2输出的图像特征信息进行融合处理,并将融合处理得到的图像特征信息输入所述处理块N1的下一处理块。
  29. 根据权利要求19-28任一项所述的装置,其中,还包括:特征提取模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,通过卷积层对所述图像进行特征提取处理,得到所述图像的初始特征信息,将所述图像的初始特征信息输入所述多个处理块进行特征提取处理。
  30. 根据权利要求19-29任一项所述的装置,其中,所述图像为遥感图像,所述目标对象为土地。
  31. 根据权利要求19-30任一项所述的装置,其中,所述装置利用分割神经网络实现,所述图像为土地样本图像;所述装置还包括:训练模块,配置为利用所述分割神经网络对道路样本图像进行处理,得到所述道路样本图像的分割结果;基于所述土地样本图像的目标对象预测结果以及所述道路样本图像的分割结果,调整所述分割神经网络的参数。
  32. 根据权利要求31所述的装置,其中,所述目标图像特征信息是基于混合特征信息得到的,所述混合特征信息是由所述分割神经网络对所述土地样本图像和所述道路样本图像进行批量处理得到的。
  33. 根据权利要求31或32所述的装置,其中,所述训练模块,具体配置为基于所述土地样本图像的目标对象预测结果和所述土地样本图像的标注信息获得第一损失;基于所述道路样本图像的分割结果和所述道路样本图像的标注信息获得第二损失;基于所述第一损失和所述第二损失调整所述分割神经网络的参数。
  34. 根据权利要求33所述的装置,其中,所述训练模块,具体配置为将所述第一损失和所述第二损失加权求和,得到总损失;基于所述总损失,调整所述分割神经网络的参数。
  35. 根据权利要求30-33任一项所述的装置,其中,还包括:增强图像处理模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,通过设定参数对所述样本图像进行以下至少一种增强处理:调整所述样本图像的大小、旋转所述样本图像的角度、改变所述样本图像的亮度;
    所述图像处理模块,具体配置为基于多个处理块对所述至少一种增强处理后的图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
  36. 根据权利要求19-35任一项所述的装置,其中,还包括:预处理模块,配置为基于多个处理块对图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息之前,基于设定大小的剪裁框对所述图像进行剪裁,获得至少一个剪裁图像;
    所述图像处理模块,具体配置为基于多个处理块对所述裁剪图像进行特征提取处理,得到所述多个处理块中每个处理块输出的图像特征信息。
  37. 一种土地分割神经网络的训练方法,其中,包括:
    将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果;
    基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数。
  38. 根据权利要求37所述的方法,其中,所述土地分割神经网络包括顺序连接的多个处理块、融合网络和分割网络;
    所述将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,包括:
    通过多个处理块对所述至少一个土地样本图像和所述至少一个道路样本图像进行特征提取处理,得到所述多个处理块中每个处理块输出的样本图像特征信息;
    通过所述融合网络将所述多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息;
    基于所述目标样本图像特征信息,通过所述分割网络得到所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果。
  39. 根据权利要求38所述的方法,其中,所述基于所述多个处理块对所述至少一个土地样本图像和所述至少一个道路样本图像进行特征提取处理,得到所述多个处理块中每个处理块输出的样本图像特征信息,包括:
    基于所述多个处理块对各所述土地样本图像和各所述道路样本图像进行处理,得到每个所述土地样本图像对应的至少两组样本图像特征信息和每个所述道路样本图像对应的至少两组样本图像特征信息。
  40. 根据权利要求38或39所述的方法,其中,所述将所述多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息,包括:
    对每个土地样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个土地样本图像对应的土地样本图像特征信息;
    对每个道路样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个道路样本图像的道路样本图像特征信息,其中,
    所述目标样本图像特征信息包括所述至少一个土地样本图像对应的土地样本图像特征信息和所述至少一个道路样本图像对应的道路样本图像特征信息。
  41. 根据权利要求38-40任一项所述的方法,其中,所述土地分割神经网络还包括切片层;
    所述基于所述目标样本图像特征信息,确定所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果之前,还包括:
    通过所述切片层对所述目标样本图像特征信息中包含的所述土地样本图像特征信息与所述道路样本图像特征信息进行分割;
    将所述土地样本图像特征信息输入到所述分割网络进行处理,得到土地样本图像的预测分割结果,并将所述道路样本图像特征信息输入所述分割网络进行处理,获得所述道路样本图像的预测分割结果。
  42. 根据权利要求41所述的方法,其中,所述土地样本图像和所述道路样本图像分别具有标注信息;
    所述基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数,包括:
    基于所述土地样本图像对应的预测分割结果和所述土地样本图像对应的标注信息获得第一损失;
    基于所述道路样本图像对应的预测分割结果和所述道路样本图像对应的标注信息获得第二损失;
    基于所述第一损失和所述第二损失调整所述土地分割神经网络的参数。
  43. 根据权利要求42所述的方法,其中,所述基于所述第一损失和所述第二损失调整所述土地分割神经网络的参数,包括:
    将所述第一损失和所述第二损失加权求和,得到总损失;
    基于所述总损失,调整所述土地分割神经网络的参数。
  44. 一种土地分割神经网络的训练装置,其中,包括:
    结果预测模块,配置为将至少一个土地样本图像和至少一个道路样本图像输入所述土地分割神经网络,获得所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果;
    参数调整模块,配置为基于所述至少一个土地样本图像的预测分割结果以及所述至少一个道路样本图像的预测分割结果,调整所述土地分割神经网络的参数。
  45. 根据权利要求44所述的装置,其中,所述土地分割神经网络包括顺序连接的多个处理块、融合网络和分割网络;
    所述结果预测模块,具体配置为基于所述多个处理块对所述至少一个土地样本图像和所述至少一个道路样本图像进行特征提取处理,得到所述多个处理块中每个处理块输出的样本图像特征信息;通过所述融合网络将多个处理块中的至少两对相邻处理块输出的样本图像特征信息进行至少两级融合处理,得到目标样本图像特征信息;基于所述目标样本图像特征信息,通过所述分割网络得到所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果。
  46. 根据权利要求45所述的装置,其中,所述结果预测模块,具体配置为基于所述多个处理块对各所述土地样本图像和各所述道路样本图像进行处理,得到每个所述土地样本图像对应的至少两组样本图像特征信息和每个所述道路样本图像对应的至少两组样本图像特征信息。
  47. 根据权利要求45或46所述的装置,其中,所述结果预测模块,具体配置为对每个土地样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个土地样本图像对应的土地样本图像特征信息;对每个道路样本图像对应的至少两组样本图像特征信息进行至少两级融合,得到所述每个道路样本图像的道路样本图像特征信息,其中,所述目标样本图像特征信息包括所述至少一个土地样本图像对应的土地样本图像特征信息和所述至少一个道路样本图像对应的道路样本图像特征信息。
  48. 根据权利要求45-47任一项所述的装置,其中,所述土地分割神经网络还包括切片层;所述结果预测模块还配置为基于所述目标样本图像特征信息,确定所述至少一个土地样本图像的预测分割结果和所述至少一个道路样本图像的预测分割结果之前,通过所述切片层对所述目标样本图像特征信息中包含的所述土地样本图像特征信息与所述道路样本图像特征信息进行分割;将所述土地样本图像特征信息输入到所述分割网络进行处理,得到土地样本图像的预测分割结果,并将所述道路样本图像特征信息输入所述分割网络进行处理,获得所述道路样本图像的预测分割结果。
  49. 根据权利要求48所述的装置,其中,所述土地样本图像和所述道路样本图像分别具有标注信息;所述参数调整模块,具体配置为基于所述土地样本图像对应的预测分割结果和所述土地样本图像对应的标注信息获得第一损失;基于所述道路样本图像对应的预测分割结果和所述道路样本图像对应的标注信息获得第二损失;基于所述第一损失和所述第二损失调整所述土地分割神经网络的参数。
  50. 根据权利要求49所述的装置,其中,所述参数调整模块,具体配置为将所述第一损失和所述第二损失加权求和,得到总损失;基于所述总损失,调整所述土地分割神经网络 的参数。
  51. 一种电子设备,其中,包括处理器,所述处理器包括权利要求19至36任一项所述的图像分割装置或权利要求44至50任一项所述的土地分割神经网络的训练装置。
  52. 一种电子设备,其中,包括:存储器,配置为存储可执行指令;
    以及处理器,配置为与所述存储器通信以执行所述可执行指令从而完成权利要求1至18任意一项所述图像分割方法的操作,或者,配置为与所述存储器通信以执行所述可执行指令从而完成权利要求37至43任意一项所述土地分割神经网络的训练方法的操作。
  53. 一种计算机可读存储介质,配置为存储计算机可读取的指令,其中,所述指令被执行时执行权利要求1至18任意一项所述图像分割方法或权利要求37至43任意一项所述土地分割神经网络的训练方法的操作。
  54. 一种计算机程序产品,包括计算机可读代码,其中,当所述计算机可读代码在设备上运行时,所述设备中的处理器执行配置为实现权利要求1至18任意一项所述图像分割方法或权利要求37至43任意一项所述土地分割神经网络的训练方法的指令。
PCT/CN2019/091328 2018-06-15 2019-06-14 图像分割及分割网络训练方法和装置、设备、介质、产品 WO2019238126A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2020569112A JP7045490B2 (ja) 2018-06-15 2019-06-14 画像分割と分割ネットワークトレーニング方法および装置、機器、媒体、並びに製品
SG11202012531TA SG11202012531TA (en) 2018-06-15 2019-06-14 Method, apparatus, device, medium and product for segmenting image, and method, apparatus, device, medium and product for training segmentation network
US17/121,670 US20210097325A1 (en) 2018-06-15 2020-12-14 Method and apparatus for segmenting image, and method and apparatus for training segmentation network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810623306.0A CN108830221A (zh) 2018-06-15 2018-06-15 图像的目标对象分割及训练方法和装置、设备、介质、产品
CN201810623306.0 2018-06-15

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/121,670 Continuation US20210097325A1 (en) 2018-06-15 2020-12-14 Method and apparatus for segmenting image, and method and apparatus for training segmentation network

Publications (1)

Publication Number Publication Date
WO2019238126A1 true WO2019238126A1 (zh) 2019-12-19

Family

ID=64142272

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091328 WO2019238126A1 (zh) 2018-06-15 2019-06-14 图像分割及分割网络训练方法和装置、设备、介质、产品

Country Status (5)

Country Link
US (1) US20210097325A1 (zh)
JP (1) JP7045490B2 (zh)
CN (1) CN108830221A (zh)
SG (1) SG11202012531TA (zh)
WO (1) WO2019238126A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667440A (zh) * 2020-05-14 2020-09-15 涡阳县幸福门业有限公司 一种金属门烤漆温度分布图像的融合方法
CN112529863A (zh) * 2020-12-04 2021-03-19 推想医疗科技股份有限公司 测量骨密度的方法及装置

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830221A (zh) * 2018-06-15 2018-11-16 北京市商汤科技开发有限公司 图像的目标对象分割及训练方法和装置、设备、介质、产品
CN111260564A (zh) * 2018-11-30 2020-06-09 北京市商汤科技开发有限公司 一种图像处理方法和装置及计算机存储介质
CN109635813B (zh) * 2018-12-13 2020-12-25 银河水滴科技(北京)有限公司 一种钢轨区域图像分割方法及装置
CN111626313B (zh) * 2019-02-28 2023-06-02 银河水滴科技(北京)有限公司 一种特征提取模型训练方法、图像处理方法及装置
CN109934223B (zh) * 2019-03-01 2022-04-26 北京地平线机器人技术研发有限公司 实例分割结果评价参数确定方法及装置
CN110062164B (zh) * 2019-04-22 2021-10-26 深圳市商汤科技有限公司 视频图像处理方法及装置
CN110245710B (zh) * 2019-06-18 2022-11-29 腾讯科技(深圳)有限公司 语义分割模型的训练方法、语义分割方法及装置
CN110363780A (zh) * 2019-07-23 2019-10-22 腾讯科技(深圳)有限公司 图像分割方法、装置、计算机可读存储介质和计算机设备
CN111178445A (zh) * 2019-12-31 2020-05-19 上海商汤智能科技有限公司 图像处理方法及装置
CN112132832B (zh) * 2020-08-21 2021-09-28 苏州浪潮智能科技有限公司 一种增强图像实例分割的方法、***、设备及介质
CN112686274B (zh) * 2020-12-31 2023-04-18 上海智臻智能网络科技股份有限公司 目标对象的检测方法及设备
CN113705718B (zh) * 2021-09-06 2024-04-02 齐齐哈尔大学 基于多层次特征密集融合的遥感场景图像分类方法
CN114596620B (zh) * 2022-05-10 2022-08-05 深圳市海清视讯科技有限公司 人脸识别设备补光控制方法、装置、设备及存储介质
CN115222638B (zh) * 2022-08-15 2023-03-07 深圳市眼科医院 一种基于神经网络模型的视网膜血管图像分割方法及***

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216890A (zh) * 2008-01-09 2008-07-09 北京中星微电子有限公司 一种彩色图像分割方法
CN101425184A (zh) * 2008-10-30 2009-05-06 西安电子科技大学 基于第二代Bandelet域隐马尔科夫树模型的图像分割方法
CN105488534A (zh) * 2015-12-04 2016-04-13 中国科学院深圳先进技术研究院 交通场景深度解析方法、装置及***
CN108830221A (zh) * 2018-06-15 2018-11-16 北京市商汤科技开发有限公司 图像的目标对象分割及训练方法和装置、设备、介质、产品

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216890A (zh) * 2008-01-09 2008-07-09 北京中星微电子有限公司 一种彩色图像分割方法
CN101425184A (zh) * 2008-10-30 2009-05-06 西安电子科技大学 基于第二代Bandelet域隐马尔科夫树模型的图像分割方法
CN105488534A (zh) * 2015-12-04 2016-04-13 中国科学院深圳先进技术研究院 交通场景深度解析方法、装置及***
CN108830221A (zh) * 2018-06-15 2018-11-16 北京市商汤科技开发有限公司 图像的目标对象分割及训练方法和装置、设备、介质、产品

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667440A (zh) * 2020-05-14 2020-09-15 涡阳县幸福门业有限公司 一种金属门烤漆温度分布图像的融合方法
CN111667440B (zh) * 2020-05-14 2024-02-13 涡阳县幸福门业有限公司 一种金属门烤漆温度分布图像的融合方法
CN112529863A (zh) * 2020-12-04 2021-03-19 推想医疗科技股份有限公司 测量骨密度的方法及装置
CN112529863B (zh) * 2020-12-04 2024-01-23 推想医疗科技股份有限公司 测量骨密度的方法及装置

Also Published As

Publication number Publication date
CN108830221A (zh) 2018-11-16
JP2021526276A (ja) 2021-09-30
SG11202012531TA (en) 2021-01-28
JP7045490B2 (ja) 2022-03-31
US20210097325A1 (en) 2021-04-01

Similar Documents

Publication Publication Date Title
WO2019238126A1 (zh) 图像分割及分割网络训练方法和装置、设备、介质、产品
CN111986099B (zh) 基于融合残差修正的卷积神经网络的耕地监测方法及***
US11551333B2 (en) Image reconstruction method and device
TWI762860B (zh) 目標檢測及目標檢測網路的訓練方法、裝置、設備及儲存媒體
CN109508681B (zh) 生成人体关键点检测模型的方法和装置
CN110428432B (zh) 结肠腺体图像自动分割的深度神经网络算法
CN110622177B (zh) 实例分割
US10891476B2 (en) Method, system, and neural network for identifying direction of a document
US20230079886A1 (en) Labeling techniques for a modified panoptic labeling neural network
Wahab et al. Multifaceted fused-CNN based scoring of breast cancer whole-slide histopathology images
WO2019071976A1 (zh) 基于区域增长和眼动模型的全景图像显著性检测方法
KR102624027B1 (ko) 영상 처리 장치 및 방법
WO2020101777A1 (en) Segmenting objects by refining shape priors
CN112016682B (zh) 视频表征学习、预训练方法及装置、电子设备、存储介质
CN111325750B (zh) 一种基于多尺度融合u型链神经网络的医学图像分割方法
WO2023109709A1 (zh) 一种基于注意力机制的图像拼接定位检测方法
CN110399826B (zh) 一种端到端人脸检测和识别方法
CN111507183A (zh) 一种基于多尺度密度图融合空洞卷积的人群计数方法
CN108388901B (zh) 基于空间-语义通道的协同显著目标检测方法
CN112668532A (zh) 基于多阶段混合注意网络的人群计数方法
Vayssade et al. Pixelwise instance segmentation of leaves in dense foliage
Hao et al. Weakly supervised instance segmentation using multi-prior fusion
CN113269752A (zh) 一种图像检测方法、装置终端设备及存储介质
CN116310899A (zh) 基于YOLOv5改进的目标检测方法及装置、训练方法
CN114359739B (zh) 目标识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19819094

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020569112

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19819094

Country of ref document: EP

Kind code of ref document: A1