WO2020186563A1 - Object segmentation method and apparatus, computer readable storage medium, and computer device - Google Patents

Object segmentation method and apparatus, computer readable storage medium, and computer device Download PDF

Info

Publication number
WO2020186563A1
WO2020186563A1 PCT/CN2019/081484 CN2019081484W WO2020186563A1 WO 2020186563 A1 WO2020186563 A1 WO 2020186563A1 CN 2019081484 W CN2019081484 W CN 2019081484W WO 2020186563 A1 WO2020186563 A1 WO 2020186563A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
network
stage
feature
feature maps
Prior art date
Application number
PCT/CN2019/081484
Other languages
French (fr)
Chinese (zh)
Inventor
林迪
黄惠
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Publication of WO2020186563A1 publication Critical patent/WO2020186563A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of computer technology, in particular to an object segmentation method, device, computer-readable storage medium, and computer equipment.
  • Object segmentation has always been one of the long-standing technical problems in computer vision.
  • Object segmentation includes semantic segmentation and instance segmentation.
  • semantic segmentation refers to marking each pixel in the image as a certain object category, and different instances of the same object do not need to be segmented separately.
  • Instance segmentation is a combination of object detection and semantic segmentation. Relative object detection only obtains the bounding box of the object, and the instance segmentation can be accurate to the edge of the object; relative semantic segmentation, instance segmentation can mark different individuals of the same object on the map.
  • an object segmentation method is provided.
  • An object segmentation method including:
  • An object segmentation device includes:
  • the initial feature map acquisition module is used to acquire the initial feature map of the image to be processed
  • the feature map generation module is configured to input the initial feature map into a continuous top-down network and a bottom-up network to obtain a first-stage feature map, respectively, in the top-down network and the bottom-up network Update the first-stage feature map according to the regional context code RCE in the network to obtain an updated first-stage feature map;
  • An iterative processing module configured to use the updated first-stage feature map as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map;
  • the object segmentation module is configured to perform object segmentation on the image to be processed according to the target feature map.
  • a computer-readable storage medium that stores a computer program.
  • the processor executes the operations of the foregoing method.
  • a computer device includes a memory and a processor, and the memory stores a computer program.
  • the processor executes the operations of the foregoing method.
  • the above object segmentation method, device, computer readable storage medium and computer equipment obtain the initial feature map of the image to be processed, and input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map ,
  • the first-stage feature map is updated according to the regional context code RCE in the top-down network and the bottom-up network respectively, and the updated first-stage feature map is obtained.
  • the updated first-stage feature map is used as input for iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map, and the object to be processed is segmented according to the target feature map.
  • the target feature map is obtained from the bottom-up network Characteristic map.
  • Figure 1A is a segmentation result diagram of an object segmentation method in an embodiment
  • FIG. 1B is a schematic flowchart of an object segmentation method in an embodiment
  • FIG. 2 is a schematic diagram of a network structure and characteristic diagram in an embodiment
  • FIG. 3 is a schematic flowchart of the method for generating the first set of feature maps and the second set of feature maps in FIG. 1B;
  • FIG. 4 is a schematic flowchart of the method for generating the updated first set of feature maps and the second set of feature maps in Figure 1B;
  • FIG. 5 is a schematic flowchart of a method for generating a context feature map based on sub-regions of each layer of feature maps according to regional context coding RCE in an embodiment
  • FIG. 6 is a schematic diagram of a method for generating a context feature map based on sub-regions of each layer of feature maps according to regional context coding RCE in an embodiment
  • FIG. 7 is a schematic flowchart of the iterative fusion and update process in FIG. 1B;
  • Fig. 8 is a schematic diagram of the iterative fusion and update process in Fig. 1B;
  • Figure 9 is a structural block diagram of an object segmentation device in an embodiment
  • Fig. 10 is a structural block diagram of the feature map generating module in Fig. 9;
  • Fig. 11 is a structural block diagram of a computer device in an embodiment.
  • a segmentation result map obtained by an object segmentation method is provided.
  • the object segmentation method is used to segment different objects in each image.
  • an object segmentation method As shown in FIG. 1B, in one embodiment, an object segmentation method is provided. 1B, the object segmentation method specifically includes the following operations:
  • S102 Acquire an initial feature map of the image to be processed.
  • the image to be processed is an image for object segmentation.
  • the image to be processed can be an image collected in real time, or an image from any other device.
  • the image to be processed may be video data or picture data.
  • the feature map refers to the feature map obtained by calculating the image to be processed through the convolutional neural network.
  • the initial feature map is specifically obtained by inputting the image to be processed into the backbone network for convolution calculation.
  • the backbone network may be a bottom-up network.
  • the bottom-up network is a convolutional neural network.
  • the backbone network can also be other types of convolutional neural networks.
  • S104 Input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map, and encode the RCE to the first stage in the top-down network and the bottom-up network respectively according to the region context.
  • the feature map is updated to obtain the updated first stage feature map.
  • Top-down networks and bottom-up networks refer to different types of convolutional neural networks. Among them, the top-down network propagates high-level large-scale semantic information down to the shallower network layer, while the bottom-up network encodes the smaller-scale visible details to the deeper network layer.
  • the continuous top-down network and the bottom-up network refer to the connection of a bottom-up network after the top-down network to form a new network.
  • the continuous top-down network and the bottom-up network The upward network constructs a shortcut for information dissemination between the top and bottom feature maps. Therefore, continuous top-down and bottom-up networks are used to learn more powerful feature maps at different levels.
  • Region Context Encoding refers to the Regional Context Encoding (RCE) mechanism, which connects all sub-regions of the input feature map so that each sub-region can flexibly spread its information.
  • the Regional Context Encoding (RCE) mechanism is implemented using multiple parallel branches, and the feature map generated by the top-down/bottom-up network is input to each RCE branch.
  • each RCE branch the feature map is first divided into regular and different scale sub-regions; then the sub-regions of the same scale in a layer of the feature map are weighted and all sub-regions are aggregated into a global representation; finally The global representation is re-allocated to the sub-regions of the scale, and the results of all branches of the feature map of the layer are added to the input feature map to generate the context feature map of the feature map of the layer. This allows each sub-region to transfer information to all sub-regions of the new feature map.
  • Each branch performs different subdivisions on the feature map to generate sub-regions of different scales.
  • the initial feature map is first input to the top-down network for processing to obtain the feature map corresponding to the top-down network, and then the processing result is input to the bottom-up network connected to the top-down network for processing , Get the feature map corresponding to the bottom-up network, these two feature maps constitute the first stage feature map.
  • the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE respectively, and the contextual feature maps are propagated to other layers in the top-down network through the top-down network
  • the feature map is used to update the feature maps of other layers in the top-down network.
  • the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE, and the contextual feature maps are propagated to other layer features in the bottom-up network through the bottom-up network Figure, thereby updating the feature maps of other layers in the bottom-up network.
  • the above two updated feature maps constitute the updated first-stage feature map.
  • S106 Use the updated first-stage feature map as an input to perform iterative fusion and update processing until the preset number of iterations is reached to obtain the target feature map.
  • the first-stage feature map updated according to the regional context code RCE is used as input to perform iterative fusion processing to obtain the second-stage feature map.
  • the second-stage feature map is updated according to the regional context code RCE to obtain the updated second-stage feature map.
  • the updated second-stage feature map is used as input for iterative fusion processing to obtain the third-stage feature map.
  • the third-stage feature map is updated according to the regional context code RCE, and the updated fourth-stage feature map is obtained. The iteration continues until the preset number of iterations is reached. At this time, the finally obtained feature map is the target feature map.
  • S108 Perform object segmentation on the image to be processed according to the target feature map, where the target feature map is a feature map obtained by a bottom-up network.
  • the object to be processed can be segmented according to the target feature map. Specifically, object segmentation is performed on the image to be processed, that is, different objects in the image to be processed and different individuals of the same object can be segmented.
  • the target feature map is the feature map obtained from the bottom-up network.
  • the initial feature map of the image to be processed is obtained, and the initial feature map is input into the continuous top-down network and bottom-up network to calculate the first-stage feature map, respectively, in the top-down network and the bottom-up network.
  • the first-stage feature map is updated according to the regional context code RCE to obtain the updated first-stage feature map.
  • the updated first-stage feature map is used as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map, and the object to be processed is segmented according to the target feature map.
  • the regional context coding RCE can use different sub-regions of the feature map to calculate the context features to update the feature map at each stage, and can capture the context information of the multi-scale sub-regions in the feature map. Iterative fusion can aggregate all levels of contextual information, avoid the loss of information, and improve the accuracy of object segmentation.
  • the continuous top-down network and the bottom-up network are connected by dense paths.
  • FIG. 2 is a schematic diagram of a network structure and characteristic diagram in an embodiment.
  • the arrows in Figure 2 are all network structures, and the rectangular blocks are all feature maps. When the arrow points from the top to the bottom, the arrow shows a top-down network, and when the arrow points from the bottom to the top, the arrow shows a bottom-up network.
  • the arrow on the left shows the backbone network
  • the arrow on the right shows the top-down network
  • the arrow on the left shows the backbone network
  • the arrow on the right shows the top-down network connected by dense paths.
  • the arrow on the left shows the backbone network
  • the feature map obtained from the backbone network is input into the continuous top-down network and bottom-up network shown by the arrow on the right.
  • the feature map obtained by the convolution calculation of the top-down network is input into the bottom-up network for convolution calculation to obtain the feature map.
  • the arrow on the left shows the backbone network
  • the feature map obtained from the backbone network is input into the continuous top-down network and bottom-up network shown by the arrow on the right.
  • the continuous top Both the bottom-up network and the bottom-up network are connected by dense paths.
  • the backbone network is a bottom-up network
  • the network structure shown by the arrow in FIG. 2(d) is mainly used to implement object segmentation.
  • both the continuous top-down network and the bottom-up network can only directly propagate contextual information between adjacent feature maps, while propagating contextual information outside of adjacent feature maps requires Indirect propagation through multiple stages, indirect propagation will inevitably lead to the attenuation of important information.
  • the continuous top-down network and the bottom-up network are connected by dense paths.
  • the dense paths can directly communicate all feature maps at each stage of context information dissemination, thereby being able to communicate at all levels. Directly and effectively enhance the feature map, avoiding the continuous attenuation of important information in the indirect propagation process, so as to reduce the accuracy of the final object segmentation result.
  • the initial feature map is input into the continuous top-down network and the bottom-up network to calculate the first-stage feature map, including: the first-stage feature map includes two sets of feature maps ;
  • S1042 Input the initial feature map into the top-down network to calculate the first group of feature maps in the first stage feature map;
  • S1044 Input the first set of feature maps in the first stage feature map into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
  • the backbone network (shown by the arrow in the left image of Figure 2(d)) performs convolution calculation to obtain the initial feature map, and input the initial feature map to The arrow indicates) a continuous top-down network and a bottom-up network, and the continuous top-down network and the bottom-up network are connected by dense paths.
  • the initial feature map is first input to the dense top-down network for convolution calculation to obtain the first set of feature maps in the first stage feature map. Then input the first set of feature maps in the first stage feature map to the dense bottom-up network for convolution calculation to obtain the second set of feature maps in the first stage feature map.
  • the continuous top-down network and the bottom-up network construct a shortcut for information dissemination between the top-most and bottom-most feature maps. Therefore, continuous top-down and bottom-up networks are used to learn more powerful feature maps at different levels. And the continuous top-down network and bottom-up network are connected by dense paths. The dense paths can directly communicate with all feature maps at each stage of contextual information dissemination, thereby directly and effectively enhancing features at all levels. Figure, to avoid the continuous attenuation of important information in the indirect propagation process. Therefore, in the embodiment of the present application, the accuracy of the object segmentation result is improved from the two dimensions of continuous and dense.
  • the first-stage feature map is updated according to the regional context code RCE in the top-down network and the bottom-up network, respectively, to obtain the updated first-stage feature map, including :
  • S1046 Generate a context feature map based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to other layer feature maps in the top-down network.
  • the feature maps of other layers in the downward network are updated to obtain the first set of feature maps in the updated first-stage feature maps.
  • the context feature map based on the subregions of each layer of feature map is generated according to the regional context coding RCE, and then the context feature map based on the feature map of each layer is generated through Figure 2( d) All connections shown by the arrows in the middle part propagate the context feature map to other layer feature maps in the top-down network. In this way, the feature maps of other layers in the top-down network are updated to obtain the first set of feature maps in the updated first-stage feature maps.
  • S1048 Generate a context feature map based on the sub-regions of each layer of feature maps in the bottom-up network according to the region context coding RCE, and propagate the context feature maps to other layer feature map pairs in the bottom-up network.
  • the feature maps of other layers are updated to obtain the second set of feature maps in the updated first-stage feature maps.
  • the context feature map based on the sub-regions of each layer of feature maps is generated according to the regional context coding RCE, and then Figure 2( d) All connections in the right part propagate this context feature map to other layer feature maps in the bottom-up network. In this way, the feature maps of other layers in the bottom-up network are updated to obtain the second set of feature maps in the updated first-stage feature maps.
  • the RCE mechanism is coded according to the regional context, and the first-stage feature map is updated through densely connected paths.
  • This mechanism connects all sub-regions of the input feature map, so that each sub-region can flexibly spread its information, thereby avoiding the loss of information contained in the sub-regions of each layer of the feature map.
  • the context feature maps based on the sub-regions of each layer of feature maps obtained according to the regional context coding RCE mechanism are propagated to other layer feature maps in the same network.
  • generating a context feature map based on the sub-regions of each layer of feature maps according to the region context coding RCE includes:
  • S506 Reallocate the global representation to the sub-regions of the scale, and aggregate the sub-regions of the scale to generate a context feature map based on the sub-regions of each layer of feature maps.
  • the regional context coding RCE mechanism includes multiple parallel branches for each layer of feature maps, and the input feature maps are divided into different scales in different branches.
  • the input feature map (a) is divided into three different scales, divided into 3 ⁇ 3, 5 ⁇ 5, and 7 ⁇ 7 sub-regions, and the scale of the sub-regions in each branch is the same .
  • the regional context coding RCE mechanism first uses a separate convolutional layer to process the input feature map (a), and then uses the processing result to calculate the features of the subregion (b), and in each branch, the subregions of the same scale are weighted and summed The global representation of the sub-region under this scale is calculated.
  • the global representation is re-allocated to sub-regions of this scale, and the sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
  • the global representation (c) connects all the sub-regions in the corresponding branches, the global representation can spread information to all the sub-regions (d) to obtain the processing result of each branch.
  • the processing results of all branches are added to the input feature map (a) to generate a context feature map (e) based on the input feature map (a), and the context feature map (e) can be represented by R.
  • (x,y) means The position of S(x,y) Contains a group of neuron subregions, Represents the characteristics of the sub-region S(x,y). By adjusting the importance of all sub-regions, the characteristics of all sub-regions are summed to obtain a global representation that connects all sub-regions. For this, you can simply apply the learnable K ⁇ K convolution and ReLU activation There is no padding. This produces a c-dimensional feature vector (as shown in Figure 6(c)).
  • Equation 2 using three different segments of F i (subregions 3 ⁇ 3,5 ⁇ 5,7 ⁇ 7) is set to calculate the characteristic of FIG. By dividing the feature map into more sub-regions, the number of parameters that need to be learned can be significantly increased.
  • the regional context coding RCE mechanism uses multiple branches.
  • the feature map is divided into sub-regions of different scales, and then a sub-region of the same scale is calculated.
  • Global variables Therefore, each branch corresponds to a global variable.
  • the contextual feature map of the feature map contains more comprehensive information.
  • the contextual feature maps of the sub-regions of each layer of feature maps obtained through the RCE mechanism will be propagated to other layer feature maps through dense connections. Then the feature maps of other layers also use the RCE mechanism to obtain the context feature maps of the sub-regions of the feature map. Therefore, due to the RCE mechanism, the sub-regions of the feature maps of each layer with different scales can affect any position of the feature maps of other layers. Compared with the use of dense paths to realize the contextual feature map propagation, all feature maps are directly communicated, which can directly and effectively enhance the feature maps at all levels and avoid the continuous attenuation of important information in the indirect propagation process.
  • the RCE mechanism further enables the sub-regions of different scales in each layer of feature maps to affect any position of the feature maps of other layers. Therefore, the accuracy of the object segmentation result is further improved.
  • the regional context coding RCE includes multiple parallel branches for each layer of feature maps, and each parallel branch processes sub-regions of the same scale respectively.
  • the regional context coding RCE mechanism for each layer of feature maps includes multiple parallel branches (three branches shown in Figure 6), and the input feature maps are scaled in different branches.
  • the division For example, in Figure 6, the input feature map (a) is divided into three different scales, divided into 3 ⁇ 3, 5 ⁇ 5, 7 ⁇ 7 sub-regions, and the scale of the sub-regions in each branch is the same .
  • Each parallel branch separately processes the sub-regions of the same scale.
  • the three different scale divisions used in the above embodiments are just examples, and one, two, four or more divisions with different scales may also be used.
  • the input feature map can also be divided irregularly, that is, the scale of the sub-region in each branch is different, but the scale of the sub-region in each branch has different specific laws.
  • the regional context coding RCE mechanism first uses a separate convolutional layer to process the input feature map (a), and then uses the processing result to calculate the features of the subregion (b), and in each branch, the subregions of the same scale are weighted and summed
  • the global representation of the sub-region under this scale is calculated. After the global representation is obtained, the global representation is re-allocated to sub-regions of this scale, and the sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
  • the RCE mechanism divides the input feature maps into different scales in different branches, so the global representation calculated from different branches is different, so the global representation is re-allocated to this scale
  • the sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
  • the final generated context feature map based on the input feature map includes all the information contained in the differential global representation, so the final generated context feature map based on the input feature map contains more comprehensive information, avoiding Information loss caused by a single branch.
  • the process of iterative fusion and update processing includes:
  • S702 Fusion of the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the first set of feature maps in the second stage feature map.
  • FIG. 8 it is a schematic diagram of the iterative fusion and update process, let t denote the stage (0 ⁇ t ⁇ T).
  • Figure 8 includes two stages. The data related to the first stage is represented by the letter with the right subscript t, and the data related to the second stage is represented by the letter with the right subscript t+1. with It is the feature map of the i-th stage at the top-down and bottom-up. Assuming that the value of t is 1, then It is the first set of feature maps of the i-th layer in the first stage of the top-down network. It is the second set of feature maps of the i-th layer in the first stage of the bottom-up network.
  • the first set of feature maps and the second set of feature maps in the first stage feature map are merged to obtain the first set of feature maps in the second stage feature map.
  • the first set of feature maps and the second set of feature maps of a certain layer in the first stage feature map are merged to obtain the first set of feature maps of the layer in the second stage feature map.
  • the feature map with Merge into In the same way, for the feature maps of other layers in the first stage, the first set of feature maps and the second set of feature maps of the layer are merged to obtain the first set of feature maps of the layer in the second stage feature map.
  • the contextual information is propagated in a zigzag manner between the two networks. Therefore, the network in the embodiment of the present application is also called a ZigZagNet (ZigZagNet).
  • the zigzag network includes densely connected and continuous top-down networks and bottom-up networks, and the top-down network and the bottom-up network are iteratively fused and updated in a zigzag manner to spread context information.
  • S706 Update the first group of feature maps in the second-stage feature maps in the top-down network according to the regional context coding RCE, to obtain the first group of feature maps in the updated second-stage feature maps.
  • S708 Update the second set of feature maps in the second-stage feature maps in the bottom-up network according to the regional context coding RCE, to obtain the updated second set of feature maps in the second-stage feature maps.
  • the second-stage feature map is updated in the respective networks according to the regional context coding RCE to obtain the updated second-stage feature map.
  • the second-stage feature maps include a first set of feature maps and a second set of feature maps, wherein the first set of feature maps include multi-layer feature maps, and the second set of feature maps include the same number of multi-layer feature maps.
  • the specific update process is as follows: generate context feature maps based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to the top-down through the top-down network The feature maps of other layers in the network are updated to update the feature maps of other layers in the top-down network.
  • the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE, and the contextual feature maps are propagated to other layer features in the bottom-up network through the bottom-up network Figure, thereby updating the feature maps of other layers in the bottom-up network.
  • the updated feature maps of the above two constitute the updated second-stage feature map.
  • S710 Perform iterative fusion and update calculation using the updated second-stage feature map as the input of the next iterative fusion and update calculation, until the preset number of iterations is reached to obtain the target feature map.
  • the updated second-stage feature map is used as input for iterative fusion to obtain the third-stage feature map, and then the third-stage feature map is updated in their respective networks according to the regional context code RCE to obtain the updated first Three-stage feature map.
  • the number of iterations may be three. Of course, in other embodiments, the number of iterations may be any other value.
  • the context information contained in the feature map is propagated in a zigzag manner between the top-down network and the bottom-up network through iterative fusion. That is, the iterative exchange of context information between top-down and bottom-up networks is realized, the direct information exchange between the two networks is strengthened, the information loss caused by a single network is avoided, and the accuracy of object segmentation is improved.
  • the maximum value of T is 3.
  • the maximum value of T may be set to other values.
  • the top-down network receives the top-down and bottom-up context information of the previous iteration to refine the new feature map
  • regional context coding RCE is used to generate Contextual feature map of subregions RCE encodes the relationship between the partitions into the context feature map. By using different sub-region scales, Provides richer contextual information.
  • obtaining the initial feature map of the image to be processed includes:
  • the bottom-up network is shown in the left image of FIG. 2(d), and the initial feature map is obtained by inputting the image to be processed into the bottom-up network for convolution calculation.
  • the target feature map is a feature map obtained from a bottom-up network.
  • the feature map of the final stage is obtained.
  • the feature maps of the final stage include the first set of feature maps and the second set of feature maps.
  • the first set of feature maps are feature maps calculated by the top-down network
  • the second set of feature maps are feature maps calculated by the bottom-up network.
  • the object segmentation is based on the second set of feature maps in the final stage, so the target feature map is the feature map obtained by the bottom-up network in the final stage.
  • the feature map obtained by the bottom-up network in the final stage is the latest and contains the richest context information. Therefore, the feature map obtained by the bottom-up network in the final stage is used for object segmentation, which improves the accuracy of the segmentation result. Sex.
  • an object segmentation device 900 includes: an initial feature map acquisition module 920, a feature map generation module 940, an iterative processing module 960, and an object segmentation module 980. among them,
  • the initial feature map acquiring module 920 is used to acquire the initial feature map of the image to be processed
  • the feature map generation module 940 is used to input the initial feature map into the continuous top-down network and the bottom-up network to calculate the first-stage feature map, respectively in the top-down network and the bottom-up network according to the regional context Encoding RCE updates the first-stage feature map to obtain the updated first-stage feature map;
  • the iterative processing module 960 is configured to use the updated first-stage feature map as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain a target feature map, which is a feature map obtained from a bottom-up network;
  • the object segmentation module 980 is configured to segment the image to be processed according to the target feature map.
  • the feature map generating module 940 includes: a first set of feature map generating module 942 and a second set of feature map generating module 944. among them,
  • the first set of feature map generation module 942 is configured to input the initial feature map into the top-down network to calculate the first set of feature maps in the first stage feature map;
  • the second set of feature map generating module 944 is configured to input the first set of feature maps in the first stage feature map into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
  • the feature map generation module 940 further includes: a feature map update module 946. among them,
  • the feature map update module 946 is used to generate a context feature map based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to other layers in the top-down network Feature map, update the feature maps of other layers in the top-down network to obtain the first set of feature maps in the updated first-stage feature map;
  • the feature map update module 946 is also used to generate a context feature map based on the sub-regions of each layer of feature maps in the bottom-up network according to the regional context coding RCE, and propagate the context feature maps to other layer features in the bottom-up network
  • the graph updates the feature maps of other layers in the bottom-up network to obtain the second set of feature maps in the updated first-stage feature map.
  • the feature map update module 946 includes a regional context encoding RCE module 946a, which is used to divide each layer of feature maps into sub-regions of different scales; The sub-regions of is weighted and calculated to obtain the global representation of the sub-regions of the scale; the global representation is re-allocated to the sub-regions of the scale, and the sub-regions of the scale are aggregated to generate a context feature map based on the sub-regions of each layer of feature maps.
  • a regional context encoding RCE module 946a which is used to divide each layer of feature maps into sub-regions of different scales; The sub-regions of is weighted and calculated to obtain the global representation of the sub-regions of the scale; the global representation is re-allocated to the sub-regions of the scale, and the sub-regions of the scale are aggregated to generate a context feature map based on the sub-regions of each layer of feature maps.
  • the iterative processing module 960 is further configured to merge the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the The first set of feature maps;
  • the updated second-stage feature map is used as the input of the next iterative fusion and update calculation to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map.
  • the initial feature map acquisition module 920 is further configured to input the image to be processed into the bottom-up network calculation to obtain the initial feature map.
  • Fig. 11 shows an internal structure diagram of a computer device in an embodiment.
  • the computer device may specifically be a terminal or a server.
  • the computer equipment includes the computer equipment including a processor, a memory, a network interface, an input device, a display screen, a camera, a sound collection device, and a speaker connected through a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system, and may also store a computer program.
  • the processor can realize the object segmentation method.
  • a computer program may also be stored in the internal memory, and when the computer program is executed by the processor, the processor can execute the object segmentation method.
  • the display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen.
  • the input device of the computer equipment can be a touch layer covered on the display screen, or a button, trackball or touch pad set on the housing of the computer equipment. It can be an external keyboard, touchpad, or mouse.
  • FIG. 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • the object segmentation apparatus provided in the present application may be implemented in the form of a computer program, and the computer program may run on the computer device as shown in FIG. 11.
  • the memory of the computer device can store various program modules that make up the object segmentation apparatus, such as the initial feature map acquisition module 920, the feature map generation module 940, the iterative processing module 960, and the object segmentation module 980 shown in FIG.
  • the computer program composed of each program module causes the processor to execute the operations in the object segmentation method of each embodiment of the present application described in this specification.
  • the computer device shown in FIG. 11 may perform operation S102 through the initial feature map acquisition module 920 in the object segmentation apparatus 900 shown in FIG. 9.
  • the computer device may perform operation S104 through the feature map generation module 940.
  • the computer device may perform operation S106 through the iterative processing module 960.
  • the computer device may perform operation S108 through the object segmentation module 980.
  • Each module in the above-mentioned object segmentation device can be implemented in whole or in part by software, hardware and a combination thereof.
  • the network interface can be an Ethernet card or a wireless network card, etc.
  • the above modules can be embedded in the form of hardware or independent of the processor in the server, or can be stored in the memory of the server in the form of software to facilitate the processor Call and execute the operations corresponding to the above modules.
  • a computer device including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to perform the operations of the object segmentation method.
  • the operation of the object segmentation method here may be the operation in the object segmentation method of each of the foregoing embodiments.
  • a computer-readable storage medium which stores a computer program, and when the computer program is executed by a processor, the processor executes the operations of the object segmentation method described above.
  • the operation of the object segmentation method here may be the operation in the object segmentation method of each of the foregoing embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An object segmentation method and apparatus, a computer readable storage medium, and a computer device, comprising: acquiring an initial feature map of an image to be processed (S102); inputting the initial feature map into a continuous top-down network and bottom-up network to calculate a first stage feature map, and respectively updating the first stage feature map in the top-down network and the bottom-up network on the basis of regional context encoding RCE to obtain an updated first stage feature map (S104); using the updated first stage feature map as an input to implement iterative fusion and update calculation until a preset number of iterations is reached to obtain a target feature map (S106); and, on the basis of the target feature map, implementing object segmentation on the image to be processed, the target feature map being a feature map obtained from the bottom-up network (S108).

Description

物体分割方法、装置、计算机可读存储介质和计算机设备Object segmentation method, device, computer readable storage medium and computer equipment
相关申请的交叉引用Cross references to related applications
本申请要求于2019年3月21日提交中国专利局,申请号为201910217342.1,发明名称为“物体分割方法、装置、计算机可读存储介质和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on March 21, 2019, the application number is 201910217342.1, and the invention title is "Object segmentation method, device, computer readable storage medium and computer equipment", and its entire content Incorporated in this application by reference.
技术领域Technical field
本申请涉及计算机技术领域,特别是涉及一种物体分割方法、装置、计算机可读存储介质和计算机设备。This application relates to the field of computer technology, in particular to an object segmentation method, device, computer-readable storage medium, and computer equipment.
背景技术Background technique
随着人工智能和无人驾驶技术的不断发展,也对计算机视觉提高出了更高的要求。物体分割一直是计算机视觉中长期存在的技术难题之一。物体分割包括语义分割和实例分割。其中,语义分割指的是将图中每一点像素标注为某个物体类别,同一物体的不同实例不需要单独分割出来。而实例分割是物体检测与语义分割的综合体,相对物体检测只得到物体的边界框,实例分割可精确到物体的边缘;相对语义分割,实例分割可以标注出图上同一物体的不同个体。With the continuous development of artificial intelligence and driverless technology, higher requirements have been placed on computer vision. Object segmentation has always been one of the long-standing technical problems in computer vision. Object segmentation includes semantic segmentation and instance segmentation. Among them, semantic segmentation refers to marking each pixel in the image as a certain object category, and different instances of the same object do not need to be segmented separately. Instance segmentation is a combination of object detection and semantic segmentation. Relative object detection only obtains the bounding box of the object, and the instance segmentation can be accurate to the edge of the object; relative semantic segmentation, instance segmentation can mark different individuals of the same object on the map.
然而,目前采用传统方法对图像进行物体分割,分割结果的准确性不是太高,不能满足越来越精细化的需求。However, currently, traditional methods are used to segment images, and the accuracy of the segmentation results is not too high, which cannot meet the demand for more and more refined.
发明内容Summary of the invention
根据本申请的各种实施例,提供一种物体分割方法、装置、计算机可读存储介质和计算机设备。According to various embodiments of the present application, an object segmentation method, device, computer-readable storage medium, and computer equipment are provided.
一种物体分割方法,包括:An object segmentation method, including:
获取待处理图像的初始特征图;Obtain the initial feature map of the image to be processed;
将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图;Input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map, and encode the RCE according to the regional context in the top-down network and the bottom-up network, respectively Updating the first-stage feature map to obtain an updated first-stage feature map;
将所述更新后的第一阶段特征图作为输入进行迭代融合及更新处理,直到达到预设迭代次数得到目标特征图;Using the updated first-stage feature map as input to perform iterative fusion and update processing until the preset number of iterations is reached to obtain the target feature map;
根据所述目标特征图对所述待处理图像进行物体分割。Perform object segmentation on the image to be processed according to the target feature map.
一种物体分割装置,所述装置包括:An object segmentation device, the device includes:
初始特征图获取模块,用于获取待处理图像的初始特征图;The initial feature map acquisition module is used to acquire the initial feature map of the image to be processed;
特征图生成模块,用于将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图;The feature map generation module is configured to input the initial feature map into a continuous top-down network and a bottom-up network to obtain a first-stage feature map, respectively, in the top-down network and the bottom-up network Update the first-stage feature map according to the regional context code RCE in the network to obtain an updated first-stage feature map;
迭代处理模块,用于将所述更新后的第一阶段特征图作为输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图;An iterative processing module, configured to use the updated first-stage feature map as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map;
物体分割模块,用于根据所述目标特征图对所述待处理图像进行物体分割。The object segmentation module is configured to perform object segmentation on the image to be processed according to the target feature map.
一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行上述方法的操作。A computer-readable storage medium that stores a computer program. When the computer program is executed by a processor, the processor executes the operations of the foregoing method.
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行上述方法的操作。A computer device includes a memory and a processor, and the memory stores a computer program. When the computer program is executed by the processor, the processor executes the operations of the foregoing method.
上述物体分割方法、装置、计算机可读存储介质和计算机设备,获取待处理图像的初始特征图,将初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在自顶向下网络和自底向上网络中根据区域上下文编码RCE对第一阶段特征图进行更新,得到更新后的第一阶段特征图。将更新后的第一阶段特征图作为输入进行迭代融合及更新计算,直 到达到预设迭代次数得到目标特征图,根据目标特征图对待处理图像进行物体分割,目标特征图为自底向上网络所得到的特征图。The above object segmentation method, device, computer readable storage medium and computer equipment, obtain the initial feature map of the image to be processed, and input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map , The first-stage feature map is updated according to the regional context code RCE in the top-down network and the bottom-up network respectively, and the updated first-stage feature map is obtained. The updated first-stage feature map is used as input for iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map, and the object to be processed is segmented according to the target feature map. The target feature map is obtained from the bottom-up network Characteristic map.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征、目的和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and descriptions, and other features, purposes and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1A为一个实施例中物体分割方法的分割结果图;Figure 1A is a segmentation result diagram of an object segmentation method in an embodiment;
图1B为一个实施例中物体分割方法的流程示意图;FIG. 1B is a schematic flowchart of an object segmentation method in an embodiment;
图2为一个实施例中网络结构及特征图的示意图;FIG. 2 is a schematic diagram of a network structure and characteristic diagram in an embodiment;
图3为图1B中第一组特征图和第二组特征图生成方法的流程示意图;3 is a schematic flowchart of the method for generating the first set of feature maps and the second set of feature maps in FIG. 1B;
图4为图1B中更新后的第一组特征图和第二组特征图生成方法的流程示意图;4 is a schematic flowchart of the method for generating the updated first set of feature maps and the second set of feature maps in Figure 1B;
图5为一个实施例中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图方法的流程示意图;FIG. 5 is a schematic flowchart of a method for generating a context feature map based on sub-regions of each layer of feature maps according to regional context coding RCE in an embodiment;
图6为一个实施例中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图方法的示意图;FIG. 6 is a schematic diagram of a method for generating a context feature map based on sub-regions of each layer of feature maps according to regional context coding RCE in an embodiment;
图7为图1B中迭代融合及更新处理过程的流程示意图;FIG. 7 is a schematic flowchart of the iterative fusion and update process in FIG. 1B;
图8为图1B中迭代融合及更新处理过程的示意图;Fig. 8 is a schematic diagram of the iterative fusion and update process in Fig. 1B;
图9为一个实施例中物体分割装置的结构框图;Figure 9 is a structural block diagram of an object segmentation device in an embodiment;
图10为图9中特征图生成模块的结构框图;Fig. 10 is a structural block diagram of the feature map generating module in Fig. 9;
图11为一个实施例中计算机设备的结构框图。Fig. 11 is a structural block diagram of a computer device in an embodiment.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.
如图1A所示,在一个实施例中,提供了一种经过物体分割方法所得到的分割结果图。在图1A中采用该物体分割方法对每张图像中不同的物体进行了分割。As shown in FIG. 1A, in one embodiment, a segmentation result map obtained by an object segmentation method is provided. In Figure 1A, the object segmentation method is used to segment different objects in each image.
如图1B所示,在一个实施例中,提供了一种物体分割方法。参照图1B,该物体分割方法具体包括如下操作:As shown in FIG. 1B, in one embodiment, an object segmentation method is provided. 1B, the object segmentation method specifically includes the following operations:
S102,获取待处理图像的初始特征图。S102: Acquire an initial feature map of the image to be processed.
其中,待处理图像为进行物体分割的图像。待处理图像可以是实时采集的图像,也可以是来自其他任何设备的图像。此处,待处理图像可以是视频数据,也可以是图片数据。特征图指的是对待处理图像通过卷积神经网络进行计算所得到的特征图。初始特征图具体是将待处理图像输入至主干网络进行卷积计算所得到的,在这里,主干网络可以是自底向上网络。自底向上网络是一种卷积神经网络,当然,主干网络也可以是其他类型的卷积神经网络。Among them, the image to be processed is an image for object segmentation. The image to be processed can be an image collected in real time, or an image from any other device. Here, the image to be processed may be video data or picture data. The feature map refers to the feature map obtained by calculating the image to be processed through the convolutional neural network. The initial feature map is specifically obtained by inputting the image to be processed into the backbone network for convolution calculation. Here, the backbone network may be a bottom-up network. The bottom-up network is a convolutional neural network. Of course, the backbone network can also be other types of convolutional neural networks.
S104,将初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在自顶向下网络和自底向上网络中根据区域上下文编码RCE对第一阶段特征图进行更新,得到更新后的第一阶段特征图。S104. Input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map, and encode the RCE to the first stage in the top-down network and the bottom-up network respectively according to the region context. The feature map is updated to obtain the updated first stage feature map.
自顶向下网络和自底向上网络指的是不同类型的卷积神经网络。其中,自顶向下网络将高层大尺度语义信息向下传播到较浅的网络层,而自底向上的网络将较小尺度的可视细节编码到较深的网络层。连续的自顶向下网络和自底向上网络指的是在自顶向下的网络之后连接了一个自底向上的网络,从而构成了新的网络,该连续的自顶向下网络和自底向上网络为最顶层和最底层的特征图之间的信息传播构建了一个快捷方式。因此,连续的自顶向下和自底向上的网络被用来学习不同层次上更强大的特征图。Top-down networks and bottom-up networks refer to different types of convolutional neural networks. Among them, the top-down network propagates high-level large-scale semantic information down to the shallower network layer, while the bottom-up network encodes the smaller-scale visible details to the deeper network layer. The continuous top-down network and the bottom-up network refer to the connection of a bottom-up network after the top-down network to form a new network. The continuous top-down network and the bottom-up network The upward network constructs a shortcut for information dissemination between the top and bottom feature maps. Therefore, continuous top-down and bottom-up networks are used to learn more powerful feature maps at different levels.
区域上下文编码(Region Context Encoding,RCE)指的是区域上下文编码(RCE)机制,该机制连接输入特征图的所有子区域,使每个子区域都能够灵 活地传播其信息。区域上下文编码(RCE)机制采用多个并行分支实现,将自顶向下/自底向上网络生成的特征图输入到每个RCE分支。在每个RCE分支中,首先将特征图划分为规则的、不同尺度的子区域;然后将一层特征图中相同尺度的子区域进行加权和,从而将所有子区域聚合成一个全局表示;最后将全局表示重新分配到尺度的子区域,再将该层特征图所有分支的结果与输入特征图相加,生成该层特征图的上下文特征图。这允许每个子区域向新特征图的所有子区域传递信息。每个分支对特征图执行不同的细分,生成不同尺度的子区域。Region Context Encoding (RCE) refers to the Regional Context Encoding (RCE) mechanism, which connects all sub-regions of the input feature map so that each sub-region can flexibly spread its information. The Regional Context Encoding (RCE) mechanism is implemented using multiple parallel branches, and the feature map generated by the top-down/bottom-up network is input to each RCE branch. In each RCE branch, the feature map is first divided into regular and different scale sub-regions; then the sub-regions of the same scale in a layer of the feature map are weighted and all sub-regions are aggregated into a global representation; finally The global representation is re-allocated to the sub-regions of the scale, and the results of all branches of the feature map of the layer are added to the input feature map to generate the context feature map of the feature map of the layer. This allows each sub-region to transfer information to all sub-regions of the new feature map. Each branch performs different subdivisions on the feature map to generate sub-regions of different scales.
具体的,将初始特征图先输入自顶向下网络进行处理得到与该自顶向下网络对应的特征图,再将该处理结果输入至与自顶向下网络连接的自底向上网络进行处理,得到与该自底向上网络对应的特征图,这两个特征图构成了第一阶段特征图。然后在自顶向下网络中分别根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图通过自顶向下网络传播到自顶向下网络中的其他层特征图,从而对自顶向下网络中的其他层特征图进行更新。同理,在自底向上网络中分别根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图通过自底向上网络传播到自底向上网络中的其他层特征图,从而对自底向上网络中的其他层特征图进行更新。上述两者更新后的特征图就构成了更新后的第一阶段特征图。Specifically, the initial feature map is first input to the top-down network for processing to obtain the feature map corresponding to the top-down network, and then the processing result is input to the bottom-up network connected to the top-down network for processing , Get the feature map corresponding to the bottom-up network, these two feature maps constitute the first stage feature map. Then in the top-down network, the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE respectively, and the contextual feature maps are propagated to other layers in the top-down network through the top-down network The feature map is used to update the feature maps of other layers in the top-down network. Similarly, in the bottom-up network, the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE, and the contextual feature maps are propagated to other layer features in the bottom-up network through the bottom-up network Figure, thereby updating the feature maps of other layers in the bottom-up network. The above two updated feature maps constitute the updated first-stage feature map.
S106,将更新后的第一阶段特征图作为输入进行迭代融合及更新处理,直到达到预设迭代次数得到目标特征图。S106: Use the updated first-stage feature map as an input to perform iterative fusion and update processing until the preset number of iterations is reached to obtain the target feature map.
将根据区域上下文编码RCE更新之后的第一阶段特征图作为输入进行迭代融合处理,得到第二阶段特征图。根据区域上下文编码RCE对第二阶段特征图进行更新,得到更新之后的第二阶段特征图。再将更新之后的第二阶段特征图作为输入进行迭代融合处理,得到第三阶段特征图。根据区域上下文编码RCE对第三阶段特征图进行更新,得到更新之后的第四阶段特征图,如此迭代下去,直到达到预设迭代次数停止,此时,最后得到的特征图即为目 标特征图。The first-stage feature map updated according to the regional context code RCE is used as input to perform iterative fusion processing to obtain the second-stage feature map. The second-stage feature map is updated according to the regional context code RCE to obtain the updated second-stage feature map. Then, the updated second-stage feature map is used as input for iterative fusion processing to obtain the third-stage feature map. The third-stage feature map is updated according to the regional context code RCE, and the updated fourth-stage feature map is obtained. The iteration continues until the preset number of iterations is reached. At this time, the finally obtained feature map is the target feature map.
S108,根据目标特征图对待处理图像进行物体分割,目标特征图为自底向上网络所得到的特征图。S108: Perform object segmentation on the image to be processed according to the target feature map, where the target feature map is a feature map obtained by a bottom-up network.
在得到了目标特征图之后,就可以根据目标特征图对待处理图像进行物体分割。具体的,对待处理图像进行物体分割,即可以实现将待处理图像中的不同的物体以及同一物体的不同个体进行分割开来。其中,目标特征图为自底向上网络所得到的特征图。After the target feature map is obtained, the object to be processed can be segmented according to the target feature map. Specifically, object segmentation is performed on the image to be processed, that is, different objects in the image to be processed and different individuals of the same object can be segmented. Among them, the target feature map is the feature map obtained from the bottom-up network.
上述物体分割方法,获取待处理图像的初始特征图,将初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在自顶向下网络和自底向上网络中根据区域上下文编码RCE对第一阶段特征图进行更新,得到更新后的第一阶段特征图。将更新后的第一阶段特征图作为输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图,根据目标特征图对待处理图像进行物体分割。区域上下文编码RCE可以利用特征图的不同子区域来计算上下文特征从而对每一阶段的特征图进行更新,能够捕获特征图中多尺度子区域的上下文信息。迭代融合能够聚合所有级别的上下文信息,避免信息的流失,进而提高物体分割的准确性。In the above object segmentation method, the initial feature map of the image to be processed is obtained, and the initial feature map is input into the continuous top-down network and bottom-up network to calculate the first-stage feature map, respectively, in the top-down network and the bottom-up network. In the upward network, the first-stage feature map is updated according to the regional context code RCE to obtain the updated first-stage feature map. The updated first-stage feature map is used as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map, and the object to be processed is segmented according to the target feature map. The regional context coding RCE can use different sub-regions of the feature map to calculate the context features to update the feature map at each stage, and can capture the context information of the multi-scale sub-regions in the feature map. Iterative fusion can aggregate all levels of contextual information, avoid the loss of information, and improve the accuracy of object segmentation.
在一个实施例中,连续的自顶向下网络和自底向上网络采用密集路径连接。In one embodiment, the continuous top-down network and the bottom-up network are connected by dense paths.
具体的,图2为一个实施例中网络结构及特征图的示意图。图2中箭头所示均为网络结构,长方形块所示均为特征图。当箭头自顶指向下,则箭头所示为自顶向下网络,当箭头自底指向上,则箭头所示为自底向上网络。Specifically, FIG. 2 is a schematic diagram of a network structure and characteristic diagram in an embodiment. The arrows in Figure 2 are all network structures, and the rectangular blocks are all feature maps. When the arrow points from the top to the bottom, the arrow shows a top-down network, and when the arrow points from the bottom to the top, the arrow shows a bottom-up network.
在图2(a)中,左图箭头所示为主干网络,右图箭头所示为自顶向下网络。在图2(b)中,左图箭头所示为主干网络,右图箭头所示为采用密集路径连接的自顶向下网络。在图2(c)中,左图箭头所示为主干网络,将主干网络所得的特征图输入至右图箭头所示连续的自顶向下网络和自底向上网络。其中,在连续的自顶向下网络和自底向上网络中,将由自顶向下网络进行卷积计算所得的特征图输入至自底向上网络中进行卷积计算得到特征图。在图 2(d)中,左图箭头所示为主干网络,将主干网络所得的特征图输入至右图箭头所示连续的自顶向下网络和自底向上网络,该连续的自顶向下网络和自底向上网络均采用密集路径连接。在本申请实施例中,主干网络为自底向上网络,且在本申请实施例中,主要采用图2(d)箭头所示的网络结构实现物体分割。In Figure 2(a), the arrow on the left shows the backbone network, and the arrow on the right shows the top-down network. In Figure 2(b), the arrow on the left shows the backbone network, and the arrow on the right shows the top-down network connected by dense paths. In Figure 2(c), the arrow on the left shows the backbone network, and the feature map obtained from the backbone network is input into the continuous top-down network and bottom-up network shown by the arrow on the right. Among them, in the continuous top-down network and bottom-up network, the feature map obtained by the convolution calculation of the top-down network is input into the bottom-up network for convolution calculation to obtain the feature map. In Figure 2(d), the arrow on the left shows the backbone network, and the feature map obtained from the backbone network is input into the continuous top-down network and bottom-up network shown by the arrow on the right. The continuous top Both the bottom-up network and the bottom-up network are connected by dense paths. In the embodiment of the present application, the backbone network is a bottom-up network, and in the embodiment of the present application, the network structure shown by the arrow in FIG. 2(d) is mainly used to implement object segmentation.
在图2(c)中,连续的自顶向下网络和自底向上网络都只能在相邻特征图之间直接传播上下文信息,而在相邻的特征图之外传播上下文信息,则需要经过多个阶段间接进行传播,间接传播必然会导致重要信息的衰减。In Figure 2(c), both the continuous top-down network and the bottom-up network can only directly propagate contextual information between adjacent feature maps, while propagating contextual information outside of adjacent feature maps requires Indirect propagation through multiple stages, indirect propagation will inevitably lead to the attenuation of important information.
本申请实施例中,连续的自顶向下网络和自底向上网络均采用密集路径连接,密集路径就可以在上下文信息传播的每个阶段都直接通信所有的特征图,从而能够在所有级别上直接有效地增强特征图,避免了重要信息在间接传播过程中的不断衰减,以至于降低最后物体分割结果的准确性。In the embodiment of the present application, the continuous top-down network and the bottom-up network are connected by dense paths. The dense paths can directly communicate all feature maps at each stage of context information dissemination, thereby being able to communicate at all levels. Directly and effectively enhance the feature map, avoiding the continuous attenuation of important information in the indirect propagation process, so as to reduce the accuracy of the final object segmentation result.
在一个实施例中,如图3所示,将初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,包括:第一阶段特征图包括两组特征图;In one embodiment, as shown in FIG. 3, the initial feature map is input into the continuous top-down network and the bottom-up network to calculate the first-stage feature map, including: the first-stage feature map includes two sets of feature maps ;
S1042,将初始特征图输入自顶向下网络中计算得到第一阶段特征图中的第一组特征图;S1042: Input the initial feature map into the top-down network to calculate the first group of feature maps in the first stage feature map;
S1044,将第一阶段特征图中的第一组特征图输入至自底向上网络中计算得到第一阶段特征图中的第二组特征图。S1044: Input the first set of feature maps in the first stage feature map into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
具体的,如图2(d)所示,由主干网络(图2(d)左图箭头所示)进行卷积计算得到初始特征图,将初始特征图输入至(图2(d)右图箭头所示)连续的自顶向下网络和自底向上网络,且该连续的自顶向下网络和自底向上网络采用密集路径连接。其中,先将初始特征图输入至密集的自顶向下网络进行卷积计算得到第一阶段特征图中的第一组特征图。然后将第一阶段特征图中的第一组特征图输入至密集的自底向上网络进行卷积计算得到第一阶段特征图中的第二组特征图。Specifically, as shown in Figure 2(d), the backbone network (shown by the arrow in the left image of Figure 2(d)) performs convolution calculation to obtain the initial feature map, and input the initial feature map to The arrow indicates) a continuous top-down network and a bottom-up network, and the continuous top-down network and the bottom-up network are connected by dense paths. Among them, the initial feature map is first input to the dense top-down network for convolution calculation to obtain the first set of feature maps in the first stage feature map. Then input the first set of feature maps in the first stage feature map to the dense bottom-up network for convolution calculation to obtain the second set of feature maps in the first stage feature map.
本申请实施例中,连续的自顶向下网络和自底向上网络为最顶层和最底 层的特征图之间的信息传播构建了一个快捷方式。因此,连续的自顶向下和自底向上的网络被用来学习不同层次上更强大的特征图。且该连续的自顶向下网络和自底向上网络采用密集路径连接,密集路径就可以在上下文信息传播的每个阶段都直接通信所有的特征图,从而能够在所有级别上直接有效地增强特征图,避免了重要信息在间接传播过程中的不断衰减。因此,在本申请实施例中,分别从连续和密集两个维度上提高了物体分割结果的准确性。In the embodiment of this application, the continuous top-down network and the bottom-up network construct a shortcut for information dissemination between the top-most and bottom-most feature maps. Therefore, continuous top-down and bottom-up networks are used to learn more powerful feature maps at different levels. And the continuous top-down network and bottom-up network are connected by dense paths. The dense paths can directly communicate with all feature maps at each stage of contextual information dissemination, thereby directly and effectively enhancing features at all levels. Figure, to avoid the continuous attenuation of important information in the indirect propagation process. Therefore, in the embodiment of the present application, the accuracy of the object segmentation result is improved from the two dimensions of continuous and dense.
在一个实施例中,如图4所示,分别在自顶向下网络和自底向上网络中根据区域上下文编码RCE对第一阶段特征图进行更新,得到更新后的第一阶段特征图,包括:In one embodiment, as shown in FIG. 4, the first-stage feature map is updated according to the regional context code RCE in the top-down network and the bottom-up network, respectively, to obtain the updated first-stage feature map, including :
S1046,在自顶向下网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图传播到自顶向下网络中的其他层特征图,对自顶向下网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第一组特征图。S1046: Generate a context feature map based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to other layer feature maps in the top-down network. The feature maps of other layers in the downward network are updated to obtain the first set of feature maps in the updated first-stage feature maps.
具体的,在图2(d)中间部分箭头所示的密集连接的自顶向下网络中,根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,再通过图2(d)中间部分箭头所示的所有连接将该上下文特征图传播到自顶向下网络中的其他层特征图。从而实现对自顶向下网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第一组特征图。Specifically, in the densely connected top-down network shown by the arrow in the middle part of Figure 2(d), the context feature map based on the subregions of each layer of feature map is generated according to the regional context coding RCE, and then the context feature map based on the feature map of each layer is generated through Figure 2( d) All connections shown by the arrows in the middle part propagate the context feature map to other layer feature maps in the top-down network. In this way, the feature maps of other layers in the top-down network are updated to obtain the first set of feature maps in the updated first-stage feature maps.
S1048,在自底向上网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图传播到自底向上网络中的其他层特征图对自底向上网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第二组特征图。S1048: Generate a context feature map based on the sub-regions of each layer of feature maps in the bottom-up network according to the region context coding RCE, and propagate the context feature maps to other layer feature map pairs in the bottom-up network. The feature maps of other layers are updated to obtain the second set of feature maps in the updated first-stage feature maps.
具体的,在图2(d)右侧部分箭头所示的密集连接的自底向上网络中,根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,再通过图2(d)右侧部分的所有连接将该上下文特征图传播到自底向上网络中的其他层特征图。从而实现对自底向上网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第二组特征图。Specifically, in the densely connected bottom-up network shown by the arrow on the right part of Figure 2(d), the context feature map based on the sub-regions of each layer of feature maps is generated according to the regional context coding RCE, and then Figure 2( d) All connections in the right part propagate this context feature map to other layer feature maps in the bottom-up network. In this way, the feature maps of other layers in the bottom-up network are updated to obtain the second set of feature maps in the updated first-stage feature maps.
本申请实施例中,根据区域上下文编码RCE机制,并通过密集连接的路径对第一阶段特征图进行更新。该机制连接输入特征图的所有子区域,使每个子区域都能够灵活地传播其信息,从而避免每一层特征图的子区域中所包含的信息的流失。且采用密集连接的路径,将根据区域上下文编码RCE机制所得的基于每一层特征图的子区域的上下文特征图,传播到同一网络中的其他层特征图。采用密集路径就可以实现上下文特征图传播时都直接通信所有的特征图,从而能够在所有级别上直接有效地增强特征图,避免了重要信息在间接传播过程中的不断衰减,以至于降低最后物体分割结果的准确性。In the embodiment of the present application, the RCE mechanism is coded according to the regional context, and the first-stage feature map is updated through densely connected paths. This mechanism connects all sub-regions of the input feature map, so that each sub-region can flexibly spread its information, thereby avoiding the loss of information contained in the sub-regions of each layer of the feature map. And using a densely connected path, the context feature maps based on the sub-regions of each layer of feature maps obtained according to the regional context coding RCE mechanism are propagated to other layer feature maps in the same network. By adopting dense paths, it is possible to directly communicate all the feature maps during the propagation of the contextual feature map, which can directly and effectively enhance the feature map at all levels, avoiding the continuous attenuation of important information in the indirect propagation process, so as to reduce the final object The accuracy of the segmentation results.
在一个实施例中,如图5所示,根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,包括:In one embodiment, as shown in FIG. 5, generating a context feature map based on the sub-regions of each layer of feature maps according to the region context coding RCE includes:
S502,将每一层特征图划分为不同尺度的子区域;S502: Divide each layer of feature maps into sub-regions of different scales;
S504,将相同尺度的子区域进行加权和计算得到该尺度的子区域的全局表示;S504: Weighting and calculating sub-regions of the same scale to obtain a global representation of the sub-regions of this scale;
S506,将全局表示重新分配到该尺度的子区域,将该尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。S506: Reallocate the global representation to the sub-regions of the scale, and aggregate the sub-regions of the scale to generate a context feature map based on the sub-regions of each layer of feature maps.
具体的,如图6所示,区域上下文编码RCE机制对于每一层特征图的处理包括多个并行分支,在不同的分支中对输入特征图进行不同尺度的划分。例如在图6中,对输入特征图(a)采用三种不同尺度的划分,划分为3×3,5×5,7×7的子区域,每一个分支中的子区域的尺度是相同的。区域上下文编码RCE机制首先使用单独的卷积层来处理输入特征图(a),然后用处理结果来计算子区域(b)的特征,并在每个分支中将相同尺度的子区域进行加权和计算得到该尺度下子区域的全局表示。Specifically, as shown in FIG. 6, the regional context coding RCE mechanism includes multiple parallel branches for each layer of feature maps, and the input feature maps are divided into different scales in different branches. For example, in Figure 6, the input feature map (a) is divided into three different scales, divided into 3×3, 5×5, and 7×7 sub-regions, and the scale of the sub-regions in each branch is the same . The regional context coding RCE mechanism first uses a separate convolutional layer to process the input feature map (a), and then uses the processing result to calculate the features of the subregion (b), and in each branch, the subregions of the same scale are weighted and summed The global representation of the sub-region under this scale is calculated.
在得到了全局表示之后,将全局表示重新分配到该尺度的子区域,将该尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。具体的,因为全局表示(c)分别在对应的分支中来连接所有的子区域,所以全局表示就可以向所有子区域(d)传播信息得到每个分支的处理结果。最后,再将所有分支的处理结果与输入特征图(a)相加,生成基于输入特征图(a)的上下文特征图 (e),该上下文特征图(e)可以用R来表示。After the global representation is obtained, the global representation is re-allocated to sub-regions of this scale, and the sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps. Specifically, because the global representation (c) connects all the sub-regions in the corresponding branches, the global representation can spread information to all the sub-regions (d) to obtain the processing result of each branch. Finally, the processing results of all branches are added to the input feature map (a) to generate a context feature map (e) based on the input feature map (a), and the context feature map (e) can be represented by R.
更具体的,给定由自顶向下/自底向上网络生成的特征图F i∈R H×W×C,对特征图F i∈R H×W×C进行卷积并将其划分为K×K的子区域。通过对每个子区域内的神经元进行累加,得到了一个特征图
Figure PCTCN2019081484-appb-000001
又有:
More specifically, given the feature map F i ∈R H×W×C generated by the top-down/bottom-up network, convolve the feature map F i ∈R H×W×C and divide it into K×K sub-area. By accumulating the neurons in each sub-region, a feature map is obtained
Figure PCTCN2019081484-appb-000001
Also:
Figure PCTCN2019081484-appb-000002
Figure PCTCN2019081484-appb-000002
其中(x,y)表示
Figure PCTCN2019081484-appb-000003
的位置,用S(x,y)表示
Figure PCTCN2019081484-appb-000004
中包含一组神经元的子区域,
Figure PCTCN2019081484-appb-000005
表示子区域S(x,y)的特征。通过调整所有子区域的重要性来对所有子区域的特征进行求和,从而得到连接所有子区域的全局表示。为此,可以简单地应用可学的K×K卷积与ReLU激活
Figure PCTCN2019081484-appb-000006
其中没有填充。这就产生了一个c维特征向量(如图6(c)所示)。其他的可学习的K×K卷积内核用于反卷积这个c维特征向量,其中没有填充,得到一个新的特征图
Figure PCTCN2019081484-appb-000007
Figure PCTCN2019081484-appb-000008
然后将所有分支的
Figure PCTCN2019081484-appb-000009
特征图添加到输入特征图F i中,生成特征图R i∈R H×W×C:
Where (x,y) means
Figure PCTCN2019081484-appb-000003
The position of S(x,y)
Figure PCTCN2019081484-appb-000004
Contains a group of neuron subregions,
Figure PCTCN2019081484-appb-000005
Represents the characteristics of the sub-region S(x,y). By adjusting the importance of all sub-regions, the characteristics of all sub-regions are summed to obtain a global representation that connects all sub-regions. For this, you can simply apply the learnable K×K convolution and ReLU activation
Figure PCTCN2019081484-appb-000006
There is no padding. This produces a c-dimensional feature vector (as shown in Figure 6(c)). Other learnable K×K convolution kernels are used to deconvolve this c-dimensional feature vector without padding to obtain a new feature map
Figure PCTCN2019081484-appb-000007
Figure PCTCN2019081484-appb-000008
Then all branches
Figure PCTCN2019081484-appb-000009
The feature map is added to the input feature map F i to generate a feature map R i ∈R H×W×C :
Figure PCTCN2019081484-appb-000010
Figure PCTCN2019081484-appb-000010
对于公式2,使用了F i的3个不同的细分(3×3,5×5,7×7的子区域)来计算特征图的集合
Figure PCTCN2019081484-appb-000011
通过将特征图划分成更多的子区域,可以显著地增加需要学习的参数的数量。
For Equation 2, using three different segments of F i (subregions 3 × 3,5 × 5,7 × 7) is set to calculate the characteristic of FIG.
Figure PCTCN2019081484-appb-000011
By dividing the feature map into more sub-regions, the number of parameters that need to be learned can be significantly increased.
本申请实施例中,对于每层输入特征图,区域上下文编码RCE机制都采用多个分支,在每个分支中将特征图划分为不同尺度的子区域,再对同一尺度的子区域计算出一个全局变量。因此,每一个分支就对应一个全局变量。In the embodiment of this application, for each layer of input feature maps, the regional context coding RCE mechanism uses multiple branches. In each branch, the feature map is divided into sub-regions of different scales, and then a sub-region of the same scale is calculated. Global variables. Therefore, each branch corresponds to a global variable.
再将某一分支的全局变量对应重新分配到该分支所对应的尺度下所划分的子区域得到每个分支的处理结果,最后将所有分支的处理结果与输入特征图相加,生成基于该输入特征图的上下文特征图。与输入特征图对应的多个全局变量就包含了输入特征图中不同尺度子区域的上下文信息,所以最终所 生成的基于该输入特征图的上下文特征图所包含的信息就更加全面。Then the global variables of a branch are redistributed to the sub-regions divided under the scale corresponding to the branch to obtain the processing results of each branch. Finally, the processing results of all branches are added to the input feature map to generate a generation based on the input The contextual feature map of the feature map. The multiple global variables corresponding to the input feature map contain the context information of the sub-regions of different scales in the input feature map, so the final generated context feature map based on the input feature map contains more comprehensive information.
通过RCE机制所得到的每层特征图的子区域的上下文特征图,会通过密集连接传播到其他层特征图中。然后其他层特征图也是通过RCE机制,得到该特征图的子区域的上下文特征图。因此,由于RCE机制,每一层特征图不同尺度的子区域就可以影响其他层特征图的任何位置。相对于采用密集路径实现上下文特征图传播时都直接通信所有的特征图,从而能够在所有级别上直接有效地增强特征图,避免了重要信息在间接传播过程中的不断衰减。RCE机制进一步使得每一层特征图不同尺度的子区域就可以影响其他层特征图的任何位置。所以,进一步提高了物体分割结果的准确性。The contextual feature maps of the sub-regions of each layer of feature maps obtained through the RCE mechanism will be propagated to other layer feature maps through dense connections. Then the feature maps of other layers also use the RCE mechanism to obtain the context feature maps of the sub-regions of the feature map. Therefore, due to the RCE mechanism, the sub-regions of the feature maps of each layer with different scales can affect any position of the feature maps of other layers. Compared with the use of dense paths to realize the contextual feature map propagation, all feature maps are directly communicated, which can directly and effectively enhance the feature maps at all levels and avoid the continuous attenuation of important information in the indirect propagation process. The RCE mechanism further enables the sub-regions of different scales in each layer of feature maps to affect any position of the feature maps of other layers. Therefore, the accuracy of the object segmentation result is further improved.
在一个实施例中,区域上下文编码RCE对于每一层特征图包括多个并行分支,每个并行分支分别对同一尺度的子区域进行处理。In one embodiment, the regional context coding RCE includes multiple parallel branches for each layer of feature maps, and each parallel branch processes sub-regions of the same scale respectively.
具体的,如图6所示,区域上下文编码RCE机制对于每一层特征图的处理包括多个并行分支(图6所示为三个分支),在不同的分支中对输入特征图进行不同尺度的划分。例如在图6中,对输入特征图(a)采用三种不同尺度的划分,划分为3×3,5×5,7×7的子区域,每一个分支中的子区域的尺度是相同的。每个并行分支分别对同一尺度的子区域进行处理。当然,上述实施例中所采用的三种不同尺度的划分只是举例,也可以采用一种、两种、四种或者更多种不同尺度的划分。还可以对输入特征图采取不规则的划分,即每个分支中的子区域的尺度是不同的,但是每个分支中的子区域的尺度具有不同的特定规律。Specifically, as shown in Figure 6, the regional context coding RCE mechanism for each layer of feature maps includes multiple parallel branches (three branches shown in Figure 6), and the input feature maps are scaled in different branches. The division. For example, in Figure 6, the input feature map (a) is divided into three different scales, divided into 3×3, 5×5, 7×7 sub-regions, and the scale of the sub-regions in each branch is the same . Each parallel branch separately processes the sub-regions of the same scale. Of course, the three different scale divisions used in the above embodiments are just examples, and one, two, four or more divisions with different scales may also be used. The input feature map can also be divided irregularly, that is, the scale of the sub-region in each branch is different, but the scale of the sub-region in each branch has different specific laws.
区域上下文编码RCE机制首先使用单独的卷积层来处理输入特征图(a),然后用处理结果来计算子区域(b)的特征,并在每个分支中将相同尺度的子区域进行加权和计算得到该尺度下子区域的全局表示。在得到了全局表示之后,将全局表示重新分配到该尺度的子区域,将该尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。The regional context coding RCE mechanism first uses a separate convolutional layer to process the input feature map (a), and then uses the processing result to calculate the features of the subregion (b), and in each branch, the subregions of the same scale are weighted and summed The global representation of the sub-region under this scale is calculated. After the global representation is obtained, the global representation is re-allocated to sub-regions of this scale, and the sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
本申请实施例中,RCE机制在不同的分支中对输入特征图进行不同尺度的划分,所以从不同分支中所计算出的全局表示是具有差异性的,这样再将 全局表示重新分配到该尺度的子区域,将该尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。最终所生成的基于该输入特征图的上下文特征图就包括了差异性全局表示所包含的所有信息,因此最终所生成的基于该输入特征图的上下文特征图所包含的信息就更加全面,避免了单一分支所造成的信息遗失。In the embodiment of this application, the RCE mechanism divides the input feature maps into different scales in different branches, so the global representation calculated from different branches is different, so the global representation is re-allocated to this scale The sub-regions of this scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps. The final generated context feature map based on the input feature map includes all the information contained in the differential global representation, so the final generated context feature map based on the input feature map contains more comprehensive information, avoiding Information loss caused by a single branch.
在一个实施例中,如图7所示,迭代融合及更新处理的过程,包括:In one embodiment, as shown in FIG. 7, the process of iterative fusion and update processing includes:
S702,将第一阶段特征图中的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中的第一组特征图。S702: Fusion of the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the first set of feature maps in the second stage feature map.
具体的,如图8所示,为迭代融合及更新处理过程的示意图,让t表示阶段(0≤t≤T)。图8中包括了两个阶段,第一阶段相关的数据用右下标为t的字母来表示,第二阶段相关的数据用右下标为t+1的字母来表示,
Figure PCTCN2019081484-appb-000012
Figure PCTCN2019081484-appb-000013
为自顶向下和自底向上的第t阶段第i层的特征图。假设此时t取值为1,则
Figure PCTCN2019081484-appb-000014
为自顶向下网络的第一阶段第i层的第一组特征图。
Figure PCTCN2019081484-appb-000015
为自底向上网络的第一阶段第i层的第二组特征图。
Specifically, as shown in FIG. 8, it is a schematic diagram of the iterative fusion and update process, let t denote the stage (0≤t≤T). Figure 8 includes two stages. The data related to the first stage is represented by the letter with the right subscript t, and the data related to the second stage is represented by the letter with the right subscript t+1.
Figure PCTCN2019081484-appb-000012
with
Figure PCTCN2019081484-appb-000013
It is the feature map of the i-th stage at the top-down and bottom-up. Assuming that the value of t is 1, then
Figure PCTCN2019081484-appb-000014
It is the first set of feature maps of the i-th layer in the first stage of the top-down network.
Figure PCTCN2019081484-appb-000015
It is the second set of feature maps of the i-th layer in the first stage of the bottom-up network.
将第一阶段特征图中的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中的第一组特征图。实际为将第一阶段特征图中某层的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中该层的第一组特征图。在图8中,即为将特征图
Figure PCTCN2019081484-appb-000016
Figure PCTCN2019081484-appb-000017
融合为
Figure PCTCN2019081484-appb-000018
同理,对于第一阶段其他层的特征图也是将该层的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中该层的第一组特征图。
The first set of feature maps and the second set of feature maps in the first stage feature map are merged to obtain the first set of feature maps in the second stage feature map. Actually, the first set of feature maps and the second set of feature maps of a certain layer in the first stage feature map are merged to obtain the first set of feature maps of the layer in the second stage feature map. In Figure 8, the feature map
Figure PCTCN2019081484-appb-000016
with
Figure PCTCN2019081484-appb-000017
Merge into
Figure PCTCN2019081484-appb-000018
In the same way, for the feature maps of other layers in the first stage, the first set of feature maps and the second set of feature maps of the layer are merged to obtain the first set of feature maps of the layer in the second stage feature map.
S704,将第二阶段特征图中的第一组特征图与第一阶段特征图中的第二组特征图进行融合,得到第二阶段特征图中的第二组特征图。S704, fusing the first set of feature maps in the second stage feature map with the second set of feature maps in the first stage feature map to obtain the second set of feature maps in the second stage feature map.
实际指的是将第二阶段特征图中某层的第一组特征图与第一阶段特征图中该层的第二组特征图进行融合,得到第二阶段特征图中该层的第二组特征图。在图8中,即为将特征图
Figure PCTCN2019081484-appb-000019
Figure PCTCN2019081484-appb-000020
融合为
Figure PCTCN2019081484-appb-000021
同理,对于第二阶段其他层的第二组特征图也是按照相同的方式生成。
Actually refers to the fusion of the first set of feature maps of a certain layer in the second stage feature map with the second set of feature maps of the layer in the first stage feature map to obtain the second set of the layer in the second stage feature map Feature map. In Figure 8, the feature map
Figure PCTCN2019081484-appb-000019
with
Figure PCTCN2019081484-appb-000020
Merge into
Figure PCTCN2019081484-appb-000021
In the same way, the second set of feature maps of other layers in the second stage are also generated in the same way.
因此,上下文信息在两个网络之间以锯齿形的方式传播。所以本申请实 施例中的网络也称之为锯齿形网络(ZigZagNet)。该锯齿形网络包括密集连接的、连续的自顶向下网络和自底向上网络,且该自顶向下网络和自底向上网络之间以锯齿形的方式迭代融合更新以传播上下文信息。Therefore, the contextual information is propagated in a zigzag manner between the two networks. Therefore, the network in the embodiment of the present application is also called a ZigZagNet (ZigZagNet). The zigzag network includes densely connected and continuous top-down networks and bottom-up networks, and the top-down network and the bottom-up network are iteratively fused and updated in a zigzag manner to spread context information.
S706,根据区域上下文编码RCE在自顶向下网络中对第二阶段特征图中的第一组特征图进行更新,得到更新后的第二阶段特征图中的第一组特征图。S706: Update the first group of feature maps in the second-stage feature maps in the top-down network according to the regional context coding RCE, to obtain the first group of feature maps in the updated second-stage feature maps.
S708,根据区域上下文编码RCE在自底向上网络中对第二阶段特征图中的第二组特征图进行更新,得到更新后的第二阶段特征图中的第二组特征图。S708: Update the second set of feature maps in the second-stage feature maps in the bottom-up network according to the regional context coding RCE, to obtain the updated second set of feature maps in the second-stage feature maps.
在得到了第二阶段特征图之后,根据区域上下文编码RCE在各自的网络中对第二阶段特征图进行更新,得到更新后的第二阶段特征图。第二阶段特征图包括第一组特征图和第二组特征图,其中第一组特征图包括多层特征图,第二组特征图包括相同数目的多层特征图。更新的过程具体为:在自顶向下网络中分别根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图通过自顶向下网络传播到自顶向下网络中的其他层特征图,从而对自顶向下网络中的其他层特征图进行更新。同理,在自底向上网络中分别根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图通过自底向上网络传播到自底向上网络中的其他层特征图,从而对自底向上网络中的其他层特征图进行更新。上述两者更新后的特征图就构成了更新后的第二阶段特征图。After the second-stage feature map is obtained, the second-stage feature map is updated in the respective networks according to the regional context coding RCE to obtain the updated second-stage feature map. The second-stage feature maps include a first set of feature maps and a second set of feature maps, wherein the first set of feature maps include multi-layer feature maps, and the second set of feature maps include the same number of multi-layer feature maps. The specific update process is as follows: generate context feature maps based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to the top-down through the top-down network The feature maps of other layers in the network are updated to update the feature maps of other layers in the top-down network. Similarly, in the bottom-up network, the contextual feature maps based on the sub-regions of each layer of feature maps are generated according to the regional context coding RCE, and the contextual feature maps are propagated to other layer features in the bottom-up network through the bottom-up network Figure, thereby updating the feature maps of other layers in the bottom-up network. The updated feature maps of the above two constitute the updated second-stage feature map.
S710,将更新后的第二阶段特征图作为下一次迭代融合及更新计算的输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图。S710: Perform iterative fusion and update calculation using the updated second-stage feature map as the input of the next iterative fusion and update calculation, until the preset number of iterations is reached to obtain the target feature map.
具体的,将更新后的第二阶段特征图作为输入进行迭代融合,得到第三阶段特征图,再根据区域上下文编码RCE在各自的网络中对第三阶段特征图进行更新,得到更新后的第三阶段特征图。同理进行下一次迭代融合及更新计算,直到达到预设迭代次数得到目标特征图。在本申请实施例中,可以取迭代次数为三次,当然在其他实施例中,迭代次数可以是其他任意数值。Specifically, the updated second-stage feature map is used as input for iterative fusion to obtain the third-stage feature map, and then the third-stage feature map is updated in their respective networks according to the regional context code RCE to obtain the updated first Three-stage feature map. Similarly, perform the next iteration of fusion and update calculations until the preset number of iterations is reached to obtain the target feature map. In the embodiment of the present application, the number of iterations may be three. Of course, in other embodiments, the number of iterations may be any other value.
本申请实施例中,通过迭代融合使得特征图所包含的上下文信息在自顶向下网络和自底向上网络之间以锯齿形的方式传播。即实现了在自顶向下和 自底向上的网络之间迭代地交换上下文信息,加强了两种网络直接的信息交流,避免了单一网络所造成的信息遗失,提高了物体分割的准确性。In the embodiment of the present application, the context information contained in the feature map is propagated in a zigzag manner between the top-down network and the bottom-up network through iterative fusion. That is, the iterative exchange of context information between top-down and bottom-up networks is realized, the direct information exchange between the two networks is strengthened, the information loss caused by a single network is avoided, and the accuracy of object segmentation is improved.
应该理解的是,虽然图7的流程图中的各个操作按照箭头的指示依次显示,但是这些操作并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些操作的执行并没有严格的顺序限制,这些操作可以以其它的顺序执行。而且,图7中的至少一部分操作可以包括多个子操作或者多个阶段,这些子操作或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子操作或者阶段的执行顺序也不必然是依次进行,而是可以与其它操作或者其它操作的子操作或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various operations in the flowchart of FIG. 7 are displayed in sequence as indicated by the arrows, these operations are not necessarily performed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order for the execution of these operations, and these operations can be executed in other orders. Moreover, at least part of the operations in FIG. 7 may include multiple sub-operations or multiple stages. These sub-operations or stages are not necessarily executed at the same time, but may be executed at different times. The execution of these sub-operations or stages The sequence is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other operations or sub-operations or stages of other operations.
在一个实施例中,对图8所示的迭代融合及更新处理的计算过程进行详细的说明。In an embodiment, the calculation process of the iterative fusion and update processing shown in FIG. 8 is described in detail.
在阶段t+1中自顶向下的网络计算特征图有
Figure PCTCN2019081484-appb-000022
In stage t+1, the top-down network calculation feature map has
Figure PCTCN2019081484-appb-000022
Figure PCTCN2019081484-appb-000023
Figure PCTCN2019081484-appb-000023
其中t=0,…,T-1。在本申请实施例中T最大为3,当然也可以设定T最大取到其他数值。通过将来自更高层次
Figure PCTCN2019081484-appb-000024
的上下文特征图集与融合的特征图集
Figure PCTCN2019081484-appb-000025
的乘积相加,对特征映射
Figure PCTCN2019081484-appb-000026
进行建模,其中
Where t=0,...,T-1. In the embodiment of the present application, the maximum value of T is 3. Of course, the maximum value of T may be set to other values. By coming from a higher level
Figure PCTCN2019081484-appb-000024
Context feature atlas and fusion feature atlas
Figure PCTCN2019081484-appb-000025
Add the product of, the feature map
Figure PCTCN2019081484-appb-000026
For modeling, where
Figure PCTCN2019081484-appb-000027
Figure PCTCN2019081484-appb-000027
其中
Figure PCTCN2019081484-appb-000028
是卷积内核,σ表示Relu激活函数,初始化时t=0,使用B i表示由骨架FCN计算的特征图,用它构造
Figure PCTCN2019081484-appb-000029
在接下来的迭代中,通过卷积和激活它们的总和,将上一个迭代中自顶向下和自底向上的网络生成的
Figure PCTCN2019081484-appb-000030
Figure PCTCN2019081484-appb-000031
融合起来。因此,与传统的单向上下文信息传播不同,自顶向下网络接收上一个迭代的自顶向下和自底向上的上下文信息,以细化新的特征图
Figure PCTCN2019081484-appb-000032
此外,使用区域上下文编码(RCE)来生成基于
Figure PCTCN2019081484-appb-000033
的子区域的上下文特征图
Figure PCTCN2019081484-appb-000034
RCE将分区之间的关系编码到上下文特征图中。通过使用不同的子区域尺度,为
Figure PCTCN2019081484-appb-000035
提供了更丰富的上下文信息。
among them
Figure PCTCN2019081484-appb-000028
Is the convolution kernel, σ represents the Relu activation function, t=0 during initialization, and uses B i to represent the feature map calculated by the skeleton FCN, and use it to construct
Figure PCTCN2019081484-appb-000029
In the next iteration, through convolution and activation of their sum, the top-down and bottom-up networks in the previous iteration are generated
Figure PCTCN2019081484-appb-000030
with
Figure PCTCN2019081484-appb-000031
Fusion. Therefore, unlike the traditional one-way contextual information propagation, the top-down network receives the top-down and bottom-up context information of the previous iteration to refine the new feature map
Figure PCTCN2019081484-appb-000032
In addition, regional context coding (RCE) is used to generate
Figure PCTCN2019081484-appb-000033
Contextual feature map of subregions
Figure PCTCN2019081484-appb-000034
RCE encodes the relationship between the partitions into the context feature map. By using different sub-region scales,
Figure PCTCN2019081484-appb-000035
Provides richer contextual information.
同样的,使用自底向上的网络来计算特征图
Figure PCTCN2019081484-appb-000036
有:
Similarly, use a bottom-up network to calculate feature maps
Figure PCTCN2019081484-appb-000036
Have:
Figure PCTCN2019081484-appb-000037
Figure PCTCN2019081484-appb-000037
其中有:Including:
Figure PCTCN2019081484-appb-000038
Figure PCTCN2019081484-appb-000038
注意,这里不像公式4那样融合阶段t的两个特征图,而是融合图
Figure PCTCN2019081484-appb-000039
Figure PCTCN2019081484-appb-000040
因为在流程的这一点上
Figure PCTCN2019081484-appb-000041
已经可用,并且包含比
Figure PCTCN2019081484-appb-000042
更精确的信息。最后,利用公式(4)融合图
Figure PCTCN2019081484-appb-000043
Figure PCTCN2019081484-appb-000044
得到用于分割的图
Figure PCTCN2019081484-appb-000045
Note that instead of fusing the two feature maps of stage t as in formula 4, it is a fusion map
Figure PCTCN2019081484-appb-000039
with
Figure PCTCN2019081484-appb-000040
Because at this point in the process
Figure PCTCN2019081484-appb-000041
Is already available and contains more than
Figure PCTCN2019081484-appb-000042
More precise information. Finally, use formula (4) to fuse the graph
Figure PCTCN2019081484-appb-000043
with
Figure PCTCN2019081484-appb-000044
Get the graph for segmentation
Figure PCTCN2019081484-appb-000045
在一个实施例中,获取待处理图像的初始特征图,包括:In one embodiment, obtaining the initial feature map of the image to be processed includes:
将待处理图像输入至自底向上网络计算得到初始特征图。Input the image to be processed into the bottom-up network to calculate the initial feature map.
本申请实施例中,如图2(d)左图所示为自底向上网络,将待处理图像输入至该自底向上网络进行卷积计算,就得到了初始特征图。In the embodiment of the present application, the bottom-up network is shown in the left image of FIG. 2(d), and the initial feature map is obtained by inputting the image to be processed into the bottom-up network for convolution calculation.
在一个实施例中,目标特征图为自底向上网络所得到的特征图。In one embodiment, the target feature map is a feature map obtained from a bottom-up network.
本申请实施例中,在经过多次迭代融合及更新计算,直到达到预设迭代次数时,得到最后阶段的特征图。该最后阶段的特征图包括第一组特征图和第二组特征图。其中,第一组特征图为由自顶向下网络计算所得的特征图,第二组特征图为由自底向上网络计算所得的特征图。但是,在物体分割时是根据最后阶段的第二组特征图进行的,所以目标特征图就为最后阶段的自底向上网络所得到的特征图。最后阶段的自底向上网络所得到的特征图是最新的、且包含了最为丰富的上下文信息,所以采用最后阶段的自底向上网络所得到的特征图进行物体分割,就提高了分割结果的准确性。In the embodiment of the present application, after multiple iterations of fusion and update calculations, until the preset number of iterations is reached, the feature map of the final stage is obtained. The feature maps of the final stage include the first set of feature maps and the second set of feature maps. Among them, the first set of feature maps are feature maps calculated by the top-down network, and the second set of feature maps are feature maps calculated by the bottom-up network. However, the object segmentation is based on the second set of feature maps in the final stage, so the target feature map is the feature map obtained by the bottom-up network in the final stage. The feature map obtained by the bottom-up network in the final stage is the latest and contains the richest context information. Therefore, the feature map obtained by the bottom-up network in the final stage is used for object segmentation, which improves the accuracy of the segmentation result. Sex.
在一个实施例中,如图9所示,提供了一种物体分割装置900,装置包括:初始特征图获取模块920、特征图生成模块940、迭代处理模块960及物 体分割模块980。其中,In one embodiment, as shown in FIG. 9, an object segmentation device 900 is provided. The device includes: an initial feature map acquisition module 920, a feature map generation module 940, an iterative processing module 960, and an object segmentation module 980. among them,
初始特征图获取模块920,用于获取待处理图像的初始特征图;The initial feature map acquiring module 920 is used to acquire the initial feature map of the image to be processed;
特征图生成模块940,用于将初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在自顶向下网络和自底向上网络中根据区域上下文编码RCE对第一阶段特征图进行更新,得到更新后的第一阶段特征图;The feature map generation module 940 is used to input the initial feature map into the continuous top-down network and the bottom-up network to calculate the first-stage feature map, respectively in the top-down network and the bottom-up network according to the regional context Encoding RCE updates the first-stage feature map to obtain the updated first-stage feature map;
迭代处理模块960,用于将更新后的第一阶段特征图作为输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图,目标特征图为自底向上网络所得到的特征图;The iterative processing module 960 is configured to use the updated first-stage feature map as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain a target feature map, which is a feature map obtained from a bottom-up network;
物体分割模块980,用于根据目标特征图对待处理图像进行物体分割。The object segmentation module 980 is configured to segment the image to be processed according to the target feature map.
在一个实施例中,如图10所示,特征图生成模块940包括:第一组特征图生成模块942及第二组特征图生成模块944。其中,In one embodiment, as shown in FIG. 10, the feature map generating module 940 includes: a first set of feature map generating module 942 and a second set of feature map generating module 944. among them,
第一组特征图生成模块942,用于将初始特征图输入自顶向下网络中计算得到第一阶段特征图中的第一组特征图;The first set of feature map generation module 942 is configured to input the initial feature map into the top-down network to calculate the first set of feature maps in the first stage feature map;
第二组特征图生成模块944,用于将第一阶段特征图中的第一组特征图输入至自底向上网络中计算得到第一阶段特征图中的第二组特征图。The second set of feature map generating module 944 is configured to input the first set of feature maps in the first stage feature map into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
在一个实施例中,如图10所示,特征图生成模块940还包括:特征图更新模块946。其中,In one embodiment, as shown in FIG. 10, the feature map generation module 940 further includes: a feature map update module 946. among them,
特征图更新模块946,用于在自顶向下网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图传播到自顶向下网络中的其他层特征图,对自顶向下网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第一组特征图;The feature map update module 946 is used to generate a context feature map based on the sub-regions of each layer of feature maps in the top-down network according to the regional context coding RCE, and propagate the context feature maps to other layers in the top-down network Feature map, update the feature maps of other layers in the top-down network to obtain the first set of feature maps in the updated first-stage feature map;
特征图更新模块946,还用于在自底向上网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将上下文特征图传播到自底向上网络中的其他层特征图对自底向上网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第二组特征图。The feature map update module 946 is also used to generate a context feature map based on the sub-regions of each layer of feature maps in the bottom-up network according to the regional context coding RCE, and propagate the context feature maps to other layer features in the bottom-up network The graph updates the feature maps of other layers in the bottom-up network to obtain the second set of feature maps in the updated first-stage feature map.
在一个实施例中,如图10所示,特征图更新模块946包括区域上下文编 码RCE模块946a,区域上下文编码RCE模块946a用于将每一层特征图划分为不同尺度的子区域;将相同尺度的子区域进行加权和计算得到尺度的子区域的全局表示;将全局表示重新分配到尺度的子区域,将尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。In one embodiment, as shown in FIG. 10, the feature map update module 946 includes a regional context encoding RCE module 946a, which is used to divide each layer of feature maps into sub-regions of different scales; The sub-regions of is weighted and calculated to obtain the global representation of the sub-regions of the scale; the global representation is re-allocated to the sub-regions of the scale, and the sub-regions of the scale are aggregated to generate a context feature map based on the sub-regions of each layer of feature maps.
在一个实施例中,如图10所示,迭代处理模块960,还用于将第一阶段特征图中的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中的第一组特征图;In one embodiment, as shown in FIG. 10, the iterative processing module 960 is further configured to merge the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the The first set of feature maps;
将第二阶段特征图中的第一组特征图与第一阶段特征图中的第二组特征图进行融合,得到第二阶段特征图中的第二组特征图;Fusion of the first set of feature maps in the second stage feature map with the second set of feature maps in the first stage feature map to obtain the second set of feature maps in the second stage feature map;
根据区域上下文编码RCE在自顶向下网络中对第二阶段特征图中的第一组特征图进行更新,得到更新后的第二阶段特征图中的第一组特征图;Update the first group of feature maps in the second-stage feature maps in the top-down network according to the regional context code RCE, to obtain the first group of feature maps in the updated second-stage feature maps;
根据区域上下文编码RCE在自底向上网络中对第二阶段特征图中的第二组特征图进行更新,得到更新后的第二阶段特征图中的第二组特征图;Update the second set of feature maps in the second-stage feature map in the bottom-up network according to the regional context code RCE, to obtain the second set of feature maps in the updated second-stage feature map;
将更新后的第二阶段特征图作为下一次迭代融合及更新计算的输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图。The updated second-stage feature map is used as the input of the next iterative fusion and update calculation to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map.
在一个实施例中,如图10所示,初始特征图获取模块920,还用于将待处理图像输入至自底向上网络计算得到初始特征图。In one embodiment, as shown in FIG. 10, the initial feature map acquisition module 920 is further configured to input the image to be processed into the bottom-up network calculation to obtain the initial feature map.
图11示出了一个实施例中计算机设备的内部结构图。该计算机设备具体可以是终端或服务器。如图11所示,该计算机设备包括该计算机设备包括通过***总线连接的处理器、存储器、网络接口、输入装置、显示屏、摄像头、声音采集装置及扬声器。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作***,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现物体分割方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行物体分割方法。计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板 或鼠标等。Fig. 11 shows an internal structure diagram of a computer device in an embodiment. The computer device may specifically be a terminal or a server. As shown in FIG. 11, the computer equipment includes the computer equipment including a processor, a memory, a network interface, an input device, a display screen, a camera, a sound collection device, and a speaker connected through a system bus. Among them, the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program. When the computer program is executed by the processor, the processor can realize the object segmentation method. A computer program may also be stored in the internal memory, and when the computer program is executed by the processor, the processor can execute the object segmentation method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen. The input device of the computer equipment can be a touch layer covered on the display screen, or a button, trackball or touch pad set on the housing of the computer equipment. It can be an external keyboard, touchpad, or mouse.
本领域技术人员可以理解,图11中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
在一个实施例中,本申请提供的物体分割装置可以实现为一种计算机程序的形式,计算机程序可在如图11所示的计算机设备上运行。计算机设备的存储器中可存储组成该物体分割装置的各个程序模块,比如,图9所示的初始特征图获取模块920、特征图生成模块940、迭代处理模块960及物体分割模块980。各个程序模块构成的计算机程序使得处理器执行本说明书中描述的本申请各个实施例的物体分割方法中的操作。In an embodiment, the object segmentation apparatus provided in the present application may be implemented in the form of a computer program, and the computer program may run on the computer device as shown in FIG. 11. The memory of the computer device can store various program modules that make up the object segmentation apparatus, such as the initial feature map acquisition module 920, the feature map generation module 940, the iterative processing module 960, and the object segmentation module 980 shown in FIG. The computer program composed of each program module causes the processor to execute the operations in the object segmentation method of each embodiment of the present application described in this specification.
例如,图11所示的计算机设备可以通过如图9所示的物体分割装置900中的初始特征图获取模块920执行操作S102。计算机设备可通过特征图生成模块940执行操作S104。计算机设备可通过迭代处理模块960执行操作S106。计算机设备可通过物体分割模块980执行操作S108。For example, the computer device shown in FIG. 11 may perform operation S102 through the initial feature map acquisition module 920 in the object segmentation apparatus 900 shown in FIG. 9. The computer device may perform operation S104 through the feature map generation module 940. The computer device may perform operation S106 through the iterative processing module 960. The computer device may perform operation S108 through the object segmentation module 980.
上述物体分割装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。其中,网络接口可以是以太网卡或无线网卡等,上述各模块可以以硬件形式内嵌于或独立于服务器中的处理器中,也可以以软件形式存储于服务器中的存储器中,以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned object segmentation device can be implemented in whole or in part by software, hardware and a combination thereof. Among them, the network interface can be an Ethernet card or a wireless network card, etc. The above modules can be embedded in the form of hardware or independent of the processor in the server, or can be stored in the memory of the server in the form of software to facilitate the processor Call and execute the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,计算机程序被处理器执行时,使得处理器执行上述物体分割方法的操作。此处物体分割方法的操作可以是上述各个实施例的物体分割方法中的操作。In one embodiment, a computer device is provided, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to perform the operations of the object segmentation method. The operation of the object segmentation method here may be the operation in the object segmentation method of each of the foregoing embodiments.
在一个实施例中,提供了一种计算机可读存储介质,存储有计算机程序,计算机程序被处理器执行时,使得处理器执行上述物体分割方法的操作。此处物体分割方法的操作可以是上述各个实施例的物体分割方法中的操作。In one embodiment, a computer-readable storage medium is provided, which stores a computer program, and when the computer program is executed by a processor, the processor executes the operations of the object segmentation method described above. The operation of the object segmentation method here may be the operation in the object segmentation method of each of the foregoing embodiments.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成,该程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,该存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program, which can be stored in a non-volatile computer readable storage medium. When the program is executed, it may include the processes of the above-mentioned method embodiments. Among them, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), etc.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-mentioned embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above-mentioned embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, All should be considered as the scope of this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (18)

  1. 一种物体分割方法,包括:An object segmentation method, including:
    获取待处理图像的初始特征图;Obtain the initial feature map of the image to be processed;
    将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图;Input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map, and encode the RCE according to the regional context in the top-down network and the bottom-up network, respectively Updating the first-stage feature map to obtain an updated first-stage feature map;
    将所述更新后的第一阶段特征图作为输入进行迭代融合及更新处理,直到达到预设迭代次数得到目标特征图,所述目标特征图为自底向上网络所得到的特征图;及Use the updated first-stage feature map as input to perform iterative fusion and update processing until the preset number of iterations is reached to obtain a target feature map, where the target feature map is a feature map obtained from a bottom-up network; and
    根据所述目标特征图对所述待处理图像进行物体分割。Perform object segmentation on the image to be processed according to the target feature map.
  2. 根据权利要求1所述的方法,其特征在于,所述连续的自顶向下网络和自底向上网络采用密集路径连接。The method according to claim 1, wherein the continuous top-down network and the bottom-up network are connected by dense paths.
  3. 根据权利要求2所述的方法,其特征在于,所述将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,包括:The method according to claim 2, wherein the inputting the initial feature map into a continuous top-down network and a bottom-up network to obtain the first-stage feature map comprises:
    所述第一阶段特征图包括两组特征图;The first stage feature map includes two sets of feature maps;
    将所述初始特征图输入自顶向下网络中计算得到所述第一阶段特征图中的第一组特征图;及Input the initial feature map into a top-down network to calculate the first set of feature maps in the first stage feature map; and
    将所述第一阶段特征图中的第一组特征图输入至自底向上网络中计算得到第一阶段特征图中的第二组特征图。The first set of feature maps in the first stage feature map are input into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
  4. 根据权利要求3所述的方法,其特征在于,所述分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图,包括:The method according to claim 3, wherein the first-stage feature map is updated according to the regional context code RCE in the top-down network and the bottom-up network, respectively, to obtain an update The feature map of the first stage afterwards includes:
    在所述自顶向下网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将所述上下文特征图传播到所述自顶向下网络中的其他层特征图,对所述自顶向下网络中的其他层特征图进行更新,得到更 新后的第一阶段特征图中的第一组特征图;及In the top-down network, a contextual feature map based on the sub-region of each layer of the feature map is generated according to the regional context coding RCE, and the contextual feature map is propagated to other layer feature maps in the top-down network , Updating the feature maps of other layers in the top-down network to obtain the first set of feature maps in the updated first-stage feature maps; and
    在所述自底向上网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将所述上下文特征图传播到所述自底向上网络中的其他层特征图对所述自底向上网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第二组特征图。In the bottom-up network, a contextual feature map based on the sub-region of each layer of feature maps is generated according to the regional context coding RCE, and the contextual feature map is propagated to the other layer feature maps in the bottom-up network. The feature maps of other layers in the bottom-up network are updated to obtain the second set of feature maps in the updated first-stage feature maps.
  5. 根据权利要求4所述的方法,其特征在于,所述根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,包括:The method according to claim 4, wherein the generating a context feature map based on the sub-regions of each layer of feature maps according to the regional context coding RCE comprises:
    将每一层特征图划分为不同尺度的子区域;Divide each layer of feature maps into sub-regions of different scales;
    将相同尺度的子区域进行加权和计算得到所述尺度的子区域的全局表示;及Weighting and calculating sub-regions of the same scale to obtain a global representation of the sub-regions of said scale; and
    将所述全局表示重新分配到所述尺度的子区域,将所述尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。The global representation is re-allocated to sub-regions of the scale, and the sub-regions of the scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
  6. 根据权利要求5所述的方法,其特征在于,所述区域上下文编码RCE对于每一层特征图包括多个并行分支,每个所述并行分支分别对同一尺度的子区域进行处理。The method according to claim 5, wherein the regional context coding RCE includes a plurality of parallel branches for each layer of the feature map, and each of the parallel branches separately processes sub-regions of the same scale.
  7. 根据权利要求3所述的方法,其特征在于,所述迭代融合及更新处理的过程,包括:The method according to claim 3, wherein the process of iterative fusion and update processing comprises:
    将所述第一阶段特征图中的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中的第一组特征图;Fusing the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the first set of feature maps in the second stage feature map;
    将所述第二阶段特征图中的第一组特征图与所述第一阶段特征图中的第二组特征图进行融合,得到第二阶段特征图中的第二组特征图;Fusing the first set of feature maps in the second stage feature map with the second set of feature maps in the first stage feature map to obtain the second set of feature maps in the second stage feature map;
    根据区域上下文编码RCE在所述自顶向下网络中对所述第二阶段特征图中的第一组特征图进行更新,得到更新后的第二阶段特征图中的第一组特征图;Updating the first group of feature maps in the second-stage feature map in the top-down network according to the regional context code RCE, to obtain the first group of feature maps in the updated second-stage feature map;
    根据区域上下文编码RCE在所述自底向上网络中对所述第二阶段特征图中的第二组特征图进行更新,得到更新后的第二阶段特征图中的第二组特征图;及Updating the second set of feature maps in the second-stage feature map in the bottom-up network according to the regional context code RCE to obtain the second set of feature maps in the updated second-stage feature map; and
    将所述更新后的第二阶段特征图作为下一次迭代融合及更新计算的输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图。The updated second-stage feature map is used as the input of the next iterative fusion and update calculation to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map.
  8. 根据权利要求1所述的方法,其特征在于,所述获取待处理图像的初始特征图,包括:The method according to claim 1, wherein said obtaining an initial feature map of the image to be processed comprises:
    将待处理图像输入至自底向上网络计算得到初始特征图。Input the image to be processed into the bottom-up network to calculate the initial feature map.
  9. 一种物体分割装置,其特征在于,所述装置包括:An object segmentation device, characterized in that the device comprises:
    初始特征图获取模块,用于获取待处理图像的初始特征图;The initial feature map acquisition module is used to acquire the initial feature map of the image to be processed;
    特征图生成模块,用于将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图;The feature map generation module is configured to input the initial feature map into a continuous top-down network and a bottom-up network to obtain a first-stage feature map, respectively, in the top-down network and the bottom-up network Update the first-stage feature map according to the regional context code RCE in the network to obtain an updated first-stage feature map;
    迭代处理模块,用于将所述更新后的第一阶段特征图作为输入进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图;及An iterative processing module, configured to use the updated first-stage feature map as input to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map; and
    物体分割模块,用于根据所述目标特征图对所述待处理图像进行物体分割。The object segmentation module is configured to perform object segmentation on the image to be processed according to the target feature map.
  10. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至8中任一项所述方法的操作。A computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to perform the operation of the method according to any one of claims 1 to 8.
  11. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:A computer device includes a memory and a processor, the memory stores a computer program, and is characterized in that, when the computer program is executed by the processor, the processor is caused to perform the following operations:
    获取待处理图像的初始特征图;Obtain the initial feature map of the image to be processed;
    将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图;Input the initial feature map into the continuous top-down network and bottom-up network to calculate the first-stage feature map, and encode the RCE according to the regional context in the top-down network and the bottom-up network, respectively Updating the first-stage feature map to obtain an updated first-stage feature map;
    将所述更新后的第一阶段特征图作为输入进行迭代融合及更新处理,直 到达到预设迭代次数得到目标特征图,所述目标特征图为自底向上网络所得到的特征图;及Use the updated first-stage feature map as input to perform iterative fusion and update processing until the preset number of iterations is reached to obtain a target feature map, the target feature map being a feature map obtained from a bottom-up network; and
    根据所述目标特征图对所述待处理图像进行物体分割。Perform object segmentation on the image to be processed according to the target feature map.
  12. 根据权利要求11所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:The computer device according to claim 11, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations:
    所述连续的自顶向下网络和自底向上网络采用密集路径连接。The continuous top-down network and bottom-up network are connected by dense paths.
  13. 根据权利要求12所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:The computer device according to claim 12, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations:
    所述将所述初始特征图输入连续的自顶向下网络和自底向上网络中计算得到第一阶段特征图,包括:Said inputting the initial feature map into the continuous top-down network and bottom-up network to obtain the first-stage feature map includes:
    所述第一阶段特征图包括两组特征图;The first stage feature map includes two sets of feature maps;
    将所述初始特征图输入自顶向下网络中计算得到所述第一阶段特征图中的第一组特征图;及Input the initial feature map into a top-down network to calculate the first set of feature maps in the first stage feature map; and
    将所述第一阶段特征图中的第一组特征图输入至自底向上网络中计算得到第一阶段特征图中的第二组特征图。The first set of feature maps in the first stage feature map are input into the bottom-up network to calculate the second set of feature maps in the first stage feature map.
  14. 根据权利要求13所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:The computer device according to claim 13, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations:
    所述分别在所述自顶向下网络和所述自底向上网络中根据区域上下文编码RCE对所述第一阶段特征图进行更新,得到更新后的第一阶段特征图,包括:The updating the first-stage feature map in the top-down network and the bottom-up network respectively according to the regional context code RCE to obtain the updated first-stage feature map includes:
    在所述自顶向下网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将所述上下文特征图传播到所述自顶向下网络中的其他层特征图,对所述自顶向下网络中的其他层特征图进行更新,得到更新后的第一阶段特征图中的第一组特征图;及In the top-down network, a contextual feature map based on the sub-region of each layer of the feature map is generated according to the regional context coding RCE, and the contextual feature map is propagated to other layer feature maps in the top-down network , Updating the feature maps of other layers in the top-down network to obtain the first set of feature maps in the updated first-stage feature maps; and
    在所述自底向上网络中根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,将所述上下文特征图传播到所述自底向上网络中的其他层特征图对所述自底向上网络中的其他层特征图进行更新,得到更新 后的第一阶段特征图中的第二组特征图。In the bottom-up network, a contextual feature map based on the sub-region of each layer of feature maps is generated according to the regional context coding RCE, and the contextual feature map is propagated to the other layer feature maps in the bottom-up network. The feature maps of other layers in the bottom-up network are updated to obtain the second set of feature maps in the updated first-stage feature maps.
  15. 根据权利要求14所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:所述根据区域上下文编码RCE生成基于每一层特征图的子区域的上下文特征图,包括:The computer device according to claim 14, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations: the RCE according to the regional context encoding generates a feature map based on each layer Contextual feature maps of sub-regions, including:
    将每一层特征图划分为不同尺度的子区域;Divide each layer of feature maps into sub-regions of different scales;
    将相同尺度的子区域进行加权和计算得到所述尺度的子区域的全局表示;及Weighting and calculating sub-regions of the same scale to obtain a global representation of the sub-regions of said scale; and
    将所述全局表示重新分配到所述尺度的子区域,将所述尺度的子区域聚合生成基于每一层特征图的子区域的上下文特征图。The global representation is re-allocated to sub-regions of the scale, and the sub-regions of the scale are aggregated to generate a contextual feature map based on the sub-regions of each layer of feature maps.
  16. 根据权利要求15所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:所述区域上下文编码RCE对于每一层特征图包括多个并行分支,每个所述并行分支分别对同一尺度的子区域进行处理。The computer device according to claim 15, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations: the regional context encoding RCE includes multiple feature maps for each layer Parallel branches, each of the parallel branches respectively process sub-regions of the same scale.
  17. 根据权利要求13所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:所述迭代融合及更新处理的过程,包括:The computer device according to claim 13, wherein when the computer program is executed by the processor, the processor is caused to perform the following operations: the process of iterative fusion and update processing includes:
    将所述第一阶段特征图中的第一组特征图和第二组特征图进行融合,得到第二阶段特征图中的第一组特征图;Fusing the first set of feature maps and the second set of feature maps in the first stage feature map to obtain the first set of feature maps in the second stage feature map;
    将所述第二阶段特征图中的第一组特征图与所述第一阶段特征图中的第二组特征图进行融合,得到第二阶段特征图中的第二组特征图;Fusing the first set of feature maps in the second stage feature map with the second set of feature maps in the first stage feature map to obtain the second set of feature maps in the second stage feature map;
    根据区域上下文编码RCE在所述自顶向下网络中对所述第二阶段特征图中的第一组特征图进行更新,得到更新后的第二阶段特征图中的第一组特征图;Updating the first group of feature maps in the second-stage feature map in the top-down network according to the regional context code RCE, to obtain the first group of feature maps in the updated second-stage feature map;
    根据区域上下文编码RCE在所述自底向上网络中对所述第二阶段特征图中的第二组特征图进行更新,得到更新后的第二阶段特征图中的第二组特征图;及Updating the second set of feature maps in the second-stage feature map in the bottom-up network according to the regional context code RCE to obtain the second set of feature maps in the updated second-stage feature map; and
    将所述更新后的第二阶段特征图作为下一次迭代融合及更新计算的输入 进行迭代融合及更新计算,直到达到预设迭代次数得到目标特征图。The updated second-stage feature map is used as the input of the next iterative fusion and update calculation to perform iterative fusion and update calculations until the preset number of iterations is reached to obtain the target feature map.
  18. 根据权利要求11所述的计算机设备,其特征在于,所述计算机程序被所述处理器执行时,使得所述处理器执行以下操作:所述获取待处理图像的初始特征图,包括:The computer device according to claim 11, wherein, when the computer program is executed by the processor, the processor is caused to perform the following operations: the obtaining an initial feature map of the image to be processed includes:
    将待处理图像输入至自底向上网络计算得到初始特征图。Input the image to be processed into the bottom-up network to calculate the initial feature map.
PCT/CN2019/081484 2019-03-21 2019-04-04 Object segmentation method and apparatus, computer readable storage medium, and computer device WO2020186563A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910217342.1A CN110084816B (en) 2019-03-21 2019-03-21 Object segmentation method, device, computer-readable storage medium and computer equipment
CN201910217342.1 2019-03-21

Publications (1)

Publication Number Publication Date
WO2020186563A1 true WO2020186563A1 (en) 2020-09-24

Family

ID=67413426

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/081484 WO2020186563A1 (en) 2019-03-21 2019-04-04 Object segmentation method and apparatus, computer readable storage medium, and computer device

Country Status (2)

Country Link
CN (1) CN110084816B (en)
WO (1) WO2020186563A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084816B (en) * 2019-03-21 2021-04-06 深圳大学 Object segmentation method, device, computer-readable storage medium and computer equipment
CN113421276B (en) * 2021-07-02 2023-07-21 深圳大学 Image processing method, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120092357A1 (en) * 2010-10-14 2012-04-19 Microsoft Corporation Region-Based Image Manipulation
CN108647695A (en) * 2018-05-02 2018-10-12 武汉科技大学 Soft image conspicuousness detection method based on covariance convolutional neural networks
CN109118491A (en) * 2018-07-30 2019-01-01 深圳先进技术研究院 A kind of image partition method based on deep learning, system and electronic equipment
CN109472298A (en) * 2018-10-19 2019-03-15 天津大学 Depth binary feature pyramid for the detection of small scaled target enhances network
CN110084816A (en) * 2019-03-21 2019-08-02 深圳大学 Method for segmenting objects, device, computer readable storage medium and computer equipment

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916379A (en) * 2010-09-03 2010-12-15 华中科技大学 Target search and recognition method based on object accumulation visual attention mechanism
CN102073700B (en) * 2010-12-30 2012-12-19 浙江大学 Discovery method of complex network community
CN102074013B (en) * 2011-01-26 2012-11-28 刘国英 Wavelet multi-scale Markov network model-based image segmentation method
US9042648B2 (en) * 2012-02-23 2015-05-26 Microsoft Technology Licensing, Llc Salient object segmentation
CN103049340A (en) * 2012-10-26 2013-04-17 中山大学 Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint
CN104463191A (en) * 2014-10-30 2015-03-25 华南理工大学 Robot visual processing method based on attention mechanism
US10176388B1 (en) * 2016-11-14 2019-01-08 Zoox, Inc. Spatial and temporal information for semantic segmentation
CN107609460B (en) * 2017-05-24 2021-02-02 南京邮电大学 Human body behavior recognition method integrating space-time dual network flow and attention mechanism
CN107368787B (en) * 2017-06-16 2020-11-10 长安大学 Traffic sign identification method for deep intelligent driving application
CN107909082B (en) * 2017-10-30 2020-07-31 东南大学 Sonar image target identification method based on deep learning technology
CN107909581B (en) * 2017-11-03 2019-01-29 杭州依图医疗技术有限公司 Lobe of the lung section dividing method, device, system, storage medium and the equipment of CT images
CN108171711A (en) * 2018-01-17 2018-06-15 深圳市唯特视科技有限公司 A kind of infant's brain Magnetic Resonance Image Segmentation method based on complete convolutional network
CN109255790A (en) * 2018-07-27 2019-01-22 北京工业大学 A kind of automatic image marking method of Weakly supervised semantic segmentation
CN109102513A (en) * 2018-08-15 2018-12-28 黄淮学院 A kind of partitioning algorithm of the medical image based on mathematical morphology
CN109447990B (en) * 2018-10-22 2021-06-22 北京旷视科技有限公司 Image semantic segmentation method and device, electronic equipment and computer readable medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120092357A1 (en) * 2010-10-14 2012-04-19 Microsoft Corporation Region-Based Image Manipulation
CN108647695A (en) * 2018-05-02 2018-10-12 武汉科技大学 Soft image conspicuousness detection method based on covariance convolutional neural networks
CN109118491A (en) * 2018-07-30 2019-01-01 深圳先进技术研究院 A kind of image partition method based on deep learning, system and electronic equipment
CN109472298A (en) * 2018-10-19 2019-03-15 天津大学 Depth binary feature pyramid for the detection of small scaled target enhances network
CN110084816A (en) * 2019-03-21 2019-08-02 深圳大学 Method for segmenting objects, device, computer readable storage medium and computer equipment

Also Published As

Publication number Publication date
CN110084816B (en) 2021-04-06
CN110084816A (en) 2019-08-02

Similar Documents

Publication Publication Date Title
US10733431B2 (en) Systems and methods for optimizing pose estimation
US10692243B2 (en) Optimizations for dynamic object instance detection, segmentation, and structure mapping
US10810435B2 (en) Segmenting objects in video sequences
CN111670457B (en) Optimization of dynamic object instance detection, segmentation and structure mapping
EP3493106B1 (en) Optimizations for dynamic object instance detection, segmentation, and structure mapping
CN109493417B (en) Three-dimensional object reconstruction method, device, equipment and storage medium
WO2020228522A1 (en) Target tracking method and apparatus, storage medium and electronic device
EP3493104A1 (en) Optimizations for dynamic object instance detection, segmentation, and structure mapping
US20150302317A1 (en) Non-greedy machine learning for high accuracy
JP7013489B2 (en) Learning device, live-action image classification device generation system, live-action image classification device generation device, learning method and program
US20200265294A1 (en) Object Animation Using Generative Neural Networks
WO2020186563A1 (en) Object segmentation method and apparatus, computer readable storage medium, and computer device
CN112214775A (en) Injection type attack method and device for graph data, medium and electronic equipment
CN112464042B (en) Task label generating method and related device for convolution network according to relationship graph
CN107003834A (en) Pedestrian detection apparatus and method
CN112819157A (en) Neural network training method and device and intelligent driving control method and device
CN113554653A (en) Semantic segmentation method for long-tail distribution of point cloud data based on mutual information calibration
CN108780377A (en) Object Management group using computing device and visualization
CN112800276A (en) Video cover determination method, device, medium and equipment
CN113723411B (en) Feature extraction method and segmentation system for semantic segmentation of remote sensing image
CN117037244A (en) Face security detection method, device, computer equipment and storage medium
CN110781223A (en) Data processing method and device, processor, electronic equipment and storage medium
JP2021527859A (en) Irregular shape segmentation in an image using deep region expansion
CN113191208B (en) Feature extraction method and computer equipment for remote sensing image instance segmentation
JP6892557B2 (en) Learning device, image generator, learning method, image generation method and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919778

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19919778

Country of ref document: EP

Kind code of ref document: A1