WO2020228405A1 - 图像处理方法、装置及电子设备 - Google Patents

图像处理方法、装置及电子设备 Download PDF

Info

Publication number
WO2020228405A1
WO2020228405A1 PCT/CN2020/079192 CN2020079192W WO2020228405A1 WO 2020228405 A1 WO2020228405 A1 WO 2020228405A1 CN 2020079192 W CN2020079192 W CN 2020079192W WO 2020228405 A1 WO2020228405 A1 WO 2020228405A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
layer
sampling
convolutional
layers
Prior art date
Application number
PCT/CN2020/079192
Other languages
English (en)
French (fr)
Inventor
李华夏
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020228405A1 publication Critical patent/WO2020228405A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Definitions

  • the present disclosure relates to the field of data processing technology, and in particular to an image processing method, device and electronic equipment.
  • image processing tasks can be completed by artificial intelligence.
  • neural networks have been fully applied in the field of computer image recognition. For example, in the image to recognize different people, or automatically recognize different objects on the road in unmanned driving. These all constitute the specific content of image semantic recognition.
  • image semantic recognition involves image semantic segmentation. Image semantic segmentation is generally modeled as a pixel-level multi-classification problem. The goal is to distinguish each pixel of an image into one of multiple predefined categories.
  • the embodiments of the present disclosure provide an image processing method, device, and electronic device, which at least partially solve the problems in the prior art.
  • embodiments of the present disclosure provide an image processing method, including:
  • a segmentation network for performing image processing on the first image is set.
  • the segmentation network includes a plurality of convolutional layers and down-sampling layers.
  • the convolutional layer and the down-sampling layer are spaced apart.
  • Perform feature extraction on a target object in an image, and the down-sampling layer performs down-sampling on the image output by the convolutional layer;
  • the parallel convolutional layer is used to process the image output by the second down-sampling layer.
  • Each parallel convolutional layer The image features extracted above form a second image through fusion;
  • a third image containing the target object is acquired.
  • the performing target recognition on the second image includes:
  • a third down-sampling layer is provided, and the third down-sampling layer performs a down-sampling operation on the second image.
  • the performing target recognition on the second image further includes:
  • the up-sampling layer After the third down-sampling layer, a plurality of up-sampling layers are set, and the up-sampling layer performs an up-sampling operation on the image output by the third down-sampling layer.
  • the performing target recognition on the second image further includes:
  • target recognition is performed on the image output by the upsampling layer.
  • the method further includes:
  • the connecting convolutional layers between convolutional layers of the same image size includes:
  • the convolutional layer is connected based on the residual function.
  • connection of the convolutional layer based on the residual function includes:
  • the image features extracted on each parallel convolutional layer are merged to form a second image, including:
  • Different weight values are assigned to the multiple eigenvector matrices, and the sum of the eigenvector matrices with different weight values is used as the representation matrix of the second image.
  • an image processing device including:
  • An obtaining module used to obtain the first image containing the target object
  • the setting module is configured to set a segmentation network for performing image processing on the first image.
  • the segmentation network includes a plurality of convolutional layers and downsampling layers.
  • the convolutional layer and the downsampling layer are distributed at intervals, and the convolution A layer performs feature extraction on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer;
  • the processing module is used to set multiple parallel convolutional layers with different sampling rates after the second down-sampling layer in the segmentation network.
  • the parallel convolutional layers are used to process the image output by the second down-sampling layer.
  • the image features extracted on two parallel convolutional layers are fused to form a second image;
  • the execution module is configured to obtain a third image containing the target object by performing target recognition on the second image.
  • an embodiment of the present disclosure also provides an electronic device, which includes:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any of the foregoing first aspect or any implementation of the first aspect Image processing method.
  • embodiments of the present disclosure also provide a non-transitory computer-readable storage medium that stores computer instructions that are used to make the computer execute the first aspect or the first aspect described above.
  • An image processing method in any implementation of one aspect.
  • the embodiments of the present disclosure also provide a computer program product.
  • the computer program product includes a computing program stored on a non-transitory computer-readable storage medium.
  • the computer program includes program instructions. When executed, the computer is caused to execute the image processing method in the foregoing first aspect or any implementation manner of the first aspect.
  • the image processing solution in the embodiment of the present disclosure includes acquiring a first image containing a target object; setting a segmentation network for image processing on the first image, the segmentation network including multiple convolutional layers and downsampling layers, the volume The distribution layer and the down-sampling layer are spaced apart, the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer; After the second down-sampling layer in the segmentation network, multiple parallel convolutional layers with different sampling rates are set.
  • the parallel convolutional layer is used to process the image output by the second down-sampling layer, and each parallel convolutional layer is The extracted image features are fused to form a second image; by performing target recognition on the second image, a third image containing the target object is obtained.
  • FIG. 1 is a schematic diagram of an image processing flow provided by an embodiment of the disclosure
  • FIG. 2 is a schematic diagram of a neural network model provided by an embodiment of the disclosure.
  • FIG. 3 is a schematic diagram of another image processing flow provided by an embodiment of the disclosure.
  • FIG. 4 is a schematic diagram of another image processing flow provided by an embodiment of the disclosure.
  • FIG. 5 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the disclosure.
  • FIG. 6 is a schematic diagram of an electronic device provided by an embodiment of the disclosure.
  • the embodiment of the present disclosure provides an image processing method.
  • the image processing method provided in this embodiment can be executed by a computing device, and the computing device can be implemented as software, or as a combination of software and hardware, and the computing device can be integrated in a server, terminal device, etc.
  • an image processing method provided by an embodiment of the present disclosure includes the following steps:
  • S101 Acquire a first image containing a target object.
  • the target object is the content to be acquired by the solution of the present disclosure.
  • the target object may be a person with various actions, an animal with behavior characteristics, or a stationary object.
  • the target object is usually contained in a certain scene.
  • a photo containing a portrait of a person usually also contains a background.
  • the background may include trees, mountains, rivers, and other people.
  • you want to extract the target object separately from the image you need to identify and process the target object separately.
  • various behaviors of the target object can be analyzed.
  • the first image is an image that contains the target object.
  • the first image can be one of a series of pre-stored photos, a video frame extracted from a pre-saved video, or a live broadcast One or more frames extracted from the video.
  • the first image may include multiple objects.
  • the photo used to describe the action of the person may include the target person, other people with the target person, trees, buildings, etc.
  • the target person constitutes the target object of the first image, and other people, trees, buildings, etc. together with the target person constitute the background image. Based on actual needs, one or more objects can be selected as target objects in the first image.
  • the target object can be obtained from a video file, and the video collected from the target object contains multiple frame images, and multiple images containing one or more continuous actions of the target object can be selected from the frame images of the video to form an image set.
  • the first image containing the target object can be obtained.
  • the segmentation network includes a plurality of convolutional layers and downsampling layers.
  • the convolutional layer and the downsampling layer are spaced apart, and the convolutional layer is Feature extraction is performed on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer.
  • the segmentation network includes a convolutional layer, a sampling layer and a fully connected layer.
  • the main parameters of the convolutional layer include the size of the convolution kernel and the number of input feature maps.
  • Each convolutional layer can contain several feature maps of the same size.
  • the feature values of the same layer adopt the method of sharing weights.
  • the convolution in each layer The core size is the same.
  • the convolution layer performs convolution calculation on the input image and extracts the layout features of the input image.
  • the feature extraction layer of the convolutional layer can be connected to the sampling layer.
  • the sampling layer is used to find the local average of the input image and perform secondary feature extraction.
  • the neural network model can ensure that the input The image has good robustness.
  • the sampling layer may include an up-sampling layer and a down-sampling layer.
  • the up-sampling layer adds pixel information in the image by interpolating the input image.
  • the down-sampling layer extracts the features of the input image by extracting the features of the input image.
  • a pooling layer (not shown in the figure) can also be provided after the convolutional layer.
  • the pooling layer uses the maximum pooling method to process the output results of the convolutional layer, which can be more Good to extract the invariant features of the input image.
  • the fully connected layer integrates the features in the image feature maps that have passed through multiple convolutional layers and pooling layers, and obtains the classification features of the input image features for image classification.
  • the fully connected layer maps the feature map generated by the convolutional layer into a fixed-length feature vector.
  • the feature vector contains the combined information of all the features of the input image, and the feature vector retains the most characteristic image features in the image to complete the image classification task. In this way, the prediction map corresponding to the input image can be calculated, thereby determining the target object contained in the first image.
  • a down-sampling layer, down-sampling layer and convolutional layer interval distribution are set in the segmentation network.
  • the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer The image output by the convolutional layer performs down-sampling.
  • the calculation speed of the segmentation network for the first image is improved.
  • the disadvantage of traditional neural networks is that they need to input fixed-size images.
  • the images input into the neural network may have been cropped or distorted.
  • the cropped or distorted images will have content due to The loss situation causes the neural network to reduce the recognition accuracy of the object to be recognized in the input image.
  • the recognition accuracy of the target object by the neural network will also be reduced.
  • Parallel convolutional layers are set in the segmentation network. Specifically, after the second down-sampling layer in the segmentation network, multiple parallel sampling rates are set. Convolutional layer, the parallel convolutional layer is used to process the image output by the second down-sampling layer, and the image features extracted on each parallel convolutional layer are fused to form a second image.
  • the input image or the target object in the input image can be any aspect ratio or any size.
  • the segmentation network can extract features at different scales.
  • the parallel convolution layer can use 4 ⁇ 4, 2 ⁇ 2, and 1 ⁇ 1 convolution kernels to perform feature calculations on the input images respectively, so as to obtain 3 independently processed images, and merge the 3 independently processed images , Can form a second image. Since the formation of the second image is not affected by the reality of the size or proportion of the input image, the robustness of the segmentation network is further improved.
  • each embodiment is not limited to the detection of objects of a specific size, shape, or type, nor is it limited to the detection of images of a specific size, type, or content.
  • the system for image processing using parallel convolutional layer pooling according to various embodiments can be applied to images of any size, type, or content.
  • the parallel convolutional layer improves the robustness of the data and also increases the computational burden of the system. For this reason, the parallel convolutional layer is set after the second down-sampling layer in the segmentation network. At this time, the second The image output by the down-sampling layer has sufficient characteristics to meet the requirements of the parallel sampling layer. At the same time, after the first image is processed by the two sampling layers, the amount of data calculation is greatly reduced. While satisfying the robustness of parallel convolutional layers, it also reduces the computational cost of evaluating convolutional layers.
  • S104 Acquire a third image containing the target object by performing target recognition on the second image.
  • the size of the second image can be adjusted.
  • 3 parallel convolutional layers (1 ⁇ 1, 3 ⁇ 3, and 6 ⁇ 6, a total of 46 feature vectors) as an example
  • these 3 parallel convolutional layers can be used for each candidate window to pool features .
  • a 11776-dimensional (256 ⁇ 46) representation is generated for each window. These representations can be provided to the fully connected layer of the segmentation network, and the fully connected layer is used to perform target recognition based on these representations.
  • a third lower A sampling layer where the third down-sampling layer performs down-sampling operations on the second image.
  • the feature information contained in the image can be improved by increasing the pixel value of the image.
  • multiple for example, 3 can be set after the third downsampling layer An up-sampling layer, where the up-sampling layer performs an up-sampling operation on the image output by the third down-sampling layer.
  • the performing target recognition on the second image may include:
  • the output a1, a2, and a3 of the fully connected layer can be expressed by the following formula:
  • the weight matrix contains different weight values, which are obtained by training the segmentation network.
  • the bias vector contains different bias values, which can be obtained by training the segmentation network.
  • S303 Perform target recognition on the image output by the upsampling layer based on the weight value and the bias value.
  • the target object contained in the second image can be quickly recognized.
  • the process of constructing a segmentation network may further include the following steps:
  • multiple convolutional layers can be set in the segmentation network.
  • the images that need to be processed can be processed accordingly.
  • the size of the feature image output by different convolution layers will also be different.
  • the input parameters and convolution kernels of all convolution layers can be calculated to obtain each convolution The size of the layer output image.
  • the shallow features have more image features, and the deep features have more semantic features.
  • the convolutional layer that can produce the same size, increase the convolution The connection between the layers, thereby reducing the edge jagged problem in the image.
  • step S403 In the process of implementing step S403, according to a specific implementation manner of the embodiment of the present disclosure, the following steps may also be included:
  • mapping function W(xi) for the i-th convolutional layer, the input xi of the i-th convolutional layer and the output F(xi) of the i-th convolutional layer can be set, and then F(xi) +W(xi) is used as the input of the i+2th convolutional layer. In this way, the convolutional layers are connected.
  • a convolution kernel of the same size can be set in multiple parallel convolution layers.
  • the input to the multiple The images in parallel convolutional layers are feature extracted to form multiple feature vector matrices.
  • different weight values are assigned to the multiple eigenvector matrices, and the sum of the eigenvector matrices with different weight values is used as the representation matrix of the second image to finally form the second image.
  • an embodiment of the present disclosure also discloses an image processing device 50, including:
  • the acquiring module 501 is configured to acquire the first image containing the target object.
  • the target object is the content to be acquired by the solution of the present disclosure.
  • the target object may be a person with various actions, an animal with behavior characteristics, or a stationary object.
  • the target object is usually contained in a certain scene.
  • a photo containing a portrait of a person usually also contains a background.
  • the background may include trees, mountains, rivers, and other people.
  • you want to extract the target object separately from the image you need to identify and process the target object separately.
  • various behaviors of the target object can be analyzed.
  • the first image is an image that contains the target object.
  • the first image can be one of a series of pre-stored photos, a video frame extracted from a pre-saved video, or a live broadcast One or more frames extracted from the video.
  • the first image may include multiple objects.
  • the photo used to describe the action of the person may include the target person, other people with the target person, trees, buildings, etc.
  • the target person constitutes the target object of the first image, and other people, trees, buildings, etc. together with the target person constitute the background image. Based on actual needs, one or more objects can be selected as target objects in the first image.
  • the target object can be obtained from a video file, and the video collected from the target object contains multiple frame images, and multiple images containing one or more continuous actions of the target object can be selected from the frame images of the video to form an image set.
  • the first image containing the target object can be obtained.
  • the setting module 502 is configured to set a segmentation network for performing image processing on the first image.
  • the segmentation network includes multiple convolutional layers and down-sampling layers.
  • the convolutional layer and the down-sampling layer are distributed at intervals, and the volume
  • the build-up layer performs feature extraction on the target object in the first image
  • the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer.
  • the segmentation network includes a convolutional layer, a sampling layer and a fully connected layer.
  • the main parameters of the convolutional layer include the size of the convolution kernel and the number of input feature maps.
  • Each convolutional layer can contain several feature maps of the same size.
  • the feature values of the same layer adopt the method of sharing weights.
  • the convolution in each layer The core size is the same.
  • the convolution layer performs convolution calculation on the input image and extracts the layout features of the input image.
  • the feature extraction layer of the convolutional layer can be connected to the sampling layer.
  • the sampling layer is used to find the local average value of the input image and perform secondary feature extraction.
  • the neural network model can ensure that the input The image has good robustness.
  • the sampling layer may include an up-sampling layer and a down-sampling layer.
  • the up-sampling layer adds pixel information in the image by interpolating the input image.
  • the down-sampling layer extracts the features of the input image by extracting the features of the input image.
  • a pooling layer (not shown in the figure) can also be provided after the convolutional layer.
  • the pooling layer uses the maximum pooling method to process the output results of the convolutional layer, which can be more Good to extract the invariant features of the input image.
  • the fully connected layer integrates the features in the image feature maps that have passed through multiple convolutional layers and pooling layers, and obtains the classification features of the input image features for image classification.
  • the fully connected layer maps the feature map generated by the convolutional layer into a fixed-length feature vector.
  • the feature vector contains the combined information of all the features of the input image, and the feature vector retains the most characteristic image features in the image to complete the image classification task. In this way, the prediction map corresponding to the input image can be calculated, thereby determining the target object contained in the first image.
  • a down-sampling layer, down-sampling layer and convolutional layer interval distribution are set in the segmentation network.
  • the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer The image output by the convolutional layer performs down-sampling.
  • the calculation speed of the segmentation network for the first image is improved.
  • the processing module 503 is configured to set multiple parallel convolutional layers with different sampling rates after the second downsampling layer in the segmentation network, and the parallel convolutional layers are used to process the image output by the second downsampling layer, The image features extracted on each parallel convolutional layer are merged to form a second image.
  • the disadvantage of traditional neural networks is that they need to input fixed-size images.
  • the images input into the neural network may have been cropped or distorted.
  • the cropped or distorted images will have content due to The loss situation causes the neural network to reduce the recognition accuracy of the object to be recognized in the input image.
  • the recognition accuracy of the target object by the neural network will also be reduced.
  • Parallel convolutional layers are set in the segmentation network. Specifically, after the second down-sampling layer in the segmentation network, multiple parallel sampling rates are set. Convolutional layer, the parallel convolutional layer is used to process the image output by the second down-sampling layer, and the image features extracted on each parallel convolutional layer are fused to form a second image.
  • the input image or the target object in the input image can have any aspect ratio or any size.
  • the segmentation network can extract features at different scales.
  • the parallel convolution layer can use 4 ⁇ 4, 2 ⁇ 2, and 1 ⁇ 1 convolution kernels to perform feature calculations on the input images respectively, so as to obtain 3 independently processed images, and merge the 3 independently processed images , A second image can be formed. Since the formation of the second image is not affected by the reality of the size or proportion of the input image, the robustness of the segmentation network is further improved.
  • each embodiment is not limited to the detection of objects of a specific size, shape, or type, nor is it limited to the detection of images of a specific size, type, or content.
  • the system for image processing using parallel convolutional layer pooling according to various embodiments can be applied to images of any size, type, or content.
  • the parallel convolutional layer improves the robustness of the data and also increases the computational burden of the system. For this reason, the parallel convolutional layer is set after the second down-sampling layer in the segmentation network. At this time, the second The image output by the down-sampling layer has sufficient characteristics to meet the requirements of the parallel sampling layer. At the same time, after the first image is processed by the two sampling layers, the amount of data calculation is greatly reduced. While satisfying the robustness of parallel convolutional layers, it also reduces the computational cost of evaluating convolutional layers.
  • the execution module 504 is configured to obtain a third image containing the target object by performing target recognition on the second image.
  • the size of the second image can be adjusted.
  • 3 parallel convolutional layers (1 ⁇ 1, 3 ⁇ 3, and 6 ⁇ 6, a total of 46 feature vectors) as an example
  • these 3 parallel convolutional layers can be used for each candidate window to pool features .
  • a 11776-dimensional (256 ⁇ 46) representation is generated for each window. These representations can be provided to the fully connected layer of the segmentation network, and the fully connected layer is used to perform target recognition based on these representations.
  • the device shown in FIG. 5 can correspondingly execute the content in the foregoing method embodiment.
  • an electronic device 60 which includes:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores an instruction executable by the at least one processor, and the instruction is executed by the at least one processor, so that the at least one processor can execute the image processing method in the foregoing method embodiment.
  • the embodiments of the present disclosure also provide a non-transitory computer-readable storage medium that stores computer instructions, and the computer instructions are used to make the computer execute the foregoing method embodiments.
  • the embodiments of the present disclosure also provide a computer program product, the computer program product includes a calculation program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, when the program instructions are executed by a computer, The computer executes the image processing method in the foregoing method embodiment.
  • Fig. 6 shows a schematic structural diagram of an electronic device 60 suitable for implementing embodiments of the present disclosure.
  • Electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (for example, Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc.
  • the electronic device shown in FIG. 6 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 60 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608
  • the program in the memory (RAM) 603 executes various appropriate actions and processing.
  • the RAM 603 also stores various programs and data required for the operation of the electronic device 60.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604.
  • the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch panel, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, An output device 607 such as a vibrator; a storage device 608 such as a magnetic tape, a hard disk, etc.; and a communication device 609.
  • the communication device 609 may allow the electronic device 60 to perform wireless or wired communication with other devices to exchange data.
  • the figure shows the electronic device 60 with various devices, it should be understood that it is not required to implement or have all the devices shown. It may be implemented alternatively or provided with more or fewer devices.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 609, or installed from the storage device 608, or installed from the ROM 602.
  • the processing device 601 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains at least two Internet protocol addresses; and sends to the node evaluation device including the at least two A node evaluation request for an Internet Protocol address, wherein the node evaluation device selects an Internet Protocol address from the at least two Internet Protocol addresses and returns it; receives the Internet Protocol address returned by the node evaluation device; wherein, the obtained The Internet Protocol address indicates the edge node in the content distribution network.
  • the aforementioned computer-readable medium carries one or more programs, and when the aforementioned one or more programs are executed by the electronic device, the electronic device: receives a node evaluation request including at least two Internet Protocol addresses; Among the at least two Internet Protocol addresses, select an Internet Protocol address; return the selected Internet Protocol address; wherein, the received Internet Protocol address indicates an edge node in the content distribution network.
  • the computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof.
  • the above-mentioned programming languages include object-oriented programming languages-such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Wherein, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the first obtaining unit can also be described as "a unit for obtaining at least two Internet Protocol addresses.”

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种图像处理方法、装置及电子设备,属于数据处理技术领域,该方法包括:获取包含目标对象的第一图像;设置对第一图像进行图像处理的分割网络;在分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像;通过对第二图像进行目标识别,获取包含目标对象的第三图像。本方案可以提高目标识别的准确度。

Description

图像处理方法、装置及电子设备
相关申请的交叉引用
本申请要求于2019年05月15日提交的,申请号为201910403859.X、发明名称为“图像处理方法、装置及电子设备”的中国专利申请的优先权,该申请的全文通过引用结合在本申请中。
技术领域
本公开涉及数据处理技术领域,尤其涉及一种图像处理方法、装置及电子设备。
背景技术
随着人工智能技术的发展,越来越多的图像处理工作可以通过人工智能的方式来完成,神经网络作为人工智能的一种实现手段,在计算机图像识别领域得到了充分的应用。例如,在图像中对不同人物进行识别,或者在无人驾驶中自动识别道路上的不同对象。这些都构成了图像语义识别的具体内容。图像语义识别的过程中会涉及到图像语义分割,图像语义分割一般建模为像素级别的多分类问题,其目标是将图像的每一像素区分为预定义的多个类别之一。
目前已有的图像语义分割方法多数基于编码器解码器的卷积神经网络。但是这种网络结构虽然可以获得较好的语义分割结果,但是一旦采用编解码结构,必然会在编码过程明显的降低特征图的空间分辨率,尽管在上采样过程恢复图像的原始分辨率,但是不可避免的会造成空间细节信息的丢失,从而导致目标识别的准确度降低。
发明内容
有鉴于此,本公开实施例提供一种图像处理方法、装置及电子设备,至少部分解决现有技术中存在的问题。
第一方面,本公开实施例提供了一种图像处理方法,包括:
获取包含目标对象的第一图像;
设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作;
在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像;
通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。
根据本公开实施例的一种具体实现方式,所述对所述第二图像进行目标识别,包括:
在所述平行卷积层之后,设置第三下采样层,所述第三下采样层对所述第二图像执行下采样操作。
根据本公开实施例的一种具体实现方式,所述对所述第二图像进行目标识别,还包括:
在所述第三下采样层之后,设置多个上采样层,所述上采样层对第三下采样层输出的图像执行上采样操作。
根据本公开实施例的一种具体实现方式,所述对所述第二图像进行目标识别,还包括:
在所述分割网络中设置全连接层;
在所述全连接层中,对所述平行卷积层不同节点输出的图像设置不同的权重值以及针对采样层所有节点的偏置值;
基于所述权重值和所述偏置值,对所述上采样层输出的图像进行目标识别。
根据本公开实施例的一种具体实现方式,所述方法还包括:
获取所述分割网络中所有的卷积层;
获取所有卷积层中每一卷积层输出的特征图像的图像尺寸;
在将输出相同图像尺寸的卷积层之间进行卷积层连接。
根据本公开实施例的一种具体实现方式,所述在将输出相同图像尺寸的卷 积层之间进行卷积层连接,包括:
获取N个输出相同图像尺寸的卷积层x中,第i个卷积层的输入xi和输出H(xi);
基于xi和H(xi),构建第i个卷积层的残差函数F(xi)=H(xi)-xi;
基于所述残差函数进行卷积层的连接。
根据本公开实施例的一种具体实现方式,所述基于所述残差函数进行卷积层的连接,包括:
设置针对第i个卷积层的映射函数W(xi);
获取第i个卷积层的输入xi及第i个卷积层的输出F(xi);
将F(xi)+W(xi)作为第i+2个卷积层的输入。
根据本公开实施例的一种具体实现方式,所述每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像,包括:
在所述多个平行卷积层中设置相同大小的卷积核;
基于所述卷积核,对输入到所述多个平行卷积层中的图像进行特征提取,形成多个特征向量矩阵;
为所述多个特征向量矩阵分配不同的权重值,将不同权重值的特征向量矩阵的和作为所述第二图像的表示矩阵。
第二方面,本公开实施例公开了一种图像处理装置,包括:
获取模块,用于获取包含目标对象的第一图像;
设置模块,用于设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作;
处理模块,用于在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像;
执行模块,用于通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。
第三方面,本公开实施例还提供了一种电子设备,该电子设备包括:
至少一个处理器;以及,
与该至少一个处理器通信连接的存储器;其中,
该存储器存储有可被该至少一个处理器执行的指令,该指令被该至少一个处理器执行,以使该至少一个处理器能够执行前述任第一方面或第一方面的任一实现方式中的图像处理方法。
第四方面,本公开实施例还提供了一种非暂态计算机可读存储介质,该非暂态计算机可读存储介质存储计算机指令,该计算机指令用于使该计算机执行前述第一方面或第一方面的任一实现方式中的图像处理方法。
第五方面,本公开实施例还提供了一种计算机程序产品,该计算机程序产品包括存储在非暂态计算机可读存储介质上的计算程序,该计算机程序包括程序指令,当该程序指令被计算机执行时,使该计算机执行前述第一方面或第一方面的任一实现方式中的图像处理方法。
本公开实施例中的图像处理方案,包括获取包含目标对象的第一图像;设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作;在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像;通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。通过本公开的方案,提高目标识别的准确度。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为本公开实施例提供的一种图像处理流程示意图;
图2为本公开实施例提供的一种神经网络模型示意图;
图3为本公开实施例提供的另一种图像处理流程示意图;
图4为本公开实施例提供的另一种图像处理流程示意图;
图5为本公开实施例提供的图像处理装置结构示意图;
图6为本公开实施例提供的电子设备示意图。
具体实施方式
下面结合附图对本公开实施例进行详细描述。
以下通过特定的具体实例说明本公开的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本公开的其他优点与功效。显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。本公开还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本公开的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
需要说明的是,下文描述在所附权利要求书的范围内的实施例的各种方面。应显而易见,本文中所描述的方面可体现于广泛多种形式中,且本文中所描述的任何特定结构及/或功能仅为说明性的。基于本公开,所属领域的技术人员应了解,本文中所描述的一个方面可与任何其它方面独立地实施,且可以各种方式组合这些方面中的两者或两者以上。举例来说,可使用本文中所阐述的任何数目个方面来实施设备及/或实践方法。另外,可使用除了本文中所阐述的方面中的一或多者之外的其它结构及/或功能性实施此设备及/或实践此方法。
还需要说明的是,以下实施例中所提供的图示仅以示意方式说明本公开的基本构想,图式中仅显示与本公开中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。
另外,在以下描述中,提供具体细节是为了便于透彻理解实例。然而,所属领域的技术人员将理解,可在没有这些特定细节的情况下实践所述方面。
本公开实施例提供一种图像处理方法。本实施例提供的图像处理方法可以由一计算装置来执行,该计算装置可以实现为软件,或者实现为软件和硬件的组合,该计算装置可以集成设置在服务器、终端设备等中。
参见图1,本公开实施例提供的一种图像处理方法,包括如下步骤:
S101,获取包含目标对象的第一图像。
目标对象是本公开的方案所要获取的内容,作为一个例子,目标对象可以是一个具有各种动作的人,也可以是具有行为特色的动物,或者是静止的物体等。
目标对象通常包含在一定的场景中,例如包含人物肖像的照片通常还含有背景,背景可以包括树木、山、河流、以及其他的人物等。此时如果想从图像中将目标对象单独的提取出来,就需要对目标对象进行单独的识别和处理。基于提取出来的目标对象,可以分析目标对象的各种行为。
第一图像是包含了目标对象的图像,第一图像可以是通过预先存储的一系列照片中的一个,也可以是从一段预先保存的视频中提取出来的视频帧,还可以是从实时直播的视频中提取的一个或多个画面。第一图像中可以包含多个对象,例如用于描述人物动作的照片可以包含目标人物、与目标人物在一起的其他人物、树木、建筑物等。目标人物构成了第一图像的目标对象,与目标人物在一起的其他人物、树木、建筑物等构成了背景图像。基于实际的需要,可以在第一图像中选择一个或多个对象作为目标对象。
作为一个例子,可以从视频文件中获取目标对象,对目标对象采集的视频中包含多个帧图像,可以从视频的帧图像中选取多个包含一个或多个目标对象连续动作的图像,构成图像集合。通过对图像集合中的图像进行选取,能够获取包含目标对象的第一图像。
S102,设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像 执行下采样操作。
为了能够对第一图像的进行图像处理,构建基于神经网络模型的分割网络,参见图2,分割网络包括卷积层、采样层和全连接层。
卷积层主要参数包括卷积核的大小和输入特征图的数量,每个卷积层可以包含若干个相同大小的特征图,同一层特征值采用共享权值的方式,每层内的卷积核大小一致。卷积层对输入图像进行卷积计算,并提取输入图像的布局特征。
卷积层的特征提取层后面都可以与采样层连接,采样层用来求输入图像的局部平均值并进行二次特征提取,通过将采样层与卷积层连接,能够保证神经网络模型对于输入图像具有较好的鲁棒性。
采样层可以包括上采样层和下采样层,上采样层通过对输入图像进行插值等方式,增加图像中的像素信息。下采样层通过对输入的图像进行特征提取的方式,提取输入图像的特征,
为了加快分割网络的训练速度,还可以在卷积层后面还设置有池化层(图中未示出),池化层采用最大池化的方式对卷积层的输出结果进行处理,能够更好的提取输入图像的不变性特征。
全连接层将经过多个卷积层和池化层的图像特征图中的特征进行整合,获取输入图像特征具有的分类特征,以用于图像分类。在分割网络的神经网络模型中,全连接层将卷积层产生的特征图映射成一个固定长度的特征向量。该特征向量包含了输入图像所有特征的组合信息,该特征向量将图像中含有最具有特点的图像特征保留了下来以完成图像分类任务。这样一来便可以计算输入的图像对应的预测图,从而确定第一图像中所包含的目标对象。
为了提高分割网络的计算速度,在分割网络中设置下采样层,下采样层和卷积层间隔分布,卷积层对所述第一图像中的目标对象进行特征提取,下采样层对所述卷积层输出的图像执行下采样操作。通过这种设置方式,提高了分割网络对于第一图像的计算速度。
S103,在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积 层上提取的图像特征通过融合的方式,形成第二图像。
传统神经网络的缺点是他们需要输入固定大小的图像,而事实上,由于对图像处理的不同,输入到神经网络中的图像可能已经经过裁剪或扭曲处理,经过裁剪或扭曲的图像会由于存在内容丢失的情况,导致神经网络对于输入图像中待识别物体的识别准确性降低。除此之外,当同样的目标对象在不同的图像中的尺寸发生变化时,也会降低神经网络对目标对象的识别准确度。
为了进一步提高分割网络对于第一图像的自适应性,参见图2,在分割网络中设置平行卷积层,具体的,在分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像。
利用平行卷积层进行图像处理,输入图像或输入图像中目标对象可以是任意宽高比或任意的尺寸大小。当输入图像为不同尺度时,分割网络可以按不同尺度进行提取特征。例如,平行卷积层可以采用4×4、2×2和1×1的卷积核对输入的图像分别进行特征计算,从而得到3路独立处理的图像,通过将3路独立处理的图像进行融合,能够形成第二图像。由于第二图像的形成不受输入图像大小或比例的现实,从而进一步的提高了分割网络的鲁棒性。
通过该实施方式,各实施例不限于对特定大小、形状或类型的对象的检测,也不限于对特定大小、类型或内容的图像的检测。根据各实施例的使用平行卷积层池化来进行图像处理的***可在能作用于任何大小、类型或内容的图像。
平行卷积层提高了数据鲁棒性的同时,也增加了***的计算负担,为此,将平行卷积层设置在所述分割网络中第二个下采样层之后,此时,第二个下采样层输出的图像具有足够的特征来满足平行采样层的需求,同时第一图像经过两个采样层的处理之后,数据的计算量大为降低。在满足平行卷积层鲁棒性的同时,降低了评价卷积层的计算消耗。之所以如此,是因为若将平行采样层放置在第三个采样层之后进行图像处理,则第一图像经过三个采样层的处理之后,将会损失过多的特征,从而导致平行卷积层获得的特征不足,影响平行卷积层对于目标对象的识别效果。
S104,通过对所述第二图像进行目标识别,获取包含所述目标对象的第三 图像。
可以对第二图像的大小进行调节,例如,可以构建最小化函数min(a;b)=c,其中a是第二图像的宽度,b第二图像的高度,c表示预定义尺度(例如256),并且可从整个图像提取特征图。例如,以3个平行卷积层(1×1、3×3和6×6,总共46个特征向量)为例,可以将这3个平行卷积层用于每个候选窗口以便池化特征。为每个窗口生成11776维(256×46)表示。这些表示可被提供给分割网络的全连接层,通过全连接层来基于这些表示进行目标识别。识别出的目标对象以单独的图像形式进行保存,形成第三图像。
为了进一步的提高分割网络的处理效率,根据本公开实施例的一种具体实现方式,在对所述第二图像进行目标识别的过程中,在所述平行卷积层之后,可以设置第三下采样层,所述第三下采样层对所述第二图像执行下采样操作。通过设置第三下采样层,能够进一步的降低第二图像的像素值,减少分割网络的计算量。
对于采用GPU等高速计算设备的情景,可以通过增加图像的像素值得方式,提高图像上所包含的特征信息,此时,可以在在第三下采样层之后,设置多个(例如,3个)上采样层,所述上采样层对第三下采样层输出的图像执行上采样操作。通过设置多个上采样层,能够通过插值等方式对第二图像增加更多的图像细节。
参见图3,根据本公开实施例的一种具体实现方式,所述对所述第二图像进行目标识别,可以包括:
S301,在所述分割网络中设置全连接层。
S302,在所述全连接层中,对所述平行卷积层不同节点输出的图像设置不同的权重值以及针对采样层所有节点的偏置值。
以x1、x2、x3为平行卷积层的输出为例,则对于全连接层的输出a1、a2、a3,可以用如下公式进行表示:
Figure PCTCN2020079192-appb-000001
其中,
Figure PCTCN2020079192-appb-000002
分别为权重矩阵和偏置向量,权重矩阵中包含不同的权重值,权重值通过对分割网络训练等方式获得。偏置向量中包含不同的偏置值,偏置值可以通过对分割网络进行训练等方式获得。
S303,基于所述权重值和所述偏置值,对所述上采样层输出的图像进行目标识别。
通过步骤S301-S303中的方式,能够快速的对第二图像中包含的目标对象进行识别。
参见图4,根据本公开实施例的一种具体实现方式,在构建分割网络的过程中,还可以包括如下步骤:
S401,获取所述分割网络中所有的卷积层。
根据不同的需要,分割网络中可以设置多个卷积层,通过对不同的卷积层设置不同的卷积核,可以对需要处理的图像进行相应的处理。
S402,获取所有卷积层中每一卷积层输出的特征图像的图像尺寸。
基于卷积核和输入图像的不同,不同卷积层输出的特征图像的尺寸也会不同,此时可以通过对所有的卷积层的输入参数和卷积核进行计算,从而获得每一个卷积层输出图像的尺寸大小。
S403,在将输出相同图像尺寸的卷积层之间进行卷积层连接。
深度学习网络中,浅层特征有更多的图像特征,深层特征有更多的语义特征,为了能够将浅层和深层的特征结合到一起,对于能够产生同样尺寸的卷积层,增加卷积层之间的连接,从而降低图像中的边缘锯齿问题。
在实现步骤S403的过程中,根据本公开实施例的一种具体实现方式,还可以包括如下步骤:
S4031,获取N个输出相同图像尺寸的卷积层x中,第i个卷积层的输入xi和输出H(xi)。
S4032,基于xi和H(xi),构建第i个卷积层的残差函数F(xi)=H(xi)-xi。
S4033,基于所述残差函数进行卷积层的连接。
具体的,可以设置针对第i个卷积层的映射函数W(xi),以及第i个卷积层的输入xi及第i个卷积层的输出F(xi),然后将F(xi)+W(xi)作为第i+2个卷积层的输入,通过这种方式,对卷积层进行连接。
在将形成第二图像的过程中,为了能够快速的提取第二图像的特征,可以在多个平行卷积层中设置相同大小的卷积核,通过该卷积核,对输入到所述多个平行卷积层中的图像进行特征提取,形成多个特征向量矩阵。基于对分割网络训练的情况,为所述多个特征向量矩阵分配不同的权重值,将不同权重值的特征向量矩阵的和作为所述第二图像的表示矩阵,最终形成第二图像。
与上面的方法实施例相对应,参见图5,本公开实施例还公开了一种图像处理装置50,包括:
获取模块501,用于获取包含目标对象的第一图像。
目标对象是本公开的方案所要获取的内容,作为一个例子,目标对象可以是一个具有各种动作的人,也可以是具有行为特色的动物,或者是静止的物体等。
目标对象通常包含在一定的场景中,例如包含人物肖像的照片通常还含有背景,背景可以包括树木、山、河流、以及其他的人物等。此时如果想从图像中将目标对象单独的提取出来,就需要对目标对象进行单独的识别和处理。基于提取出来的目标对象,可以分析目标对象的各种行为。
第一图像是包含了目标对象的图像,第一图像可以是通过预先存储的一系列照片中的一个,也可以是从一段预先保存的视频中提取出来的视频帧,还可以是从实时直播的视频中提取的一个或多个画面。第一图像中可以包含多个对象,例如用于描述人物动作的照片可以包含目标人物、与目标人物在一起的其他人物、树木、建筑物等。目标人物构成了第一图像的目标对象,与目标人物在一起的其他人物、树木、建筑物等构成了背景图像。基于实际的需要,可以在第一图像中选择一个或多个对象作为目标对象。
作为一个例子,可以从视频文件中获取目标对象,对目标对象采集的视频中包含多个帧图像,可以从视频的帧图像中选取多个包含一个或多个目标对象 连续动作的图像,构成图像集合。通过对图像集合中的图像进行选取,能够获取包含目标对象的第一图像。
设置模块502,用于设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作。
为了能够对第一图像的进行图像处理,构建基于神经网络模型的分割网络,参见图2,分割网络包括卷积层、采样层和全连接层。
卷积层主要参数包括卷积核的大小和输入特征图的数量,每个卷积层可以包含若干个相同大小的特征图,同一层特征值采用共享权值的方式,每层内的卷积核大小一致。卷积层对输入图像进行卷积计算,并提取输入图像的布局特征。
卷积层的特征提取层后面都可以与采样层连接,采样层用来求输入图像的局部平均值并进行二次特征提取,通过将采样层与卷积层连接,能够保证神经网络模型对于输入图像具有较好的鲁棒性。
采样层可以包括上采样层和下采样层,上采样层通过对输入图像进行插值等方式,增加图像中的像素信息。下采样层通过对输入的图像进行特征提取的方式,提取输入图像的特征,
为了加快分割网络的训练速度,还可以在卷积层后面还设置有池化层(图中未示出),池化层采用最大池化的方式对卷积层的输出结果进行处理,能够更好的提取输入图像的不变性特征。
全连接层将经过多个卷积层和池化层的图像特征图中的特征进行整合,获取输入图像特征具有的分类特征,以用于图像分类。在分割网络的神经网络模型中,全连接层将卷积层产生的特征图映射成一个固定长度的特征向量。该特征向量包含了输入图像所有特征的组合信息,该特征向量将图像中含有最具有特点的图像特征保留了下来以完成图像分类任务。这样一来便可以计算输入的图像对应的预测图,从而确定第一图像中所包含的目标对象。
为了提高分割网络的计算速度,在分割网络中设置下采样层,下采样层和 卷积层间隔分布,卷积层对所述第一图像中的目标对象进行特征提取,下采样层对所述卷积层输出的图像执行下采样操作。通过这种设置方式,提高了分割网络对于第一图像的计算速度。
处理模块503,用于在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像。
传统神经网络的缺点是他们需要输入固定大小的图像,而事实上,由于对图像处理的不同,输入到神经网络中的图像可能已经经过裁剪或扭曲处理,经过裁剪或扭曲的图像会由于存在内容丢失的情况,导致神经网络对于输入图像中待识别物体的识别准确性降低。除此之外,当同样的目标对象在不同的图像中的尺寸发生变化时,也会降低神经网络对目标对象的识别准确度。
为了进一步提高分割网络对于第一图像的自适应性,参见图2,在分割网络中设置平行卷积层,具体的,在分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像。
利用平行卷积层进行图像处理,输入图像或输入图像中目标对象可以是任意宽高比或任意的尺寸大小。当输入图像为不同尺度时,分割网络可以按不同尺度进行提取特征。例如,平行卷积层可以采用4×4、2×2和1×1的卷积核对输入的图像分别进行特征计算,从而得到3路独立处理的图像,通过将3路独立处理的图像进行融合,能够形成第二图像。由于第二图像的形成不受输入图像大小或比例的现实,从而进一步的提高了分割网络的鲁棒性。
通过该实施方式,各实施例不限于对特定大小、形状或类型的对象的检测,也不限于对特定大小、类型或内容的图像的检测。根据各实施例的使用平行卷积层池化来进行图像处理的***可在能作用于任何大小、类型或内容的图像。
平行卷积层提高了数据鲁棒性的同时,也增加了***的计算负担,为此,将平行卷积层设置在所述分割网络中第二个下采样层之后,此时,第二个下采样层输出的图像具有足够的特征来满足平行采样层的需求,同时第一图像经过两个采样层的处理之后,数据的计算量大为降低。在满足平行卷积层鲁棒性的 同时,降低了评价卷积层的计算消耗。之所以如此,是因为若将平行采样层放置在第三个采样层之后进行图像处理,则第一图像经过三个采样层的处理之后,将会损失过多的特征,从而导致平行卷积层获得的特征不足,影响平行卷积层对于目标对象的识别效果。
执行模块504,用于通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。
可以对第二图像的大小进行调节,例如,可以构建最小化函数min(a;b)=c,其中a是第二图像的宽度,b第二图像的高度,c表示预定义尺度(例如256),并且可从整个图像提取特征图。例如,以3个平行卷积层(1×1、3×3和6×6,总共46个特征向量)为例,可以将这3个平行卷积层用于每个候选窗口以便池化特征。为每个窗口生成11776维(256×46)表示。这些表示可被提供给分割网络的全连接层,通过全连接层来基于这些表示进行目标识别。识别出的目标对象以单独的图像形式进行保存,形成第三图像。
图5所示装置可以对应的执行上述方法实施例中的内容,本实施例未详细描述的部分,参照上述方法实施例中记载的内容,在此不再赘述。
参见图6,本公开实施例还提供了一种电子设备60,该电子设备包括:
至少一个处理器;以及,
与该至少一个处理器通信连接的存储器;其中,
该存储器存储有可被该至少一个处理器执行的指令,该指令被该至少一个处理器执行,以使该至少一个处理器能够执行前述方法实施例中图像处理方法。
本公开实施例还提供了一种非暂态计算机可读存储介质,该非暂态计算机可读存储介质存储计算机指令,该计算机指令用于使该计算机执行前述方法实施例中。
本公开实施例还提供了一种计算机程序产品,该计算机程序产品包括存储在非暂态计算机可读存储介质上的计算程序,该计算机程序包括程序指令,当该程序指令被计算机执行时,使该计算机执行前述方法实施例中的图像处理方法。
下面参考图6,其示出了适于用来实现本公开实施例的电子设备60的结构 示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图6示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图6所示,电子设备60可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储装置608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备60操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、图像传感器、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备60与其他设备进行无线或有线通信以交换数据。虽然图中示出了具有各种装置的电子设备60,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装 置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取至少两个网际协议地址;向节点评价设备发送包括所述至少两个网际协议地址的节点评价请求,其中,所述节点评价设备从所述至少两个网际协议地址中,选取网际协议地址并返回;接收所述节点评价设备返回的网际协议地址;其中,所获取的网际协议地址指示内容分发网络中的边缘节点。
或者,上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:接收包括至少两个网际协议地址的节点评价请求;从所述至少两个网际协议地址中,选取网际协议地址;返回选取出的网际协议地址;其中,接收到的网际协议地址指示内容分发网络中的边缘节点。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的 计算机程序代码,上述程序设计语言包括面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
应当理解,本公开的各部分可以用硬件、软件、固件或它们的组合来实现。
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。

Claims (11)

  1. 一种图像处理方法,其特征在于,包括:
    获取包含目标对象的第一图像;
    设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作;
    在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像;
    通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述第二图像进行目标识别,包括:
    在所述平行卷积层之后,设置第三下采样层,所述第三下采样层对所述第二图像执行下采样操作。
  3. 根据权利要求2所述的方法,其特征在于,所述对所述第二图像进行目标识别,还包括:
    在所述第三下采样层之后,设置多个上采样层,所述上采样层对第三下采样层输出的图像执行上采样操作。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述第二图像进行目标识别,还包括:
    在所述分割网络中设置全连接层;
    在所述全连接层中,对所述平行卷积层不同节点输出的图像设置不同的权重值以及针对采样层所有节点的偏置值;
    基于所述权重值和所述偏置值,对所述上采样层输出的图像进行目标识别。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述分割网络中所有的卷积层;
    获取所有卷积层中每一卷积层输出的特征图像的图像尺寸;
    在将输出相同图像尺寸的卷积层之间进行卷积层连接。
  6. 根据权利要求5所述的方法,其特征在于,所述在将输出相同图像尺寸的卷积层之间进行卷积层连接,包括:
    获取N个输出相同图像尺寸的卷积层x中,第i个卷积层的输入xi和输出H(xi);
    基于xi和H(xi),构建第i个卷积层的残差函数F(xi)=H(xi)-xi;
    基于所述残差函数进行卷积层的连接。
  7. 根据权利要求6所述的方法,其特征在于,所述基于所述残差函数进行卷积层的连接,包括:
    设置针对第i个卷积层的映射函数W(xi);
    获取第i个卷积层的输入xi及第i个卷积层的输出F(xi);
    将F(xi)+W(xi)作为第i+2个卷积层的输入。
  8. 根据权利要求1所述的方法,其特征在于,所述每个平行卷积层上提取的图像特征通过融合的方式,形成第二图像,包括:
    在所述多个平行卷积层中设置相同大小的卷积核;
    基于所述卷积核,对输入到所述多个平行卷积层中的图像进行特征提取,形成多个特征向量矩阵;
    为所述多个特征向量矩阵分配不同的权重值,将不同权重值的特征向量矩阵的和作为所述第二图像的表示矩阵。
  9. 一种图像处理装置,其特征在于,包括:
    获取模块,用于获取包含目标对象的第一图像;
    设置模块,用于设置对第一图像进行图像处理的分割网络,所述分割网络包括多个卷积层和下采样层,所述卷积层和所述下采样层间隔分布,所述卷积层对所述第一图像中的目标对象进行特征提取,所述下采样层对所述卷积层输出的图像执行下采样操作;
    处理模块,用于在所述分割网络中第二个下采样层之后,设置多个不同采样率平行卷积层,所述平行卷积层用于处理第二个下采样层输出的图像,每个 平行卷积层上提取的图像特征通过融合的方式,形成第二图像;
    执行模块,用于通过对所述第二图像进行目标识别,获取包含所述目标对象的第三图像。
  10. 一种电子设备,其特征在于,所述电子设备包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行前述任一权利要求1-8所述的图像处理方法。
  11. 一种非暂态计算机可读存储介质,该非暂态计算机可读存储介质存储计算机指令,该计算机指令用于使该计算机执行前述任一权利要求1-8所述的图像处理方法。
PCT/CN2020/079192 2019-05-15 2020-03-13 图像处理方法、装置及电子设备 WO2020228405A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910403859.XA CN110222726A (zh) 2019-05-15 2019-05-15 图像处理方法、装置及电子设备
CN201910403859.X 2019-05-15

Publications (1)

Publication Number Publication Date
WO2020228405A1 true WO2020228405A1 (zh) 2020-11-19

Family

ID=67821169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/079192 WO2020228405A1 (zh) 2019-05-15 2020-03-13 图像处理方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN110222726A (zh)
WO (1) WO2020228405A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651983A (zh) * 2020-12-15 2021-04-13 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN113469083A (zh) * 2021-07-08 2021-10-01 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN111369468B (zh) * 2020-03-09 2022-02-01 北京字节跳动网络技术有限公司 图像处理方法、装置、电子设备及计算机可读介质
CN111931600B (zh) * 2020-07-21 2021-04-06 深圳市鹰硕教育服务有限公司 智能笔图像处理方法、装置及电子设备
CN113691863B (zh) * 2021-07-05 2023-06-20 浙江工业大学 一种提取视频关键帧的轻量化方法
CN113936220B (zh) * 2021-12-14 2022-03-04 深圳致星科技有限公司 图像处理方法、存储介质、电子设备及图像处理装置
CN117437429A (zh) * 2022-07-15 2024-01-23 华为技术有限公司 图像数据处理方法、装置和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862287A (zh) * 2017-11-08 2018-03-30 吉林大学 一种前方小区域物体识别及车辆预警方法
CN110046607A (zh) * 2019-04-26 2019-07-23 西安因诺航空科技有限公司 一种基于深度学习的无人机遥感图像板房或建材检测方法
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN110456805A (zh) * 2019-06-24 2019-11-15 深圳慈航无人智能***技术有限公司 一种无人机智能循迹飞行***及方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3582151A1 (en) * 2015-08-15 2019-12-18 Salesforce.com, Inc. Three-dimensional (3d) convolution with 3d batch normalization
CN106920227B (zh) * 2016-12-27 2019-06-07 北京工业大学 基于深度学习与传统方法相结合的视网膜血管分割方法
CN107292352B (zh) * 2017-08-07 2020-06-02 北京中星微人工智能芯片技术有限公司 基于卷积神经网络的图像分类方法和装置
CN107657257A (zh) * 2017-08-14 2018-02-02 中国矿业大学 一种基于多通道卷积神经网络的语义图像分割方法
CN107909113B (zh) * 2017-11-29 2021-11-16 北京小米移动软件有限公司 交通事故图像处理方法、装置及存储介质
CN108022647B (zh) * 2017-11-30 2022-01-25 东北大学 基于ResNet-Inception模型的肺结节良恶性预测方法
CN108615010B (zh) * 2018-04-24 2022-02-11 重庆邮电大学 基于平行卷积神经网络特征图融合的人脸表情识别方法
CN108986124A (zh) * 2018-06-20 2018-12-11 天津大学 结合多尺度特征卷积神经网络视网膜血管图像分割方法
CN109389030B (zh) * 2018-08-23 2022-11-29 平安科技(深圳)有限公司 人脸特征点检测方法、装置、计算机设备及存储介质
CN109344878B (zh) * 2018-09-06 2021-03-30 北京航空航天大学 一种基于ResNet的仿鹰脑特征整合小目标识别方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862287A (zh) * 2017-11-08 2018-03-30 吉林大学 一种前方小区域物体识别及车辆预警方法
CN110046607A (zh) * 2019-04-26 2019-07-23 西安因诺航空科技有限公司 一种基于深度学习的无人机遥感图像板房或建材检测方法
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN110456805A (zh) * 2019-06-24 2019-11-15 深圳慈航无人智能***技术有限公司 一种无人机智能循迹飞行***及方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651983A (zh) * 2020-12-15 2021-04-13 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN112651983B (zh) * 2020-12-15 2023-08-01 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN113469083A (zh) * 2021-07-08 2021-10-01 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***
CN113469083B (zh) * 2021-07-08 2024-05-31 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***

Also Published As

Publication number Publication date
CN110222726A (zh) 2019-09-10

Similar Documents

Publication Publication Date Title
WO2020228405A1 (zh) 图像处理方法、装置及电子设备
CN110189246B (zh) 图像风格化生成方法、装置及电子设备
CN110399848A (zh) 视频封面生成方法、装置及电子设备
JP2023547917A (ja) 画像分割方法、装置、機器および記憶媒体
CN110796664B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
WO2020228383A1 (zh) 口型生成方法、装置及电子设备
WO2022237811A1 (zh) 图像处理方法、装置及设备
CN110399847B (zh) 关键帧提取方法、装置及电子设备
CN112232311B (zh) 人脸跟踪方法、装置及电子设备
CN111222509A (zh) 目标检测方法、装置及电子设备
CN110211017B (zh) 图像处理方法、装置及电子设备
CN110287816B (zh) 车门动作检测方法、装置和计算机可读存储介质
CN110197459B (zh) 图像风格化生成方法、装置及电子设备
WO2024012255A1 (zh) 语义分割模型训练方法、装置、电子设备及存储介质
WO2024041235A1 (zh) 图像处理方法、装置、设备、存储介质及程序产品
CN110060324B (zh) 图像渲染方法、装置及电子设备
CN115100536B (zh) 建筑物识别方法、装置、电子设备和计算机可读介质
CN116704203A (zh) 目标检测方法、装置、电子设备、计算机可读存储介质
CN114419322A (zh) 一种图像实例分割方法、装置、电子设备及存储介质
CN112052863B (zh) 一种图像检测方法及装置、计算机存储介质、电子设备
CN113762017B (zh) 一种动作识别方法、装置、设备及存储介质
WO2021073204A1 (zh) 对象的显示方法、装置、电子设备及计算机可读存储介质
CN115082828A (zh) 基于支配集的视频关键帧提取方法和装置
CN111200705B (zh) 图像处理方法和装置
CN113808151A (zh) 直播图像的弱语义轮廓检测方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1