WO2020228405A1 - Procédé et appareil de traitement d'image et dispositif électronique - Google Patents

Procédé et appareil de traitement d'image et dispositif électronique Download PDF

Info

Publication number
WO2020228405A1
WO2020228405A1 PCT/CN2020/079192 CN2020079192W WO2020228405A1 WO 2020228405 A1 WO2020228405 A1 WO 2020228405A1 CN 2020079192 W CN2020079192 W CN 2020079192W WO 2020228405 A1 WO2020228405 A1 WO 2020228405A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
layer
sampling
convolutional
layers
Prior art date
Application number
PCT/CN2020/079192
Other languages
English (en)
Chinese (zh)
Inventor
李华夏
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020228405A1 publication Critical patent/WO2020228405A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Definitions

  • the present disclosure relates to the field of data processing technology, and in particular to an image processing method, device and electronic equipment.
  • image processing tasks can be completed by artificial intelligence.
  • neural networks have been fully applied in the field of computer image recognition. For example, in the image to recognize different people, or automatically recognize different objects on the road in unmanned driving. These all constitute the specific content of image semantic recognition.
  • image semantic recognition involves image semantic segmentation. Image semantic segmentation is generally modeled as a pixel-level multi-classification problem. The goal is to distinguish each pixel of an image into one of multiple predefined categories.
  • the embodiments of the present disclosure provide an image processing method, device, and electronic device, which at least partially solve the problems in the prior art.
  • embodiments of the present disclosure provide an image processing method, including:
  • a segmentation network for performing image processing on the first image is set.
  • the segmentation network includes a plurality of convolutional layers and down-sampling layers.
  • the convolutional layer and the down-sampling layer are spaced apart.
  • Perform feature extraction on a target object in an image, and the down-sampling layer performs down-sampling on the image output by the convolutional layer;
  • the parallel convolutional layer is used to process the image output by the second down-sampling layer.
  • Each parallel convolutional layer The image features extracted above form a second image through fusion;
  • a third image containing the target object is acquired.
  • the performing target recognition on the second image includes:
  • a third down-sampling layer is provided, and the third down-sampling layer performs a down-sampling operation on the second image.
  • the performing target recognition on the second image further includes:
  • the up-sampling layer After the third down-sampling layer, a plurality of up-sampling layers are set, and the up-sampling layer performs an up-sampling operation on the image output by the third down-sampling layer.
  • the performing target recognition on the second image further includes:
  • target recognition is performed on the image output by the upsampling layer.
  • the method further includes:
  • the connecting convolutional layers between convolutional layers of the same image size includes:
  • the convolutional layer is connected based on the residual function.
  • connection of the convolutional layer based on the residual function includes:
  • the image features extracted on each parallel convolutional layer are merged to form a second image, including:
  • Different weight values are assigned to the multiple eigenvector matrices, and the sum of the eigenvector matrices with different weight values is used as the representation matrix of the second image.
  • an image processing device including:
  • An obtaining module used to obtain the first image containing the target object
  • the setting module is configured to set a segmentation network for performing image processing on the first image.
  • the segmentation network includes a plurality of convolutional layers and downsampling layers.
  • the convolutional layer and the downsampling layer are distributed at intervals, and the convolution A layer performs feature extraction on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer;
  • the processing module is used to set multiple parallel convolutional layers with different sampling rates after the second down-sampling layer in the segmentation network.
  • the parallel convolutional layers are used to process the image output by the second down-sampling layer.
  • the image features extracted on two parallel convolutional layers are fused to form a second image;
  • the execution module is configured to obtain a third image containing the target object by performing target recognition on the second image.
  • an embodiment of the present disclosure also provides an electronic device, which includes:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any of the foregoing first aspect or any implementation of the first aspect Image processing method.
  • embodiments of the present disclosure also provide a non-transitory computer-readable storage medium that stores computer instructions that are used to make the computer execute the first aspect or the first aspect described above.
  • An image processing method in any implementation of one aspect.
  • the embodiments of the present disclosure also provide a computer program product.
  • the computer program product includes a computing program stored on a non-transitory computer-readable storage medium.
  • the computer program includes program instructions. When executed, the computer is caused to execute the image processing method in the foregoing first aspect or any implementation manner of the first aspect.
  • the image processing solution in the embodiment of the present disclosure includes acquiring a first image containing a target object; setting a segmentation network for image processing on the first image, the segmentation network including multiple convolutional layers and downsampling layers, the volume The distribution layer and the down-sampling layer are spaced apart, the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer; After the second down-sampling layer in the segmentation network, multiple parallel convolutional layers with different sampling rates are set.
  • the parallel convolutional layer is used to process the image output by the second down-sampling layer, and each parallel convolutional layer is The extracted image features are fused to form a second image; by performing target recognition on the second image, a third image containing the target object is obtained.
  • FIG. 1 is a schematic diagram of an image processing flow provided by an embodiment of the disclosure
  • FIG. 2 is a schematic diagram of a neural network model provided by an embodiment of the disclosure.
  • FIG. 3 is a schematic diagram of another image processing flow provided by an embodiment of the disclosure.
  • FIG. 4 is a schematic diagram of another image processing flow provided by an embodiment of the disclosure.
  • FIG. 5 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the disclosure.
  • FIG. 6 is a schematic diagram of an electronic device provided by an embodiment of the disclosure.
  • the embodiment of the present disclosure provides an image processing method.
  • the image processing method provided in this embodiment can be executed by a computing device, and the computing device can be implemented as software, or as a combination of software and hardware, and the computing device can be integrated in a server, terminal device, etc.
  • an image processing method provided by an embodiment of the present disclosure includes the following steps:
  • S101 Acquire a first image containing a target object.
  • the target object is the content to be acquired by the solution of the present disclosure.
  • the target object may be a person with various actions, an animal with behavior characteristics, or a stationary object.
  • the target object is usually contained in a certain scene.
  • a photo containing a portrait of a person usually also contains a background.
  • the background may include trees, mountains, rivers, and other people.
  • you want to extract the target object separately from the image you need to identify and process the target object separately.
  • various behaviors of the target object can be analyzed.
  • the first image is an image that contains the target object.
  • the first image can be one of a series of pre-stored photos, a video frame extracted from a pre-saved video, or a live broadcast One or more frames extracted from the video.
  • the first image may include multiple objects.
  • the photo used to describe the action of the person may include the target person, other people with the target person, trees, buildings, etc.
  • the target person constitutes the target object of the first image, and other people, trees, buildings, etc. together with the target person constitute the background image. Based on actual needs, one or more objects can be selected as target objects in the first image.
  • the target object can be obtained from a video file, and the video collected from the target object contains multiple frame images, and multiple images containing one or more continuous actions of the target object can be selected from the frame images of the video to form an image set.
  • the first image containing the target object can be obtained.
  • the segmentation network includes a plurality of convolutional layers and downsampling layers.
  • the convolutional layer and the downsampling layer are spaced apart, and the convolutional layer is Feature extraction is performed on the target object in the first image, and the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer.
  • the segmentation network includes a convolutional layer, a sampling layer and a fully connected layer.
  • the main parameters of the convolutional layer include the size of the convolution kernel and the number of input feature maps.
  • Each convolutional layer can contain several feature maps of the same size.
  • the feature values of the same layer adopt the method of sharing weights.
  • the convolution in each layer The core size is the same.
  • the convolution layer performs convolution calculation on the input image and extracts the layout features of the input image.
  • the feature extraction layer of the convolutional layer can be connected to the sampling layer.
  • the sampling layer is used to find the local average of the input image and perform secondary feature extraction.
  • the neural network model can ensure that the input The image has good robustness.
  • the sampling layer may include an up-sampling layer and a down-sampling layer.
  • the up-sampling layer adds pixel information in the image by interpolating the input image.
  • the down-sampling layer extracts the features of the input image by extracting the features of the input image.
  • a pooling layer (not shown in the figure) can also be provided after the convolutional layer.
  • the pooling layer uses the maximum pooling method to process the output results of the convolutional layer, which can be more Good to extract the invariant features of the input image.
  • the fully connected layer integrates the features in the image feature maps that have passed through multiple convolutional layers and pooling layers, and obtains the classification features of the input image features for image classification.
  • the fully connected layer maps the feature map generated by the convolutional layer into a fixed-length feature vector.
  • the feature vector contains the combined information of all the features of the input image, and the feature vector retains the most characteristic image features in the image to complete the image classification task. In this way, the prediction map corresponding to the input image can be calculated, thereby determining the target object contained in the first image.
  • a down-sampling layer, down-sampling layer and convolutional layer interval distribution are set in the segmentation network.
  • the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer The image output by the convolutional layer performs down-sampling.
  • the calculation speed of the segmentation network for the first image is improved.
  • the disadvantage of traditional neural networks is that they need to input fixed-size images.
  • the images input into the neural network may have been cropped or distorted.
  • the cropped or distorted images will have content due to The loss situation causes the neural network to reduce the recognition accuracy of the object to be recognized in the input image.
  • the recognition accuracy of the target object by the neural network will also be reduced.
  • Parallel convolutional layers are set in the segmentation network. Specifically, after the second down-sampling layer in the segmentation network, multiple parallel sampling rates are set. Convolutional layer, the parallel convolutional layer is used to process the image output by the second down-sampling layer, and the image features extracted on each parallel convolutional layer are fused to form a second image.
  • the input image or the target object in the input image can be any aspect ratio or any size.
  • the segmentation network can extract features at different scales.
  • the parallel convolution layer can use 4 ⁇ 4, 2 ⁇ 2, and 1 ⁇ 1 convolution kernels to perform feature calculations on the input images respectively, so as to obtain 3 independently processed images, and merge the 3 independently processed images , Can form a second image. Since the formation of the second image is not affected by the reality of the size or proportion of the input image, the robustness of the segmentation network is further improved.
  • each embodiment is not limited to the detection of objects of a specific size, shape, or type, nor is it limited to the detection of images of a specific size, type, or content.
  • the system for image processing using parallel convolutional layer pooling according to various embodiments can be applied to images of any size, type, or content.
  • the parallel convolutional layer improves the robustness of the data and also increases the computational burden of the system. For this reason, the parallel convolutional layer is set after the second down-sampling layer in the segmentation network. At this time, the second The image output by the down-sampling layer has sufficient characteristics to meet the requirements of the parallel sampling layer. At the same time, after the first image is processed by the two sampling layers, the amount of data calculation is greatly reduced. While satisfying the robustness of parallel convolutional layers, it also reduces the computational cost of evaluating convolutional layers.
  • S104 Acquire a third image containing the target object by performing target recognition on the second image.
  • the size of the second image can be adjusted.
  • 3 parallel convolutional layers (1 ⁇ 1, 3 ⁇ 3, and 6 ⁇ 6, a total of 46 feature vectors) as an example
  • these 3 parallel convolutional layers can be used for each candidate window to pool features .
  • a 11776-dimensional (256 ⁇ 46) representation is generated for each window. These representations can be provided to the fully connected layer of the segmentation network, and the fully connected layer is used to perform target recognition based on these representations.
  • a third lower A sampling layer where the third down-sampling layer performs down-sampling operations on the second image.
  • the feature information contained in the image can be improved by increasing the pixel value of the image.
  • multiple for example, 3 can be set after the third downsampling layer An up-sampling layer, where the up-sampling layer performs an up-sampling operation on the image output by the third down-sampling layer.
  • the performing target recognition on the second image may include:
  • the output a1, a2, and a3 of the fully connected layer can be expressed by the following formula:
  • the weight matrix contains different weight values, which are obtained by training the segmentation network.
  • the bias vector contains different bias values, which can be obtained by training the segmentation network.
  • S303 Perform target recognition on the image output by the upsampling layer based on the weight value and the bias value.
  • the target object contained in the second image can be quickly recognized.
  • the process of constructing a segmentation network may further include the following steps:
  • multiple convolutional layers can be set in the segmentation network.
  • the images that need to be processed can be processed accordingly.
  • the size of the feature image output by different convolution layers will also be different.
  • the input parameters and convolution kernels of all convolution layers can be calculated to obtain each convolution The size of the layer output image.
  • the shallow features have more image features, and the deep features have more semantic features.
  • the convolutional layer that can produce the same size, increase the convolution The connection between the layers, thereby reducing the edge jagged problem in the image.
  • step S403 In the process of implementing step S403, according to a specific implementation manner of the embodiment of the present disclosure, the following steps may also be included:
  • mapping function W(xi) for the i-th convolutional layer, the input xi of the i-th convolutional layer and the output F(xi) of the i-th convolutional layer can be set, and then F(xi) +W(xi) is used as the input of the i+2th convolutional layer. In this way, the convolutional layers are connected.
  • a convolution kernel of the same size can be set in multiple parallel convolution layers.
  • the input to the multiple The images in parallel convolutional layers are feature extracted to form multiple feature vector matrices.
  • different weight values are assigned to the multiple eigenvector matrices, and the sum of the eigenvector matrices with different weight values is used as the representation matrix of the second image to finally form the second image.
  • an embodiment of the present disclosure also discloses an image processing device 50, including:
  • the acquiring module 501 is configured to acquire the first image containing the target object.
  • the target object is the content to be acquired by the solution of the present disclosure.
  • the target object may be a person with various actions, an animal with behavior characteristics, or a stationary object.
  • the target object is usually contained in a certain scene.
  • a photo containing a portrait of a person usually also contains a background.
  • the background may include trees, mountains, rivers, and other people.
  • you want to extract the target object separately from the image you need to identify and process the target object separately.
  • various behaviors of the target object can be analyzed.
  • the first image is an image that contains the target object.
  • the first image can be one of a series of pre-stored photos, a video frame extracted from a pre-saved video, or a live broadcast One or more frames extracted from the video.
  • the first image may include multiple objects.
  • the photo used to describe the action of the person may include the target person, other people with the target person, trees, buildings, etc.
  • the target person constitutes the target object of the first image, and other people, trees, buildings, etc. together with the target person constitute the background image. Based on actual needs, one or more objects can be selected as target objects in the first image.
  • the target object can be obtained from a video file, and the video collected from the target object contains multiple frame images, and multiple images containing one or more continuous actions of the target object can be selected from the frame images of the video to form an image set.
  • the first image containing the target object can be obtained.
  • the setting module 502 is configured to set a segmentation network for performing image processing on the first image.
  • the segmentation network includes multiple convolutional layers and down-sampling layers.
  • the convolutional layer and the down-sampling layer are distributed at intervals, and the volume
  • the build-up layer performs feature extraction on the target object in the first image
  • the down-sampling layer performs a down-sampling operation on the image output by the convolutional layer.
  • the segmentation network includes a convolutional layer, a sampling layer and a fully connected layer.
  • the main parameters of the convolutional layer include the size of the convolution kernel and the number of input feature maps.
  • Each convolutional layer can contain several feature maps of the same size.
  • the feature values of the same layer adopt the method of sharing weights.
  • the convolution in each layer The core size is the same.
  • the convolution layer performs convolution calculation on the input image and extracts the layout features of the input image.
  • the feature extraction layer of the convolutional layer can be connected to the sampling layer.
  • the sampling layer is used to find the local average value of the input image and perform secondary feature extraction.
  • the neural network model can ensure that the input The image has good robustness.
  • the sampling layer may include an up-sampling layer and a down-sampling layer.
  • the up-sampling layer adds pixel information in the image by interpolating the input image.
  • the down-sampling layer extracts the features of the input image by extracting the features of the input image.
  • a pooling layer (not shown in the figure) can also be provided after the convolutional layer.
  • the pooling layer uses the maximum pooling method to process the output results of the convolutional layer, which can be more Good to extract the invariant features of the input image.
  • the fully connected layer integrates the features in the image feature maps that have passed through multiple convolutional layers and pooling layers, and obtains the classification features of the input image features for image classification.
  • the fully connected layer maps the feature map generated by the convolutional layer into a fixed-length feature vector.
  • the feature vector contains the combined information of all the features of the input image, and the feature vector retains the most characteristic image features in the image to complete the image classification task. In this way, the prediction map corresponding to the input image can be calculated, thereby determining the target object contained in the first image.
  • a down-sampling layer, down-sampling layer and convolutional layer interval distribution are set in the segmentation network.
  • the convolutional layer performs feature extraction on the target object in the first image, and the down-sampling layer The image output by the convolutional layer performs down-sampling.
  • the calculation speed of the segmentation network for the first image is improved.
  • the processing module 503 is configured to set multiple parallel convolutional layers with different sampling rates after the second downsampling layer in the segmentation network, and the parallel convolutional layers are used to process the image output by the second downsampling layer, The image features extracted on each parallel convolutional layer are merged to form a second image.
  • the disadvantage of traditional neural networks is that they need to input fixed-size images.
  • the images input into the neural network may have been cropped or distorted.
  • the cropped or distorted images will have content due to The loss situation causes the neural network to reduce the recognition accuracy of the object to be recognized in the input image.
  • the recognition accuracy of the target object by the neural network will also be reduced.
  • Parallel convolutional layers are set in the segmentation network. Specifically, after the second down-sampling layer in the segmentation network, multiple parallel sampling rates are set. Convolutional layer, the parallel convolutional layer is used to process the image output by the second down-sampling layer, and the image features extracted on each parallel convolutional layer are fused to form a second image.
  • the input image or the target object in the input image can have any aspect ratio or any size.
  • the segmentation network can extract features at different scales.
  • the parallel convolution layer can use 4 ⁇ 4, 2 ⁇ 2, and 1 ⁇ 1 convolution kernels to perform feature calculations on the input images respectively, so as to obtain 3 independently processed images, and merge the 3 independently processed images , A second image can be formed. Since the formation of the second image is not affected by the reality of the size or proportion of the input image, the robustness of the segmentation network is further improved.
  • each embodiment is not limited to the detection of objects of a specific size, shape, or type, nor is it limited to the detection of images of a specific size, type, or content.
  • the system for image processing using parallel convolutional layer pooling according to various embodiments can be applied to images of any size, type, or content.
  • the parallel convolutional layer improves the robustness of the data and also increases the computational burden of the system. For this reason, the parallel convolutional layer is set after the second down-sampling layer in the segmentation network. At this time, the second The image output by the down-sampling layer has sufficient characteristics to meet the requirements of the parallel sampling layer. At the same time, after the first image is processed by the two sampling layers, the amount of data calculation is greatly reduced. While satisfying the robustness of parallel convolutional layers, it also reduces the computational cost of evaluating convolutional layers.
  • the execution module 504 is configured to obtain a third image containing the target object by performing target recognition on the second image.
  • the size of the second image can be adjusted.
  • 3 parallel convolutional layers (1 ⁇ 1, 3 ⁇ 3, and 6 ⁇ 6, a total of 46 feature vectors) as an example
  • these 3 parallel convolutional layers can be used for each candidate window to pool features .
  • a 11776-dimensional (256 ⁇ 46) representation is generated for each window. These representations can be provided to the fully connected layer of the segmentation network, and the fully connected layer is used to perform target recognition based on these representations.
  • the device shown in FIG. 5 can correspondingly execute the content in the foregoing method embodiment.
  • an electronic device 60 which includes:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores an instruction executable by the at least one processor, and the instruction is executed by the at least one processor, so that the at least one processor can execute the image processing method in the foregoing method embodiment.
  • the embodiments of the present disclosure also provide a non-transitory computer-readable storage medium that stores computer instructions, and the computer instructions are used to make the computer execute the foregoing method embodiments.
  • the embodiments of the present disclosure also provide a computer program product, the computer program product includes a calculation program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, when the program instructions are executed by a computer, The computer executes the image processing method in the foregoing method embodiment.
  • Fig. 6 shows a schematic structural diagram of an electronic device 60 suitable for implementing embodiments of the present disclosure.
  • Electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (for example, Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc.
  • the electronic device shown in FIG. 6 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 60 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608
  • the program in the memory (RAM) 603 executes various appropriate actions and processing.
  • the RAM 603 also stores various programs and data required for the operation of the electronic device 60.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604.
  • the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch panel, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, An output device 607 such as a vibrator; a storage device 608 such as a magnetic tape, a hard disk, etc.; and a communication device 609.
  • the communication device 609 may allow the electronic device 60 to perform wireless or wired communication with other devices to exchange data.
  • the figure shows the electronic device 60 with various devices, it should be understood that it is not required to implement or have all the devices shown. It may be implemented alternatively or provided with more or fewer devices.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 609, or installed from the storage device 608, or installed from the ROM 602.
  • the processing device 601 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains at least two Internet protocol addresses; and sends to the node evaluation device including the at least two A node evaluation request for an Internet Protocol address, wherein the node evaluation device selects an Internet Protocol address from the at least two Internet Protocol addresses and returns it; receives the Internet Protocol address returned by the node evaluation device; wherein, the obtained The Internet Protocol address indicates the edge node in the content distribution network.
  • the aforementioned computer-readable medium carries one or more programs, and when the aforementioned one or more programs are executed by the electronic device, the electronic device: receives a node evaluation request including at least two Internet Protocol addresses; Among the at least two Internet Protocol addresses, select an Internet Protocol address; return the selected Internet Protocol address; wherein, the received Internet Protocol address indicates an edge node in the content distribution network.
  • the computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof.
  • the above-mentioned programming languages include object-oriented programming languages-such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Wherein, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the first obtaining unit can also be described as "a unit for obtaining at least two Internet Protocol addresses.”

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

L'invention porte sur un procédé et sur un appareil de traitement d'image, ainsi que sur un dispositif électronique, se rapportant au domaine technique du traitement de données. Le procédé consiste : à obtenir une première image comprenant un objet cible ; à fournir un réseau de segmentation pour effectuer un traitement d'image sur la première image ; à fournir une pluralité de couches de convolution parallèles ayant différents taux d'échantillonnage derrière une seconde couche de sous-échantillonnage dans le réseau de segmentation, la couche de convolution parallèle étant utilisée pour traiter une image délivrée en sortie par la seconde couche de sous-échantillonnage, et des caractéristiques d'image extraites de chaque couche de convolution parallèle formant une deuxième image par fusion ; et à obtenir une troisième image comprenant l'objet cible en effectuant une reconnaissance de cible sur la deuxième image. La présente invention peut améliorer la précision d'une reconnaissance de cible.
PCT/CN2020/079192 2019-05-15 2020-03-13 Procédé et appareil de traitement d'image et dispositif électronique WO2020228405A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910403859.XA CN110222726A (zh) 2019-05-15 2019-05-15 图像处理方法、装置及电子设备
CN201910403859.X 2019-05-15

Publications (1)

Publication Number Publication Date
WO2020228405A1 true WO2020228405A1 (fr) 2020-11-19

Family

ID=67821169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/079192 WO2020228405A1 (fr) 2019-05-15 2020-03-13 Procédé et appareil de traitement d'image et dispositif électronique

Country Status (2)

Country Link
CN (1) CN110222726A (fr)
WO (1) WO2020228405A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651983A (zh) * 2020-12-15 2021-04-13 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN113469083A (zh) * 2021-07-08 2021-10-01 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN111369468B (zh) * 2020-03-09 2022-02-01 北京字节跳动网络技术有限公司 图像处理方法、装置、电子设备及计算机可读介质
CN111931600B (zh) * 2020-07-21 2021-04-06 深圳市鹰硕教育服务有限公司 智能笔图像处理方法、装置及电子设备
CN113691863B (zh) * 2021-07-05 2023-06-20 浙江工业大学 一种提取视频关键帧的轻量化方法
CN113936220B (zh) * 2021-12-14 2022-03-04 深圳致星科技有限公司 图像处理方法、存储介质、电子设备及图像处理装置
CN117437429A (zh) * 2022-07-15 2024-01-23 华为技术有限公司 图像数据处理方法、装置和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862287A (zh) * 2017-11-08 2018-03-30 吉林大学 一种前方小区域物体识别及车辆预警方法
CN110046607A (zh) * 2019-04-26 2019-07-23 西安因诺航空科技有限公司 一种基于深度学习的无人机遥感图像板房或建材检测方法
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN110456805A (zh) * 2019-06-24 2019-11-15 深圳慈航无人智能***技术有限公司 一种无人机智能循迹飞行***及方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10282663B2 (en) * 2015-08-15 2019-05-07 Salesforce.Com, Inc. Three-dimensional (3D) convolution with 3D batch normalization
CN106920227B (zh) * 2016-12-27 2019-06-07 北京工业大学 基于深度学习与传统方法相结合的视网膜血管分割方法
CN107292352B (zh) * 2017-08-07 2020-06-02 北京中星微人工智能芯片技术有限公司 基于卷积神经网络的图像分类方法和装置
CN107657257A (zh) * 2017-08-14 2018-02-02 中国矿业大学 一种基于多通道卷积神经网络的语义图像分割方法
CN107909113B (zh) * 2017-11-29 2021-11-16 北京小米移动软件有限公司 交通事故图像处理方法、装置及存储介质
CN108022647B (zh) * 2017-11-30 2022-01-25 东北大学 基于ResNet-Inception模型的肺结节良恶性预测方法
CN108615010B (zh) * 2018-04-24 2022-02-11 重庆邮电大学 基于平行卷积神经网络特征图融合的人脸表情识别方法
CN108986124A (zh) * 2018-06-20 2018-12-11 天津大学 结合多尺度特征卷积神经网络视网膜血管图像分割方法
CN109389030B (zh) * 2018-08-23 2022-11-29 平安科技(深圳)有限公司 人脸特征点检测方法、装置、计算机设备及存储介质
CN109344878B (zh) * 2018-09-06 2021-03-30 北京航空航天大学 一种基于ResNet的仿鹰脑特征整合小目标识别方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862287A (zh) * 2017-11-08 2018-03-30 吉林大学 一种前方小区域物体识别及车辆预警方法
CN110046607A (zh) * 2019-04-26 2019-07-23 西安因诺航空科技有限公司 一种基于深度学习的无人机遥感图像板房或建材检测方法
CN110222726A (zh) * 2019-05-15 2019-09-10 北京字节跳动网络技术有限公司 图像处理方法、装置及电子设备
CN110456805A (zh) * 2019-06-24 2019-11-15 深圳慈航无人智能***技术有限公司 一种无人机智能循迹飞行***及方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651983A (zh) * 2020-12-15 2021-04-13 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN112651983B (zh) * 2020-12-15 2023-08-01 北京百度网讯科技有限公司 拼接图识别方法、装置、电子设备和存储介质
CN113469083A (zh) * 2021-07-08 2021-10-01 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***
CN113469083B (zh) * 2021-07-08 2024-05-31 西安电子科技大学 基于抗锯齿卷积神经网络的sar图像目标分类方法及***

Also Published As

Publication number Publication date
CN110222726A (zh) 2019-09-10

Similar Documents

Publication Publication Date Title
WO2020228405A1 (fr) Procédé et appareil de traitement d'image et dispositif électronique
CN110189246B (zh) 图像风格化生成方法、装置及电子设备
JP2023547917A (ja) 画像分割方法、装置、機器および記憶媒体
CN110399848A (zh) 视频封面生成方法、装置及电子设备
CN110070551B (zh) 视频图像的渲染方法、装置和电子设备
WO2020228383A1 (fr) Procédé et appareil de génération de forme de bouche et dispositif électronique
CN110796664B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
WO2022237811A1 (fr) Procédé et appareil de traitement d'image et dispositif
CN110399847B (zh) 关键帧提取方法、装置及电子设备
CN112232311B (zh) 人脸跟踪方法、装置及电子设备
CN111222509A (zh) 目标检测方法、装置及电子设备
CN110211017B (zh) 图像处理方法、装置及电子设备
CN110555861B (zh) 光流计算方法、装置及电子设备
CN110197459B (zh) 图像风格化生成方法、装置及电子设备
WO2024012255A1 (fr) Procédé et appareil d'entraînement de modèle de segmentation sémantique, dispositif électronique et support de stockage
WO2024041235A1 (fr) Procédé et appareil de traitement d'image, dispositif, support d'enregistrement et produit-programme
CN110060324B (zh) 图像渲染方法、装置及电子设备
CN114419322B (zh) 一种图像实例分割方法、装置、电子设备及存储介质
CN115100536B (zh) 建筑物识别方法、装置、电子设备和计算机可读介质
CN112052863B (zh) 一种图像检测方法及装置、计算机存储介质、电子设备
WO2021073204A1 (fr) Procédé et appareil d'affichage d'objet, dispositif électronique et support de stockage lisible par ordinateur
CN115311414A (zh) 基于数字孪生的实景渲染方法、装置及相关设备
CN115082828A (zh) 基于支配集的视频关键帧提取方法和装置
CN111200705B (zh) 图像处理方法和装置
CN113808151A (zh) 直播图像的弱语义轮廓检测方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20806268

Country of ref document: EP

Kind code of ref document: A1