WO2022134971A1 - 一种降噪模型的训练方法及相关装置 - Google Patents

一种降噪模型的训练方法及相关装置 Download PDF

Info

Publication number
WO2022134971A1
WO2022134971A1 PCT/CN2021/131656 CN2021131656W WO2022134971A1 WO 2022134971 A1 WO2022134971 A1 WO 2022134971A1 CN 2021131656 W CN2021131656 W CN 2021131656W WO 2022134971 A1 WO2022134971 A1 WO 2022134971A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixels
sub
noise reduction
target
Prior art date
Application number
PCT/CN2021/131656
Other languages
English (en)
French (fr)
Inventor
李松江
黄涛
贾旭
刘健庄
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022134971A1 publication Critical patent/WO2022134971A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a training method for a noise reduction model and a related device.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that responds in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • image noise reduction methods In order to improve the quality of such images, image noise reduction methods emerge as the times require.
  • Image denoising methods refer to applying algorithms to remove noise from observed noisy images, preserve image details, and reconstruct corresponding clean images.
  • image noise reduction methods have important application value in the fields of mobile phone photography, high-definition TV, surveillance equipment, satellite images, and medical imaging.
  • the image is denoised mainly through a learning-based denoising model (eg, a convolutional neural network).
  • a learning-based denoising model eg, a convolutional neural network.
  • the denoising effect of a denoising model is largely dependent on the training data used to train the denoising model, i.e. noisy image-clean image pairs.
  • noisy image-clean image pairs i.e. noisy image-clean image pairs.
  • the acquisition of noisy image-clean image pairs is often very difficult. Therefore, there is an urgent need for a method that enables training of denoising models without the need for noise image-clean image pairs.
  • the embodiments of the present application provide a training method and a related device for a noise reduction model.
  • a pair of sub-image samples is obtained, and one sub-image in the pair of sub-image samples is obtained.
  • the other sub-image in the sub-image sample pair is used as the output expected value of the noise reduction model, so as to realize the training of the noise reduction model.
  • the training of the noise reduction model can be realized based on the noise image, and there is no need to obtain the clean image corresponding to the noise image, which reduces the training difficulty of the noise reduction model.
  • a first aspect of the present application provides a training method for a noise reduction model, and the training method for a noise reduction model can be applied to scenarios such as terminal photography, medical images, or supervised videos, and is used to achieve image noise reduction.
  • the method includes: the terminal acquires image samples to be denoised from a sample set, where the sample set includes a plurality of image samples to be denoised.
  • the terminal performs a first random downsampling process and a second random downsampling process on the image samples to be denoised to obtain a first sub-image and a second sub-image respectively, the first sub-image and the second sub-image are The images are of the same resolution.
  • the terminal performs random downsampling twice on the same image sample to be denoised, respectively, to obtain the first sub-image and the second sub-image.
  • the first sub-image and the second sub-image are different images, the resolutions of the first sub-image and the second sub-image are the same, and the resolutions of the first sub-image and the second sub-image are lower than the image samples to be denoised resolution.
  • the random downsampling process refers to randomly sampling pixels in the image sample to be denoised based on a set sampling method, and piecing together the pixels obtained by sampling to obtain a sub-image with a resolution smaller than that of the image sample to be denoised.
  • the above-mentioned first random downsampling process and the second random downsampling process perform pixel sampling in the same manner.
  • the first random downsampling process and the second random downsampling process are two independent pixel random sampling processes, the first sub-image obtained by the first random downsampling process and the second random downsampling process are obtained.
  • the maximum probability of the second sub-image is not the same.
  • the terminal inputs the first sub-image into a noise reduction model to obtain a first target image, where the noise reduction model includes but is not limited to a learning-based model such as a convolutional neural network or a noise reduction model based on sparse feature expression.
  • the terminal acquires a first loss function according to the first target image and the second sub-image, where the first loss function is used to indicate a difference between the first target image and the second sub-image.
  • the terminal trains the noise reduction model at least according to the first loss function, until the model training conditions are met, and a target noise reduction model is obtained.
  • the model training condition means that the first loss function obtained by the terminal is smaller than a preset threshold.
  • a sub-image sample pair is obtained by performing random downsampling twice on the image samples to be denoised, and one sub-image in the sub-image sample pair is used as the input value of the noise reduction model, and the sub-image sample pair in the sub-image sample pair is used as the input value of the noise reduction model.
  • the other sub-image is used as the output expectation value of the noise reduction model, so as to realize the training of the noise reduction model.
  • the training of the noise reduction model can be realized based on the noise image, and there is no need to obtain the clean image corresponding to the noise image, which reduces the training difficulty of the noise reduction model.
  • the terminal performs a first random downsampling process and a second random downsampling process on the image samples to be denoised to obtain a first sub-image and a second sub-image respectively.
  • the terminal divides the image sample to be denoised into M image units, and each image unit in the M image units includes n*n pixels.
  • the terminal performs a first random selection of pixels in each of the M image units to obtain M first pixels.
  • the terminal piece together the M first pixels according to the positions of the image units corresponding to the M first pixels in the image to be denoised to obtain the first sub-image.
  • the terminal performs a second random selection of pixels in each of the M image units to obtain M second pixels.
  • the terminal piece together the M second pixels according to the positions of the image units corresponding to the M second pixels in the image to be denoised to obtain the second sub-image.
  • the terminal performs the second random selection of pixels in each of the M image units, including: the terminal acquires each of the M image units. n*n-1 target pixels in the image units, where the n*n-1 target pixels are pixels in each image unit that are not selected when the first random selection of pixels is performed.
  • the terminal performs random selection of pixels among n*n-1 target pixels in each of the M image units, and obtains M second pixels, the M second pixels and the M second pixels The first pixels are all different.
  • the terminal when the terminal performs the second random selection of pixels for each image unit, the terminal needs to first determine the pixels in each image unit that are not selected as the first pixel, that is, n*n in each image unit -1 target pixel. Then, the terminal randomly selects one pixel from the n*n-1 target pixels as the second pixel. In this way, the first pixel and the second pixel selected by the terminal in each image unit must be different pixels. For example, each pixel in the first sub-image and the second sub-image as shown in FIG. 5 is a different pixel.
  • the first sub-image and the second sub-image obtained by the random down-sampling process are two completely independent images , that is, there is no strong correlation between the first sub-image and the second sub-image, so that two images with random noise in the same scene can be better simulated based on the first sub-image and the second sub-image, thereby improving the The training effect of the noise reduction model.
  • the terminal performs random selection of pixels among n*n-1 target pixels in each of the M image units, including: Among the n*n-1 target pixels in each of the image units, randomly select a second pixel adjacent to the pixel selected in the first random selection to obtain M second pixels, so Each of the M second pixels is adjacent to the corresponding first pixel.
  • the terminal may first determine the n*n-1 target pixels and the first pixel selected when performing the first random selection of pixels adjacent target pixels. Then, the terminal randomly selects one pixel as the second pixel among the determined target pixels adjacent to the first pixel. In this way, the first pixel and the second pixel selected by the terminal in each image unit must be adjacent pixels.
  • the target pixel adjacent to the first pixel as the second pixel, a higher similarity between the obtained second sub-image and the first sub-image can be ensured, that is, based on the first sub-image and the second sub-image, a higher similarity can be ensured. It simulates two images with random noise in the same scene well, thereby improving the training effect of the noise reduction model.
  • the method further includes: the terminal inputs the image samples to be denoised into the denoising model to obtain a second target image.
  • the terminal performs down-sampling processing on the second target image based on the sampling positions of the pixels in the first random down-sampling processing to obtain a first sub-target image. That is to say, the positions of the first sub-target image and each group of corresponding pixels in the first sub-image in the source image before downsampling are the same.
  • the terminal performs down-sampling processing on the second target image based on the sampling positions of the pixels in the second random down-sampling processing to obtain a second sub-target image.
  • the terminal obtains a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image.
  • the terminal trains the noise reduction model according to at least the first loss function and the second loss function.
  • the noise reduction model can be further constrained, so that the noise reduction model will not generate excessive noise due to the inconsistent positions of the pixels corresponding to the first sub-image and the second sub-image in the image samples to be denoised
  • a smooth image means that the high-frequency detail information in the noisy image is protected from being removed by the noise reduction model, so that the image after noise reduction still retains the high-frequency detail information.
  • the terminal trains the noise reduction model at least according to the first loss function and the second loss function, including: the terminal at least according to the first loss function, The first weight coefficient, the second loss function and the second weight coefficient train the noise reduction model.
  • the first weight coefficient is used to indicate the weight of the first loss function
  • the second weight coefficient is used to indicate the weight of the second loss function.
  • the stronger the noise intensity, the stronger the noise reduction intensity, the smaller the noise, but the more the loss of high-frequency details of the image will be. Therefore, in practical applications, the first weight coefficient and/or the second weight coefficient can be adjusted to achieve a balance between the strength of noise reduction and the degree of loss of high-frequency details.
  • the noise reduction model includes a learning-based noise reduction model such as a convolutional neural network or a noise reduction model based on sparse feature expression.
  • a second aspect of the present application provides an image denoising method, including: acquiring an image to be denoised; inputting the denoised image into a target denoising model to obtain a denoised image.
  • the target noise reduction model is obtained by training a noise reduction model based on at least a first loss function, the first loss function is obtained based on the first target image and the second sub-image, and the first loss The function is used to indicate the difference between the first target image and the second sub-image, the first target image is obtained by inputting the first sub-image into the noise reduction model, the first sub-image is and the second sub-image is obtained after performing the first random downsampling process and the second random downsampling process on the image samples to be denoised respectively, the resolutions of the first sub-image and the second sub-image are same.
  • the first sub-image is obtained according to M first pixels, and the M first pixels are obtained by dividing the image samples to be denoised into M pieces.
  • the first random selection of pixels is performed in each of the M image units;
  • the second sub-image is obtained according to the M second pixels, and the M th Two pixels are obtained by performing a second random selection of pixels in each of the M image units after dividing the image sample to be denoised into the M image units.
  • the M second pixels are randomly selected by performing pixel randomization in n*n-1 target pixels in each of the M image units. Selected, the n*n-1 target pixels are pixels that are not selected when the first random selection of pixels is performed in each image unit, and the M second pixels and the M first pixels are Pixels are not the same.
  • the M second pixels are obtained by randomly selecting one of the n*n-1 target pixels in each of the M image units. Obtained from a second pixel adjacent to the pixel selected in the first random selection, where each second pixel in the M second pixels is adjacent to a corresponding first pixel.
  • the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and the second loss function
  • the second loss function is obtained by training the noise reduction model.
  • the first sub-target image is based on the pixels in the first random downsampling process
  • the sampling position is obtained by down-sampling the second target image
  • the second sub-target image is obtained by down-sampling the second target image based on the sampling positions of the pixels in the second random down-sampling process.
  • the second target image is obtained by inputting the image sample to be denoised into the denoising model.
  • the target noise reduction model is based on at least the first weight coefficient of the first loss function, the second loss function and the second weight coefficient. obtained by training.
  • the first weight coefficient is used to indicate the weight of the first loss function
  • the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • a third aspect of the present application provides a model training device, including: an acquisition unit and a processing unit.
  • the acquiring unit is configured to acquire image samples to be denoised from a sample set, the sample set includes a plurality of image samples to be denoised;
  • the processing unit is adapted to perform the first execution on the image samples to be denoised
  • the random downsampling process and the second random downsampling process respectively obtain a first sub-image and a second sub-image, and the first sub-image and the second sub-image have the same resolution;
  • the processing unit further uses The first sub-image is input into a noise reduction model to obtain a first target image;
  • the processing unit is further configured to obtain a first loss function according to the first target image and the second sub-image, and the first A loss function is used to indicate the difference between the first target image and the second sub-image;
  • the processing unit is further configured to train the noise reduction model at least according to the first loss function, and obtain Target noise reduction model.
  • the processing unit is further configured to: divide the image sample to be denoised into M image units, and each image unit in the M image units includes: n*n pixels; perform the first random selection of pixels in each of the M image units, obtain M first pixels, and obtain the first pixels according to the M first pixels sub-image; performing a second random selection of pixels in each of the M image units, obtaining M second pixels, and obtaining the second sub-image according to the M second pixels.
  • the acquiring unit is further configured to acquire n*n-1 target pixels in each of the M image units, where the n*n -1 target pixel is a pixel that is not selected when the first random selection of pixels is performed in each image unit; the processing unit is also used for the Random selection of pixels is performed among the n*n-1 target pixels, and M second pixels are obtained, and the M second pixels are different from the M first pixels.
  • the processing unit is further configured to randomly select one of the n*n-1 target pixels in each of the M image units and In the first random selection, a second pixel adjacent to the selected pixel is obtained, and M second pixels are obtained, and each second pixel in the M second pixels is adjacent to the corresponding first pixel.
  • the processing unit is further configured to: input the image samples to be denoised into the denoising model to obtain a second target image; The sampling positions of the pixels in the down-sampling process, perform down-sampling processing on the second target image to obtain a first sub-target image; based on the sampling positions of the pixels in the second random down-sampling process, the second target image is performing down-sampling processing on the image to obtain a second sub-target image; obtaining a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image; The noise reduction model is trained according to at least the first loss function and the second loss function.
  • the processing unit is further configured to perform at least on the reduction according to the first loss function, the first weight coefficient, the second loss function and the second weight coefficient.
  • the noise model is trained; wherein, the first weight coefficient is used to indicate the weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • a fourth aspect of the present application provides an image noise reduction device, including: an acquisition unit and a processing unit.
  • the acquisition unit is used to acquire the image to be denoised;
  • the processing unit is used to input the image to be denoised into a target denoising model to obtain a denoised image;
  • the target denoising model is at least Obtained by training the noise reduction model based on a first loss function, the first loss function is obtained based on the first target image and the second sub-image, and the first loss function is used to indicate the first target image and the second sub-image, the first target image is obtained by inputting the first sub-image into the noise reduction model, and the first sub-image and the second sub-image are
  • the noise image samples are obtained after performing the first random downsampling process and the second random downsampling process respectively, and the first sub-image and the second sub-image have the same resolution.
  • the first sub-image is obtained according to M first pixels, and the M first pixels are obtained by dividing the image samples to be denoised into M pieces.
  • the first random selection of pixels is performed in each of the M image units;
  • the second sub-image is obtained according to the M second pixels, and the M th Two pixels are obtained by performing a second random selection of pixels in each of the M image units after dividing the image sample to be denoised into the M image units.
  • the M second pixels are randomly selected by performing pixel randomization in n*n-1 target pixels in each of the M image units. Selected, the n*n-1 target pixels are pixels that are not selected when the first random selection of pixels is performed in each image unit, and the M second pixels and the M first pixels are Pixels are not the same.
  • the M second pixels are obtained by randomly selecting one of the n*n-1 target pixels in each of the M image units. Obtained from a second pixel adjacent to the pixel selected in the first random selection, where each second pixel in the M second pixels is adjacent to a corresponding first pixel.
  • the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and the second loss function
  • the second loss function is obtained by training the noise reduction model.
  • the first sub-target image is based on the pixels in the first random downsampling process
  • the sampling position is obtained by down-sampling the second target image
  • the second sub-target image is obtained by down-sampling the second target image based on the sampling positions of the pixels in the second random down-sampling process.
  • the second target image is obtained by inputting the image sample to be denoised into the denoising model.
  • the target noise reduction model is based on at least the first weight coefficient of the first loss function, the second loss function and the second weight coefficient. obtained through training; wherein the first weight coefficient is used to indicate the weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • a fifth aspect of the present application provides a model training device, which may include a processor, the processor is coupled to a memory, the memory stores program instructions, and the method described in the first aspect is implemented when the program instructions stored in the memory are executed by the processor .
  • a model training device which may include a processor, the processor is coupled to a memory, the memory stores program instructions, and the method described in the first aspect is implemented when the program instructions stored in the memory are executed by the processor .
  • a sixth aspect of the present application provides an image noise reduction device, which may include a processor, the processor is coupled with a memory, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the above-mentioned second aspect is implemented method.
  • the steps in each possible implementation manner of the second aspect performed by the processor reference may be made to the second aspect for details, and details are not repeated here.
  • a seventh aspect of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, causes the computer to execute the method described in the first aspect.
  • An eighth aspect of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, causes the computer to execute the method described in the second aspect.
  • a ninth aspect of the present application provides a circuit system, the circuit system includes a processing circuit configured to perform the method of the first aspect or the second aspect.
  • a tenth aspect of the present application provides a computer program that, when run on a computer, causes the computer to execute the method described in the first aspect or the second aspect.
  • An eleventh aspect of the present application provides a chip system, where the chip system includes a processor for supporting a server or a threshold value obtaining device to implement the functions involved in the above aspects, for example, sending or processing the functions involved in the above methods data and/or information.
  • the chip system further includes a memory for storing necessary program instructions and data of the server or the communication device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • FIG. 1 is a schematic structural diagram of an artificial intelligence main frame provided by an embodiment of the present application.
  • FIG. 2a is an image processing system provided by an embodiment of the present application.
  • FIG. 2b is another image processing system provided by an embodiment of the present application.
  • FIG. 2c is a schematic diagram of a related device for image processing provided by an embodiment of the present application.
  • FIG. 3a is a schematic diagram of the architecture of a system 100 provided by an embodiment of the present application.
  • 3b is a schematic diagram of an image noise reduction provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a training method for a noise reduction model provided by an embodiment of the present application
  • FIG. 5 is a schematic diagram of a random downsampling process provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another random downsampling process provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another random downsampling process provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a training process of a noise reduction model provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a training process of another noise reduction model provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of an experiment for determining the noise reduction effect of a noise reduction model provided by an embodiment of the application;
  • FIG. 11 is a schematic diagram illustrating a comparison of noise reduction effects of various noise reduction methods provided in an embodiment of the present application.
  • FIG. 12 is another schematic flowchart of an experiment for determining the noise reduction effect of a noise reduction model provided by an embodiment of the application;
  • FIG. 13 is a schematic diagram of a comparison of noise reduction indicators of multiple noise reduction methods provided in an embodiment of the present application.
  • FIG. 14 is a schematic diagram illustrating the comparison of noise reduction effects of various noise reduction methods provided in an embodiment of the present application.
  • 15 is a schematic flowchart of an image noise reduction method provided by an embodiment of the present application.
  • 16 is a schematic structural diagram of a model training apparatus provided by an embodiment of the application.
  • FIG. 17 is a schematic structural diagram of an image noise reduction apparatus provided by an embodiment of the application.
  • FIG. 18 is a schematic structural diagram of an execution device provided by an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of a training device provided by an embodiment of the application.
  • FIG. 20 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • Figure 1 shows a schematic structural diagram of the main frame of artificial intelligence.
  • the above-mentioned artificial intelligence theme framework is explained in two dimensions (vertical axis).
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, data has gone through the process of "data-information-knowledge-wisdom".
  • the "IT value chain” reflects the value brought by artificial intelligence to the information technology industry from the underlying infrastructure of human intelligence, information (providing and processing technology implementation) to the industrial ecological process of the system.
  • the infrastructure provides computing power support for artificial intelligence systems, realizes communication with the outside world, and supports through the basic platform. Communication with the outside world through sensors; computing power is provided by smart chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA); the basic platform includes distributed computing framework and network-related platform guarantee and support, which can include cloud storage and computing, interconnection networks, etc. For example, sensors communicate with external parties to obtain data, and these data are provided to the intelligent chips in the distributed computing system provided by the basic platform for calculation.
  • smart chips hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA
  • the basic platform includes distributed computing framework and network-related platform guarantee and support, which can include cloud storage and computing, interconnection networks, etc. For example, sensors communicate with external parties to obtain data, and these data are provided to the intelligent chips in the distributed computing system provided by the basic platform for calculation.
  • the data on the upper layer of the infrastructure is used to represent the data sources in the field of artificial intelligence.
  • the data involves graphics, images, voice, and text, as well as IoT data from traditional devices, including business data from existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making, etc.
  • machine learning and deep learning can perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc. on data.
  • Reasoning refers to the process of simulating human's intelligent reasoning method in a computer or intelligent system, using formalized information to carry out machine thinking and solving problems according to the reasoning control strategy, and the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, usually providing functions such as classification, sorting, and prediction.
  • some general capabilities can be formed based on the results of data processing, such as algorithms or a general system, such as translation, text analysis, computer vision processing, speech recognition, image identification, etc.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. They are the encapsulation of the overall solution of artificial intelligence, and the productization of intelligent information decision-making to achieve landing applications. Its application areas mainly include: intelligent terminals, intelligent transportation, Smart healthcare, autonomous driving, smart city, etc.
  • FIG. 2a is an image processing system provided by an embodiment of the present application, where the image processing system includes a user equipment and a data processing device.
  • the user equipment includes smart terminals such as mobile phones, personal computers, or information processing centers.
  • the user equipment is the initiator of image processing. As the initiator of the image enhancement request, the user usually initiates the request through the user equipment.
  • the above-mentioned data processing device may be a device or server with data processing functions, such as a cloud server, a network server, an application server, and a management server.
  • the data processing device receives the image enhancement request from the intelligent terminal through the interactive interface, and then performs image processing in the form of machine learning, deep learning, search, reasoning, and decision-making through the memory for storing data and the processor for data processing.
  • the memory in the data processing device may be a general term, including local storage and a database for storing historical data.
  • the database may be on the data processing device or on other network servers.
  • the user equipment can receive instructions from the user, for example, the user equipment can acquire an image input/selected by the user, and then initiate a request to the data processing equipment, so that the data processing equipment can target the data obtained by the user equipment.
  • An image noise reduction application is performed on the image, resulting in corresponding processing results for the image.
  • the user equipment may acquire an image input by the user, and then initiate an image noise reduction request to the data processing device, so that the data processing device performs image noise reduction on the image, thereby obtaining a noise-reduced image.
  • the data processing device may execute the training method of the noise reduction model according to the embodiment of the present application.
  • Fig. 2b is another image processing system provided by the embodiment of the application.
  • the user equipment is directly used as a data processing device, and the user equipment can directly obtain the input from the user and directly process it by the hardware of the user equipment itself, The specific process is similar to that of FIG. 2a, and the above description can be referred to, and details are not repeated here.
  • the user equipment can receive instructions from the user, for example, the user equipment can acquire an image selected by the user in the user equipment, and then the user equipment itself executes an image processing application for the image, Thereby, the corresponding processing result for the image is obtained.
  • the user equipment itself can execute the training method of the noise reduction model according to the embodiment of the present application.
  • FIG. 2c is a schematic diagram of a related device for image processing provided by an embodiment of the present application.
  • the user equipment in the above-mentioned FIGS. 2a and 2b may specifically be the local device 301 or the local device 302 in FIG. 2c, and the data processing device in FIG. 2a may specifically be the execution device 210 in FIG. 2c, wherein the data storage system 250 may be To store the data to be processed by the execution device 210, the data storage system 250 may be integrated on the execution device 210, or may be set on the cloud or other network servers.
  • the processors in Figures 2a and 2b may perform data training/machine learning/deep learning through a neural network model or other model (eg, a support vector machine-based model), and use the data to finally train or learn the model to execute on the image Image processing application, so as to obtain the corresponding processing results.
  • a neural network model or other model eg, a support vector machine-based model
  • FIG. 3a is a schematic diagram of the architecture of a system 100 provided by an embodiment of the present application.
  • the execution device 110 is configured with an input/output (I/O) interface 112 for performing data interaction with external devices,
  • I/O input/output
  • the user may input data to the I/O interface 112 through the client device 140, and the input data may include: various tasks to be scheduled, callable resources, and other parameters in this embodiment of the present application.
  • the execution device 110 may call the data storage system 150
  • the data, codes, etc. in the corresponding processing can also be stored in the data storage system 150 .
  • the I/O interface 112 returns the processing results to the client device 140 for provision to the user.
  • the training device 120 can generate corresponding target models/rules based on different training data for different goals or tasks, and the corresponding target models/rules can be used to achieve the above-mentioned goals or complete the above-mentioned tasks. , which provides the user with the desired result.
  • the training data may be stored in the database 130 and come from training samples collected by the data collection device 160 .
  • the user can manually specify input data, which can be operated through the interface provided by the I/O interface 112 .
  • the client device 140 can automatically send the input data to the I/O interface 112 . If the user's authorization is required to request the client device 140 to automatically send the input data, the user can set corresponding permissions in the client device 140 .
  • the user can view the result output by the execution device 110 on the client device 140, and the specific presentation form can be a specific manner such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as new sample data as shown in the figure, and store them in the database 130 .
  • the I/O interface 112 directly uses the input data input into the I/O interface 112 and the output result of the output I/O interface 112 as shown in the figure as a new sample The data is stored in database 130 .
  • FIG. 3a is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
  • the neural network can be obtained by training according to the training device 120.
  • An embodiment of the present application further provides a chip, where the chip includes a neural network processor NPU.
  • the chip can be set in the execution device 110 as shown in Fig. 3a to complete the calculation work of the calculation module 111.
  • the chip can also be set in the training device 120 as shown in FIG. 3a to complete the training work of the training device 120 and output the target model/rule.
  • the neural network processor NPU is mounted on the main central processing unit (CPU) (host CPU) as a co-processor, and tasks are allocated by the main CPU.
  • the core part of the NPU is an arithmetic circuit, and the controller controls the arithmetic circuit to extract the data in the memory (weight memory or input memory) and perform operations.
  • the arithmetic circuit includes multiple processing units (process engines, PEs).
  • the arithmetic circuit is a two-dimensional systolic array.
  • the arithmetic circuit may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition.
  • the arithmetic circuit is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory, and buffers it on each PE in the operation circuit.
  • the arithmetic circuit fetches the data of matrix A from the input memory and performs matrix operation on matrix B, and stores the partial result or final result of the matrix in an accumulator.
  • the vector calculation unit can further process the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • the vector computing unit can be used for network computation of non-convolutional/non-FC layers in neural networks, such as pooling, batch normalization, local response normalization, etc.
  • the vector computation unit can store the processed output vector to a unified buffer.
  • the vector computing unit may apply a nonlinear function to the output of the arithmetic circuit, such as a vector of accumulated values, to generate activation values.
  • the vector computation unit generates normalized values, merged values, or both.
  • the vector of processed outputs can be used as activation input to an operational circuit, such as for use in subsequent layers in a neural network.
  • Unified memory is used to store input data as well as output data.
  • the weight data directly transfers the input data in the external memory to the input memory and/or the unified memory through the direct memory access controller (DMAC), stores the weight data in the external memory into the weight memory, and transfers the unified memory store the data in the external memory.
  • DMAC direct memory access controller
  • the bus interface unit (BIU) is used to realize the interaction between the main CPU, the DMAC and the instruction fetch memory through the bus.
  • the instruction fetch buffer connected to the controller is used to store the instructions used by the controller
  • the controller is used for invoking the instructions cached in the memory to realize and control the working process of the operation accelerator.
  • the unified memory, input memory, weight memory and instruction fetch memory are all on-chip memories
  • the external memory is the memory outside the NPU
  • the external memory can be double data rate synchronous dynamic random access memory (double data rate synchronous dynamic random access memory, DDR SDRAM), high bandwidth memory (HBM), or other readable and writable memory.
  • a neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes xs and intercept 1 as inputs, and the output of the operation unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • Ws is the weight of xs
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer.
  • the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting many of the above single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field, and the local receptive field can be an area composed of several neural units.
  • the work of each layer in a neural network can be expressed mathematically To describe: From the physical level, the work of each layer in the neural network can be understood as the transformation from the input space to the output space (that is, the row space of the matrix to the column space) through five operations on the input space (set of input vectors). ), the five operations include: 1. Dimension raising/lowering; 2. Enlarging/reducing; 3. Rotation; 4. Translation; 5. "Bending”. Among them, the operations of 1, 2, and 3 are determined by Complete, the operation of 4 is completed by +b, and the operation of 5 is realized by a().
  • W is the weight vector, and each value in the vector represents the weight value of a neuron in the neural network of this layer.
  • the vector W determines the space transformation from the input space to the output space described above, that is, the weight W of each layer controls how the space is transformed.
  • the purpose of training the neural network is to finally obtain the weight matrix of all layers of the trained neural network (the weight matrix formed by the vectors W of many layers). Therefore, the training process of the neural network is essentially learning the way to control the spatial transformation, and more specifically, learning the weight matrix.
  • the neural network can use the error back propagation (BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller.
  • BP error back propagation
  • the input signal is passed forward until the output will generate error loss, and the parameters in the initial neural network model are updated by back-propagating the error loss information, so that the error loss converges.
  • the back-propagation algorithm is a back-propagation movement dominated by error loss, aiming to obtain the parameters of the optimal neural network model, such as the weight matrix.
  • Image enhancement refers to processing the brightness, color, contrast, saturation, dynamic range, etc. of an image to meet certain specific indicators. Simply put, in the process of image processing, by purposefully emphasizing the overall or local characteristics of the image, the original unclear image becomes clear or some interesting features are emphasized, and the difference between the features of different objects in the image is enlarged. It can improve the image quality and enrich the amount of image information, and can strengthen the image interpretation and recognition effect to meet the needs of some special analysis.
  • image enhancement may include, but is not limited to, image super-resolution reconstruction, image noise reduction, image dehazing, image deblurring, and image contrast enhancement.
  • Image denoising method refers to the application of algorithms to remove noise from the observed noisy image, retain the more important details in the image, and reconstruct the corresponding clean image.
  • the reconstructed image looks clear and clean, and the image quality can be improved through image noise reduction, which is beneficial to the subsequent image processing processes such as image classification or object recognition.
  • image noise reduction is a very important step.
  • many image noise reduction methods have been applied in academia and industry, and image noise reduction is one of the research hotspots in the current field of image processing technology.
  • FIG. 3b is a schematic diagram of an image noise reduction provided by an embodiment of the present application. As shown in Figure 3b, the noise in the image can be eliminated as much as possible through image noise reduction, and the quality of the image can be improved.
  • the neural network training method provided in the embodiment of the present application involves the processing of images, and can be specifically applied to data processing methods such as data training, machine learning, deep learning, etc., to symbolize and form the training data (such as the images in the present application).
  • intelligent information modeling, extraction, preprocessing, training, etc. to finally obtain a trained image processing model;
  • the image noise reduction method provided in the embodiment of the present application can use the above-mentioned trained noise reduction model to convert the input data ( The image to be processed in this application) is input into the trained image processing model to obtain output data (such as the target image in this application).
  • the training method of the noise reduction model and the image noise reduction method provided by the embodiments of the present application are inventions based on the same concept, and can also be understood as two parts in a system, or two parts of an overall process Stages: such as model training stage and model application stage.
  • image noise reduction methods refer to applying algorithms to remove noise from observed noisy images, preserve image details, and reconstruct corresponding clean images. By extracting features from noisy images, removing noise and filling details by means of image prior knowledge, image self-similarity and complementary information of multi-frame images, generating corresponding high-quality images is a common idea in image noise reduction research.
  • image noise reduction technology has important application value in the fields of mobile phone photography, high-definition TV, surveillance equipment, satellite imagery and medical imaging.
  • image noise reduction algorithms are mainly divided into traditional filtering methods and learning-based methods.
  • Traditional filtering methods are, for example, mean filtering Gaussian filtering, bilateral filtering, three-dimensional block-matching filtering (Block-Matching and 3D filtering, BM3D) and other methods.
  • the traditional filtering method mainly uses the similarity of the image to reduce the random noise in the image through filtering and smoothing, while retaining the high-frequency signal of the image itself.
  • most effective image denoising methods are a combination of multiple methods, which can not only preserve edge information well, but also remove noise in images.
  • the median filtering method and the wavelet filtering method are combined to perform image filtering to achieve a better image noise reduction effect.
  • the traditional noise reduction algorithm finds the rules from the noise image, and then performs the corresponding noise reduction processing.
  • the regularity cannot be found from the noise image itself, the traditional filtering method performs poorly, and it is difficult to achieve a good image noise reduction effect, thus limiting the further improvement of image noise reduction performance.
  • the learning-based image denoising method is a data-driven method, which realizes image denoising by learning the regularity of large-scale noisy image-clean image pairs by the denoising model, so as to achieve a good image denoising effect.
  • DNN Deep Neural Network
  • Image denoising methods based on deep neural networks can generate cleaner, clearer, and less artifact-free images, further promoting the development of image denoising techniques.
  • the denoising effect of current supervised learning-based image denoising networks largely depends on the training data used to train the network, i.e. noisy image-clean image pairs.
  • the noise image-clean image pair refers to a noise image and a clean and noise-free image corresponding to the noise image.
  • a noisy image-clean image pair is a pair of images in the same scene. The scene information included in the noisy image and the clean image is the same. The difference is that the clean image does not include noise.
  • medical images are generated by instruments that generate rays or electromagnetic waves and act on the human body. Due to the reasons of the instrument itself, the images obtained by the instrument often contain a lot of random noise, that is, it is often difficult to obtain clean and noise-free images by the instrument. In addition, due to the particularity of medical images, the process of taking medical images is likely to have a certain negative impact on the human body. Therefore, in practical applications, it is often difficult to obtain multiple medical images of the same part.
  • this embodiment proposes a method of training the noise reduction model only by using the noise image.
  • Sub-image sample pairs are obtained by performing two random downsampling processes on the image samples to be denoised, and one sub-image in the sub-image sample pair is used as the input value of the noise reduction model, and the other sub-image in the sub-image sample pair is used as the input value of the noise reduction model. Then it is used as the output expectation value of the noise reduction model, so as to realize the training of the noise reduction model.
  • the training of the noise reduction model can be realized based on the noise image, and there is no need to obtain the clean image corresponding to the noise image, which reduces the training difficulty of the noise reduction model.
  • the training method of the noise reduction model provided by the embodiment of the present application can be applied to a terminal, where the terminal is a device capable of performing model training.
  • the terminal can be, for example, a personal computer (personal computer, PC), a notebook computer, a server, a mobile phone (mobile phone), a tablet computer, a mobile internet device (MID), a wearable device, a virtual reality ( virtual reality (VR) equipment, augmented reality (AR) equipment, wireless terminals in industrial control, wireless terminals in self driving, wireless terminals in remote medical surgery Terminal, wireless terminal in smart grid, wireless terminal in transportation safety, wireless terminal in smart city, wireless terminal in smart home, etc.
  • the terminal may be a device running an Android system, an IOS system, a windows system, and other systems.
  • the training method of the noise reduction model provided in this embodiment can be applied to scenes that need to convert a noisy image into a clean image, such as terminal device photography, video supervision, and medical image processing.
  • the training method of the noise reduction model provided by this embodiment which can train a noise reduction model for image noise reduction to achieve image noise reduction.
  • the shortcoming of terminal device photography mainly lies in the image quality in dark light scenes. In scenes such as dark light or at night, because the ambient light is too weak, the photos taken by the terminal device will have very obvious noise, which greatly affects the image quality. It should be understood that since the noise in the image captured by the terminal device is caused by poor ambient light, in this scenario, it is difficult to obtain a clean image in the scene through the terminal device, that is, it is difficult to obtain a noisy image-clean image right.
  • Imaging instruments such as Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and the like. Imaging instruments will introduce noise in the process of imaging, and larger noise will reduce the signal-to-noise ratio and seriously affect the subsequent image processing. Therefore, in medical image processing, it is an indispensable link to denoise the noisy images generated by imaging instruments.
  • MRI Magnetic Resonance Imaging
  • CT Computed Tomography
  • FIG. 4 is a schematic flowchart of a training method for a noise reduction model provided by an embodiment of the present application. As shown in FIG. 4 , a method for training a noise reduction model provided by an embodiment of the present application includes the following steps 401-405.
  • Step 401 Obtain image samples to be denoised from a sample set, where the sample set includes a plurality of image samples to be denoised.
  • the terminal may obtain a sample including a plurality of image samples to be noise-reduced gather.
  • the image samples to be denoised in the sample set may be images collected by the terminal in different scenarios, and these images collected by the terminal all include noise.
  • the image samples to be denoised included in the sample set are medical images of the same type, such as multiple images to be denoised in the sample set
  • the samples are all MRI images, or the multiple image samples to be denoised in the sample set are all CT images.
  • the plurality of image samples to be denoised included in the sample set are images of the same type, and these image samples to be denoised all include noise.
  • the terminal may acquire a corresponding sample set according to the scene to which the training method of the noise reduction model is applied, and train the noise reduction model based on the image samples to be noise-reduced in the sample set.
  • Step 402 Perform a first random downsampling process and a second random downsampling process on the image samples to be denoised to obtain a first sub-image and a second sub-image respectively, the first sub-image and the second sub-image are respectively obtained. Both subimages have the same resolution.
  • the terminal After obtaining the image sample to be denoised, the terminal performs two random downsampling processes on the same image sample to be denoised, respectively, to obtain the first sub-image and the second sub-image.
  • the first sub-image and the second sub-image are different images, the resolutions of the first sub-image and the second sub-image are the same, and the resolutions of the first sub-image and the second sub-image are lower than the image samples to be denoised resolution.
  • the random downsampling process refers to randomly sampling pixels in the image sample to be denoised based on a set sampling method, and piecing together the pixels obtained by sampling to obtain a sub-image with a resolution smaller than that of the image sample to be denoised.
  • the above-mentioned first random downsampling process and the second random downsampling process perform pixel sampling in the same manner.
  • the first random downsampling process and the second random downsampling process are two independent pixel random sampling processes, the first sub-image obtained by the first random downsampling process and the second random downsampling process are obtained.
  • the maximum probability of the second sub-image is not the same.
  • Mode 1 The image sample to be denoised is divided into multiple image units, and one pixel is randomly sampled in each image unit, so as to obtain a sub-image composed of multiple sampled pixels.
  • the terminal evenly divides the image samples to be denoised into M image units, and each of the M image units includes n*n pixels. Then, the terminal performs the first random selection of pixels in each of the M image units, that is, randomly selects one pixel from the n*n pixels of each image unit as the first pixel, thereby obtaining M first pixels. According to the M first pixels obtained by random sampling, the terminal piece together the M first pixels according to the positions of the image units corresponding to the M first pixels in the image to be denoised to obtain the first sub-image.
  • the terminal performs a second random selection of pixels in each of the M image units, that is, randomly selects a pixel again as the second pixel among the n*n pixels of each image unit,
  • M second pixels are obtained.
  • the terminal piece together the M second pixels according to the positions of the image units corresponding to the M second pixels in the image to be denoised to obtain the second sub-image.
  • FIG. 5 is a schematic diagram of a random downsampling process provided by an embodiment of the present application. As shown in FIG. 5 , it is assumed that the resolution of the image to be denoised is 4*4, that is, the image to be denoised is composed of 4*4 pixels. In FIG. 5, the image to be denoised is divided into 4 image units, and each image unit includes 2*2 pixels.
  • the pixels in the upper left corner of the first image unit are randomly sampled (ie, pixel 1A) as the first pixel.
  • the pixel in the upper right corner of the second image unit (that is, pixel 1B) is randomly sampled as the first pixel;
  • the pixel in the lower right corner of the third image unit (that is, pixel 1C) is randomly sampled as the first pixel;
  • image unit in the lower corner the pixel in the lower right corner of the fourth image unit (ie, pixel 1D) is randomly sampled as the first pixel.
  • the positions of the image units corresponding to the sampled pixels ie, pixel 1A, pixel 1B, pixel 1C, and pixel 1D) in the image to be denoised can be pieced together to obtain the first sub-image. .
  • the pixel in the lower right corner of the first image unit is randomly sampled as the second pixel.
  • the pixel in the lower left corner of the second image unit ie, pixel 2B
  • the pixel in the upper left corner of the third image unit is randomly sampled (ie, pixel 2C) as the second pixel
  • the upper right pixel (ie, pixel 2D) in the fourth image unit is randomly sampled as the second pixel.
  • the position of the image units corresponding to the sampled pixels ie, pixel 2A, pixel 2B, pixel 2C, and pixel 2D) in the image to be denoised can be pieced together to obtain the second sub-image. .
  • the terminal may acquire n*n values in each of the M image units.
  • the n*n-1 target pixels are pixels in each image unit that were not selected when the first random selection of pixels was performed. Then, the terminal performs random selection of pixels among n*n-1 target pixels in each of the M image units to obtain M second pixels, the M second pixels and the The M first pixels are all different.
  • the terminal when the terminal performs the second random selection of pixels for each image unit, the terminal needs to first determine the pixels in each image unit that are not selected as the first pixel, that is, n*n in each image unit -1 target pixel. Then, the terminal randomly selects one pixel from the n*n-1 target pixels as the second pixel. In this way, the first pixel and the second pixel selected by the terminal in each image unit must be different pixels. For example, each pixel in the first sub-image and the second sub-image as shown in FIG. 5 is a different pixel.
  • the first sub-image and the second sub-image obtained by the random down-sampling process are two completely independent images , that is, there is no strong correlation between the first sub-image and the second sub-image, so that two images with random noise in the same scene can be better simulated based on the first sub-image and the second sub-image, thereby improving the The training effect of the noise reduction model.
  • each second pixel in the second sub-image is Possibly the same as the first pixel, ie the second sub-image is the same as the first sub-image.
  • the terminal when the terminal performs random selection of pixels among n*n-1 target pixels in each of the M image units, the terminal may select each of the M image units at random. Among the n*n-1 target pixels in the image unit, randomly select a second pixel adjacent to the pixel selected in the first random selection, and obtain M second pixels, among the M second pixels. Each second pixel of is adjacent to the corresponding first pixel.
  • the terminal may first determine the n*n-1 target pixels and the first pixel selected when performing the first random selection of pixels adjacent target pixels. Then, the terminal randomly selects one pixel as the second pixel among the determined target pixels adjacent to the first pixel. In this way, the first pixel and the second pixel selected by the terminal in each image unit must be adjacent pixels.
  • FIG. 6 is a schematic diagram of another random downsampling process provided by an embodiment of the present application.
  • each second pixel ie pixel 2A, pixel 2B, pixel 2C, pixel 2D
  • first pixel ie pixel 1A, pixel 1B, pixel 1C, pixel 2D
  • Adjacent there is no situation where the second pixel is located at the opposite corner of the first pixel.
  • the image unit for each image unit, it may be considered that adjacent pixels in the image unit have higher similarity. Therefore, by selecting the target pixel adjacent to the first pixel as the second pixel, a higher similarity between the obtained second sub-image and the first sub-image can be ensured, that is, based on the first sub-image and the second sub-image
  • the image better simulates two images with random noise in the same scene, thereby improving the training effect of the noise reduction model.
  • Method 2 Determine the pixel that needs to be sampled in the image sample to be denoised based on the ratio of downsampling, and randomly sample a pixel from multiple pixels around the pixel, so as to obtain a sub-pixel composed of multiple sampled pixels. image.
  • an image to be denoised with a resolution of W*H it may be predetermined to downsample the image to be denoised into a sub-image with a resolution of (W-x)*(H-x), that is, the first sub-image. and the resolution of the second sub-image are both (W-x)*(H-x).
  • the terminal determines the pixel to be subjected to sampling processing in the image to be denoised, and the pixel needs to be used as a pixel in the sub-image after the sampling processing is performed.
  • the terminal performs sampling processing on the pixel that needs to be sampled, that is, randomly samples a pixel from multiple pixels around the pixel or multiple pixels around the pixel and the pixel, and the sampled pixel is used as the sub-image in the sub-image. , and finally a sub-image composed of multiple sampled pixels is obtained.
  • the terminal may randomly select a row from the 1st row to the xth row as the starting row for performing the sampling processing of the pixels, and continuously perform the sampling processing on the pixels of the W-x row from this row;
  • One of the x-th columns is randomly selected as a start column for executing the sampling process of the pixels, and the sampling process is successively performed on the pixels of the H-x columns from this column.
  • FIG. 7 is a schematic diagram of another random downsampling process provided by an embodiment of the present application.
  • the image to be denoised is an image with a resolution of 4*4, and the image to be denoised includes 16 pixels, which are pixel 1 to pixel 16 respectively.
  • the image to be denoised needs to be down-sampled into a sub-image with a resolution of (4-2)*(4-2), that is, the resolution of the sub-image is 2*2.
  • Based on the ratio of downsampling it is determined that pixel 6, pixel 7, pixel 10, and pixel 11 in the image to be denoised are targeted, and random down-sampling processing is performed on these four pixels.
  • a plurality of pixels surrounding pixel 6, ie, pixel 1, pixel 2, pixel 3, pixel 5, pixel 7, pixel 9, pixel 10, and pixel 11, are determined. Then, one pixel is randomly selected as a pixel in the sub-image among the plurality of pixels around pixel 6 .
  • pixel 6 is the target for performing the random downsampling process
  • pixel 2 is selected as the pixel in the sub-image.
  • a plurality of pixels around the pixel 6 and one pixel of the pixel 6 may be randomly selected as a pixel in the sub-image, that is, the pixel 6 itself may also be selected as a pixel in the sub-image.
  • pixel 7 a plurality of pixels surrounding pixel 7, ie, pixel 2, pixel 3, pixel 4, pixel 6, pixel 8, pixel 10, pixel 11, and pixel 12, are determined. Then, one pixel is randomly selected as the pixel in the sub-image among the plurality of pixels around the pixel 7 . For example, as shown in FIG. 7, when the random downsampling process is performed with the pixel 7 as the target, the pixel 8 is selected as the pixel in the sub-image.
  • a plurality of pixels surrounding pixel 10, ie, pixel 5, pixel 6, pixel 7, pixel 9, pixel 11, pixel 13, pixel 14, and pixel 15, are determined. Then, a pixel is randomly selected among the plurality of pixels around pixel 10 as a pixel in the sub-image. For example, as shown in FIG. 7, when the random downsampling process is performed with the pixel 10 as the target, the pixel 9 is selected as the pixel in the sub-image.
  • a plurality of pixels surrounding pixel 11, ie, pixel 6, pixel 7, pixel 8, pixel 10, pixel 12, pixel 14, pixel 15, and pixel 16, are determined. Then, one pixel is randomly selected as a pixel in the sub-image among the plurality of pixels around the pixel 11 . For example, as shown in FIG. 7 , when the pixel 11 is the target of performing the random downsampling process, the pixel 12 is selected as the pixel in the sub-image.
  • Step 403 Input the first sub-image into a noise reduction model to obtain a first target image.
  • the terminal may input the first sub-image into the noise reduction model to obtain the first target image output by the noise reduction model.
  • the noise reduction model is used to perform noise reduction processing on the input image, and output a clean image after noise reduction.
  • the noise reduction model is a learnable model. Before the noise reduction model is trained, the noise reduction ability of the noise reduction model is poor; during the training process, the noise reduction module inside the noise reduction model is continuously optimized, and the noise reduction The ability is continuously enhanced; after training, the noise reduction model can be used to achieve image noise reduction.
  • the noise reduction model includes, but is not limited to, a learning-based model such as a convolutional neural network or a noise reduction model based on sparse feature expression, and this embodiment does not specifically limit the specific structure of the noise reduction model.
  • Step 404 Obtain a first loss function according to the first target image and the second sub-image, where the first loss function is used to indicate the difference between the first target image and the second sub-image.
  • the first sub-image and the second sub-image may be used as a sample pair for training.
  • the first sub-image is used as the input value of the noise reduction model
  • the second sub-image is used as the expected output value of the noise reduction model.
  • a first loss function can be obtained, and the first loss function is used to indicate the first loss function. the difference between the target image and the second sub-image. Based on the first loss function, the noise reduction model can be guided to learn the noise reduction capability.
  • the first sub-image is g 1 (y)
  • the second sub-image is g 2 (y)
  • the noise reduction model is f.
  • the image output by the noise reduction model f is f(g 1 (y)).
  • Equation 1 A possible example of the first loss function is shown in Equation 1.
  • loss1 is the first loss function
  • f(g 1 (y)) is the image output by the noise reduction model f after inputting the first sub-image g 1 (y)
  • g 2 (y) is the second sub-image
  • p For the power, the value of p can be 1 or 2 and other values.
  • f(g 1 (y))-g 2 (y) can be understood as subtracting the value of the pixel in the f(g 1 (y)) image from the value of the corresponding pixel in the g 2 (y) image .
  • the value of a pixel in an image can be 0-255 or 0-4095, and different values are used to represent different colors.
  • Step 405 Train the noise reduction model at least according to the first loss function to obtain a target noise reduction model.
  • the terminal may train the noise reduction model based on the value of the first loss function.
  • the process that the terminal trains the noise reduction model based on the first loss function includes: the terminal adjusts parameters in the noise reduction model based on the value of the first loss function, and repeatedly executes steps 401-405, so as to continuously adjust the noise reduction model parameters of the terminal, until the obtained first loss function is less than the preset threshold, it can be determined that the model training conditions have been met, and the target noise reduction model is obtained.
  • the target noise reduction model is the trained noise reduction model, which can be used for subsequent image noise reduction.
  • the terminal may also train the noise reduction model based on the first loss function and the second loss function.
  • the terminal may input the image samples to be noise reduction into the noise reduction model to obtain the second target image output by the noise reduction model. Based on the sampling positions of the pixels in the first random downsampling process, the terminal performs downsampling processing on the second target image to obtain a first sub-target image. Based on the sampling positions of pixels in the second random downsampling process, the terminal performs downsampling processing on the second target image to obtain a second sub-target image.
  • the terminal performs downsampling processing on the second target image means that when the terminal performs downsampling processing on the second target image, the The downsampling process is performed in the same manner as the sub-random downsampling process.
  • the first random downsampling process is as follows: pixels are collected at the upper left corner of the first image unit, pixels are collected at the upper right corner of the second image unit, and pixels are collected at the lower right corner of the third image unit. Capture pixels and capture pixels in the lower left corner of the fourth image cell. Then, based on the same method as the first random downsampling process, the terminal may divide the second target image into four image units based on the same method, and in each For the position in the image unit, the corresponding pixels are respectively collected in the four image units of the second target image, so as to obtain the first sub-target image. That is to say, the positions of the first sub-target image and each group of corresponding pixels in the first sub-image in the source image before downsampling are the same.
  • the terminal may record the position of each first pixel collected during the first random downsampling process. Then, the terminal determines the corresponding position in the second target image based on the position of each first pixel, and collects pixels at the corresponding position in the second target image as pixels on the first sub-target image. In addition, the terminal may also generate a first sampler for performing the first random downsampling process and a second sampler for performing the second random downsampling process, wherein the first sampler and the second sampler The way in which downsampling is done is fixed.
  • the terminal can perform down-sampling processing on the second target image based on the first sampler to obtain the first sub-target image; and the terminal can perform down-sampling processing on the second target image based on the second sampler to obtain the second sub-target image .
  • the terminal obtains a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image.
  • the second loss function is mainly used to indicate the difference between the first sub-target image and the second sub-target image.
  • the first sub-image is g 1 (y)
  • the second sub-image is g 2 (y)
  • the noise reduction model is f.
  • the image output by the noise reduction model f is f(g 1 (y)).
  • the first sub-image is g 1 (f(y))
  • the second sub-image is g 2 (f(y)).
  • Equation 2 A possible example of the second loss function is shown in Equation 2.
  • loss2 is the second loss function
  • f(g 1 (y)) is the image output by the noise reduction model f after inputting the first sub-image g 1 (y)
  • g 2 (y) is the second sub-image
  • g 1 (f(y)) is the first sub-image
  • g 2 (f(y)) is the second sub-image
  • p is a power number.
  • the second loss function is a correction term for the inconsistent positions of the two sub-images during the downsampling process, and its purpose is to constrain the noise reduction model, so that the noise reduction model will not
  • the position of the corresponding pixels of the image in the image sample to be denoised is inconsistent, resulting in an excessively smooth image.
  • the clean images corresponding to the first sub-image and the second sub-image obtained after two random downsampling processes are not exactly the same, and the clean images corresponding to the first sub-image and the second sub-image are only similar in their neighbors. Relationship.
  • the noise reduction ability learned by the denoising model will not only erase the noise in the image, but also the difference between the ground-truth images (ie clean images) corresponding to the two sub-images. Differences are also erased. That is to say, the noise reduction ability of the noise reduction model is too strong, resulting in the high-frequency detail information in the noise image will be processed as noise information. Therefore, on the basis of the first loss function, the introduction of the second loss function can protect the high-frequency detail information in the noise image from being removed by the noise reduction model, and ensure the resolution of the image after noise reduction.
  • the terminal trains the noise reduction model at least according to the first loss function and the second loss function, until the noise reduction model satisfies the model training conditions.
  • the terminal may add the first loss function and the second loss function to obtain a total loss function, and then train the noise reduction model based on the total loss function.
  • the first loss function and the second loss function also have corresponding weight coefficients, that is, the terminal calculates the total loss function based on the weight coefficients corresponding to the first loss function and the second loss function. That is, the terminal trains the noise reduction model according to at least the first loss function, the first weight coefficient, the second loss function and the second weight coefficient; wherein the first weight coefficient is used to indicate The weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the process of calculating the total loss function based on the first loss function, the first weight coefficient, the second loss function, and the second weight coefficient may be as shown in Equation 3.
  • the first weight coefficient and/or the second weight coefficient may be adjusted according to actual noise reduction requirements.
  • the larger the second weight coefficient the weaker the noise reduction intensity, the weaker the noise reduction intensity, the more noise, but the less the loss of high-frequency details of the image; the smaller the second weight coefficient, the lower the noise reduction.
  • a noise reduction model with noise reduction capability can be obtained by training based on a large number of such sample pairs. It is understandable that when there are few samples used to train the denoising model, what the denoising model actually learns is the transformation relationship between the two noise modes. When there are enough samples for training the noise reduction model, since the noise in the samples is random and fluctuates around the true value, then, from the perspective of minimizing the loss function, it can be found that the noise reduction model is Clean and noise-free images can be learned. Because it is impossible for the noise reduction model to learn a certain noise transformation law to minimize the loss function, because the noise is always random.
  • the noise reduction model minimizes the loss function by converting the randomly fluctuating noise in the sample into an intermediate value, and the intermediate value of a large amount of noise actually happens to be the true value corresponding to the noise. Therefore, a noise reduction model with noise reduction capability can be trained based on the sample pairs composed of noise images.
  • Equation 4 For the indoor temperature estimation problem: Suppose, a series of observed temperatures (y1, y2, y3, y4%) are obtained in a certain way. Then, based on a series of observed temperatures to obtain the real temperature z, it can be modeled as Equation 4.
  • argmin represents the expectation of minimizing the loss function L, which is a function of z. Losses are minimized for all observed temperatures y. Therefore, L can be regarded as a probability distribution with y as a variable, and minimizing the loss of all samples is actually minimizing the mean of the losses of all samples. If the distance measure is the L2 norm, then the true temperature z is actually the mean of y. Therefore, the real temperature z is actually the mean value of y. As for the temperature y(i) of each observation, it does not matter at all. The goal of optimization is to consider the mean value of all observations.
  • Equation 5 For the image noise reduction problem, suppose the input noisy image is x(i) and the output clear image is y(i). Then, the image denoising problem can be modeled as Equation 5.
  • Equation 5 ⁇ is actually the weight parameter of the noise reduction model. Also, x and y are not independent of each other, so Equation 5 can be transformed into Equation 6.
  • y' can be regarded as the expected value with noise. That is to say, taking the noise image as the expected output value of the noise reduction model, a noise reduction model with noise reduction capability can be trained.
  • random downsampling is performed twice on the same noise image to obtain two sub-images.
  • These two sub-images can be used to simulate two images with random noise in the same scene. Therefore, the training of the noise reduction model can also be realized based on the sub-images obtained by downsampling.
  • the present embodiment obtains sub-images as sample pairs by performing random downsampling on the same noise image twice. Sample pairs are readily available. Because in some special scenes or fields, it is actually difficult to obtain two noisy images corresponding to the same clean image.
  • FIG. 8 is a schematic diagram of a training process of a noise reduction model provided by an embodiment of the application
  • FIG. 9 is a schematic diagram of a training process of another noise reduction model provided by an embodiment of the application.
  • the training process of the noise reduction model includes steps 801-809.
  • Step 801 select the noise image y from the sample set.
  • the terminal acquires a sample set including a large number of noisy images, and selects a noise image y that is not used for training in the sample set.
  • Step 802 construct a sampler g1 and a sampler g2.
  • the terminal may configure the sampler g1 and the sampler g2 in advance.
  • the sampler g1 is used to perform the first random downsampling process to downsample the noise image y.
  • the sampler g2 is used to perform a second random downsampling process to downsample the noise image y for a second time.
  • Step 803 perform down-sampling processing on the noise image y through the sampler g1 and the sampler g2 to obtain a noise sub-image g1(y) and a noise sub-image g2(y).
  • Step 804 the noise sub-image g1(y) is denoised by the denoising module to obtain a denoised image f(g1(y)).
  • the noise sub-image g1(y) can be input into the noise reduction module, and the noise sub-image g1(y) can be denoised by the noise reduction module to obtain the denoised image f (g1(y)).
  • the noise reduction module may be the above-mentioned learning-based noise reduction model, such as a convolutional neural network or a noise reduction model based on sparse feature expression.
  • Step 805 the noise image y is denoised by the denoising module to obtain the denoised image f(y).
  • Step 806 perform down-sampling processing on the denoised image f(y) by the sampler g1 and the sampler g2 to obtain the denoising sub-image g1(f(y)) and the denoising sub-image g2(f(y)) .
  • the sampler g1 and the sampler g2 are used to perform downsampling on the denoised image f(y) to obtain the denoised sub-image g1(f(y)) and the denoised sub-image respectively.
  • Step 807 calculate the loss function.
  • the loss function is calculated from the first loss function, the second loss function, and the weight coefficients corresponding to the first loss function and the second loss function.
  • the first loss function may be calculated based on the denoised image f(g1(y)) and the noise sub-image g2(y);
  • the second loss function may be calculated based on the denoised image f(g1(y)) , noise sub-image g2(y), noise-reduction sub-image g1(f(y)) and noise-reduction sub-image g2(f(y)).
  • Step 808 update the parameters of the noise reduction module.
  • the parameters of the noise reduction module can be adaptively updated based on the value of the loss function.
  • Step 809 determine whether the noise reduction module has converged.
  • the noise reduction module Based on the value of the loss function, it is judged whether the noise reduction module has converged, that is, whether the noise reduction module has met the training conditions. If the noise reduction module converges, it is considered that the noise reduction module has been trained, and the noise reduction module can be output as a trained noise reduction module; if the noise reduction module does not converge, go to and continue to perform steps 801-808, and continue to perform the noise reduction module. Train until the denoising module converges.
  • this embodiment provides multiple experiments to evaluate the noise reduction effect of the noise reduction model trained by the training method based on the noise reduction model.
  • FIG. 10 is a schematic flowchart of an experiment for determining a noise reduction effect of a noise reduction model provided by an embodiment of the present application.
  • a batch of clean images without noise is obtained first, and then noise is randomly synthesized in the clean images to obtain a batch of noisy images.
  • the noise image is input into the noise reduction network trained based on the above training method to obtain the denoised image, that is, the denoised image.
  • the index between the two is calculated to determine the gap between the denoised image and the clean image, so as to determine the noise reduction effect of the noise reduction network.
  • noise reduction method is a commonly used measure to measure the noise reduction method.
  • Noise-containing images are synthesized by artificially adding Gaussian noise and Poisson noise.
  • the ImageNet dataset is used as training data, which contains 50,000 high-definition pictures, covering most scenes of daily life.
  • the Kodak image (Kodak) dataset, the low-complexity single-image super-resolution (Set14) dataset with non-negative neighborhood embedding, and the Berkeley Image Segmentation (BSD300) dataset are also selected as the test set to facilitate comparison with other
  • the noise reduction method is compared horizontally.
  • a U-Net network is built on the PyTorch platform as a noise reduction network.
  • the peak signal-to-noise ratio (PSNR) of each test image is calculated separately, and finally the average of the entire test set is calculated. PSNR.
  • PSNR peak signal-to-noise ratio
  • the UNet network is used as the noise reduction network.
  • Training network Based on the constructed training set, noise reduction network, sampler and loss function, use the training method of the noise reduction model provided in this embodiment to train the noise reduction network to convergence.
  • FIG. 11 is a schematic diagram for comparison of noise reduction effects of various noise reduction methods provided in an embodiment of the present application.
  • the Noise2Noise method is the method described above that needs to be trained based on multiple noisy images with the same clean image. However, in practical applications, multiple noisy images with the same clean image are actually very difficult. Therefore, compared with the training method of the noise reduction model provided by the embodiments of the present application, the Noise2Noise method is difficult to implement in most scenarios.
  • the above experiments are by synthesizing noise on the clean image to obtain the noise image and the clean image corresponding to the noise image.
  • an experiment will be performed on the training method of the noise reduction model provided by the embodiment of the present application by taking a scene where a terminal takes a picture as an example.
  • FIG. 12 is a schematic flowchart of another experiment for determining the noise reduction effect of the noise reduction model provided by the embodiment of the present application.
  • multiple noise images in the same scene are acquired through the camera sensor, that is, noise image 1 , noise image 2 . . . noise image N shown in FIG. 12 .
  • the multiple noise images may be collected by the camera in a scene with poor light, so the multiple noise images include obvious noise.
  • the index between the two is calculated to determine the gap between the denoised image and the clean image, so as to determine the noise reduction effect of the noise reduction network.
  • the clean image can be obtained by averaging the multiple noisy images. Since the camera sensor cannot obtain a clean image, by averaging multiple noisy images, an image that is as close to the real clean image as possible can be obtained.
  • a Smartphone Image Denoising Dataset (SIDD) dataset constructed based on sensors of a real mobile phone is used.
  • the SIDD dataset is a real scene dataset collected by the camera sensor of a mobile phone in a dark scene, and its corresponding clean image is obtained by weighted average of multiple noisy images.
  • the UNet network model is used as the noise reduction network.
  • Training network Based on the constructed training set, noise reduction network, sampler and loss function, use the training method of the noise reduction model provided in this embodiment to train the noise reduction network to convergence.
  • this method is compared horizontally with several current mainstream image noise reduction methods. Multiple models were trained based on the SIDD dataset, and the average PSNR and structural similarity index measurement (SSIM) on both the SIDD validation data and the benchmark data were calculated. Finally, the comparison of the indicators and the comparison of the real noise reduction effect is carried out. The comparison results of various methods are shown in Figure 13 and Figure 14. The higher the PSNR and SSIM, the better the noise reduction effect. 13 is a schematic diagram showing a comparison of noise reduction indicators of various noise reduction methods provided by an embodiment of the present application; and FIG.
  • FIG. 14 is a schematic diagram showing a comparison schematic diagram of noise reduction effects of various noise reduction methods provided by an embodiment of the present application. It can be seen from FIG. 13 and FIG. 14 that the PSNR and SSIM of the training method of the noise reduction model provided by the embodiment of the present application are higher than other methods, that is, the noise reduction effect is better than other methods.
  • FIG. 15 is a schematic flowchart of an image noise reduction method provided by an embodiment of the present application. As shown in FIG. 15 , the image noise reduction method includes steps 1501-1502.
  • Step 1501 acquiring an image to be denoised.
  • the image to be denoised is a noise image that contains noise and needs to be denoised in the actual application process.
  • the image to be denoised may be an image obtained by the terminal, a medical image or a supervision image. This embodiment does not specifically limit the type of the denoised image.
  • Step 1502 Input the image to be denoised into a target denoising model to obtain a denoised image.
  • the target noise reduction model is obtained by training a noise reduction model based on at least a first loss function, the first loss function is obtained based on the first target image and the second sub-image, and the first loss The function is used to indicate the difference between the first target image and the second sub-image, the first target image is obtained by inputting the first sub-image into the noise reduction model, the first sub-image is and the second sub-image is obtained after performing the first random downsampling process and the second random downsampling process on the image samples to be denoised respectively, the resolutions of the first sub-image and the second sub-image are same.
  • the target noise reduction model is obtained by training based on the training method of the noise reduction model described in the above-mentioned embodiment. For the specific training process, reference may be made to the description of steps 401-405, which will not be repeated here.
  • the first sub-image is obtained according to M first pixels, and the M first pixels are obtained after dividing the image sample to be denoised into M image units Obtained by performing the first random selection of pixels in each of the image units;
  • the second sub-image is obtained according to M second pixels, the M second pixels are obtained after dividing the image sample to be denoised into the M image units, and the M second pixels are obtained in the M image units. obtained by performing a second random selection of pixels in each image unit.
  • the M second pixels are obtained by performing random selection of pixels in n*n-1 target pixels in each of the M image units, the n*n -1 target pixel is a pixel in each image unit that is not selected when the first random selection of pixels is performed, and the M second pixels are different from the M first pixels.
  • the M second pixels are selected by randomly selecting one of the n*n-1 target pixels in each of the M image units and the first random selection.
  • the second pixel is obtained by the adjacent second pixel, and each second pixel in the M second pixels is adjacent to the corresponding first pixel.
  • the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and the second loss function, and the second loss function is based on the first target image, the The second sub-image, the first sub-target image, and the second sub-target image are obtained, and the first sub-target image is based on the sampling positions of the pixels in the first random down-sampling process.
  • the second sub-target image is obtained by down-sampling the second sub-target image based on the sampling positions of the pixels in the second random down-sampling processing.
  • the image samples to be denoised are obtained by inputting the denoising model.
  • the target noise reduction model is obtained by training the noise reduction model based on at least the first weight coefficient of the first loss function, the second loss function and the second weight coefficient; wherein the The first weight coefficient is used to indicate the weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • FIG. 16 is a schematic structural diagram of a model training apparatus provided by an embodiment of the present application.
  • the model training apparatus includes: an acquisition unit 1601 and a processing unit 1602 .
  • the obtaining unit 1601 is configured to obtain image samples to be denoised from a sample set, the sample set includes a plurality of image samples to be denoised; the processing unit 1602 is configured to perform the first step on the image samples to be denoised.
  • the processing unit 1602 is further configured to: divide the image sample to be denoised into M image units, and each image unit in the M image units Including n*n pixels; performing the first random selection of pixels in each of the M image units, obtaining M first pixels, and obtaining the first pixel according to the M first pixels a sub-image; performing a second random selection of pixels in each of the M image units, obtaining M second pixels, and obtaining the second sub-image according to the M second pixels .
  • the acquiring unit 1601 is further configured to acquire n*n-1 target pixels in each of the M image units, where the n*n-1 target pixels are The n-1 target pixels are the pixels in each image unit that are not selected when the first random selection of pixels is performed; the processing unit 1602 is further used for each image unit in the M image units Random selection of pixels is performed among the n*n-1 target pixels in , and M second pixels are obtained, and the M second pixels are different from the M first pixels.
  • the processing unit 1602 is further configured to randomly select one of the n*n-1 target pixels in each of the M image units. For a second pixel adjacent to the pixel selected in the first random selection, M second pixels are obtained, and each of the M second pixels is adjacent to the corresponding first pixel.
  • the processing unit 1602 is further configured to: input the image samples to be denoised into the denoising model to obtain a second target image; Perform downsampling processing on the second target image based on the sampling positions of the pixels in the random downsampling process to obtain a first sub-target image; based on the sampling positions of the pixels in the second random downsampling Perform downsampling processing on the target image to obtain a second sub-target image; obtain a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image ; train the noise reduction model according to at least the first loss function and the second loss function.
  • the processing unit 1602 is further configured to perform at least on the The noise reduction model is trained; wherein, the first weight coefficient is used to indicate the weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • FIG. 17 is a schematic structural diagram of an image noise reduction apparatus provided by an embodiment of the present application.
  • the image noise reduction apparatus includes: an acquisition unit 1701 and a processing unit 1702 .
  • the acquiring unit 1701 is configured to acquire the image to be denoised;
  • the processing unit 1702 is adapted to input the image to be denoised into a target denoising model to obtain a denoised image; wherein the target denoising model is obtained by training the noise reduction model based on at least the first loss function, the first loss function is obtained based on the first target image and the second sub-image, and the first loss function is used to indicate the first loss function.
  • the difference between the target image and the second sub-image, the first target image is obtained by inputting the first sub-image into the noise reduction model, and the first sub-image and the second sub-image are
  • the resolutions of the first sub-image and the second sub-image are the same as obtained after the first random downsampling process and the second random downsampling process are respectively performed on the image samples to be denoised.
  • the first sub-image is obtained according to M first pixels, and the M first pixels are obtained by dividing the image samples to be denoised into M pieces.
  • the first random selection of pixels is performed in each of the M image units;
  • the second sub-image is obtained according to the M second pixels, and the M th Two pixels are obtained by performing a second random selection of pixels in each of the M image units after dividing the image sample to be denoised into the M image units.
  • the M second pixels are randomly selected by performing pixel randomization in n*n-1 target pixels in each of the M image units. Selected, the n*n-1 target pixels are pixels that are not selected when the first random selection of pixels is performed in each image unit, and the M second pixels and the M first pixels are Pixels are not the same.
  • the M second pixels are obtained by randomly selecting one of the n*n-1 target pixels in each of the M image units. Obtained from a second pixel adjacent to the pixel selected in the first random selection, where each second pixel in the M second pixels is adjacent to a corresponding first pixel.
  • the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and the second loss function
  • the second loss function is obtained by training the noise reduction model.
  • the first sub-target image is based on the pixels in the first random downsampling process
  • the sampling position is obtained by down-sampling the second target image
  • the second sub-target image is obtained by down-sampling the second target image based on the sampling positions of the pixels in the second random down-sampling process.
  • the second target image is obtained by inputting the image sample to be denoised into the denoising model.
  • the target noise reduction model is based on at least the first weight coefficient of the first loss function, the second loss function and the second weight coefficient. obtained through training; wherein the first weight coefficient is used to indicate the weight of the first loss function, and the second weight coefficient is used to indicate the weight of the second loss function.
  • the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
  • FIG. 18 is a schematic structural diagram of the execution device provided by the embodiment of the present application. Smart wearable devices, servers, etc., are not limited here.
  • the data processing apparatus described in the embodiment corresponding to FIG. 18 may be deployed on the execution device 1800 to implement the function of data processing in the embodiment corresponding to FIG. 18 .
  • the execution device 1800 includes: a receiver 1801, a transmitter 1802, a processor 1803, and a memory 1804 (wherein the number of processors 1803 in the execution device 1800 may be one or more, and one processor is taken as an example in FIG. 18 ) , wherein the processor 1803 may include an application processor 18031 and a communication processor 18032.
  • the receiver 1801, the transmitter 1802, the processor 1803, and the memory 1804 may be connected by a bus or otherwise.
  • Memory 1804 may include read-only memory and random access memory, and provides instructions and data to processor 1803 .
  • a portion of memory 1804 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 1804 stores processors and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, wherein the operating instructions may include various operating instructions for implementing various operations.
  • the processor 1803 controls the operation of the execution device.
  • various components of the execution device are coupled together through a bus system, where the bus system may include a power bus, a control bus, a status signal bus, and the like in addition to a data bus.
  • the various buses are referred to as bus systems in the figures.
  • the methods disclosed in the above embodiments of the present application may be applied to the processor 1803 or implemented by the processor 1803 .
  • the processor 1803 may be an integrated circuit chip, which has signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by an integrated logic circuit of hardware in the processor 1803 or an instruction in the form of software.
  • the above-mentioned processor 1803 may be a general-purpose processor, a digital signal processing (DSP), a microprocessor or a microcontroller, and may further include an application specific integrated circuit (ASIC), a field programmable Field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable Field-programmable gate array
  • the processor 1803 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of this application.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory 1804, and the processor 1803 reads the information in the memory 1804, and completes the steps of the above method in combination with its hardware.
  • the receiver 1801 can be used to receive input numerical or character information, and generate signal input related to the relevant settings and function control of the execution device.
  • the transmitter 1802 can be used to output digital or character information through the first interface; the transmitter 1802 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group; the transmitter 1802 can also include a display device such as a display screen .
  • the processor 1803 is configured to execute the training method of the noise reduction model executed by the execution device in the embodiment corresponding to FIG. 4 .
  • FIG. 19 is a schematic structural diagram of the training device provided by the embodiment of the present application.
  • the training device 1900 is implemented by one or more servers. 1900 may vary widely by configuration or performance, and may include one or more central processing units (CPUs) 1919 (eg, one or more processors) and memory 1932, one or more storage A storage medium 1930 (eg, one or more mass storage devices) for applications 1942 or data 1944.
  • the memory 1932 and the storage medium 1930 may be short-term storage or persistent storage.
  • the program stored in the storage medium 1930 may include one or more modules (not shown in the figure), and each module may include a series of instructions to operate on the training device.
  • the central processing unit 1919 may be configured to communicate with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the training device 1900 .
  • Training device 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958; or, one or more operating systems 1941, such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and many more.
  • operating systems 1941 such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and many more.
  • the training device may perform the steps in the embodiment corresponding to FIG. 4 .
  • Embodiments of the present application also provide a computer program product that, when running on a computer, causes the computer to perform the steps performed by the aforementioned execution device, or causes the computer to perform the steps performed by the aforementioned training device.
  • Embodiments of the present application further provide a computer-readable storage medium, where a program for performing signal processing is stored in the computer-readable storage medium, and when it runs on a computer, the computer executes the steps performed by the aforementioned execution device. , or, causing the computer to perform the steps as performed by the aforementioned training device.
  • the execution device, training device, or terminal device provided in this embodiment of the present application may specifically be a chip, and the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, pins or circuits, etc.
  • the processing unit can execute the computer executable instructions stored in the storage unit, so that the chip in the execution device executes the data processing method described in the above embodiments, or the chip in the training device executes the data processing method described in the above embodiment.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as only Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • FIG. 20 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the chip may be represented as a neural network processor NPU 2000, and the NPU 2000 is mounted as a co-processor to the main CPU (Host CPU), tasks are allocated by the Host CPU.
  • the core part of the NPU is the arithmetic circuit 2003, which is controlled by the controller 2004 to extract the matrix data in the memory and perform multiplication operations.
  • the arithmetic circuit 2003 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 2003 is a two-dimensional systolic array. The arithmetic circuit 2003 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 2003 is a general-purpose matrix processor.
  • PE Processing Unit
  • the arithmetic circuit 2003 is a two-dimensional systolic array.
  • the arithmetic circuit 2003 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition.
  • the arithmetic circuit 2003 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 2002 and buffers it on each PE in the arithmetic circuit.
  • the arithmetic circuit fetches the data of matrix A and matrix B from the input memory 2001 to perform matrix operation, and stores the partial result or final result of the matrix in an accumulator 2008 .
  • Unified memory 2006 is used to store input data and output data.
  • the weight data directly passes through the storage unit access controller (Direct Memory Access Controller, DMAC) 2005, and the DMAC is transferred to the weight memory 2002.
  • Input data is also transferred to unified memory 2006 via the DMAC.
  • DMAC Direct Memory Access Controller
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 2013, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch buffer (Instruction Fetch Buffer, IFB) 2009.
  • IFB Instruction Fetch Buffer
  • the bus interface unit 2013 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 2009 to obtain instructions from the external memory, and also for the storage unit access controller 2005 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 2006 , the weight data to the weight memory 2002 , or the input data to the input memory 2001 .
  • the vector calculation unit 2007 includes a plurality of operation processing units, and further processes the output of the operation circuit 2003, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc., if necessary. It is mainly used for non-convolutional/fully connected layer network computation in neural networks, such as Batch Normalization, pixel-level summation, and upsampling of feature planes.
  • the vector computation unit 2007 can store the vector of processed outputs to the unified memory 2006.
  • the vector calculation unit 2007 can apply a linear function; or a nonlinear function to the output of the operation circuit 2003, such as performing linear interpolation on the feature plane extracted by the convolution layer, such as a vector of accumulated values, to generate activation values.
  • the vector computation unit 2007 generates normalized values, pixel-level summed values, or both.
  • the vector of processed outputs can be used as activation input to the arithmetic circuit 2003, eg, for use in subsequent layers in a neural network.
  • the instruction fetch memory (instruction fetch buffer) 2009 connected to the controller 2004 is used to store the instructions used by the controller 2004;
  • Unified memory 2006, input memory 2001, weight memory 2002 and instruction fetch memory 2009 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • the processor mentioned in any one of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the above program.
  • the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be A physical unit, which can be located in one place or distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be retrieved from a website, computer, training device, or data Transmission from the center to another website site, computer, training facility or data center via wired (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means.
  • wired eg coaxial cable, fiber optic, digital subscriber line (DSL)
  • wireless eg infrared, wireless, microwave, etc.
  • the computer-readable storage medium can be any available medium that can be stored by a computer or a data storage device such as a training device, a data center, etc., that includes one or more available media integrated.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种降噪模型的训练方法,应用于人工智能领域,且属于计算机视觉技术。该方法包括:从样本集合中获取待降噪图像样本,样本集合包括多个待降噪图像样本;对待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像;将第一子图像输入降噪模型,得到第一目标图像;根据第一目标图像和第二子图像获取第一损失函数,第一损失函数用于指示第一目标图像和第二子图像之间的差异;至少根据第一损失函数对降噪模型进行训练,得到目标降噪模型。本方案中,基于噪声图像即可实现降噪模型的训练,无需获取噪声图像对应的干净图像,降低了降噪模型的训练难度。

Description

一种降噪模型的训练方法及相关装置
本申请要求于2020年12月25日提交中国专利局、申请号为202011565423.X、发明名称为“一种降噪模型的训练方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种降噪模型的训练方法及相关装置。
背景技术
人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用***。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
目前,在图像的生成或传输过程中,往往容易受到成像设备或外部环境噪声的干扰,而产生携带影响图像质量的噪声。这种由于受到干扰而包括噪声的图像通常称为含噪图像或噪声图像。为提高这种图像的质量,图像降噪方法应运而生。图像降噪方法是指应用算法从观测到的噪声图像中去除噪声,保留图像细节,重建出相应的干净图像。目前,图像降噪方法在手机拍照、高清电视、监督设备、卫星图像和医学影像等领域有重要的应用价值。
相关技术中,主要通过基于学习的降噪模型(例如卷积神经网络)对图像进行降噪。通过大量的训练数据对降噪模型进行训练,能够得到具有良好降噪效果的降噪模型,从而实现对图像的降噪。
一般来说,降噪模型的降噪效果很大程度上取决于用于训练降噪模型的训练数据,即噪声图像-干净图像对。然而,在图像处理领域,噪声图像-干净图像对的获取往往十分困难。因此,亟需一种能够无需基于噪声图像-干净图像对来实现降噪模型的训练的方法。
发明内容
本申请实施例提供了一种降噪模型的训练方法及相关装置,通过对待降噪图像样本分别执行两次随机下采样处理,获得子图像样本对,并将子图像样本对中的一个子图像作为降噪模型的输入值,子图像样本对中的另一个子图像则作为降噪模型的输出期望值,从而实现降噪模型的训练。基于噪声图像即可实现降噪模型的训练,无需获取噪声图像对应的干净图像,降低了降噪模型的训练难度。
本申请第一方面提供一种降噪模型的训练方法,该降噪模型的训练方法可以应用于终端拍照、医学图像或监督视频等场景,用于实现图像的降噪。该方法包括:终端从样本集合中获取待降噪图像样本,所述样本集合包括多个待降噪图像样本。终端对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,所述第一子图像与所述第二子图像的分辨率相同。即终端对同一个待降噪图像 样本分别执行两次随机下采样处理,得到第一子图像和第二子图像。其中,第一子图像和第二子图像为不同的图像,第一子图像和第二子图像的分辨率相同,且第一子图像和第二子图像的分辨率低于待降噪图像样本的分辨率。其中,随机下采样处理是指基于设定的采样方式,在待降噪图像样本中随机进行像素的采样,并基于采样获得的像素拼凑得到分辨率小于待降噪图像样本的子图像。上述的第一次随机下采样处理和第二次随机下采样处理进行像素采样的方式是相同的。但是由于第一次随机下采样处理和第二次随机下采样处理是两次独立的像素随机采样过程,因此第一次随机下采样处理得到的第一子图像和第二次随机下采样处理得到的第二子图像极大概率是不一样的。
终端将所述第一子图像输入降噪模型,得到第一目标图像,该降噪模型包括但不限于卷积神经网络或基于稀疏特征表达的降噪模型等基于学习的模型。终端根据所述第一目标图像和所述第二子图像获取第一损失函数,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异。最后,终端至少根据所述第一损失函数对所述降噪模型进行训练,直至满足模型训练条件,得到目标降噪模型。其中,模型训练条件是指终端求得的第一损失函数小于预设阈值。
本方案中,通过对待降噪图像样本分别执行两次随机下采样处理,获得子图像样本对,并将子图像样本对中的一个子图像作为降噪模型的输入值,子图像样本对中的另一个子图像则作为降噪模型的输出期望值,从而实现降噪模型的训练。基于噪声图像即可实现降噪模型的训练,无需获取噪声图像对应的干净图像,降低了降噪模型的训练难度。
可选的,在一种可能的实现方式中,终端对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,包括:终端将所述待降噪图像样本划分为M个图像单元,所述M个图像单元中的每个图像单元包括n*n个像素。终端在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择,获得M个第一像素。根据随机采样得到的M个第一像素,终端根据M个第一像素对应的图像单元在待降噪图像中的位置,将M个第一像素进行拼凑,得到所述第一子图像。
类似地,终端在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,获得M个第二像素。根据随机采样得到的M个第二像素,终端根据M个第二像素对应的图像单元在待降噪图像中的位置,将M个第二像素进行拼凑,得到所述第二子图像。
可选的,在一种可能的实现方式中,终端在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,包括:终端获取所述M个图像单元中的每个图像单元中的n*n-1个目标像素,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素。终端在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,获得M个第二像素,所述M个第二像素与所述M个第一像素均不相同。
也就是说,终端在对每个图像单元执行像素的第二次随机选择时,终端需要先确定每个图像单元中没有被选为第一像素的像素,即每个图像单元中的n*n-1个目标像素。然后,终端再在这n*n-1个目标像素中随机选择一个像素作为第二像素。这样,终端在每个图像单元中选择到的第一像素和第二像素必然是不相同的像素。例如,如图5所示的第一子图 像和第二子图像中的每个像素都是不相同的像素。
通过确保第二子图像的下采样处理过程中不会采样到与第一子图像中相同的像素,可以保证随机下采样处理得到的第一子图像和第二子图像是两个完全独立的图像,即第一子图像与第二子图像之间没有过强的相关性,从而能够基于第一子图像和第二子图像更好地模拟相同场景下带有随机噪声的两个图像,进而提高降噪模型的训练效果。
可选的,在一种可能的实现方式中,终端所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,包括:终端在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素,获得M个第二像素,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
也就是说,对于每个图像单元中的n*n-1个目标像素,终端可以先确定这n*n-1个目标像素中与执行像素的第一次随机选择时所选中的第一像素相邻的目标像素。然后,终端在所确定的与第一像素相邻的目标像素中随机选择一个像素作为第二像素。这样,终端在每个图像单元中选择到的第一像素和第二像素必然是相邻的像素。
通过选择与第一像素相邻的目标像素作为第二像素,可以确保得到的第二子图像与第一子图像之间有更高的相似性,即基于第一子图像和第二子图像更好地模拟相同场景下带有随机噪声的两个图像,进而提高降噪模型的训练效果。
可选的,在一种可能的实现方式中,该方法还包括:终端将所述待降噪图像样本输入所述降噪模型,得到第二目标图像。终端基于所述第一次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第一子目标图像。也就是说,第一子目标图像与第一子图像中每一组对应的像素在下采样前的源图像中的位置都是相同的。终端基于所述第二次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第二子目标图像。终端根据所述第一目标图像、所述第二子图像、所述第一子目标图像和所述第二子目标图像,获取第二损失函数。终端至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练。
本方案中,通过引入第二损失函数,能够进一步约束降噪模型,使降噪模型不至于因为第一子图像和第二子图像对应的像素在待降噪图像样本中的位置不一致而生成过度平滑的图像,即保护噪声图像中的高频细节信息不被降噪模型去掉,保证降噪后的图像仍保留有高频细节信息。
可选的,在一种可能的实现方式中,终端至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练,包括:终端至少根据所述第一损失函数、第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练。其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
其中,第二权重系数越大,则降噪强度也越弱,降噪强度越弱,噪声则越多,但是图像的高频细节的损失也会越少;第二权重系数越小,则降噪强度也越强,降噪强度越强,噪声则越小,但是图像的高频细节的损失也会越多。因此,在实际应用中可以通过调整第一权重系数和/或第二权重系数来到达到降噪强度和高频细节的损失程度之间的平衡。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型等基于学习的降噪模型。
本申请第二方面提供一种图像降噪方法,包括:获取待降噪图像;将所述待降噪图像输入目标降噪模型,得到降噪后的图像。其中,所述目标降噪模型是至少基于第一损失函数对降噪模型进行训练得到的,所述第一损失函数是基于第一目标图像和第二子图像获取到的,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异,所述第一目标图像是将第一子图像输入所述降噪模型后得到的,所述第一子图像和所述第二子图像是对待降噪图像样本分别执行第一次随机下采样处理以及第二次随机下采样处理后得到的,所述第一子图像与所述第二子图像的分辨率相同。
可选的,在一种可能的实现方式中,所述第一子图像是根据M个第一像素得到的,所述M个第一像素是在将所述待降噪图像样本划分为M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择得到的;所述第二子图像是根据M个第二像素得到的,所述M个第二像素是在将所述待降噪图像样本划分为所述M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择得到的。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择得到的,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素得到的,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数和第二损失函数对所述降噪模型进行训练得到的,所述第二损失函数根据所述第一目标图像、所述第二子图像、第一子目标图像和第二子目标图像得到的,所述第一子目标图像是基于所述第一次随机下采样处理中像素的采样位置对第二目标图像进行下采样处理得到的,所述第二子目标图像是基于所述第二次随机下采样处理中像素的采样位置对所述第二目标图像进行下采样处理得到的,所述第二目标图像是将所述待降噪图像样本输入所述降噪模型得到的。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练得到的。其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
本申请第三方面提供一种模型训练装置,包括:获取单元和处理单元。所述获取单元,用于从样本集合中获取待降噪图像样本,所述样本集合包括多个待降噪图像样本;所述处理单元,用于对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,所述第一子图像与所述第二子图像的分辨率相同;所述处理单元,还用于将所述第一子图像输入降噪模型,得到第一目标图像;所述处理单元,还用于根据所述第一目标图像和所述第二子图像获取第一损失函数,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异;所述处理单元,还用于至少根据所述第一损失函数对所述降噪模型进行训练,得到目标降噪模型。
可选的,在一种可能的实现方式中,所述处理单元,还用于:将所述待降噪图像样本划分为M个图像单元,所述M个图像单元中的每个图像单元包括n*n个像素;在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择,获得M个第一像素,并根据所述M个第一像素得到所述第一子图像;在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,获得M个第二像素,并根据所述M个第二像素得到所述第二子图像。
可选的,在一种可能的实现方式中,所述获取单元,还用于获取所述M个图像单元中的每个图像单元中的n*n-1个目标像素,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素;所述处理单元,还用于在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,获得M个第二像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,在一种可能的实现方式中,所述处理单元,还用于在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素,获得M个第二像素,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,在一种可能的实现方式中,所述处理单元,还用于:将所述待降噪图像样本输入所述降噪模型,得到第二目标图像;基于所述第一次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第一子目标图像;基于所述第二次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第二子目标图像;根据所述第一目标图像、所述第二子图像、所述第一子目标图像和所述第二子目标图像,获取第二损失函数;至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练。
可选的,在一种可能的实现方式中,所述处理单元,还用于至少根据所述第一损失函数、第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
本申请第四方面提供一种图像降噪装置,包括:获取单元和处理单元。所述获取单元,用于获取待降噪图像;所述处理单元,用于将所述待降噪图像输入目标降噪模型,得到降噪后的图像;其中,所述目标降噪模型是至少基于第一损失函数对降噪模型进行训练得到的,所述第一损失函数是基于第一目标图像和第二子图像获取到的,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异,所述第一目标图像是将第一子图像输入所述降噪模型后得到的,所述第一子图像和所述第二子图像是对待降噪图像样本分别执行第一次随机下采样处理以及第二次随机下采样处理后得到的,所述第一子图像与所述第二子图像的分辨率相同。
可选的,在一种可能的实现方式中,所述第一子图像是根据M个第一像素得到的,所述M个第一像素是在将所述待降噪图像样本划分为M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择得到的;所述第二子图像是根据M个第二像素得到的,所述M个第二像素是在将所述待降噪图像样本划分为所述M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择得到的。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择得到的,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素得到的,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数和第二损失函数对所述降噪模型进行训练得到的,所述第二损失函数根据所述第一目标图像、所述第二子图像、第一子目标图像和第二子目标图像得到的,所述第一子目标图像是基于所述第一次随机下采样处理中像素的采样位置对第二目标图像进行下采样处理得到的,所述第二子目标图像是基于所述第二次随机下采样处理中像素的采样位置对所述第二目标图像进行下采样处理得到的,所述第二目标图像是将所述待降噪图像样本输入所述降噪模型得到的。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练得到的;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
本申请第五方面提供了一种模型训练装置,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第一方面所述 的方法。对于处理器执行第一方面的各个可能实现方式中的步骤,具体均可以参阅第一方面,此处不再赘述。
本申请第六方面提供了一种图像降噪装置,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第二方面所述的方法。对于处理器执行第二方面的各个可能实现方式中的步骤,具体均可以参阅第二方面,此处不再赘述。
本申请第七方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
本申请第八方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述第二方面所述的方法。
本申请第九方面提供了一种电路***,所述电路***包括处理电路,所述处理电路配置为执行上述第一方面或第二方面所述的方法。
本申请第十方面提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的方法。
本申请第十一方面提供了一种芯片***,该芯片***包括处理器,用于支持服务器或门限值获取装置实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片***还包括存储器,所述存储器,用于保存服务器或通信设备必要的程序指令和数据。该芯片***,可以由芯片构成,也可以包括芯片和其他分立器件。
附图说明
图1为本申请实施例提供的人工智能主体框架的一种结构示意图;
图2a为本申请实施例提供的一种图像处理***;
图2b为本申请实施例提供的另一种图像处理***;
图2c为本申请实施例提供的图像处理的相关设备的示意图;
图3a为本申请实施例提供的一种***100架构的示意图;
图3b为本申请实施例提供的一种图像降噪的示意图;
图4为本申请实施例提供的一种降噪模型的训练方法的流程示意图;
图5为本申请实施例提供的一种随机下采样处理的示意图;
图6为本申请实施例提供的另一种随机下采样处理的示意图;
图7为本申请实施例提供的另一种随机下采样处理的示意图;
图8为本申请实施例提供的一种降噪模型的训练流程示意图;
图9为本申请实施例提供的另一种降噪模型的训练流程示意图;
图10为本申请实施例提供的一种确定降噪模型的降噪效果的实验流程示意图;
图11为本申请实施例提供的多种降噪方法的降噪效果对比示意图;
图12为本申请实施例提供的另一种确定降噪模型的降噪效果的实验流程示意图;
图13为本申请实施例提供的多种降噪方法的降噪指标对比示意图;
图14为本申请实施例提供的多种降噪方法的降噪效果对比示意图;
图15为本申请实施例提供的一种图像降噪方法的流程示意图;
图16为本申请实施例提供的一种模型训练装置的结构示意图;
图17为本申请实施例提供的一种图像降噪装置的结构示意图;
图18为本申请实施例提供的执行设备的一种结构示意图;
图19为本申请实施例提供的训练设备的一种结构示意图;
图20为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
下面结合本发明实施例中的附图对本发明实施例进行描述。本发明的实施方式部分使用的术语仅用于对本发明的具体实施例进行解释,而非旨在限定本发明。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、***、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
首先对人工智能***总体工作流程进行描述,请参见图1,图1示出的为人工智能主体框架的一种结构示意图,下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。其中,“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到***的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施
基础设施为人工智能***提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片(CPU、NPU、GPU、ASIC、FPGA等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算***中的智能芯片进行计算。
(2)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有***的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能***中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用***,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用
智能产品及行业应用指人工智能***在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能终端、智能交通、智能医疗、自动驾驶、智慧城市等。
接下来介绍几种本申请的应用场景。
图2a为本申请实施例提供的一种图像处理***,该图像处理***包括用户设备以及数据处理设备。其中,用户设备包括手机、个人电脑或者信息处理中心等智能终端。用户设备为图像处理的发起端,作为图像增强请求的发起方,通常由用户通过用户设备发起请求。
上述数据处理设备可以是云服务器、网络服务器、应用服务器以及管理服务器等具有数据处理功能的设备或服务器。数据处理设备通过交互接口接收来自智能终端的图像增强请求,再通过存储数据的存储器以及数据处理的处理器环节进行机器学习,深度学习,搜索,推理,决策等方式的图像处理。数据处理设备中的存储器可以是一个统称,包括本地存储以及存储历史数据的数据库,数据库可以在数据处理设备上,也可以在其它网络服务器上。
在图2a所示的图像处理***中,用户设备可以接收用户的指令,例如用户设备可以获取用户输入/选择的一张图像,然后向数据处理设备发起请求,使得数据处理设备针对用户设备得到的该图像执行图像降噪应用,从而得到针对该图像的对应的处理结果。示例性的,用户设备可以获取用户输入的一张图像,然后向数据处理设备发起图像降噪请求,使得数据处理设备对该图像进行图像降噪,从而得到降噪后的图像。
在图2a中,数据处理设备可以执行本申请实施例的降噪模型的训练方法。
图2b为本申请实施例提供的另一种图像处理***,在图2b中,用户设备直接作为数据处理设备,该用户设备能够直接获取来自用户的输入并直接由用户设备本身的硬件进行处理,具体过程与图2a相似,可参考上面的描述,在此不再赘述。
在图2b所示的图像处理***中,用户设备可以接收用户的指令,例如用户设备可以获取用户在用户设备中所选择的一张图像,然后再由用户设备自身针对该图像执行图像处理应用,从而得到针对该图像的对应的处理结果。
在图2b中,用户设备自身就可以执行本申请实施例的降噪模型的训练方法。
图2c是本申请实施例提供的图像处理的相关设备的示意图。
上述图2a和图2b中的用户设备具体可以是图2c中的本地设备301或者本地设备302,图2a中的数据处理设备具体可以是图2c中的执行设备210,其中,数据存储***250可以存储执行设备210的待处理数据,数据存储***250可以集成在执行设备210上,也可以设置在云上或其它网络服务器上。
图2a和图2b中的处理器可以通过神经网络模型或者其它模型(例如,基于支持向量机的模型)进行数据训练/机器学习/深度学习,并利用数据最终训练或者学习得到的模型针对图像执行图像处理应用,从而得到相应的处理结果。
图3a是本申请实施例提供的一种***100架构的示意图,在图3a中,执行设备110配置输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口112输入数据,所述输入数据在本申请实施例中可以包括:各个待调度任务、可调用资源以及其他参数。
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理(比如进行本申请中神经网络的功能实现)过程中,执行设备110可以调用数据存储***150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储***150中。
最后,I/O接口112将处理结果返回给客户设备140,从而提供给用户。
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则,该相应的目标模型/规则即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。其中,训练数据可以存储在数据库130中,且来自于数据采集设备160采集的训练样本。
在图3a中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口112提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口112发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。用户可以在客户设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备140也可以作为数据采集端,采集如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过客户设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。
值得注意的是,图3a仅是本申请实施例提供的一种***架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图3a中,数据存储***150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储***150置于执行设备110中。如图3a所示,可以根据训练设备120训练得到神经网络。
本申请实施例还提供的一种芯片,该芯片包括神经网络处理器NPU。该芯片可以被设 置在如图3a所示的执行设备110中,用以完成计算模块111的计算工作。该芯片也可以被设置在如图3a所示的训练设备120中,用以完成训练设备120的训练工作并输出目标模型/规则。
神经网络处理器NPU,NPU作为协处理器挂载到主中央处理器(central processing unit,CPU)(host CPU)上,由主CPU分配任务。NPU的核心部分为运算电路,控制器控制运算电路提取存储器(权重存储器或输入存储器)中的数据并进行运算。
在一些实现中,运算电路内部包括多个处理单元(process engine,PE)。在一些实现中,运算电路是二维脉动阵列。运算电路还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)中。
向量计算单元可以对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。例如,向量计算单元可以用于神经网络中非卷积/非FC层的网络计算,如池化(pooling),批归一化(batch normalization),局部响应归一化(local response normalization)等。
在一些实现种,向量计算单元能将经处理的输出的向量存储到统一缓存器。例如,向量计算单元可以将非线性函数应用到运算电路的输出,例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元生成归一化的值、合并值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路的激活输入,例如用于在神经网络中的后续层中的使用。
统一存储器用于存放输入数据以及输出数据。
权重数据直接通过存储单元访问控制器(direct memory access controller,DMAC)将外部存储器中的输入数据搬运到输入存储器和/或统一存储器、将外部存储器中的权重数据存入权重存储器,以及将统一存储器中的数据存入外部存储器。
总线接口单元(bus interface unit,BIU),用于通过总线实现主CPU、DMAC和取指存储器之间进行交互。
与控制器连接的取指存储器(instruction fetch buffer),用于存储控制器使用的指令;
控制器,用于调用指存储器中缓存的指令,实现控制该运算加速器的工作过程。
一般地,统一存储器,输入存储器,权重存储器以及取指存储器均为片上(On-Chip)存储器,外部存储器为该NPU外部的存储器,该外部存储器可以为双倍数据率同步动态随机存储器(double data rate synchronous dynamic random access memory,DDR SDRAM)、高带宽存储器(high bandwidth memory,HBM)或其他可读可写的存储器。
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例涉及的相关术语及神经网络等相关概念进行介绍。
(1)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以xs和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2021131656-appb-000001
其中,s=1、2、……n,n为大于1的自然数,Ws为xs的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入。激活函数可以是sigmoid函数。神经网络是将许多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
神经网络中的每一层的工作可以用数学表达式
Figure PCTCN2021131656-appb-000002
来描述:从物理层面神经网络中的每一层的工作可以理解为通过五种对输入空间(输入向量的集合)的操作,完成输入空间到输出空间的变换(即矩阵的行空间到列空间),这五种操作包括:1、升维/降维;2、放大/缩小;3、旋转;4、平移;5、“弯曲”。其中1、2、3的操作由
Figure PCTCN2021131656-appb-000003
完成,4的操作由+b完成,5的操作则由a()来实现。这里之所以用“空间”二字来表述是因为被分类的对象并不是单个事物,而是一类事物,空间是指这类事物所有个体的集合。其中,W是权重向量,该向量中的每一个值表示该层神经网络中的一个神经元的权重值。该向量W决定着上文所述的输入空间到输出空间的空间变换,即每一层的权重W控制着如何变换空间。训练神经网络的目的,也就是最终得到训练好的神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。因此,神经网络的训练过程本质上就是学习控制空间变换的方式,更具体的就是学习权重矩阵。
因为希望神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到神经网络能够预测出真正想要的目标值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么神经网络的训练就变成了尽可能缩小这个loss的过程。
(2)反向传播算法
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的神经网络模型的参数,例如权重矩阵。
(3)图像增强
图像增强指的是对图像的亮度、颜色、对比度、饱和度、动态范围等进行处理,以满 足某种特定指标。简单来说,通过在图像处理过程中,通过有目的地强调图像的整体或局部特性,将原来不清晰的图像变得清晰或强调某些感兴趣的特征,扩大图像中不同物体特征之间的差别,抑制不感兴趣的特征,从而起到改善图像质量、丰富图像信息量的作用,能够加强图像判读和识别效果,满足某些特殊分析的需要。示例性的,图像增强可以包括但不限于图像超分辨率重构、图像降噪、图像去雾、图像去模糊以及图像对比度增强。
(4)图像降噪
图像降噪方法是指应用算法从观测到的噪声图像中去除噪声,保留图像中较为重要的细节信息,重建出相应的干净图像。重建后得到的图像看似清晰且洁净,通过图像降噪,能够提高图像质量,有利于后续对图像执行图像分类或物体识别等图像处理流程。在图像处理的流程中,图像降噪是一个十分重要的步骤。目前学术界和工业界有许多图像降噪方法得到应用,图像降噪是当前图像处理技术领域的研究热点之一。可以参阅图3b,图3b为本申请实施例提供的一种图像降噪的示意图。如图3b所示,通过图像降噪可以将图像中的噪点尽可能地消除,提高图像的质量。
下面从神经网络的训练侧和神经网络的应用侧对本申请提供的方法进行描述。
本申请实施例提供的神经网络的训练方法,涉及图像的处理,具体可以应用于数据训练、机器学习、深度学习等数据处理方法,对训练数据(如本申请中的图像)进行符号化和形式化的智能信息建模、抽取、预处理、训练等,最终得到训练好的图像处理模型;并且,本申请实施例提供的图像降噪方法可以运用上述训练好的降噪模型,将输入数据(如本申请中的待处理图像)输入到所述训练好的图像处理模型中,得到输出数据(如本申请中目标图像)。需要说明的是,本申请实施例提供的降噪模型的训练方法和图像降噪方法是基于同一个构思产生的发明,也可以理解为一个***中的两个部分,或一个整体流程的两个阶段:如模型训练阶段和模型应用阶段。
目前,在图像的生成或传输过程中,往往容易受到成像设备或外部环境噪声的干扰,而产生携带影响图像质量的噪声。这种由于受到干扰而包括噪声的图像通常称为含噪图像或噪声图像。为提高这种图像的质量,图像降噪方法应运而生。图像降噪方法是指应用算法从观测到的噪声图像中去除噪声,保留图像细节,重建出相应的干净图像。通过对噪声图像提取特征,借助图像先验知识、图像自相似性和多帧图像互补信息等手段去除噪声、填充细节,生成对应的高质量图像是图像降噪研究的常用思路。在工业界,图像降噪技术在手机拍照、高清电视、监督设备、卫星图像和医学影像等领域有重要的应用价值。
通常来说,图像降噪算法主要分为传统滤波方法和基于学***滑降低图像中随机性的噪声,同时保留图像本身的高频信号。一般地,效果较好的图像降噪方法大多是多种方法的结合,既能很好地保持边缘信息,又能去除图像中的噪声。比如将中值滤波方法和小波滤波方法结合起来进行图像滤波,以实现较好的图像降噪效果。总的来说,传统的降噪算法都是从噪声图像中找出规律,然后再进行相应的降噪处理。在从噪声图像本身无法找到 规律的情况下,传统滤波方法则表现较差,难以达到较好的图像降噪效果,从而限制了图像降噪性能的进一步提升。
在这种情况下,基于学习的图像降噪方法应运而生。基于学习的图像降噪方法是数据驱动的方法,通过由降噪模型学习大规模的噪声图像-干净图像对中的规律,来实现图像的降噪,以达到良好的图像降噪效果。近些年,深度神经网络(Deep Neural Network,DNN)凭借其强大的学习能力,迅速超越了传统的图像降噪方法,取得了巨大的成功。基于深度神经网络的图像降噪方法能够生成更加干净、更加清晰、更少伪影的干净图片,进一步推动了图像降噪技术的发展。
得益于处理器算力的不断提升以及深度卷积神经网络的快速发展,图像降噪网络的效果得到了大幅度的提高,这进一步推动了基于深度学习的图像降噪网络的应用。当前基于监督学习(Supervised Learning)的图像降噪网络的降噪效果很大程度上取决于用于训练网络的训练数据,即噪声图像-干净图像对。其中,噪声图像-干净图像对是指噪声图像以及该噪声图像对应的干净无噪声图像。简单来说,噪声图像-干净图像对是一对相同场景下的图像,噪声图像和干净图像中所包括的场景信息是相同的,区别在于干净图像中病不包括有噪声。
然而,在图像处理领域,噪声图像-干净图像对的获取往往十分困难。例如,对于拍照领域来说,由于受到成像设备和外部环境噪声的干扰,相机所拍摄的图像往往都是含有噪声的。尽管可以通过增加曝光时间、多帧平滑等技术手段来获得相对较为干净的图像(即包含噪声较少的图像),但是其局限性也非常明显。尤其是对于动态的场景,不同时间下所拍摄的图像内容本身就存在变化,因此在动态场景下往往难以获得噪声图像-干净图像对。
又例如,对于医学图像领域来说,医学图像是由仪器产生射线或电磁波并作用于人体而生成的。由于仪器本身的原因,通过仪器所获得的图像中往往含有大量的随机噪声,即通过仪器往往难以获得干净无噪声的图像。此外,由于医学图像的特殊性,拍摄医学图像的过程中容易对人体产生一定的负面影响,因此在实际应用过程中,往往难以获取多张相同部位下的医学图像。
针对于在降噪模型训练过程中,噪声图像-干净图像对难以获得的问题,本实施例中提出了仅仅使用噪声图像来训练降噪模型的方法。通过对待降噪图像样本分别执行两次随机下采样处理,获得子图像样本对,并将子图像样本对中的一个子图像作为降噪模型的输入值,子图像样本对中的另一个子图像则作为降噪模型的输出期望值,从而实现降噪模型的训练。基于噪声图像即可实现降噪模型的训练,无需获取噪声图像对应的干净图像,降低了降噪模型的训练难度。
为了便于理解,以下将对本实施例所提供的降噪模型的训练方法所应用的设备以及场景进行介绍。
本申请实施例所提供的降噪模型的训练方法可以应用于终端上,该终端为能够执行模型训练的设备。示例性地,该终端例如可以是个人电脑(personal computer,PC)、笔记本电脑、服务器、手机(mobile phone)、平板电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备、虚拟现实(virtual reality,VR)设备、增强现实(augmented reality, AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。该终端可以是运行安卓***、IOS***、windows***以及其他***的设备。
本实施例所提供的降噪模型的训练方法可以应用于终端设备拍照、视频监督以及医学图像处理等需要将噪声图像转换为干净图像的场景,通过本实施例所提供的降噪模型的训练方法,能够训练得到用于图像降噪的降噪模型,实现图像降噪。
随着智能手机、平板电脑等便携式终端设备的普及,拍照已经成为智能设备必不可少的功能。随着拍照功能在终端设备的重要性不断提升,人们对提升终端设备的拍照质量提出了更多的要求。目前,终端设备拍照的短板主要在于暗光场景下的图像质量。在暗光或者夜间等场景下,由于环境光太弱,终端设备拍出来的照片会有非常明显的噪点,极大地影响了图像质量。应理解,由于终端设备拍摄得到的图像中的噪声是由环境光线太差而导致的,因此在这种场景下,难以通过终端设备获取该场景下的干净图像,即难以获得噪声图像-干净图像对。
医学图像是指通过成像仪器获取人体器官等部位的图像,例如磁共振成像(Magnetic Resonance Imaging,MRI)、计算机断层扫描(Computed Tomography,CT)等。成像仪器在成像的过程中会引入噪声,而较大的噪声会降低信噪比,严重影响后续的图像处理。因此,在医学图像处理中,对成像仪器产生的含噪声图像进行降噪处理是一个必不可少的环节。
由于医学图像的特殊性,很多医学图像只能拍一张,对同一个场景同时拍多张含噪声图像十分困难,而获取干净图像更是难以实现。因此,在医学图像降噪领域,同样难以获得噪声图像-干净图像对,容易影响降噪模型的正常训练。
可以参阅图4,图4为本申请实施例提供的一种降噪模型的训练方法的流程示意图。如图4所示,本申请实施例提供的一种降噪模型的训练方法包括以下的步骤401-405。
步骤401,从样本集合中获取待降噪图像样本,所述样本集合包括多个待降噪图像样本。
在本实施例提供的降噪模型的训练方法应用于终端拍照场景时,即需要训练得到用于对拍照得到的图像进行降噪的模型时,终端可以获取包括多个待降噪图像样本的样本集合。该样本集合中的待降噪图像样本可以为终端在不同的场景下所采集的图像,且终端所采集的这些图像均包括有噪声。
在本实施例提供的降噪模型的训练方法应用于医学图像场景时,该样本集合中所包括的待降噪图像样本则为同类型的医学图像,例如样本集合中的多个待降噪图像样本均为MRI图像,或者样本集合中的多个待降噪图像样本均为CT图像。
总的来说,样本集合中所包括的多个待降噪图像样本为相同类型的图像,且这些待降噪图像样本均包括有噪声。终端可以根据降噪模型的训练方法所应用的场景获取相应的样本集合,并且基于样本集合中的待降噪图像样本对降噪模型进行训练。
步骤402,对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,所述第一子图像与所述第二子图像的分辨率相同。
在获得待降噪图像样本之后,终端对同一个待降噪图像样本分别执行两次随机下采样处理,得到第一子图像和第二子图像。其中,第一子图像和第二子图像为不同的图像,第一子图像和第二子图像的分辨率相同,且第一子图像和第二子图像的分辨率低于待降噪图像样本的分辨率。
其中,随机下采样处理是指基于设定的采样方式,在待降噪图像样本中随机进行像素的采样,并基于采样获得的像素拼凑得到分辨率小于待降噪图像样本的子图像。上述的第一次随机下采样处理和第二次随机下采样处理进行像素采样的方式是相同的。但是由于第一次随机下采样处理和第二次随机下采样处理是两次独立的像素随机采样过程,因此第一次随机下采样处理得到的第一子图像和第二次随机下采样处理得到的第二子图像极大概率是不一样的。
为便于理解,以下将介绍本实施例所提供的执行随机下采样处理的两种方式。应理解,在实际应用中,还可以采用其他的随机下采样处理方式,在此不对随机下采样处理的方式做具体限定。
方式一、将待降噪图像样本划分为多个图像单元,在每个图像单元中随机采样一个像素,从而得到由多个采样得到的像素构成的子图像。
示例性地,终端将所述待降噪图像样本平均地划分为M个图像单元,这M个图像单元中的每个图像单元均包括n*n个像素。然后,终端在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择,即在每个图像单元的n*n个像素中随机选择一个像素作为第一像素,从而得到M个第一像素。根据随机采样得到的M个第一像素,终端根据M个第一像素对应的图像单元在待降噪图像中的位置,将M个第一像素进行拼凑,得到所述第一子图像。
类似地,终端在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,即在每个图像单元的n*n个像素中再次随机选择一个像素作为第二像素,从而得到M个第二像素。根据随机采样得到的M个第二像素,终端根据M个第二像素对应的图像单元在待降噪图像中的位置,将M个第二像素进行拼凑,得到所述第二子图像。
可以参阅图5,图5为本申请实施例提供的一种随机下采样处理的示意图。如图5所示,假设待降噪图像的分辨率为4*4,即待降噪图像由4*4个像素构成。图5中将待降噪图像划分为4个图像单元,每个图像单元包括2*2个像素。
在对待降噪图像执行第一次随机下采样处理的过程中,对于第一个图像单元(即待降噪图像中左上角的图像单元),随机采样了第一个图像单元中左上角的像素(即像素1A)作为第一像素。对于第二个图像单元(即待降噪图像中右上角的图像单元),随机采样了第二个图像单元中右上角的像素(即像素1B)作为第一像素;对于第三个图像单元(即待降噪图像中左下角的图像单元),随机采样了第三个图像单元中右下角的像素(即像素1C)作为第一像素;对于第四个图像单元(即待降噪图像中右下角的图像单元),随机采样了第 四个图像单元中右下角的像素(即像素1D)作为第一像素。基于对四个图像单元采样得到的像素,可以采样得到的像素(即像素1A、像素1B、像素1C、像素1D)对应的图像单元在待降噪图像中所在的位置,拼凑得到第一子图像。
在对待降噪图像执行第二次随机下采样处理的过程中,对于第一个图像单元,随机采样了第一个图像单元中右下角的像素(即像素2A)作为第二像素。对于第二个图像单元,随机采样了第二个图像单元中左下角的像素(即像素2B)作为第二像素;对于第三个图像单元,随机采样了第三个图像单元中左上角的像素(即像素2C)作为第二像素;对于第四个图像单元,随机采样了第四个图像单元中右上角的像素(即像素2D)作为第二像素。基于对四个图像单元采样得到的像素,可以采样得到的像素(即像素2A、像素2B、像素2C、像素2D)对应的图像单元在待降噪图像中所在的位置,拼凑得到第二子图像。
可选的,当终端在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择时,终端可以获取所述M个图像单元中的每个图像单元中的n*n-1个目标像素,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素。然后,终端在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,获得M个第二像素,所述M个第二像素与所述M个第一像素均不相同。
也就是说,终端在对每个图像单元执行像素的第二次随机选择时,终端需要先确定每个图像单元中没有被选为第一像素的像素,即每个图像单元中的n*n-1个目标像素。然后,终端再在这n*n-1个目标像素中随机选择一个像素作为第二像素。这样,终端在每个图像单元中选择到的第一像素和第二像素必然是不相同的像素。例如,如图5所示的第一子图像和第二子图像中的每个像素都是不相同的像素。
通过确保第二子图像的下采样处理过程中不会采样到与第一子图像中相同的像素,可以保证随机下采样处理得到的第一子图像和第二子图像是两个完全独立的图像,即第一子图像与第二子图像之间没有过强的相关性,从而能够基于第一子图像和第二子图像更好地模拟相同场景下带有随机噪声的两个图像,进而提高降噪模型的训练效果。例如,在没有确保第二子图像的下采样处理过程中不会采样到与第一子图像中相同的像素时,在一些较为极端的情况下,第二子图像中的每个第二像素均可能是与第一像素相同的,即第二子图像与第一子图像相同。此时,基于相同的两个子图像进行降噪模型的训练,则难以获得较好的训练效果。
可选的,当终端在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择时,终端可以在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素,获得M个第二像素,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
也就是说,对于每个图像单元中的n*n-1个目标像素,终端可以先确定这n*n-1个目标像素中与执行像素的第一次随机选择时所选中的第一像素相邻的目标像素。然后,终端在所确定的与第一像素相邻的目标像素中随机选择一个像素作为第二像素。这样,终端在每个图像单元中选择到的第一像素和第二像素必然是相邻的像素。
示例性地,可以参阅图6,图6为本申请实施例提供的另一种随机下采样处理的示意 图。如图6所示,第二子图像中的每个第二像素(即像素2A、像素2B、像素2C、像素2D)均与其对应的第一像素(即像素1A、像素1B、像素1C、像素1D)相邻,不存在第二像素位于第一像素的对角的情况。
本实施例中,对于每个图像单元,可以认为图像单元中相邻的像素具有更高的相似性。因此,通过选择与第一像素相邻的目标像素作为第二像素,可以确保得到的第二子图像与第一子图像之间有更高的相似性,即基于第一子图像和第二子图像更好地模拟相同场景下带有随机噪声的两个图像,进而提高降噪模型的训练效果。
方式二、基于下采样的比例,确定待降噪图像样本中需要执行采样处理的像素,并在该像素周围的多个像素中随机采样一个像素,从而得到由多个采样得到的像素构成的子图像。
示例性地,对于一个分辨率为W*H的待降噪图像,可以预先确定将该待降噪图像下采样处理为分辨率为(W-x)*(H-x)的子图像,即第一子图像和第二子图像的分辨率均为(W-x)*(H-x)。基于下采样的比例,即W/(W-x)以及H/(H-x),终端确定待降噪图像中需要执行采样处理的像素,该像素在执行采样处理后需要作为子图像中的像素。然后,终端对该需要执行采样处理的像素执行采样处理,即在该像素周围的多个像素或者该像素周围的多个像素以及该像素中随机采样一个像素,采样得到的像素即作为子图像中的像素,最终得到由多个采样得到的像素构成的子图像。例如,终端在可以第1行至第x行中随机选择一行作为执行像素的采样处理的开始行,从该行开始连续对W-x行的像素执行采样处理;类似地,终端在可以第1列至第x列中随机选择一列作为执列像素的采样处理的开始列,从该列开始连续对H-x列的像素执列采样处理。
可以参阅图7,图7为本申请实施例提供的另一种随机下采样处理的示意图。如图7所示,假设待降噪图像为一个分辨率为4*4的图像,待降噪图像中包括16个像素,分别为像素1-像素16。该待降噪图像需要下采样处理为分辨为(4-2)*(4-2)的子图像,即子图像的分辨率为2*2。基于下采样的比例,确定以待降噪图像中的像素6、像素7、像素10以及像素11为目标,对这四个像素执行随机下采样处理。
在对像素6执行随机下采样处理的过程中,确定像素6周围的多个像素,即像素1、像素2、像素3、像素5、像素7、像素9、像素10以及像素11。然后,在像素6周围的多个像素中随机选择一个像素作为子图像中的像素。例如,如图7所示,在以像素6为执行随机下采样处理的目标时,选择了像素2作为子图像中的像素。可选的,也可以是在像素6周围的多个像素以及像素6中随机选择一个像素作为子图像中的像素,即像素6本身也可能被选到作为子图像中的像素。
在对像素7执行随机下采样处理的过程中,确定像素7周围的多个像素,即像素2、像素3、像素4、像素6、像素8、像素10、像素11以及像素12。然后,在像素7周围的多个像素中随机选择一个像素作为子图像中的像素。例如,如图7所示,在以像素7为执行随机下采样处理的目标时,选择了像素8作为子图像中的像素。
在对像素10执行随机下采样处理的过程中,确定像素10周围的多个像素,即像素5、像素6、像素7、像素9、像素11、像素13、像素14以及像素15。然后,在像素10周围 的多个像素中随机选择一个像素作为子图像中的像素。例如,如图7所示,在以像素10为执行随机下采样处理的目标时,选择了像素9作为子图像中的像素。
在对像素11执行随机下采样处理的过程中,确定像素11周围的多个像素,即像素6、像素7、像素8、像素10、像素12、像素14、像素15以及像素16。然后,在像素11周围的多个像素中随机选择一个像素作为子图像中的像素。例如,如图7所示,在以像素11为执行随机下采样处理的目标时,选择了像素12作为子图像中的像素。
步骤403,将所述第一子图像输入降噪模型,得到第一目标图像。
在得到第一子图像之后,终端可以将第一子图像输入降噪模型中,得到降噪模型所输出的第一目标图像。其中,该降噪模型用于对输入的图像进行降噪处理,并输出降噪后的干净图像。该降噪模型是一个可以学习的模型,在对降噪模型进行训练之前,降噪模型的降噪能力较差;在训练的过程中,降噪模型内部的降噪模块不断被优化,降噪能力不断增强;在训练结束之后,降噪模型即可用于实现图像的降噪。
其中,该降噪模型包括但不限于卷积神经网络或基于稀疏特征表达的降噪模型等基于学习的模型,本实施例并不对降噪模型的具体结构做具体限定。
步骤404,根据所述第一目标图像和所述第二子图像获取第一损失函数,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异。
本实施例中,在对待降噪图像进行随机下采样处理,得到第一子图像和第二子图像之后,可以将第一子图像和第二子图像作为用于训练的样本对。其中,第一子图像作为降噪模型的输入值,第二子图像则作为降噪模型的期望输出值。基于降噪模型的实际输出值(即第一目标图像)以及降噪模型的期望输出值(即第二子图像),可以获取第一损失函数,该第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异。基于该第一损失函数,可以指导降噪模型学习到降噪能力。
示例性地,假设第一子图像为g 1(y),第二子图像为g 2(y),降噪模型为f。第一子图像g 1(y)输入降噪模型f之后,降噪模型f所输出的图像为f(g 1(y))。第一损失函数的一个可能的示例如公式1所示。
loss1=||f(g 1(y))-g 2(y)|| p   公式1
其中,loss1为第一损失函数,f(g 1(y))为降噪模型f输入第一子图像g 1(y)后所输出的图像,g 2(y)为第二子图像,p为次方数,p的取值可以为1或2等数值。应理解,f(g 1(y))-g 2(y)可以理解为将f(g 1(y))图像中的像素的值与g 2(y)图像中对应的像素的值相减。一般来说,图像中像素的取值可以为0-255或0-4095,不同的取值用于表示不同的颜色。
步骤405,至少根据所述第一损失函数对所述降噪模型进行训练,得到目标降噪模型。
在得到第一损失函数之后,终端可以基于第一损失函数的值对降噪模型进行训练。其中,终端基于第一损失函数对降噪模型进行训练的过程包括:终端基于第一损失函数的值调整降噪模型中的参数,并且重复执行步骤401-405,从而实现不断地调整降噪模型终端的参数,直至求得的第一损失函数小于预设阈值,即可确定已满足模型训练条件,得到目标降噪模型。该目标降噪模型即为训练好的降噪模型,能够用于后续的图像降噪。
可选的,在一个可能的实施例中,终端还可以是基于第一损失函数和第二损失函数对 降噪模型进行训练。
示例性地,在对降噪模型进行训练之前,终端可以将所述待降噪图像样本输入所述降噪模型,得到降噪模型所输出的第二目标图像。基于所述第一次随机下采样处理中像素的采样位置,终端对所述第二目标图像进行下采样处理,得到第一子目标图像。基于所述第二次随机下采样处理中像素的采样位置,终端对所述第二目标图像进行下采样处理,得到第二子目标图像。其中,终端基于所述第一次随机下采样处理中像素的采样位置,终端对所述第二目标图像进行下采样处理是指终端在对第二目标图像进行下采样处理时,采用与第一次随机下采样处理相同的方式来进行下采样处理。
以图5为例,第一次随机下采样处理的方式为:在第一个图像单元的左上角采集像素、在第二个图像单元的右上角采集像素、在第三个图像单元的右下角采集像素以及在第四个图像单元的左下角采集像素。那么,基于与第一次随机下采样处理相同的方式,终端可以基于相同的方式将第二目标图像划分为四个图像单元,并根据第一次随机下采样处理时所采集的像素在每个图像单元中的位置,在第二目标图像的四个图像单元分别采集对应的像素,从而得到第一子目标图像。也就是说,第一子目标图像与第一子图像中每一组对应的像素在下采样前的源图像中的位置都是相同的。
在实际应用中,终端在对待降噪图像样本执行第一次随机下采样处理之后,可以记录在执行第一次随机下采样处理时所采集的每个第一像素的位置。然后,终端基于每个第一像素所在的位置,确定第二目标图像中对应的位置,并在第二目标图像中对应的位置上采集像素作为第一子目标图像上的像素。此外,终端也可以是分别生成用于执行第一次随机下采样处理的第一采样器和用于执行第二次随机下采样处理的第二采样器,其中第一采样器和第二采样器进行下采样的方式是固定的。这样,终端可以基于第一采样器对第二目标图像进行下采样处理,得到第一子目标图像;以及,终端基于第二采样器对第二目标图像进行下采样处理,得到第二子目标图像。
终端根据所述第一目标图像、所述第二子图像、所述第一子目标图像和所述第二子目标图像,获取第二损失函数。该第二损失函数主要用于指示第一子目标图像和第二子目标图像之间的差异。
示例性地,假设第一子图像为g 1(y),第二子图像为g 2(y),降噪模型为f。第一子图像g 1(y)输入降噪模型f之后,降噪模型f所输出的图像为f(g 1(y))。第一子图像为g 1(f(y)),第二子图像为g 2(f(y))。第二损失函数的一个可能的示例如公式2所示。
loss2=||f(g 1(y))-g 2(y)-(g 1(f(y))-g 2(f(y)))|| p   公式2
其中,loss2为第二损失函数,f(g 1(y))为降噪模型f输入第一子图像g 1(y)后所输出的图像,g 2(y)为第二子图像,g 1(f(y))为第一子图像,g 2(f(y))为第二子图像,p为次方数。
可以理解的是,第二损失函数是对于下采样处理过程中两个子图像对应的位置不一致的修正项,其目的是约束降噪模型,使降噪模型不至于因为第一子图像和第二子图像对应的像素在待降噪图像样本中的位置不一致而生成过度平滑的图像。简单来说,经过两次随机下采样处理得到的第一子图像以及第二子图像所对应的干净图像并非是完全一样的,第一子图像以及第二子图像所对应的干净图像只是近邻相似的关系。在只用这两个子图像来 构造损失函数的基础上,降噪模型学习的降噪能力不仅仅会抹掉图像中的噪声,这两个子图像对应的真值图像(即干净图像)之间的差异也会被抹掉。即降噪模型的降噪能力过强,导致噪声图像中高频的细节信息也会被当做是噪声信息给处理掉了。因此,在第一损失函数的基础上,引入第二损失函数,能够保护噪声图像中的高频细节信息不被降噪模型去掉,保证降噪后的图像的分辨率。
最后,在计算得到第一损失函数和第二损失函数后,终端至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练,直至降噪模型满足模型训练条件。示例性地,终端可以将第一损失函数和第二损失函数进行相加,得到总损失函数,然后基于总损失函数对降噪模型进行训练。
可选的,第一损失函数以及第二损失函数还具有对应的权重系数,即终端基于第一损失函数以及第二损失函数对应的权重系数来计算总损失函数。也就是说,终端至少根据所述第一损失函数、第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
示例性地,基于第一损失函数、第一权重系数、第二损失函数和第二权重系数计算总损失函数的过程可以如公式3所示。
loss总a*loss1+b*loss2        公式3
其中,loss总为总损失函数,loss1为第一损失函数,loss2为第二损失函数,a为第一权重系数,b为第二权重系数。可选的,a与b之间的比例可以为1或1/2等数值。在实际应用过程中,可以根据实际的降噪需求来调整第一权重系数和/或第二权重系数。其中,第二权重系数越大,则降噪强度也越弱,降噪强度越弱,噪声则越多,但是图像的高频细节的损失也会越少;第二权重系数越小,则降噪强度也越强,降噪强度越强,噪声则越小,但是图像的高频细节的损失也会越多。因此,在实际应用中可以通过调整第一权重系数和/或第二权重系数来到达到降噪强度和高频细节的损失程度之间的平衡。
以上介绍了对降噪模型进行训练的过程。为了便于理解,以下将介绍基于随机下采样处理得到的子图像对,能够实现降噪模型的训练的原理。
在采用两张噪声图像构成样本对时,如果这两张噪声图像对应的干净图像是一致的,则可以基于大量这样的样本对训练得到具有降噪能力的降噪模型。可以理解的是,当用于训练降噪模型的样本很少时,降噪模型实际上学习到的是两种噪声模式的转换关系。当用于训练降噪模型的样本足够多的时候,由于样本中的噪声是随机的并且是在真值附近波动的,那么,站在最小化损失函数的角度来看,可以发现降噪模型是可以学习到干净无噪声的图像。因为降噪模型是不可能学习到某种噪声转换规律从而使得损失函数最小化的,因为噪声始终是随机的。降噪模型使得损失函数最小化的方式是将样本中随机波动的噪声转换为中间值,而大量的噪声的中间值实际上恰好为噪声对应的真值。因此,基于由噪声图像所构成的样本对,能够训练得到具有降噪能力的降噪模型。
为便于理解,以下将介绍具体的推导过程。
对于室内温度估计问题:假设,通过一定的方式获得了一系列的观测温度(y1,y2,y3,y4…)。那么,基于一系列的观测温度来求取真实的温度z,就可以建模为公式4。
Figure PCTCN2021131656-appb-000004
其中,argmin表示最小化损失函数L的期望,这个损失函数是关于z的一个函数。对于所有观测温度y,都要最小化损失。因此,可以把L看成一个以y为变量的概率分布,最小化所有样本的损失,其实就是最小化所有样本的损失的均值。如果距离度量方式是L2范数,那么真实温度z实际上就是y的均值。所以,真实温度z实际上即为y的均值,至于每一次的观测温度y(i)是什么,根本不重要,优化的目标是考虑所有观测的均值。
那么,对于图像降噪问题,假设输入噪声图像为x(i),输出的清晰图像为y(i)。那么,可以将图像降噪问题建模为公式5。
Figure PCTCN2021131656-appb-000005
在公式5中,θ实际上是降噪模型的权重参数。并且,x和y不是相互独立的,因此公式5可以转化为公式6。
Figure PCTCN2021131656-appb-000006
类似的,如果改变p(y/x)的分布,只要条件期望不变,最终求得的θ也不会变。所以,如果对标签y添加一个均值为0的高斯噪声作为扰动得到y’,y’即可以认为是带有噪声的期望值。也就是说,将噪声图像作为降噪模型的输出期望值,能够训练得到具有降噪能力的降噪模型。
基于类似的思想,本实施例中对同一张噪声图像执行两次随机下采样处理,得到两张子图像。这两张子图像即可用于模拟相同场景下带有随机噪声的两个图像,因此,基于下采样处理得到的子图像,同样能够实现降噪模型的训练。相较于采用对应于相同的干净图像的两张噪声图像作为样本对来训练降噪模型,本实施例通过对同一张噪声图像进行两次随机下采样处理来得到作为样本对的子图像,更容易获得样本对。因为在一些特别的场景或领域中,对应于相同的干净图像的两张噪声图像实际上是很难获取到的。例如,在终端拍照领域,由于动态场景下的物体不断在运动,因此在动态场景下连续获取两张噪声图像,且要求这两张噪声图像对应的干净图像是相同的,实际上是非常困难的。又例如,在医学图像领域,由于基于医学仪器获取人体部位对应的医学图像会对人体产生一定的负面影响,因此想要通过医学仪器连续获取两张噪声图像也是非常困难的。
为方便理解,以下将结合具体例子介绍本实施例所提供的降噪模型的训练方法。
可以参阅图8和图9,图8为本申请实施例提供的一种降噪模型的训练流程示意图;图9为本申请实施例提供的另一种降噪模型的训练流程示意图。如图8所示,降噪模型的训练流程包括步骤801-809。
步骤801,在样本集合中选取噪声图像y。
首先,终端获取包括有大量噪声图像的样本集合,并在样本集合中选择未用于训练的 噪声图像y。
步骤802,构造采样器g1和采样器g2。
本实施例中,终端可以预先构造采样器g1和采样器g2。采样器g1用于执行第一次随机下采样处理,以对噪声图像y进行下采样。采样器g2用于执行第二次随机下采样处理,以对噪声图像y进行第二次下采样。
步骤803,通过采样器g1和采样器g2对噪声图像y执行下采样处理,得到噪声子图g1(y)和噪声子图g2(y)。
具体地,通过采样器g1和采样器g2对噪声图像y进行随机下采样处理的过程可以参考上述步骤402,在此不再赘述。
步骤804,通过降噪模块对噪声子图g1(y)降噪,得到降噪后的图像f(g1(y))。
在得到噪声子图g1(y)之后,可以将噪声子图g1(y)输入降噪模块中,由降噪模块对噪声子图g1(y)进行降噪处理,得到降噪后的图像f(g1(y))。其中,降噪模块可以是上述的基于学习的降噪模型,例如卷积神经网络或基于稀疏特征表达的降噪模型。
步骤805,通过降噪模块对噪声图像y降噪,得到降噪后的图像f(y)。
步骤806,通过采样器g1和采样器g2对降噪后的图像f(y)执行下采样处理,得到降噪子图g1(f(y))和降噪子图g2(f(y))。
在得到降噪后的图像f(y),采用采样器g1和采样器g2对降噪后的图像f(y)执行下采样处理,分别得到降噪子图g1(f(y))和降噪子图g2(f(y))。
步骤807,计算损失函数。
其中,损失函数由第一损失函数、第二损失函数以及第一损失函数和第二损失函数对应的权重系数计算得到的。第一损失函数可以是基于降噪后的图像f(g1(y))和噪声子图g2(y)计算得到的;第二损失函数可以是基于降噪后的图像f(g1(y))、噪声子图g2(y)、降噪子图g1(f(y))和降噪子图g2(f(y))得到的。第一损失函数和第二损失函数的计算过程可以参考上述的步骤404,在此不再赘述。
步骤808,更新降噪模块的参数。
在计算得到损失函数之后,可以基于损失函数的值对降噪模块的参数进行适应性更新。
步骤809,判断降噪模块是否收敛。
基于损失函数的取值,判断降噪模块是否收敛,即降噪模块是否已经满足训练条件。如果降噪模块收敛,则认为降噪模块训练完毕,可以输出降噪模块作为训练好的降噪模块;如果降噪模块没有收敛,则转至继续执行步骤801-808,继续对降噪模块进行训练,直至降噪模块收敛。
为验证基于本实施例所提供的降噪模型的训练方法的效果,本实施例提供了多个实验来评估基于该降噪模型的训练方法所训练得到的降噪模型的降噪效果。
可以参阅图10,图10为本申请实施例提供的一种确定降噪模型的降噪效果的实验流程示意图。如图10所示,首先获取一批没有噪声的干净图像,然后在干净图像中随机合成噪声,得到一批噪声图像。然后将噪声图像输入至基于上述的训练方法训练好的降噪网络 中,得到降噪后的图像,即去噪图像。最后,基于去噪图像以及去噪图像对应的干净图像,计算两者之间的指标,确定去噪图像与干净图像之间的差距,从而确定降噪网络的降噪效果。
其中,合成噪声是一种常用来衡量降噪方法的手段,通过人工增加高斯噪声、泊松噪声等方式合成出含噪声的图像。本实施例中,采用了ImageNet数据集作为训练数据,包含五万张高清图片,覆盖了日常生活的大部分场景。同时,还选择了柯达图像(Kodak)数据集、非负邻域嵌入的低复杂度单图像超分辨率的(Set14)数据集、伯克利图像分割(BSD300)数据集作为测试集,以便于与其他降噪方法进行横向的比较。对于降噪网络,则是在PyTorch平台上构建了一个U-Net网络作为降噪网络。此外,为了评价输出结果的质量,以未添加噪声的图像作为干净图像,分别计算每一张测试图像的峰值信噪比(Peak signal-to-noise Ratio,PSNR),最后计算整个测试集的平均PSNR。
具体地,具体的实施步骤如下所示。
1、构造训练集,即从ImageNet验证集中提取五万张高清图像。构造测试集,即在Kodak数据集、Set14数据集、BSD300数据集中增加噪声。
2、构造降噪网络,本实施例采用UNet网络作为降噪网络。
3、构造下采样器、第一损失函数和第二损失函数,本实施例使用随机的下采样器。同时,采样的两张子图在同一位置上的像素在原图中对应的位置为相邻关系。其中,第一损失函数的权重系数设置为1,第二损失函数的权重系数设置为2。
4、训练网络:基于构造的训练集、降噪网络、采样器和损失函数,使用本实施例提供的降噪模型的训练方法将降噪网络训练到收敛。
5、使用训练好的降噪网络对测试集进行降噪,生成去噪图像。
6、计算测试集中的去噪图像与干净图像之间的PSNR。
为了验证本申请实施例所提供的降噪模型的训练方法的有效性,将本方法与当前主流的几个图像降噪方法进行了横向的比较。基于训练集训练了多个模型,之后在Kodak、BSD300和Set14数据集上进行测试,计算其PSNR。各种方法的对比结果如表1和图11所示,PSNR越高则表明降噪效果越好。图11为本申请实施例提供的多种降噪方法的降噪效果对比示意图。
表1
Figure PCTCN2021131656-appb-000007
Figure PCTCN2021131656-appb-000008
由表1可知,本申请实施例所提供的降噪模型的训练方法的降噪效果除了略微低于Noise2Noise方法的降噪效果之外,均要比其他方法的降噪效果要高。其中,Noise2Noise方法即为上文所述的需要基于多张具有相同干净图像的噪声图像来进行训练的方法。然而,在实际应用中,多张具有相同干净图像的噪声图像实际上是非常困难的。因此,相比于本申请实施例所提供的降噪模型的训练方法,Noise2Noise方法在大部分场景下都很难实现。
以上实验是通过在干净图像上合成噪声,来得到噪声图像以及噪声图像对应的干净图像。以下将以终端拍照的场景为例,对本申请实施例所提供的降噪模型的训练方法进行实验。
可以参阅图12,图12为本申请实施例提供的另一种确定降噪模型的降噪效果的实验流程示意图。如图12所示,首先通过相机传感器获取同一场景下的多张噪声图像,即图12中所示的噪声图像1、噪声图像2…噪声图像N。这多张噪声图像可以是相机在光线较差的场景下所采集的,因此这多张噪声图像都包括有明显的噪声。然后,从这多张噪声图像中随机选择一个噪声图像输入至基于上述的训练方法训练好的降噪网络中,得到降噪后的图像,即去噪图像。最后,基于去噪图像以及去噪图像对应的干净图像,计算两者之间的指标,确定去噪图像与干净图像之间的差距,从而确定降噪网络的降噪效果。其中,干净图像可以是通过对这多张噪声图像进行平均而求得的。由于相机传感器无法获取到干净图像,因此通过对多张噪声图像进行平均,可以获取到尽可能接近真实干净图像的图像。
具体地,具体的实施步骤如下所示。
1、构造训练集和测试集。本实施例中,使用基于真实手机的传感器构造的智能手机图像去噪数据集(Smartphone Image Denoising Dataset,SIDD)数据集。SIDD数据集是一个 使用手机的相机传感器在暗光场景下收集得到的真实场景数据集,其对应的干净图像由多张含噪声图像加权平均得到。
2、构造降噪网络,本实施例采用UNet网络模型作为降噪网络。
3、构造下采样器、第一损失函数和第二损失函数,本实施例使用随机的下采样器。同时,采样的两张子图在同一位置上的像素在原图中对应的位置为相邻关系。其中,第一损失函数和第二损失函数的权重系数均设置为1。
4、训练网络:基于构造的训练集、降噪网络、采样器和损失函数,使用本实施例提供的降噪模型的训练方法将降噪网络训练到收敛。
5、使用训练好的降噪网络对测试集进行降噪,生成去噪图像。
6、计算测试集中的去噪图像与干净图像之间的PSNR。
为了验证本申请实施例所提供的降噪模型的训练方法的有效性,将本方法与当前主流的几个图像降噪方法进行了横向的比较。基于SIDD数据集训练了多个模型,计算了SIDD验证数据和基准数据两个数据上的平均PSNR和结构相似度(structural similarity index measurement,SSIM)。最后,进行了指标的比较和真实降噪效果的比较。各种方法的对比结果如图13和图14所示,PSNR和SSIM越高则表明降噪效果越好。其中,图13为本申请实施例提供的多种降噪方法的降噪指标对比示意图;图14为本申请实施例提供的多种降噪方法的降噪效果对比示意图。由图13和图14可知,本申请实施例所提供的降噪模型的训练方法的PSNR和SSIM均高于其他的方法,即降噪效果比其他的方法好。
可以参阅图15,图15为本申请实施例提供的一种图像降噪方法的流程示意图。如图15所示,该图像降噪方法包括步骤1501-1502。
步骤1501,获取待降噪图像。
其中,该待降噪图像为实际应用过程中包括有噪声且需要进行降噪的噪声图像。例如,该待降噪图像可以为终端拍照得到的图像、医学图像或者监督图像。本实施例并不对待降噪图像的类型做具体限定。
步骤1502,将所述待降噪图像输入目标降噪模型,得到降噪后的图像。
其中,所述目标降噪模型是至少基于第一损失函数对降噪模型进行训练得到的,所述第一损失函数是基于第一目标图像和第二子图像获取到的,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异,所述第一目标图像是将第一子图像输入所述降噪模型后得到的,所述第一子图像和所述第二子图像是对待降噪图像样本分别执行第一次随机下采样处理以及第二次随机下采样处理后得到的,所述第一子图像与所述第二子图像的分辨率相同。简单来说,该目标降噪模型是基于上述实施例所述的降噪模型的训练方法训练得到的,具体训练过程可以参考步骤401-405的描述,此处不再赘述。
可选的,所述第一子图像是根据M个第一像素得到的,所述M个第一像素是在将所述待降噪图像样本划分为M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择得到的;
所述第二子图像是根据M个第二像素得到的,所述M个第二像素是在将所述待降噪 图像样本划分为所述M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择得到的。
可选的,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择得到的,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素得到的,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,所述目标降噪模型是至少基于所述第一损失函数和第二损失函数对所述降噪模型进行训练得到的,所述第二损失函数根据所述第一目标图像、所述第二子图像、第一子目标图像和第二子目标图像得到的,所述第一子目标图像是基于所述第一次随机下采样处理中像素的采样位置对第二目标图像进行下采样处理得到的,所述第二子目标图像是基于所述第二次随机下采样处理中像素的采样位置对所述第二目标图像进行下采样处理得到的,所述第二目标图像是将所述待降噪图像样本输入所述降噪模型得到的。
可选的,所述目标降噪模型是至少基于所述第一损失函数第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练得到的;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
可以参阅图16,图16为本申请实施例提供的一种模型训练装置的结构示意图。如图16所示,该模型训练装置包括:获取单元1601和处理单元1602。所述获取单元1601,用于从样本集合中获取待降噪图像样本,所述样本集合包括多个待降噪图像样本;所述处理单元1602,用于对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,所述第一子图像与所述第二子图像的分辨率相同;所述处理单元1602,还用于将所述第一子图像输入降噪模型,得到第一目标图像;所述处理单元1602,还用于根据所述第一目标图像和所述第二子图像获取第一损失函数,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异;所述处理单元1602,还用于至少根据所述第一损失函数对所述降噪模型进行训练,得到目标降噪模型。
可选的,在一种可能的实现方式中,所述处理单元1602,还用于:将所述待降噪图像样本划分为M个图像单元,所述M个图像单元中的每个图像单元包括n*n个像素;在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择,获得M个第一像素,并根据所述M个第一像素得到所述第一子图像;在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,获得M个第二像素,并根据所述M个第二像素得到所述第二子图像。
可选的,在一种可能的实现方式中,所述获取单元1601,还用于获取所述M个图像单元中的每个图像单元中的n*n-1个目标像素,所述n*n-1个目标像素为每个图像单元中在执 行像素的第一次随机选择时没有被选中的像素;所述处理单元1602,还用于在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,获得M个第二像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,在一种可能的实现方式中,所述处理单元1602,还用于在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素,获得M个第二像素,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,在一种可能的实现方式中,所述处理单元1602,还用于:将所述待降噪图像样本输入所述降噪模型,得到第二目标图像;基于所述第一次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第一子目标图像;基于所述第二次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第二子目标图像;根据所述第一目标图像、所述第二子图像、所述第一子目标图像和所述第二子目标图像,获取第二损失函数;至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练。
可选的,在一种可能的实现方式中,所述处理单元1602,还用于至少根据所述第一损失函数、第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
可以参阅图17,图17为本申请实施例提供的一种图像降噪装置的结构示意图。如图17所示,该图像降噪装置包括:获取单元1701和处理单元1702。所述获取单元1701,用于获取待降噪图像;所述处理单元1702,用于将所述待降噪图像输入目标降噪模型,得到降噪后的图像;其中,所述目标降噪模型是至少基于第一损失函数对降噪模型进行训练得到的,所述第一损失函数是基于第一目标图像和第二子图像获取到的,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异,所述第一目标图像是将第一子图像输入所述降噪模型后得到的,所述第一子图像和所述第二子图像是对待降噪图像样本分别执行第一次随机下采样处理以及第二次随机下采样处理后得到的,所述第一子图像与所述第二子图像的分辨率相同。
可选的,在一种可能的实现方式中,所述第一子图像是根据M个第一像素得到的,所述M个第一像素是在将所述待降噪图像样本划分为M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择得到的;所述第二子图像是根据M个第二像素得到的,所述M个第二像素是在将所述待降噪图像样本划分为所述M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择得到的。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择得到的,所述n*n-1个目标 像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素,所述M个第二像素与所述M个第一像素均不相同。
可选的,在一种可能的实现方式中,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素得到的,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数和第二损失函数对所述降噪模型进行训练得到的,所述第二损失函数根据所述第一目标图像、所述第二子图像、第一子目标图像和第二子目标图像得到的,所述第一子目标图像是基于所述第一次随机下采样处理中像素的采样位置对第二目标图像进行下采样处理得到的,所述第二子目标图像是基于所述第二次随机下采样处理中像素的采样位置对所述第二目标图像进行下采样处理得到的,所述第二目标图像是将所述待降噪图像样本输入所述降噪模型得到的。
可选的,在一种可能的实现方式中,所述目标降噪模型是至少基于所述第一损失函数第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练得到的;其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
可选的,在一种可能的实现方式中,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
接下来介绍本申请实施例提供的一种执行设备,请参阅图18,图18为本申请实施例提供的执行设备的一种结构示意图,执行设备1800具体可以表现为手机、平板、笔记本电脑、智能穿戴设备、服务器等,此处不做限定。其中,执行设备1800上可以部署有图18对应实施例中所描述的数据处理装置,用于实现图18对应实施例中数据处理的功能。具体的,执行设备1800包括:接收器1801、发射器1802、处理器1803和存储器1804(其中执行设备1800中的处理器1803的数量可以一个或多个,图18中以一个处理器为例),其中,处理器1803可以包括应用处理器18031和通信处理器18032。在本申请的一些实施例中,接收器1801、发射器1802、处理器1803和存储器1804可通过总线或其它方式连接。
存储器1804可以包括只读存储器和随机存取存储器,并向处理器1803提供指令和数据。存储器1804的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。存储器1804存储有处理器和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。
处理器1803控制执行设备的操作。具体的应用中,执行设备的各个组件通过总线***耦合在一起,其中总线***除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线***。
上述本申请实施例揭示的方法可以应用于处理器1803中,或者由处理器1803实现。 处理器1803可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1803中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1803可以是通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器,还可进一步包括专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。该处理器1803可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1804,处理器1803读取存储器1804中的信息,结合其硬件完成上述方法的步骤。
接收器1801可用于接收输入的数字或字符信息,以及产生与执行设备的相关设置以及功能控制有关的信号输入。发射器1802可用于通过第一接口输出数字或字符信息;发射器1802还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据;发射器1802还可以包括显示屏等显示设备。
本申请实施例中,在一种情况下,处理器1803,用于执行图4对应实施例中的执行设备执行的降噪模型的训练方法。
本申请实施例还提供了一种训练设备,请参阅图19,图19为本申请实施例提供的训练设备的一种结构示意图,具体的,训练设备1900由一个或多个服务器实现,训练设备1900可因配置或性能不同而产生比较大的差异,可以包括一个或一个以***处理器(central processing units,CPU)1919(例如,一个或一个以上处理器)和存储器1932,一个或一个以上存储应用程序1942或数据1944的存储介质1930(例如一个或一个以上海量存储设备)。其中,存储器1932和存储介质1930可以是短暂存储或持久存储。存储在存储介质1930的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对训练设备中的一系列指令操作。更进一步地,中央处理器1919可以设置为与存储介质1930通信,在训练设备1900上执行存储介质1930中的一系列指令操作。
训练设备1900还可以包括一个或一个以上电源1926,一个或一个以上有线或无线网络接口1950,一个或一个以上输入输出接口1958;或,一个或一个以上操作***1941,例如Windows Server TM,Mac OS X TM,Unix TM,Linux TM,FreeBSD TM等等。
具体的,训练设备可以执行图4对应的实施例中的步骤。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述执行设备所执行的步骤,或者,使得计算机执行如前述训练设备所执行的步骤。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用 于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述执行设备所执行的步骤,或者,使得计算机执行如前述训练设备所执行的步骤。
本申请实施例提供的执行设备、训练设备或终端设备具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使执行设备内的芯片执行上述实施例描述的数据处理方法,或者,以使训练设备内的芯片执行上述实施例描述的数据处理方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图20,图20为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 2000,NPU 2000作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路2003,通过控制器2004控制运算电路2003提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路2003内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路2003是二维脉动阵列。运算电路2003还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路2003是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器2002中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器2001中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)2008中。
统一存储器2006用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)2005,DMAC被搬运到权重存储器2002中。输入数据也通过DMAC被搬运到统一存储器2006中。
BIU为Bus Interface Unit即,总线接口单元2013,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)2009的交互。
总线接口单元2013(Bus Interface Unit,简称BIU),用于取指存储器2009从外部存储器获取指令,还用于存储单元访问控制器2005从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器2006或将权重数据搬运到权重存储器2002中或将输入数据数据搬运到输入存储器2001中。
向量计算单元2007包括多个运算处理单元,在需要的情况下,对运算电路2003的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元2007能将经处理的输出的向量存储到统一存储器 2006。例如,向量计算单元2007可以将线性函数;或,非线性函数应用到运算电路2003的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元2007生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路2003的激活输入,例如用于在神经网络中的后续层中的使用。
控制器2004连接的取指存储器(instruction fetch buffer)2009,用于存储控制器2004使用的指令;
统一存储器2006,输入存储器2001,权重存储器2002以及取指存储器2009均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据 中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (17)

  1. 一种降噪模型的训练方法,其特征在于,包括:
    从样本集合中获取待降噪图像样本,所述样本集合包括多个待降噪图像样本;
    对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,所述第一子图像与所述第二子图像的分辨率相同;
    将所述第一子图像输入降噪模型,得到第一目标图像;
    根据所述第一目标图像和所述第二子图像获取第一损失函数,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异;
    至少根据所述第一损失函数对所述降噪模型进行训练,得到目标降噪模型。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述待降噪图像样本执行第一次随机下采样处理以及第二次随机下采样处理,分别得到第一子图像和第二子图像,包括:
    将所述待降噪图像样本划分为M个图像单元,所述M个图像单元中的每个图像单元包括n*n个像素;
    在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择,获得M个第一像素,并根据所述M个第一像素得到所述第一子图像;
    在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,获得M个第二像素,并根据所述M个第二像素得到所述第二子图像。
  3. 根据权利要求2所述的方法,其特征在于,所述在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择,包括:
    获取所述M个图像单元中的每个图像单元中的n*n-1个目标像素,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素;
    在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,获得M个第二像素,所述M个第二像素与所述M个第一像素均不相同。
  4. 根据权利要求3所述的方法,其特征在于,在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择,包括:
    在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素,获得M个第二像素,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
  5. 根据权利要求1-4任意一项所述的方法,其特征在于,所述方法还包括:
    将所述待降噪图像样本输入所述降噪模型,得到第二目标图像;
    基于所述第一次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样处理,得到第一子目标图像;
    基于所述第二次随机下采样处理中像素的采样位置,对所述第二目标图像进行下采样 处理,得到第二子目标图像;
    根据所述第一目标图像、所述第二子图像、所述第一子目标图像和所述第二子目标图像,获取第二损失函数;
    所述至少根据所述第一损失函数对所述降噪模型进行训练,包括:
    至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练。
  6. 根据权利要求5所述的方法,其特征在于,所述至少根据所述第一损失函数和所述第二损失函数对所述降噪模型进行训练,包括:
    至少根据所述第一损失函数、第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练;
    其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
  7. 根据权利要求1-6任意一项所述的方法,其特征在于,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
  8. 一种图像降噪方法,其特征在于,包括:
    获取待降噪图像;
    将所述待降噪图像输入目标降噪模型,得到降噪后的图像;
    其中,所述目标降噪模型是至少基于第一损失函数对降噪模型进行训练得到的,所述第一损失函数是基于第一目标图像和第二子图像获取到的,所述第一损失函数用于指示所述第一目标图像和所述第二子图像之间的差异,所述第一目标图像是将第一子图像输入所述降噪模型后得到的,所述第一子图像和所述第二子图像是对待降噪图像样本分别执行第一次随机下采样处理以及第二次随机下采样处理后得到的,所述第一子图像与所述第二子图像的分辨率相同。
  9. 根据权利要求8所述的方法,其特征在于,所述第一子图像是根据M个第一像素得到的,所述M个第一像素是在将所述待降噪图像样本划分为M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第一次随机选择得到的;
    所述第二子图像是根据M个第二像素得到的,所述M个第二像素是在将所述待降噪图像样本划分为所述M个图像单元后,在所述M个图像单元中的每个图像单元中执行像素的第二次随机选择得到的。
  10. 根据权利要求9所述的方法,其特征在于,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中执行像素的随机选择得到的,所述n*n-1个目标像素为每个图像单元中在执行像素的第一次随机选择时没有被选中的像素,所述M个第二像素与所述M个第一像素均不相同。
  11. 根据权利要求10所述的方法,其特征在于,所述M个第二像素是通过在所述M个图像单元中的每个图像单元中的n*n-1个目标像素中,随机选择一个与第一次随机选择时所选中的像素相邻的第二像素得到的,所述M个第二像素中的每个第二像素均与对应的第一像素相邻。
  12. 根据权利要求8-11任意一项所述的方法,其特征在于,所述目标降噪模型是至少基于所述第一损失函数和第二损失函数对所述降噪模型进行训练得到的,所述第二损失函数根据所述第一目标图像、所述第二子图像、第一子目标图像和第二子目标图像得到的,所述第一子目标图像是基于所述第一次随机下采样处理中像素的采样位置对第二目标图像进行下采样处理得到的,所述第二子目标图像是基于所述第二次随机下采样处理中像素的采样位置对所述第二目标图像进行下采样处理得到的,所述第二目标图像是将所述待降噪图像样本输入所述降噪模型得到的。
  13. 根据权利要求12所述的方法,其特征在于,所述目标降噪模型是至少基于所述第一损失函数第一权重系数、所述第二损失函数和第二权重系数对所述降噪模型进行训练得到的;
    其中,所述第一权重系数用于指示所述第一损失函数的权重,所述第二权重系数用于指示所述第二损失函数的权重。
  14. 根据权利要求8-13任意一项所述的方法,其特征在于,所述降噪模型包括卷积神经网络或基于稀疏特征表达的降噪模型。
  15. 一种终端,其特征在于,包括存储器和处理器;所述存储器存储有代码,所述处理器被配置为执行所述代码,当所述代码被执行时,所述终端执行如权利要求1至14任一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,包括计算机可读指令,当所述计算机可读指令在计算机上运行时,使得所述计算机执行如权利要求1至14中任一项所述的方法。
  17. 一种计算机程序产品,其特征在于,包括计算机可读指令,当所述计算机可读指令在计算机上运行时,使得所述计算机执行如权利要求1至14任一项所述的方法。
PCT/CN2021/131656 2020-12-25 2021-11-19 一种降噪模型的训练方法及相关装置 WO2022134971A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011565423.X 2020-12-25
CN202011565423.XA CN112598597A (zh) 2020-12-25 2020-12-25 一种降噪模型的训练方法及相关装置

Publications (1)

Publication Number Publication Date
WO2022134971A1 true WO2022134971A1 (zh) 2022-06-30

Family

ID=75202186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131656 WO2022134971A1 (zh) 2020-12-25 2021-11-19 一种降噪模型的训练方法及相关装置

Country Status (2)

Country Link
CN (1) CN112598597A (zh)
WO (1) WO2022134971A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788843A (zh) * 2024-02-27 2024-03-29 青岛超瑞纳米新材料科技有限公司 一种基于神经网络算法的碳纳米管图像处理方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598597A (zh) * 2020-12-25 2021-04-02 华为技术有限公司 一种降噪模型的训练方法及相关装置
CN113177497B (zh) * 2021-05-10 2024-04-12 百度在线网络技术(北京)有限公司 视觉模型的训练方法、车辆识别方法及装置
CN113611318A (zh) * 2021-06-29 2021-11-05 华为技术有限公司 一种音频数据增强方法及相关设备
CN113362259B (zh) * 2021-07-13 2024-01-09 商汤集团有限公司 图像降噪处理方法、装置、电子设备及存储介质
CN113610731B (zh) * 2021-08-06 2023-08-08 北京百度网讯科技有限公司 用于生成画质提升模型的方法、装置及计算机程序产品
CN115565212B (zh) * 2022-01-20 2023-08-04 荣耀终端有限公司 图像处理方法、神经网络模型训练方法及装置
CN114783454B (zh) * 2022-04-27 2024-06-04 北京百度网讯科技有限公司 一种模型训练、音频降噪方法、装置、设备及存储介质
CN117274109B (zh) * 2023-11-14 2024-04-23 荣耀终端有限公司 图像处理方法、降噪模型训练方法及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190365341A1 (en) * 2018-05-31 2019-12-05 Canon Medical Systems Corporation Apparatus and method for medical image reconstruction using deep learning to improve image quality in position emission tomography (pet)
CN111598804A (zh) * 2020-05-12 2020-08-28 西安电子科技大学 基于深度学习的图像多级去噪方法
CN111598808A (zh) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及其训练方法
CN111882503A (zh) * 2020-08-04 2020-11-03 深圳高性能医疗器械国家研究院有限公司 一种图像降噪方法及其应用
CN111968058A (zh) * 2020-08-25 2020-11-20 北京交通大学 一种低剂量ct图像降噪方法
CN112598597A (zh) * 2020-12-25 2021-04-02 华为技术有限公司 一种降噪模型的训练方法及相关装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288547A (zh) * 2019-06-27 2019-09-27 北京字节跳动网络技术有限公司 用于生成图像去噪模型的方法和装置
CN110782421B (zh) * 2019-09-19 2023-09-26 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN111310903B (zh) * 2020-02-24 2023-04-07 清华大学 基于卷积神经网络的三维单分子定位***
CN111768349A (zh) * 2020-06-09 2020-10-13 山东师范大学 一种基于深度学习的espi图像降噪方法及***
CN111951195A (zh) * 2020-07-08 2020-11-17 华为技术有限公司 图像增强方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190365341A1 (en) * 2018-05-31 2019-12-05 Canon Medical Systems Corporation Apparatus and method for medical image reconstruction using deep learning to improve image quality in position emission tomography (pet)
CN111598804A (zh) * 2020-05-12 2020-08-28 西安电子科技大学 基于深度学习的图像多级去噪方法
CN111598808A (zh) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及其训练方法
CN111882503A (zh) * 2020-08-04 2020-11-03 深圳高性能医疗器械国家研究院有限公司 一种图像降噪方法及其应用
CN111968058A (zh) * 2020-08-25 2020-11-20 北京交通大学 一种低剂量ct图像降噪方法
CN112598597A (zh) * 2020-12-25 2021-04-02 华为技术有限公司 一种降噪模型的训练方法及相关装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788843A (zh) * 2024-02-27 2024-03-29 青岛超瑞纳米新材料科技有限公司 一种基于神经网络算法的碳纳米管图像处理方法
CN117788843B (zh) * 2024-02-27 2024-04-30 青岛超瑞纳米新材料科技有限公司 一种基于神经网络算法的碳纳米管图像处理方法

Also Published As

Publication number Publication date
CN112598597A (zh) 2021-04-02

Similar Documents

Publication Publication Date Title
WO2022134971A1 (zh) 一种降噪模型的训练方法及相关装置
US12008797B2 (en) Image segmentation method and image processing apparatus
CN112308200B (zh) 神经网络的搜索方法及装置
WO2021164234A1 (zh) 图像处理方法以及图像处理装置
WO2021164731A1 (zh) 图像增强方法以及图像增强装置
WO2021043273A1 (zh) 图像增强方法和装置
WO2022116856A1 (zh) 一种模型结构、模型训练方法、图像增强方法及设备
CN111402130B (zh) 数据处理方法和数据处理装置
CN113066017B (zh) 一种图像增强方法、模型训练方法及设备
WO2020177607A1 (zh) 图像去噪方法和装置
CN111914997B (zh) 训练神经网络的方法、图像处理方法及装置
WO2021063341A1 (zh) 图像增强方法以及装置
CN113065645B (zh) 孪生注意力网络、图像处理方法和装置
WO2021042774A1 (zh) 图像恢复方法、图像恢复网络训练方法、装置和存储介质
CN113191489B (zh) 二值神经网络模型的训练方法、图像处理方法和装置
CN113011562A (zh) 一种模型训练方法及装置
US20220157046A1 (en) Image Classification Method And Apparatus
WO2024002211A1 (zh) 一种图像处理方法及相关装置
US20230019851A1 (en) Methods and systems for high definition image manipulation with neural networks
WO2022111387A1 (zh) 一种数据处理方法及相关装置
WO2022100490A1 (en) Methods and systems for deblurring blurry images
CN114359289A (zh) 一种图像处理方法及相关装置
CN113066018A (zh) 一种图像增强方法及相关装置
CN115131256A (zh) 图像处理模型、图像处理模型的训练方法及装置
CN113284055A (zh) 一种图像处理的方法以及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908970

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21908970

Country of ref document: EP

Kind code of ref document: A1