WO2021063341A1 - 图像增强方法以及装置 - Google Patents

图像增强方法以及装置 Download PDF

Info

Publication number
WO2021063341A1
WO2021063341A1 PCT/CN2020/118721 CN2020118721W WO2021063341A1 WO 2021063341 A1 WO2021063341 A1 WO 2021063341A1 CN 2020118721 W CN2020118721 W CN 2020118721W WO 2021063341 A1 WO2021063341 A1 WO 2021063341A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
processed
enhancement
enhanced
Prior art date
Application number
PCT/CN2020/118721
Other languages
English (en)
French (fr)
Inventor
宋风龙
熊志伟
王宪
黄杰
查正军
Original Assignee
华为技术有限公司
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学技术大学 filed Critical 华为技术有限公司
Publication of WO2021063341A1 publication Critical patent/WO2021063341A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20192Edge enhancement; Edge preservation

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to an image enhancement method and device in the field of computer vision.
  • Computer vision is an inseparable part of various intelligent/autonomous systems in various application fields, such as manufacturing, inspection, document analysis, medical diagnosis, and military. It is about how to use cameras/video cameras and computers to obtain What we need is the knowledge of the data and information of the subject. To put it vividly, it is to install eyes (camera/camcorder) and brain (algorithm) on the computer to replace the human eye to identify, track and measure the target, so that the computer can perceive the environment. Because perception can be seen as extracting information from sensory signals, computer vision can also be seen as a science that studies how to make artificial systems "perceive" from images or multi-dimensional data.
  • computer vision is to use various imaging systems to replace the visual organs to obtain input information, and then the computer replaces the brain to complete the processing and interpretation of the input information.
  • the ultimate research goal of computer vision is to enable computers to observe and understand the world through vision like humans, and have the ability to adapt to the environment autonomously.
  • Image enhancement is an important branch in the field of image processing.
  • Image enhancement technology can improve image quality without re-collecting data to meet more practical application requirements.
  • image enhancement technology can purposely emphasize the overall or local characteristics of an image (or video), make an unclear image clear or emphasize certain features of interest, and expand the difference between the features of different objects in the image. , Suppress uninteresting features, make it improve the image quality, enrich the amount of information, strengthen the image interpretation and recognition effect, and meet the needs of some special analysis.
  • the present application provides an image enhancement method and device, which can enhance the performance of the image to be processed in terms of details, color, and brightness, thereby improving the effect of image enhancement processing.
  • an image enhancement method including: acquiring an image to be processed; performing feature enhancement processing on the image to be processed through a neural network to obtain enhanced image features of the image to be processed, and the neural network includes N A convolutional layer, where N is a positive integer; performing color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain an output image.
  • the above-mentioned image to be processed may be an original image with poor image quality; for example, it may mean that the image to be processed is affected by factors such as weather, distance, and shooting environment, and the acquired image to be processed is blurred or has low image quality. , Or image color and brightness are low and other issues.
  • the above-mentioned color enhancement processing can be used to improve the color distribution of the image to be processed, and increase the color saturation of the image to be processed;
  • the brightness enhancement processing can refer to adjusting the brightness of the image to be processed;
  • the feature enhancement processing can refer to enhancing the Details, so that the image includes more detailed information; for example, feature enhancement processing can refer to the feature enhancement of the image to be processed.
  • the image enhancement method provided by the embodiments of the application obtains the image enhancement feature of the image to be processed by performing feature enhancement processing on the image to be processed, and uses the image enhancement feature to further perform color enhancement processing and brightness enhancement processing on the image to be processed, thereby performing color enhancement processing. Enhancement and brightness enhancement can also enhance the details of the image to be processed, so that the performance of the image to be processed in terms of detail, color, and brightness can be enhanced, thereby enhancing the effect of image enhancement processing.
  • image enhancement can also be referred to as image quality enhancement.
  • image quality enhancement can refer to processing the brightness, color, contrast, saturation, and/or dynamic range of the image, so that the various indicators of the image meet the preset values. condition.
  • the performing the feature enhancement processing on the image to be processed through a neural network to obtain the enhanced image feature of the image to be processed includes:
  • the Lass enhancement algorithm performs the feature enhancement processing on the image to be processed to obtain the enhanced image feature of the image to be processed.
  • the feature enhancement of the image to be processed can be realized by the Laplacian enhancement algorithm, where the Laplacian enhancement algorithm can realize that the feature enhancement of the image to be processed does not introduce new textures, so that To a certain extent, the problem of introducing false textures into the output image after image enhancement processing can be avoided, and the effect of image enhancement processing can be improved.
  • the aforementioned Laplacian enhancement algorithm may enhance the high-frequency features in the image to be processed, so as to obtain the enhanced image features of the image to be processed.
  • the high-frequency features of the image may refer to the details, texture, and other information of the image to be processed.
  • the Laplacian enhancement algorithm is used to compare the i-th convolutional layer based on the residual characteristics of the i-th convolutional layer among the N convolutional layers.
  • the input image feature of the convolutional layer is subjected to the feature enhancement processing to obtain the enhanced image feature of the i-th convolutional layer, wherein the residual feature represents the input image feature of the i-th convolutional layer and the
  • the difference between the image features processed by the convolution operation in the i-th convolutional layer, and the enhanced image feature of the i-th convolutional layer is the input image feature of the i+1th convolutional layer ,
  • the input image feature is obtained according to the image to be processed, and i is a positive integer.
  • the Laplacian enhancement algorithm may be an improved Laplacian enhancement algorithm
  • the improved Laplacian enhancement algorithm according to the embodiment of the present application may be obtained by combining the image in the previous convolutional layer Features are used to enhance subsequent image features to achieve progressive enhancement of image features of different convolutional layers, which can improve the effect of image enhancement processing.
  • the enhanced image feature of the image to be processed is the image feature output by the Nth convolutional layer among the N convolutional layers, which is obtained by the following equation
  • the enhanced image feature of the image to be processed is the image feature output by the Nth convolutional layer among the N convolutional layers, which is obtained by the following equation
  • the enhanced image feature of the image to be processed is the image feature output by the Nth convolutional layer among the N convolutional layers, which is obtained by the following equation
  • the enhanced image feature of the image to be processed is the image feature output by the Nth convolutional layer among the N convolutional layers
  • L(F N ) represents the enhanced image feature of the Nth convolutional layer
  • F N represents the input image feature of the Nth convolutional layer
  • represents the convolution of the Nth convolutional layer Core
  • s l represents the scaling parameter obtained through learning
  • N is a positive integer.
  • Example embodiments of the present application can be replaced by the parameter s l learning enhancement algorithm conventional Laplacian fixed scaling factor s c; at the same time, the residual layer is characterized by contiguous enhanced features, may be used wherein the residual Indicates any information that needs to be emphasized. Therefore, the Laplacian algorithm of the embodiment of the present application can not only enhance the high-frequency information of the image, but also realize the progressive enhancement of the image features of different convolutional layers, thereby improving the effect of the image enhancement processing.
  • the performing color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain an output image includes: The enhanced image feature of the image is used to obtain the confidence image feature and the illumination compensation image feature of the image to be processed, where the confidence image feature is used to enhance the color of the image to be processed, and the illumination compensation image feature is used to The brightness of the to-be-processed image is enhanced; and the output image is obtained according to the to-be-processed image, the features of the confidence image, and the features of the illumination compensation image.
  • the above-mentioned confidence image feature may represent a mapping relationship or a mapping function for performing color enhancement processing on the image to be processed.
  • the feature of the confidence image can correspond to the image feature of the image to be processed.
  • an element in the feature of the confidence image can be used to indicate the zoom degree of the corresponding element in the image feature of the image to be processed;
  • the zooming of the area can realize the color enhancement of the image to be processed.
  • the above-mentioned enhanced image features of the image to be processed may include more detailed features and textures; performing color enhancement and brightness enhancement processing on the image to be processed according to the enhanced image features of the image to be processed can enable the output image to achieve detail enhancement, and at the same time It can also improve the brightness and color of the output image.
  • the confidence image feature used for color enhancement processing and the illumination compensation image feature used for brightness enhancement processing can be obtained by the enhanced image feature of the image to be processed, which is compared with the traditional zoom method.
  • the confidence image feature and the illumination compensation image feature in the embodiment of this application can not only perform color enhancement and brightness enhancement, but also achieve the processing The details of the image are enhanced to improve the effect of image enhancement processing.
  • the method further includes: obtaining the confidence image feature and the illumination compensation image feature by performing a convolution operation on the enhanced image feature of the image to be processed;
  • the image feature of the image to be processed is multiplied by the confidence image feature to obtain the color-enhanced image feature of the image to be processed; the color-enhanced image feature and the illumination compensation image feature are fused to obtain the output image .
  • the above-mentioned confidence image feature and the above-mentioned illumination compensation image feature can be obtained through the enhanced image feature of the image to be processed in parallel; for example, the enhanced image feature of the image to be processed can be performed through the first branch in the network model.
  • a convolution operation is performed to obtain the above-mentioned confidence image feature; the second branch in the network model is used to perform a convolution operation on the enhanced image feature of the image to be processed to obtain the above-mentioned illumination compensation image feature.
  • the fusion of the above-mentioned color-enhanced image feature and the illumination compensation image feature may refer to the addition of the color-enhanced image feature and the illumination compensation image feature.
  • an image enhancement method including: detecting a first operation for opening a camera by a user; in response to the first operation, displaying a shooting interface on the display screen, the shooting interface including a viewfinder A frame, the viewfinder frame includes a first image; a second operation instructed by the user to the camera is detected; in response to the second operation, a second image is displayed in the viewfinder frame, or in the electronic device Save a second image, the second image is obtained by performing color enhancement processing and brightness enhancement processing on the first image according to the enhanced image feature of the first image, and the enhanced image feature of the first image is obtained through neural It is obtained by performing feature enhancement processing on the first image by a network, and the neural network includes N convolutional layers, and N is a positive integer.
  • the foregoing specific process of performing feature enhancement processing on the first image may be obtained according to the foregoing first aspect and any one of the implementation manners of the first aspect.
  • the image enhancement method provided by the embodiments of this application can be applied to the field of photographing of smart terminals.
  • the image enhancement method of the embodiments of this application can perform processing on the original images with poor image quality acquired by the smart terminal.
  • the image enhancement process obtains an output image with improved image quality.
  • it can be an image enhancement process on the acquired original image when the smart terminal is taking a real-time photo, and the output image after the image enhancement process is displayed on the screen of the smart terminal, or , You can also save the output image after the image enhancement processing to the album of the smart terminal by performing image enhancement processing on the acquired original image.
  • an image enhancement method including: acquiring a road image to be processed; performing feature enhancement processing on the road image to be processed through a neural network to obtain enhanced image features of the road image to be processed, so
  • the neural network includes N convolutional layers, where N is a positive integer; according to the enhanced image characteristics, the road image to be processed is subjected to color enhancement processing and brightness enhancement processing to obtain a processed output road image; according to the processing After the output road screen, the information in the output road screen is recognized.
  • the foregoing specific process of performing feature enhancement processing on the road image may be obtained according to the foregoing first aspect and any one of the implementation manners of the first aspect.
  • the image enhancement method provided in the embodiment of the present application may be applied to the field of automatic driving.
  • it can be applied to the navigation system of an autonomous vehicle.
  • the image enhancement method in this application can enable the autonomous vehicle to perform image enhancement processing through the acquired original road images with lower image quality during the navigation process of the autonomous vehicle while driving on the road. Obtain the enhanced road picture, so as to realize the safety of self-driving vehicles.
  • an image enhancement method including: acquiring a street view image; performing feature enhancement processing on the street view image through a neural network to obtain enhanced image features of the street view image, the neural network including N convolutional layers , N is a positive integer; perform color enhancement processing and brightness enhancement processing on the street view image according to the enhanced image characteristics to obtain a processed output street view image; identify the output street view image according to the processed output street view image Information in.
  • the foregoing specific process of performing feature enhancement processing on the street view image may be obtained according to the foregoing first aspect and any one of the implementation manners of the first aspect.
  • the image enhancement method provided in the embodiment of the present application can be applied to the security field.
  • the image enhancement method of the embodiment of the present application can be applied to the surveillance image enhancement of a safe city.
  • the image (or video) collected by the surveillance equipment in public places is often affected by factors such as weather and distance, and the image is blurred. Problems such as low image quality.
  • the image enhancement method of the present application can perform image enhancement on the collected original images, so that important information such as license plate numbers and clear human faces can be recovered for public security personnel, and important clue information can be provided for case detection.
  • an image enhancement device which includes a module for executing the image enhancement method in any one of the foregoing first to fourth aspects and the first to fourth aspects.
  • an image enhancement device including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to execute: Processing an image; performing feature enhancement processing on the image to be processed to obtain an enhanced image feature of the image to be processed; performing color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain an output image.
  • the processor included in the above-mentioned image enhancement device is also used in the image enhancement method in any one of the above-mentioned first aspect to the fourth aspect and the first aspect to the fourth aspect.
  • a computer-readable medium stores program code for device execution, and the program code includes the program code for executing the first aspect to the fourth aspect and the first aspect to the fourth aspect.
  • the image enhancement method in any one of the implementations.
  • a computer program product containing instructions is provided.
  • the computer program product runs on a computer, the computer executes any one of the first to fourth aspects and the first to fourth aspects.
  • the image enhancement method in the implementation mode.
  • a chip in a ninth aspect, includes a processor and a data interface.
  • the processor reads instructions stored in a memory through the data interface, and executes the first to fourth aspects and the first aspect. To the image enhancement method in any one of the implementation manners of the fourth aspect.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute instructions stored on the memory.
  • the processor is configured to execute the image enhancement method in any one of the foregoing first aspect to the fourth aspect and the first aspect to the fourth aspect.
  • FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of the present application
  • Figure 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of still another application scenario provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of yet another application scenario provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a convolutional neural network structure provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a chip hardware structure provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image enhancement model provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a Laplacian enhancement unit and a hybrid enhancement unit provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of feature enhancement processing provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of color enhancement processing and brightness enhancement processing provided by an embodiment of the present application.
  • 15 is a schematic diagram of color enhancement processing and brightness enhancement processing provided by an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a visual quality evaluation result provided by an embodiment of the present application.
  • FIG. 17 is a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • FIG. 18 is a schematic diagram of a set of display interfaces provided by an embodiment of the present application.
  • FIG. 19 is a schematic diagram of another set of display interfaces provided by an embodiment of the present application.
  • FIG. 20 is a schematic block diagram of an image enhancement device provided by an embodiment of the present application.
  • FIG. 21 is a schematic diagram of the hardware structure of an image enhancement device provided by an embodiment of the present application.
  • the images in the embodiments of the present application may be static images (or referred to as static pictures) or dynamic images (or referred to as dynamic pictures).
  • the images in the present application may be videos or dynamic pictures, or the present application
  • the images in can also be static pictures or photos.
  • static images or dynamic images are collectively referred to as images.
  • Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • the "IT value chain” is the industrial ecological process from the underlying infrastructure and information (providing and processing technology realization) of human intelligence to the system, reflecting the value that artificial intelligence brings to the information technology industry.
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • the infrastructure can communicate with the outside through sensors, and the computing power of the infrastructure can be provided by smart chips.
  • the smart chip here can be a central processing unit (CPU), a neural-network processing unit (NPU), a graphics processing unit (GPU), and an application specific integrated circuit (application specific).
  • Hardware acceleration chips such as integrated circuit (ASIC) and field programmable gate array (FPGA).
  • the basic platform of infrastructure can include distributed computing framework and network and other related platform guarantees and support, and can include cloud storage and computing, interconnection networks, etc.
  • data can be obtained through sensors and external communication, and then these data can be provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, text, and IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • the above-mentioned data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other processing methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions based on intelligent information after reasoning, and usually provides functions such as classification, sorting, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
  • Fig. 2 is a schematic diagram of an application scenario of an image enhancement method provided by an embodiment of the present application.
  • the technical solution of the embodiment of the present application can be applied to a smart terminal.
  • the image enhancement method in the embodiment of the present application can perform image enhancement processing on an input image to obtain an output image of the input image after image enhancement.
  • the smart terminal may be mobile or fixed.
  • the smart terminal may be a mobile phone with image enhancement function, a tablet personal computer (TPC), a media player, a smart TV, a laptop computer, LC), personal digital assistant (PDA), personal computer (PC), camera, video camera, smart watch, augmented reality (AR)/virtual reality (VR), wearable Wearable device (WD) or self-driving vehicle, etc., which are not limited in the embodiment of the present application.
  • image enhancement can also be referred to as image quality enhancement. Specifically, it can refer to processing the brightness, color, contrast, saturation, and/or dynamic range of the image to make the image The indicators meet the preset conditions.
  • image enhancement and image quality enhancement have the same meaning.
  • Application scenario 1 Smart terminal camera field
  • the image enhancement method of the embodiment of the present application may be applied to the shooting of a smart terminal device (for example, a mobile phone).
  • the image enhancement method of the embodiment of the present application can perform image enhancement processing on the acquired original image of poor quality to obtain an output image with improved image quality.
  • the color image portion is represented by oblique line filling.
  • the image enhancement method of the embodiment of the present application can perform image enhancement processing on the acquired original image when the smart terminal is taking real-time photos, and display the output image after the image enhancement processing on the screen of the smart terminal.
  • the image enhancement method of the embodiment of the present application may be used to perform image enhancement processing on the acquired original image, and the output image after the image enhancement processing can be saved in the album of the smart terminal.
  • this application proposes an image enhancement method, which is applied to an electronic device with a display screen and a camera, including: detecting a first operation of a user to turn on the camera; in response to the first operation, A photographing interface is displayed on the display screen, the photographing interface includes a viewfinder frame, and the first image is included in the viewfinder frame; a second operation instructed by the user to the camera is detected; in response to the second operation, in the viewfinder Display a second image in a frame, or save a second image in the electronic device, where the second image is obtained by performing color enhancement processing and brightness enhancement processing on the first image according to the enhanced image characteristics of the first image
  • the enhanced image feature of the first image is obtained by performing feature enhancement processing on the first image through a neural network, and the neural network includes N convolutional layers, and N is a positive integer.
  • image enhancement method provided by the embodiment of the present application is also applicable to the extension, limitation, explanation and description of the related content of the image enhancement method in the following related embodiments in FIGS. 6 to 16, which will not be repeated here.
  • the image enhancement method of the embodiment of the present application can be applied to the field of automatic driving.
  • it can be applied to the navigation system of an autonomous vehicle.
  • the image enhancement method in this application can enable the autonomous vehicle to perform image enhancement processing through the acquired original road images with lower image quality during the navigation process of the autonomous vehicle while driving on the road. Obtain the enhanced road picture, so as to realize the safety of self-driving vehicles.
  • this application provides an image enhancement method.
  • the method includes: acquiring a road image to be processed; performing feature enhancement processing on the road image to be processed through a neural network to obtain an image of the road image to be processed
  • the neural network includes N convolutional layers, where N is a positive integer; according to the enhanced image features, perform color enhancement processing and brightness enhancement processing on the road image to be processed to obtain a processed output road image ; According to the processed output road screen, identify the information in the output road screen.
  • image enhancement method provided by the embodiment of the present application is also applicable to the extension, limitation, explanation and description of the related content of the image enhancement method in the following related embodiments in FIGS. 6 to 16, which will not be repeated here.
  • the image enhancement method of the embodiment of the present application can be applied to the security field.
  • the image enhancement method of the embodiment of the present application can be applied to the surveillance image enhancement of a safe city.
  • the image (or video) collected by the surveillance equipment in public places is often affected by factors such as weather and distance, and the image is blurred. Problems such as low image quality.
  • the image enhancement method of the present application can perform image enhancement on the collected pictures, so that important information such as license plate numbers and clear human faces can be recovered for public security personnel, and important clue information can be provided for case detection.
  • the present application provides an image enhancement method, the method includes: acquiring a street view image; performing feature enhancement processing on the street view image through a neural network to obtain an enhanced image feature of the street view image, and the neural network includes N convolutional layers, where N is a positive integer; perform color enhancement processing and brightness enhancement processing on the street view image according to the enhanced image characteristics to obtain a processed output street view image; according to the processed output street view image, identify Said outputting the information in the street view picture.
  • image enhancement method provided by the embodiment of the present application is also applicable to the extension, limitation, explanation and description of the related content of the image enhancement method in the following related embodiments in FIGS. 6 to 16, which will not be repeated here.
  • the image enhancement method of the embodiment of the present application may also be applied to a source enhancement scene.
  • a source enhancement scene For example, when using smart terminals (such as smart TVs, smart screens, etc.) to play movies, in order to display better image quality (picture quality), the original source of the movie can be enhanced by using the image enhancement method of the embodiment of the application. Processing to improve the image quality of the film source and obtain a better visual impression.
  • the image enhancement method of the embodiment of the application can be used to compare the old movies.
  • the source performs image enhancement processing, which can show the visual sense of modern movies.
  • the source of an old movie can be enhanced to a high-dynamic range (HDR) 10 or a high-quality video of the Dolby Vision (Dolby Vision) standard through the image enhancement method of the embodiment of the present application.
  • HDR high-dynamic range
  • Dolby Vision Dolby Vision
  • the present application provides an image enhancement method, which includes: obtaining an original image (for example, the original film source of a movie); performing feature enhancement processing on the original image through a neural network to obtain the original image
  • the neural network includes N convolutional layers, where N is a positive integer; according to the enhanced image features, the original image is subjected to color enhancement processing and brightness enhancement processing to obtain a processed output image (for example, enhanced Picture quality film source).
  • image enhancement method provided by the embodiment of the present application is also applicable to the extension, limitation, explanation and description of the related content of the image enhancement method in the following related embodiments in FIGS. 6 to 16, which will not be repeated here.
  • a neural network can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes x s and intercept 1 as inputs.
  • the output of the arithmetic unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • the DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated as far as the work of each layer is concerned. Simply put, it is the following linear relationship expression: among them, Is the input vector, Is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just the input vector After such a simple operation, the output vector is obtained Due to the large number of DNN layers, the coefficient W and the offset vector The number is also relatively large.
  • DNN The definition of these parameters in DNN is as follows: Take coefficient W as an example: Suppose in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third-level index 2 and the input second-level index 4.
  • the coefficient from the kth neuron of the L-1 layer to the jth neuron of the Lth layer is defined as
  • Convolutional neural network (convolutional neuron network, CNN) is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a sub-sampling layer.
  • the feature extractor can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can only be connected to a part of the neighboring neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels.
  • Sharing weight can be understood as the way of extracting image information has nothing to do with location.
  • the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network.
  • the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, and at the same time reduce the risk of overfitting.
  • Important equation taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference, then the training of the deep neural network becomes a process of reducing this loss as much as possible.
  • the neural network can use an error back propagation (BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, forwarding the input signal until the output will cause error loss, and the parameters in the initial neural network model are updated by backpropagating the error loss information, so that the error loss is converged.
  • the backpropagation algorithm is a backpropagation motion dominated by error loss, and aims to obtain the optimal parameters of the neural network model, such as the weight matrix.
  • Fig. 6 shows a system architecture 200 provided by an embodiment of the present application.
  • a data collection device 260 is used to collect training data.
  • the image enhancement model also referred to as an image enhancement network
  • the training data collected by the data collection device 260 can be training images.
  • the training data of the training image enhancement model in the embodiment of the present application may include original images and sample enhanced images.
  • the original image may refer to an image with lower image quality
  • the sample-enhanced image may refer to an image with higher image quality.
  • it may refer to one or more of the brightness, color, details, etc., relative to the sample image. The image after all aspects have been improved.
  • image enhancement can also be referred to as image quality enhancement, which can specifically refer to processing the brightness, color, contrast, saturation, and/or dynamic range of an image, so that various indicators of the image meet preset conditions .
  • image enhancement and image quality enhancement have the same meaning.
  • the data collection device 260 stores the training data in the database 230, and the training device 220 trains based on the training data maintained in the database 230 to obtain the target model/rule 201 (that is, the image enhancement model in the embodiment of the present application) .
  • the training device 220 inputs the training data to the image enhancement model until the difference between the predicted enhanced image output by the training image enhancement model and the sample enhanced image meets a preset condition (for example, the difference between the predicted enhanced image and the sample enhanced image is less than a certain threshold , Or predict that the difference between the enhanced image and the sample enhanced image remains unchanged or no longer decreases), thereby completing the training of the target model/rule 201.
  • a preset condition for example, the difference between the predicted enhanced image and the sample enhanced image is less than a certain threshold , Or predict that the difference between the enhanced image and the sample enhanced image remains unchanged or no longer decreases
  • the image enhancement model used to perform the image enhancement method in the embodiment of the present application can realize end-to-end training.
  • the image can be enhanced by the input image and the sample corresponding to the input image ( For example, true-value images) realize end-to-end training.
  • the target model/rule 201 is obtained by training an image enhancement model. It should be noted that in actual applications, the training data maintained in the database 230 may not all come from the collection of the data collection device 260, and may also be received from other devices.
  • the training device 220 does not necessarily perform the training of the target model/rule 201 completely based on the training data maintained by the database 230. It may also obtain training data from the cloud or other places for model training. The above description should not be used as a reference to this application. Limitations of the embodiment. It should also be noted that at least part of the training data maintained in the database 230 may also be used to execute the process of the processing to be processed by the execution device 210.
  • the target model/rule 201 trained according to the training device 220 can be applied to different systems or devices, such as the execution device 210 shown in FIG. 6, the execution device 210 may be a terminal, such as a mobile phone terminal, a tablet computer, Laptops, AR/VR, car terminals, etc., can also be servers or cloud, etc.
  • the execution device 210 may be a terminal, such as a mobile phone terminal, a tablet computer, Laptops, AR/VR, car terminals, etc., can also be servers or cloud, etc.
  • the execution device 210 is configured with an input/output (input/output, I/O) interface 212 for data interaction with external devices.
  • the user can input data to the I/O interface 212 through the client device 240.
  • the input data in this embodiment of the present application may include: a to-be-processed image input by the client device.
  • the preprocessing module 213 and the preprocessing module 214 are used to perform preprocessing according to the input data (such as the image to be processed) received by the I/O interface 212.
  • the preprocessing module 213 and the preprocessing module may not be provided. 214 (there may only be one of the preprocessing modules), and the calculation module 211 is directly used to process the input data.
  • the execution device 210 When the execution device 210 preprocesses input data, or when the calculation module 211 of the execution device 210 performs calculations and other related processing, the execution device 210 can call data, codes, etc. in the data storage system 250 for corresponding processing. , The data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 250.
  • the I/O interface 212 returns the processing result to the image enhancement image to be processed as described above, and returns the resulting output image to the client device 240 to provide the user.
  • the training device 220 can generate corresponding target models/rules 201 based on different training data for different goals or tasks, and the corresponding target models/rules 201 can be used to achieve the above goals or complete The above tasks provide users with the desired results.
  • the user can manually set input data, and the manual setting can be operated through the interface provided by the I/O interface 212.
  • the client device 240 can automatically send input data to the I/O interface 212. If the client device 240 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 240. The user can view the result output by the execution device 210 on the client device 240, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 240 can also be used as a data collection terminal to collect the input data of the input I/O interface 212 and the output result of the output I/O interface 212 as new sample data, and store it in the database 230 as shown in the figure. Of course, it is also possible not to collect through the client device 240. Instead, the I/O interface 212 directly uses the input data input to the I/O interface 212 and the output result of the output I/O interface 212 as a new sample as shown in the figure. The data is stored in the database 230.
  • FIG. 6 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 250 is an external memory relative to the execution device 210. In other cases, the data storage system 250 may also be placed in the execution device 210.
  • the target model/rule 201 is trained according to the training device 220.
  • the target model/rule 201 may be an image enhancement model in the embodiment of the present application.
  • the image enhancement model provided in the embodiment of the present application may be Deep neural network, convolutional neural network, or, it can be deep convolutional neural network, etc.
  • a convolutional neural network is a deep neural network with a convolutional structure. It is a deep learning architecture.
  • the deep learning architecture refers to the algorithm of machine learning. Multi-level learning is carried out on the abstract level of the system.
  • a convolutional neural network is a feed-forward artificial neural network, and each neuron in the feed-forward artificial neural network can respond to the input image.
  • the convolutional neural network 300 may include an input layer 310, a convolutional layer/pooling layer 320 (wherein the pooling layer is optional), a fully connected layer 330 and an output layer 340.
  • the input layer 310 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 320 and the fully connected layer 330 for processing, and the processing result of the image can be obtained.
  • the convolutional layer/pooling layer 320 may include layers 321-326, for example: in one implementation, layer 321 is a convolutional layer, layer 322 is a pooling layer, and layer 323 is a convolutional layer. Layers, 324 is a pooling layer, 325 is a convolutional layer, and 326 is a pooling layer; in another implementation, 321 and 322 are convolutional layers, 323 is a pooling layer, and 324 and 325 are convolutional layers. Layer, 326 is the pooling layer, that is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or can be used as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 321 can include many convolution operators.
  • the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix. This weight matrix is usually pre-defined. In the process of convolution on the image, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. And so on, it depends on the value of stride) to complete the process of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract the edge information of the image, another weight matrix is used to extract the specific color of the image, and the other weight matrix is used to correct the unwanted images in the image. The noise is blurred and so on.
  • the multiple weight matrices have the same size (row ⁇ column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are merged to form The output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 300 can make correct predictions. .
  • the initial convolutional layer (such as 321) often extracts more general features, and the general features can also be called low-level features; with the convolutional neural network With the 300-depth deepening, the features extracted by the subsequent convolutional layer (for example, 326) become more and more complex, for example, features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
  • the pooling layer can be a convolutional layer followed by a layer.
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the purpose of the pooling layer is to reduce the size of the image space.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 300 After processing by the convolutional layer/pooling layer 320, the convolutional neural network 300 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 320 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 300 needs to use the fully connected layer 330 to generate one or a group of required classes of output. Therefore, the fully connected layer 330 may include multiple hidden layers (331, 332 to 33n as shown in FIG. 7) and an output layer 340. The parameters contained in the multiple hidden layers can be based on specific task types. Relevant training data of, for example, the task type can include image enhancement, image recognition, image classification, image detection, and image super-resolution reconstruction, etc.
  • the output layer 340 After the multiple hidden layers in the fully connected layer 330, that is, the final layer of the entire convolutional neural network 300 is the output layer 340.
  • the output layer 340 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • the convolutional neural network shown in FIG. 7 is only used as an example of the structure of the image enhancement model of the embodiment of the present application.
  • the convolutional neural network used in the image enhancement method of the embodiment of the present application It can also exist in the form of other network models.
  • the image enhancement device may include the convolutional neural network 300 shown in FIG. 7, and the image enhancement device may perform image enhancement processing on the image to be processed to obtain a processed output image.
  • FIG. 8 is a hardware structure of a chip provided by an embodiment of the present application.
  • the chip includes a neural network processor 400 (neural-network processing unit, NPU).
  • the chip can be set in the execution device 210 as shown in FIG. 6 to complete the calculation work of the calculation module 211.
  • the chip can also be set in the training device 220 shown in FIG. 6 to complete the training work of the training device 220 and output the target model/rule 201.
  • the algorithms of each layer in the convolutional neural network as shown in FIG. 7 can be implemented in the chip as shown in FIG. 8.
  • the NPU 400 is mounted on a main central processing unit (CPU) as a coprocessor, and the main CPU allocates tasks.
  • the core part of the NPU 400 is the arithmetic circuit 403, and the controller 404 controls the arithmetic circuit 403 to extract data from the memory (weight memory or input memory) and perform calculations.
  • the arithmetic circuit 403 includes multiple processing units (process engines, PE). In some implementations, the arithmetic circuit 403 is a two-dimensional systolic array. The arithmetic circuit 403 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 403 is a general-purpose matrix processor.
  • the arithmetic circuit 403 fetches the data corresponding to matrix B from the weight memory 402 and caches it on each PE in the arithmetic circuit 403.
  • the arithmetic circuit 403 fetches the matrix A data and matrix B from the input memory 401 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in an accumulator 408 (accumulator).
  • the vector calculation unit 407 can perform further processing on the output of the arithmetic circuit 403, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and so on.
  • the vector calculation unit 407 can be used for network calculations in the non-convolutional/non-FC layer of the neural network, such as pooling, batch normalization, local response normalization, etc. .
  • the vector calculation unit 407 can store the processed output vector to the unified memory 406.
  • the vector calculation unit 407 may apply a nonlinear function to the output of the arithmetic circuit 403, such as a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 407 generates a normalized value, a combined value, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 403, for example for use in a subsequent layer in a neural network.
  • the unified memory 406 is used to store input data and output data.
  • the weight data directly passes through the storage unit access controller 405 (direct memory access controller, DMAC) to store the input data in the external memory into the input memory 401 and/or unified memory 406, and the weight data in the external memory into the weight memory 402 , And store the data in the unified memory 406 into the external memory.
  • DMAC direct memory access controller
  • the bus interface unit 410 (bus interface unit, BIU) is used to implement interaction between the main CPU, the DMAC, and the fetch memory 409 through the bus.
  • An instruction fetch buffer 409 (instruction fetch buffer) connected to the controller 404 is used to store instructions used by the controller 404.
  • the controller 404 is used to call the instructions cached in the instruction fetch memory 409 to control the working process of the computing accelerator.
  • the unified memory 406, the input memory 401, the weight memory 402, and the instruction fetch memory 409 are all on-chip (On-Chip) memories.
  • the external memory is a memory external to the NPU.
  • the external memory can be a double data rate synchronous dynamic random access memory.
  • Memory double data rate synchronous dynamic random access memory, DDR SDRAM), high bandwidth memory (HBM) or other readable and writable memory.
  • each layer in the convolutional neural network shown in FIG. 7 may be executed by the arithmetic circuit 403 or the vector calculation unit 407.
  • the execution device 210 in FIG. 6 introduced above can execute each step of the image enhancement method of the embodiment of the present application.
  • the CNN model shown in FIG. 7 and the chip shown in FIG. 8 can also be used to execute the image of the embodiment of the present application. Steps of the enhancement method.
  • FIG. 9 shows a system architecture 500 provided by an embodiment of the present application.
  • the system architecture includes a local device 520, a local device 530, an execution device 510, and a data storage system 550.
  • the local device 520 and the local device 530 are connected to the execution device 510 through a communication network.
  • the execution device 510 may be implemented by one or more servers.
  • the execution device 510 can be used in conjunction with other computing devices.
  • data storage for example: data storage, routers, load balancers and other equipment.
  • the execution device 510 may be arranged on one physical site or distributed on multiple physical sites.
  • the execution device 510 may use the data in the data storage system 550 or call the program code in the data storage system 550 to implement the image enhancement method of the embodiment of the present application.
  • execution device 510 may also be referred to as a cloud device, and in this case, the execution device 510 may be deployed in the cloud.
  • the execution device 510 may perform the following process: obtain an image to be processed; perform feature enhancement processing on the image to be processed through a neural network to obtain enhanced image features of the image to be processed, and the neural network includes N convolutions. Layer, N is a positive integer; perform color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image characteristics to obtain an output image.
  • the image enhancement method of the embodiment of the present application may be an offline method executed in the cloud.
  • the image enhancement method of the embodiment of the present application may be executed by the execution device 510 described above.
  • the image enhancement method in the embodiment of the present application may be executed by the local device 520 or the local device 530.
  • image enhancement can be performed on the acquired image to be processed with poor image quality, so as to obtain an output image with improved performance of the image to be processed in terms of image details, image color, and image brightness.
  • the user can operate respective user devices (for example, the local device 520 and the local device 530) to interact with the execution device 510.
  • Each local device can represent any computing device, for example, a personal computer, a computer workstation, a smart phone, a tablet computer, a smart camera, a smart car or other types of cellular phones, a media consumption device, a wearable device, a set-top box, a game console, etc.
  • the local device of each user can interact with the execution device 510 through a communication network of any communication mechanism/communication standard.
  • the communication network can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
  • the local device 520 and the local device 530 can obtain the relevant parameters of the target neural network from the execution device 510, deploy the target neural network on the local device 520 and the local device 530, and use the target neural network to perform image processing. Enhanced processing, etc.
  • the target neural network can be directly deployed on the execution device 510.
  • the execution device 510 obtains the image to be processed from the local device 520 and the local device 530, and performs image enhancement processing on the image to be processed according to the target neural network.
  • the above-mentioned target neural network may be the image enhancement model in the embodiment of the present application.
  • the camera imaging and video of smart terminals are limited by the hardware performance of the optical sensors of smart terminals, and the quality of photos and videos taken by them is still not high enough, and there are problems such as high noise, low resolution, lack of detail, and color cast.
  • Image (or picture) enhancement is the basis of various image processing applications, and computer vision often involves the issue of how to enhance the acquired images.
  • the existing image enhancement methods can be divided into two categories: the first category, scaling method, that is, by learning the mapping function from the original image or video to the target image or video, the pixels of the input image or video Or feature scaling (Scale) operation; the second category is Generative Method, which uses generative adversarial networks (GAN) to extract features from the input image or video, generate new elements, and reconstruct the output image Or video.
  • GAN generative adversarial networks
  • the embodiments of the present application provide an image enhancement method, by performing feature enhancement processing on the image to be processed, the image enhancement feature of the image to be processed is obtained, and the image enhancement feature is used to further perform color enhancement and brightness enhancement on the image to be processed Processing, so that when performing color enhancement and brightness enhancement, the details of the image to be processed can be enhanced, so that the performance of the image to be processed in terms of detail, color, and brightness can be enhanced, thereby improving the effect of image enhancement processing.
  • FIG. 10 shows a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • the image enhancement method shown in FIG. 10 may be executed by an image enhancement device, which specifically may be the execution device 210 in FIG. It may be the execution device 510 in FIG. 9 or a local device.
  • the method shown in FIG. 10 includes steps 610 to 630, and steps 610 to 630 are respectively described in detail below.
  • Step 610 Obtain an image to be processed.
  • the image to be processed may be an original image with poor image quality; for example, it may mean that the image to be processed is affected by factors such as weather, distance, and shooting environment, and the image to be processed is blurred, or the image quality is low, or the image Problems such as low color and brightness.
  • the above-mentioned image to be processed may be an image captured by an electronic device through a camera, or the above-mentioned image to be processed may also be an image obtained from the inside of the electronic device (for example, an image stored in an album of an electronic device, or an electronic device).
  • the electronic device may be any one of the local device or the execution device shown in FIG. 9.
  • Step 620 Perform feature enhancement processing on the image to be processed through the neural network to obtain the enhanced image feature of the image to be processed.
  • the neural network may include N convolutional layers, and N is a positive integer.
  • the feature enhancement processing mentioned above may refer to the feature enhancement of the details of the image to be processed.
  • the aforementioned neural network may refer to the feature extraction part in the image enhancement model shown in FIG. 11; the image enhancement model may include multiple neural networks, the feature extraction part may be the first neural network, and the feature reconstruction part may include the first neural network.
  • Neural network For example, the first neural network can be used to perform feature enhancement processing on the image to be processed to obtain the enhanced image feature of the image to be processed; the second neural network can be used to perform feature enhancement on the image to be processed according to the enhanced image feature. Perform color enhancement processing and brightness enhancement processing to obtain an output image.
  • FIG. 13 may be the first neural network
  • FIG. 14 may be the second neural network.
  • feature enhancement processing can refer to purposefully emphasizing the overall or local characteristics of the image, making the original unclear image clear or emphasizing certain features of interest, and expanding the difference between the features of different objects in the image , Suppress uninteresting features, improve image quality, enrich the amount of information, strengthen image interpretation and recognition effects, and meet the needs of recognition and analysis.
  • Step 630 Perform color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain an output image.
  • the color enhancement processing can be used to improve the color distribution of the image to be processed and increase the color saturation of the image to be processed.
  • the brightness enhancement processing may refer to adjusting the brightness of the image to be processed.
  • the output image may refer to the image obtained by performing image enhancement on the acquired image to be processed.
  • the image enhancement may be referred to as image quality enhancement, which may specifically refer to the brightness and color of the image. , Contrast, saturation, and/or dynamic range, etc., to make one or more indicators of the image meet preset conditions.
  • the feature enhancement processing of the image to be processed may be performed by using the Laplacian enhancement algorithm to obtain the enhanced image feature of the image to be processed.
  • the enhanced image feature of the image to be processed may refer to the enhanced image feature obtained by enhancing details or textures in the image to be processed.
  • the Laplacian enhancement algorithm can be used to achieve the enhancement of the detailed features in the image to be processed.
  • the Laplacian enhancement algorithm does not introduce new features when it is used to enhance the features of the image to be processed. Texture, thereby avoiding the problem of introducing pseudo texture in the output image obtained after image enhancement processing, and can improve the effect of image enhancement processing.
  • the aforementioned Laplacian enhancement algorithm may refer to a traditional Laplacian enhancement algorithm, which is an enhanced image feature obtained by fusing the original image feature of the image to be processed and the high frequency feature of the image to be processed.
  • I may represent the original image feature image to be processed
  • E may represent an enhanced image wherein image to be processed
  • h may represent the blur kernel
  • s c may represent a scaling factor constant values
  • Ih (I) may represent obtained to be processed The high-frequency characteristics of the image.
  • the high-frequency features of the image may refer to information such as details and texture of the image; the low-frequency features of the image may refer to the contour information of the image.
  • an improved Laplacian enhancement algorithm is proposed in the embodiment of the present application, and the improved Laplacian enhancement algorithm of the embodiment of the present application can be obtained by comparing the previous volume
  • the image features in the build-up layer are used to enhance subsequent image features to achieve the progressive enhancement of the image features of different convolutional layers.
  • the Laplacian enhancement algorithm proposed in this embodiment of the application can be used to compare the input image characteristics of the i-th convolutional layer according to the residual characteristics of the i-th convolutional layer in the N convolutional layers.
  • Perform feature enhancement processing to obtain the enhanced image feature of the i-th convolutional layer, where the residual feature can represent the input image feature of the i-th convolutional layer and the image feature processed by the convolution operation in the i-th convolutional layer
  • the difference between, the enhanced image feature of the i-th convolutional layer is the input image feature of the i+1th convolutional layer, the input image feature is obtained from the image to be processed, and i is a positive integer.
  • the enhanced image feature of the image to be processed may be the image feature output by the Nth convolutional layer among the N convolutional layers, and the enhanced image feature of the image to be processed is obtained by the following equation:
  • L(F N ) can represent the enhanced image feature of the Nth convolutional layer
  • F N can represent the input image feature of the Nth convolutional layer
  • can represent the convolution kernel of the Nth convolutional layer
  • s l can represent the parameters obtained through learning
  • N is a positive integer.
  • the Laplacian enhancement algorithm proposed in the embodiment of this application replaces the fixed scaling factor in the traditional Laplacian enhancement algorithm with learnable parameters; at the same time, the residual feature of the adjacent layer is used for feature enhancement, and the residual feature Can be used to represent any information that needs to be emphasized. Therefore, the Laplacian algorithm of the embodiment of the present application can not only enhance the high-frequency information of the image, but also can gradually enhance the image features of different convolutional layers, thereby improving the effect of image feature enhancement.
  • the color enhancement and brightness enhancement of the image to be processed can be performed through the acquired enhanced image features. Since the enhanced image features of the image to be processed include more detailed features and textures, the The enhanced image feature of the processed image Performing color enhancement and brightness enhancement processing on the image to be processed can enhance the details of the output image, and also improve the brightness and color of the output image.
  • performing color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain the output image may include: obtaining the image of the image to be processed according to the enhanced image feature of the image to be processed Confidence image feature and illumination compensation image feature, wherein the confidence image feature is used for color enhancement of the image to be processed, and the illumination compensation image feature is used for brightness enhancement of the image to be processed; The image, the confidence image feature, and the illumination compensation image feature are used to obtain the output image.
  • the above-mentioned confidence image feature may represent a mapping relationship or a mapping function for performing color enhancement processing on the image to be processed.
  • the feature of the confidence image can correspond to the image feature of the image to be processed.
  • an element in the feature of the confidence image can be used to indicate the zoom degree of the corresponding element in the image feature of the image to be processed;
  • the zooming of the area can realize the color enhancement of the image to be processed.
  • the confidence image feature used for color enhancement processing and the illumination compensation image feature used for brightness enhancement processing can be obtained by the enhanced image feature of the image to be processed, which is compared with the traditional zoom method.
  • the confidence image feature and the illumination compensation image feature in the embodiment of this application can not only perform color enhancement and brightness enhancement, but also achieve the processing The details of the image are enhanced to improve the effect of image enhancement processing.
  • the confidence image feature and the illumination compensation image feature can be obtained; by multiplying the image feature of the image to be processed and the confidence image feature Obtain the color-enhanced image feature of the image to be processed; fuse the color-enhanced image feature and the illumination compensation image feature to obtain the output image.
  • the above-mentioned confidence image feature and illumination compensation image feature are obtained based on the enhanced image feature of the image to be processed. Therefore, the confidence image feature will also enhance the image to be processed to a certain extent when it is used to enhance the color of the image to be processed.
  • the image features of the image; similarly, the illumination compensation image feature will also enhance the image feature of the image to be processed when it is used to enhance the brightness of the image to be processed, so as to achieve the enhancement of the details, color and brightness of the image to be processed.
  • the confidence image feature used for color enhancement processing and the image feature used for brightness enhancement processing are obtained through the enhanced image feature of the image to be processed.
  • the confidence image feature and the illumination compensation image feature in the embodiments of the present application can not only perform color enhancement and brightness enhancement, but also achieve detail enhancement of the image to be processed.
  • the above-mentioned confidence image feature and the above-mentioned illumination compensation image feature can be obtained through the enhanced image feature of the image to be processed in parallel; for example, the enhanced image feature of the image to be processed can be performed through the first branch in the network model.
  • a convolution operation is performed to obtain the above-mentioned confidence image feature; the second branch in the network model is used to perform a convolution operation on the enhanced image feature of the image to be processed to obtain the above-mentioned illumination compensation image feature.
  • the output image can be obtained from the image to be processed, the feature of the confidence image, and the feature of the illumination compensation image.
  • the image feature of the image to be processed (for example, the original image feature of the image to be processed) can be multiplied by the feature of the confidence image to realize the color enhancement of different regions in the image to be processed to obtain the color enhancement image feature; and then the color enhancement The image feature and the illumination compensation image feature are fused to obtain the output image after image enhancement processing.
  • the fusion of the aforementioned color-enhanced image feature and the illumination compensation image feature may refer to the addition of the color-enhanced image feature and the illumination compensation image feature to obtain an output image.
  • the output image after image enhancement processing can be obtained through the enhanced image feature, the confidence image feature, and the illumination compensation feature of the image to be processed.
  • the enhanced image feature of the image to be processed can be multiplied with the confidence image feature to realize the color enhancement of different areas in the image to be processed to obtain the color enhanced image feature; then the color enhanced image feature and the illumination compensation image feature are fused to Obtain the output image after image enhancement processing.
  • Fig. 11 is a schematic diagram of a model structure for image enhancement provided by an embodiment of the present application.
  • the model shown in FIG. 11 can be deployed in an image enhancement device that executes the above-mentioned image enhancement method.
  • the model shown in FIG. 11 may include four parts, namely an input part, a feature extraction part, a feature reconstruction part, and an output part.
  • the feature extraction part may include a Laplacian enhancing unit (LEU); the feature reconstruction part may include a hybrid enhancing module (HEM).
  • LEU Laplacian enhancing unit
  • HEM hybrid enhancing module
  • the image enhancement model shown in FIG. 11 may include multiple neural networks, the feature extraction part may be a first neural network, and the feature reconstruction part may include a second neural network.
  • the first neural network can be used to perform feature enhancement processing on the image to be processed to obtain the enhanced image feature of the image to be processed; the second neural network can be used to color the image to be processed according to the enhanced image feature Enhance processing and brightness enhancement processing to get the output image.
  • FIG. 13 may be the first neural network
  • FIG. 14 may be the second neural network.
  • the Laplacian enhancement unit can be embedded in the convolutional layer, and the extracted features can be enhanced by using the Laplacian enhancement unit; specifically, the Laplacian enhancement unit can be used to pass the Laplacian enhancement unit.
  • the Rass enhancement algorithm performs feature enhancement processing on the image to be processed.
  • the extracted features are gradually enhanced through several layers of Laplacian enhancement units, the image features of the previous layer can be used to enhance the image features of the next layer, and the residuals of the image features of the previous layer can be superimposed.
  • the image features of different convolutional layers are gradually enhanced to improve the performance of image enhancement.
  • a hybrid enhancement unit can be used to achieve the advantages of the zooming method and the generative method of image enhancement.
  • the hybrid enhancement unit may use the image feature of the input image processed by the Laplacian enhancement unit as the input data, that is, the hybrid enhancement unit uses the enhanced image feature of the output image output by the Laplacian unit as the input data. Input data.
  • the Laplacian enhancement unit can be used to perform feature enhancement processing on the output image through the Laplacian algorithm
  • the hybrid enhancement unit can be used to perform color enhancement processing and brightness enhancement processing on the output image.
  • the image enhancement model for performing the image enhancement method provided by the embodiment of the application shown in FIG. 11 may be called a hybrid progressive enhancing u-net (HPEU), which uses LEU and HEM, so that the receptive field increases layer by layer; among them, LEU can achieve feature enhancement processing on the output image based on more image information.
  • HPEU hybrid progressive enhancing u-net
  • the Laplacian enhancement unit can be used to perform different levels of enhancement processing on image features of different levels.
  • LEU in the shallow convolutional layer of the image enhancement model, LEU can be mainly used to perform local area feature enhancement processing based on the edge information of the input image; in the deep convolutional layer of the image enhancement model, the receptive field is larger, so LEU Can be used to enhance processing of global features.
  • the above-mentioned receptive field is a term in the field of deep neural network in the field of computer vision, and is used to indicate the size of the receptive range of neurons in different positions within the neural network to the original image.
  • Fig. 12 is a schematic diagram of a Laplacian enhancement unit and a hybrid enhancement unit provided by an embodiment of the present application. As shown in FIG. 12, it may include one or more Laplacian enhancement units and hybrid enhancement units.
  • the input image is extracted by the convolutional layer, and the extracted features are enhanced by the Laplacian enhancement unit to obtain the enhanced Further, the enhanced image features are used as the input data of the subsequent convolutional layer or the subsequent Laplacian enhancement unit layer; until the enhanced image features are input to the hybrid enhancement unit, the feature channel can be divided In two parts, the zoom component and the generated component are calculated separately, and then the two components are merged to obtain the final enhanced image.
  • the above-mentioned scaling component can be used to achieve color enhancement, and the degree of color enhancement in different areas of the input image can be constrained by the confidence map (also called the confidence image feature); the above-mentioned generating component can be used for illumination compensation, thereby achieving contrast and brightness enhancement .
  • FIG. 13 is a schematic diagram of the processing flow of the Laplacian enhancement unit proposed by an embodiment of the application (ie, a schematic diagram of feature enhancement processing).
  • the processing procedure of the Laplace enhancement unit includes the following steps:
  • Step a Suppose the current network (first neural network) comprises N layers convolution, the convolution output data of the N-th layer image feature F N, i.e. F N may be characterized as an enhanced input data processing.
  • Step 2 Extract the features of F N through the Nth convolutional layer, denoted as ⁇ (F N ).
  • Step 3 Calculate the residual error between ⁇ (F N ) and F N.
  • the residual error can be recorded as: ⁇ (F N )-F N , or F N - ⁇ (F N ) and pass the learnable parameter pair The above residual error is enhanced.
  • Step 4 Superimpose the above-mentioned enhancement processing residual error with ⁇ (F N ) to obtain the enhanced image feature obtained after the Nth convolutional layer is subjected to feature enhancement processing by the Laplacian enhancement unit.
  • the image features output by the Nth convolutional layer are obtained by the following equation:
  • L(F N ) can represent the enhanced image feature of the Nth convolutional layer
  • F N can represent the input image feature of the Nth convolutional layer
  • can represent the convolution kernel of the Nth convolutional layer
  • s l can represent the parameters obtained through learning, and can be used to represent the enhancement degree of LEU enhancement of image features each time.
  • the first neural network may include N convolutional layers, and the image feature output by the Nth convolutional layer may be the enhanced image feature of the image to be processed by performing feature enhancement processing on the image to be processed.
  • s 1 can represent a zoom parameter obtained through learning, and the zoom parameter can perform a zoom operation on different regions in the image, thereby achieving color enhancement of different regions in the image.
  • FIG. 14 is a schematic diagram of a processing flow of a hybrid enhancement unit proposed in an embodiment of the application (ie, a schematic diagram of color enhancement processing and brightness enhancement processing).
  • the processing procedure of the hybrid enhancement unit includes the following steps:
  • Step 1 Divide the multiple feature channels of the enhanced image features output by the Laplacian enhancement unit (that is, the enhanced image features output by the first neural network) into two branches, denoted as the first branch and the second branch .
  • Step 2 Perform a convolution operation on the first branch of the hybrid enhancement unit to obtain a confidence map, and then apply the confidence map to the input image to perform pixel-level scaling to obtain a scaling component.
  • the confidence map that is, the confidence image feature in the enhancement processing method shown in FIG. 10, is used to enhance the color of the input image.
  • the confidence map can represent a mapping relationship or a mapping function for performing color enhancement processing on an input image.
  • the zoom component is the color enhancement image feature in the enhancement processing method shown in FIG. 10.
  • the confidence map can correspond to the input image.
  • an element in the image feature of the confidence map can be used to indicate the zoom degree of the corresponding element in the image feature of the input image; by zooming different regions in the input image, Realize the color enhancement of the image to be input.
  • the confidence map may include several channels, which correspond to the channels of the input image.
  • applying the confidence map to the input image may refer to multiplying the confidence map and the input image, for example, multiplying the input image and the confidence map pixel by pixel.
  • Step 3 Perform a convolution operation through the second branch in the hybrid enhancement unit to obtain a generation component for illumination compensation.
  • the generating component is used to enhance the brightness of the input image; the generating component is the illumination compensation image feature in the enhancement processing method shown in FIG. 10.
  • Step 4 Perform fusion processing on the image features of the two branches to obtain an output image after image enhancement processing.
  • the N channels can be divided into two branches, where the M channel features are obtained by convolution Confidence map, the confidence map can be used to calculate the zoom component, M channels can correspond to the R, G, B three channels of the image; the other NM channels can be generated through convolution operation to generate components, and the generated components are used for illumination compensation. Contrast and brightness enhancement.
  • a convolution operation can be performed on a channel feature to obtain the illumination compensation image feature used for illumination enhancement processing.
  • the confidence image feature and the illumination compensation image feature of the input image can be obtained according to the enhanced image feature of the hybrid enhancement module and the input image
  • the hybrid enhancement module includes a first branch and a second branch.
  • the first branch is used to obtain the confidence image feature based on the enhanced image
  • the second branch is used to obtain the illumination compensation image feature based on the enhanced image feature
  • the confidence image feature can be used to represent the mapping relationship for color enhancement of the image to be processed (for example, mapping Function)
  • the illumination compensation image feature can be used to enhance the brightness of the input image
  • the output image is obtained according to the input image, the confidence image feature, and the illumination compensation image feature.
  • FIG. 15 is a schematic diagram of a processing flow of a hybrid enhancement unit provided by an embodiment of the present application.
  • the scale of the image to be processed and the confidence map can be the same, and different regions of the input image can be scaled to different degrees through the confidence map to achieve color enhancement; and then the overlay generation component is used to achieve illumination compensation and achieve contrast and brightness
  • the enhancement of the input image will finally get the output image after the feature enhancement, color enhancement and brightness enhancement of the input image.
  • the above confidence map may represent a mapping relationship or a mapping function for performing color enhancement processing on the image to be processed.
  • the confidence map can correspond to the image to be processed.
  • an element in the image feature of the confidence map can be used to indicate the zoom degree of the corresponding element in the image feature of the image to be processed; Scaling can realize the color enhancement of the image to be processed.
  • the confidence map used for color enhancement processing and the illumination compensation image feature used for brightness enhancement processing may be obtained through the enhanced image feature of the image to be processed, which is directly compared to the traditional zoom method.
  • the color and brightness of each pixel in the image to be processed are enhanced by a mapping function.
  • the confidence image feature and the illumination compensation image feature in the embodiments of the present application can not only perform color enhancement and brightness enhancement, but also realize the image to be processed The details are enhanced.
  • Table 1 is the quantitative performance evaluation result on the MIT-Adobe FiveK data set provided by the embodiment of the application.
  • Baseline represents the basic model architecture, that is, it can be a Unet model that does not include the above-mentioned hybrid enhancement unit (HEM) and the above-mentioned Laplacian enhancement unit (LEU); +LEU represents the image enhancement Unet that includes the above-mentioned Laplacian enhancement unit Model; +HEM represents the image enhancement Unet model including the above-mentioned hybrid enhancement unit; peak signal-to-noise ratio (PSNR) is usually used as a measurement method of signal reconstruction quality in image processing and other fields. The mean square error is defined.
  • HEM hybrid enhancement unit
  • LEU Laplacian enhancement unit
  • PSNR peak signal-to-noise ratio
  • Multi-scale structural similarity index can be used to measure the similarity of two images, and used to evaluate the quality of the output image processed by the algorithm.
  • the structural similarity index defines structural information as an attribute that reflects the structure of objects in the scene independent of brightness and contrast, and models distortion as a combination of three different factors: brightness, contrast, and structure. For example, use the mean as an estimate of brightness, standard deviation as an estimate of contrast, and covariance as a measure of structural similarity.
  • Time is used to represent the time for the model to perform image enhancement on the input image. Parameters can be used to describe the parameters included in the neural network and to evaluate the size of the model.
  • Table 2 is the quantitative performance evaluation results of different image enhancement models provided by the embodiments of the present application on the data set.
  • data set 1 can represent the MIT-Adobe FiveK data set
  • data set 2 can represent the DPED-iPhone data set
  • the tested models include: digital camera photo enhancer (Weakly Supervised Photo Enhancer for Digital camera, WESPE), context fusion network (context aggregation network, CAN), regional scale scaling global U-shaped network (range scaling global Unet, RSGUnet).
  • the PSNR and SSIM of the image enhancement model proposed in this application on the DPED-iPhone data set are not as good as the WESPE model, which is mainly caused by the non-pixel-level alignment of the DPED-iPhone data set.
  • the DPED-iPhone data set a pair of images obtained by using a SLR camera and a mobile phone in the same scene and shooting at the same angle at the same time. Because the sensors of the SLR camera and the mobile phone are different, the images captured by the SLR camera and the graphics captured by the camera are not sequentially Pixel alignment leads to relative deviations in HPEU during downsampling.
  • the image enhancement model (HPEU) proposed in the embodiment of this application has the highest PSNR and SSIM, and the visual effect is consistent with the true value ( Ground Truth) is closer.
  • Table 3 is the quantitative performance evaluation result of the MIT-Adobe FiveK data set by adding Guided Filter to the model provided by the embodiment of the application.
  • the tested models include: a trainable guided filter model (GDF), a high dynamic range network model (high dynamic range, HDR), and an image enhancement model (HPEU) provided in an embodiment of the present application with guided filtering added.
  • GDF trainable guided filter model
  • HDR high dynamic range network model
  • HPEU image enhancement model
  • Table 4 is the evaluation result of the image enhancement processing running time provided by the embodiment of the present application. It can be seen from the running time evaluation results shown in Table 4 that under the same processing conditions, the image enhancement model (HPEU) model proposed in the embodiment of the present application has the shortest running time, and the calculation efficiency of HPEU+Guided Filter is higher. After the Guided Filter is added to the HPEU model, the speed of HPEU+Guided Filter is faster than the GDF model and the HDR model, and the objective quantitative indicators are similar.
  • HPEU image enhancement model
  • Table 5 is the quantitative evaluation result based on the DPED-iphone data set after introducing the visual perception loss provided by the embodiment of the present application. Among them, it includes objective evaluation indicators PSNR, SSIM, and perceptual evaluation indicators (Perceptual Index).
  • Table 6 is the training of each network model based on the MIT-Adobe FiveK data set provided by the embodiments of this application, and the data set A (DPED-iphone), the data set B (DPED-Sony) and the data set C (DPED-Blackberry) are used as tests Data set, used to verify the generalization ability of the model.
  • FIG. 16 is a schematic diagram of a visual quality evaluation result provided by an embodiment of the present application. It should be noted that in FIG. 16, in order to be distinguished from the gray-scale image portion, the color image portion is indicated by hatching.
  • Fig. 16(a) represents the input image (for example, the image to be processed);
  • Fig. 16(b) represents the predicted output image obtained by using the basic model.
  • the basic model may not include the aforementioned hybrid enhancement unit (HEM) and the aforementioned The Unet model of the Laplacian enhancement unit (LEU);
  • Figure 16(c) shows the predicted output image obtained using the Unet model of the Laplacian enhancement unit;
  • Figure 16(d) shows the Unet model using the above-mentioned hybrid enhancement unit
  • Figure 16(e) represents the predicted output image obtained by using the model of the embodiment of the application (for example, the model shown in Figure 11 or Figure 12);
  • Figure 16(f) represents the input image Corresponding to the ground truth image (Ground Truth), the ground truth image can represent the sample enhanced image corresponding to the input image;
  • Figure 16 (e) represents the error graph 1, that is, the residual between the predicted output image output by the basic model and the true value image
  • Figure 16(j) represents the error Figure 4, that is, the predicted output image output by the model of the embodiment of the application is compared with The residual between the ground truth images.
  • FIG. 17 is a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • the method 700 shown in FIG. 17 includes steps 710 to 740, and steps 710 to 740 are described in detail below.
  • Step 710 Detect the first operation used by the user to turn on the camera.
  • Step 720 In response to the first operation, display a shooting interface on the display screen, and display a shooting interface on the display screen.
  • the shooting interface includes a viewfinder frame, and the viewfinder frame includes a first image.
  • the user's shooting behavior may include a first operation of the user to turn on the camera; in response to the first operation, displaying a shooting interface on the display screen.
  • FIG. 18 shows a graphical user interface (GUI) of the mobile phone, and the GUI is the desktop 810 of the mobile phone.
  • GUI graphical user interface
  • the electronic device detects that the user clicks on the icon 820 of the camera application (application, APP) on the desktop 810, it can start the camera application and display another GUI as shown in (b) in FIG. 18, which can be called It is the shooting interface 830.
  • the shooting interface 830 may include a viewing frame 840. In the preview state, the preview image can be displayed in the viewfinder frame 840 in real time.
  • the color image portion is represented by oblique line filling.
  • a first image may be displayed in the view frame 840, and the first image is a color image.
  • the shooting interface may also include a control 850 for indicating the shooting mode, and other shooting controls.
  • the user's shooting behavior may include a first operation of the user to turn on the camera; in response to the first operation, displaying a shooting interface on the display screen.
  • the shooting interface may include a viewfinder frame. It is understandable that the size of the viewfinder frame may be different in the photo mode and the video mode.
  • the viewfinder frame may be the viewfinder frame in the photographing mode. In the video mode, the viewfinder frame can be the entire display screen.
  • the preview state that is, before the user turns on the camera and does not press the photo/video button, the preview image can be displayed in the viewfinder in real time.
  • the preview image may be a color image
  • the preview image may be an image displayed when the camera is set to an automatic photographing mode.
  • Step 730 It is detected that the second operation of the camera instructed by the user is detected.
  • the second operation instructing the first processing mode by the user may be detected.
  • the first processing mode may be a professional shooting mode (for example, an image enhancement shooting mode).
  • the shooting interface includes a shooting option 860.
  • the electronic device displays a shooting mode interface.
  • the mobile phone enters the professional shooting mode, for example, the mobile phone performs image enhancement shooting mode.
  • a second operation used by the user to instruct shooting may be detected, and the second operation is to shoot a long-distance object, or to shoot a tiny object, or when the shooting environment is poor. Used to indicate shooting operations.
  • the second operation for instructing shooting may be detected, that is, the operation shown in (a) and (b) of FIG. 19 may not be performed, and the operation shown in (a) and (b) of FIG. 19 may not be performed.
  • the second operation 870 of instructing shooting shown in (c) of FIG. 19 is performed.
  • the second operation used by the user to instruct the shooting behavior may include pressing the shooting button in the camera of the electronic device, or may include the user equipment instructing the electronic device to perform the shooting behavior through voice, or may also include other instructions from the user.
  • the device performs the shooting behavior.
  • Step 740 In response to the second operation, display a second image in the viewing frame, or save the second image in the electronic device, where the second image is the first image captured by the camera.
  • the enhanced image feature of an image is obtained by performing color enhancement processing and brightness enhancement processing on the first image
  • the enhanced image feature of the first image is obtained by performing feature enhancement processing on the first image through a neural network, so
  • the neural network includes N convolutional layers, and N is a positive integer.
  • the process of performing the image enhancement method on the first image may refer to the image enhancement method shown in FIG. 10, and the image enhancement model for executing the image enhancement method may adopt the model shown in FIG. 11.
  • the second image is displayed in the viewing frame in (d) of FIG. 19
  • the first image is displayed in the viewing frame in (c) of FIG. 19
  • the second image is displayed in the viewing frame of FIG. 19 (c).
  • the content of the first image is the same or substantially the same, but the quality of the second image is better than that of the first image.
  • the detail display of the second image is better than that of the first image; or, the brightness of the second image is better than that of the first image; or, the brightness of the second image is better than that of the first image.
  • the second image shown in (d) of FIG. 19 may not be displayed in the viewfinder, but the second image may be saved in the photo album of the electronic device.
  • the enhanced image feature of the first image is obtained by performing the feature enhancement processing on the first image according to a Laplacian enhancement algorithm.
  • the Laplacian enhancement algorithm is used to compare the i-th convolutional layer according to the residual characteristics of the i-th convolutional layer among the N convolutional layers.
  • the feature enhancement process is performed on the input image feature of the buildup layer to obtain the enhanced image feature of the i-th convolutional layer, wherein the residual feature represents the input image feature of the i-th convolutional layer and the first
  • the difference between the image features processed by the convolution operation in the i convolutional layers, the enhanced image feature of the i-th convolutional layer is the input image feature of the i+1th convolutional layer, and the input The image feature is obtained based on the first image, and i is a positive integer.
  • the enhanced image feature of the first image is an image feature output by the Nth convolutional layer among the N convolutional layers, and the enhanced image feature of the first image
  • the image features are obtained by the following equation:
  • L(F N ) represents the enhanced image feature of the Nth convolutional layer
  • F N represents the input image feature of the Nth convolutional layer
  • represents the convolution of the Nth convolutional layer Core
  • s l represents the scaling parameter obtained through learning.
  • the output image is obtained according to the first image, the characteristic of the confidence image, and the characteristic of the illumination compensation image, and the characteristic of the confidence image and the characteristic of the illumination compensation image are obtained according to the characteristic of the illumination compensation image.
  • the enhanced image feature of the first image is obtained, the confidence image feature is used for color enhancement of the first image, and the illumination compensation image feature is used for brightness enhancement of the first image.
  • the output image is obtained by fusing the features of the color-enhanced image with the features of the illumination compensation image, and the feature of the color-enhanced image is based on the image of the first image
  • the feature is obtained by multiplying the feature of the confidence image
  • the feature of the confidence image is obtained by convolution operation on the enhanced image feature of the first image
  • the feature of the illumination compensation image is obtained by comparing the feature of the first image.
  • Enhanced image features are obtained by convolution operation.
  • FIG. 20 is a schematic block diagram of an image enhancement device provided by an embodiment of the present application. It should be understood that the image enhancement device 900 may execute the image enhancement method shown in FIG. 10.
  • the image enhancement device 900 includes: an acquisition unit 910 and a processing unit 920.
  • the acquisition unit 910 is configured to acquire an image to be processed;
  • the processing unit 920 is configured to perform feature enhancement processing on the image to be processed through a neural network to obtain enhanced image features of the image to be processed, and
  • the neural network includes N convolutional layers, where N is a positive integer; performing color enhancement processing and brightness enhancement processing on the image to be processed according to the enhanced image feature to obtain an output image.
  • the processing unit 920 is specifically configured to:
  • the feature enhancement processing is performed on the image to be processed by the Laplacian enhancement algorithm to obtain the enhanced image feature of the image to be processed.
  • the Laplacian enhancement algorithm is used to input the i-th convolutional layer according to the residual characteristics of the i-th convolutional layer in the N convolutional layers
  • the image feature is subjected to the feature enhancement processing to obtain the enhanced image feature of the i-th convolutional layer, wherein the residual feature represents the input image feature of the i-th convolutional layer and the i-th convolution
  • the difference between the image features processed by the convolution operation in the layer, the enhanced image feature of the i-th convolutional layer is the input image feature of the i+1th convolutional layer, the input image feature It is obtained from the image to be processed, and i is a positive integer less than or equal to N.
  • the enhanced image feature of the image to be processed is the image feature output by the Nth convolutional layer among the N convolutional layers, and the processing unit 920 is specifically configured to:
  • the enhanced image feature of the image to be processed is obtained by the equation:
  • L(F N ) represents the enhanced image feature of the Nth convolutional layer
  • F N represents the input image feature of the Nth convolutional layer
  • represents the convolution of the Nth convolutional layer Core
  • s l represents the parameter obtained through learning.
  • the processing unit 920 is specifically configured to:
  • the confidence image feature and the illumination compensation image feature of the image to be processed are obtained, wherein the confidence image feature is used for color enhancement of the image to be processed, and the illumination compensation The image feature is used to enhance the brightness of the image to be processed;
  • the output image is obtained according to the image to be processed, the characteristic of the confidence image, and the characteristic of the illumination compensation image.
  • processing unit 920 is further configured to:
  • the color-enhanced image feature and the illumination compensation image feature are fused to obtain the output image.
  • the enhancement device 900 shown in FIG. 20 may also be used to perform the image enhancement methods shown in FIG. 17 to FIG. 19.
  • image enhancement device 900 is embodied in the form of a functional unit.
  • unit herein can be implemented in the form of software and/or hardware, which is not specifically limited.
  • a "unit” can be a software program, a hardware circuit, or a combination of the two that realizes the above-mentioned functions.
  • the hardware circuit may include an application specific integrated circuit (ASIC), an electronic circuit, and a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor). Etc.) and memory, merged logic circuits and/or other suitable components that support the described functions.
  • the units of the examples described in the embodiments of the present application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • FIG. 21 is a schematic diagram of the hardware structure of an image enhancement device provided by an embodiment of the present application.
  • the image enhancement apparatus 1000 shown in FIG. 21 includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004.
  • the memory 1001, the processor 1002, and the communication interface 1003 implement communication connections between each other through the bus 1004.
  • the memory 1001 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1001 may store a program.
  • the processor 1002 is configured to execute each step of the image enhancement method of the embodiment of the present application, for example, execute each of the steps shown in FIG. 10 to FIG. 15 Step, or execute each step shown in Figure 17 to Figure 19.
  • the image enhancement device shown in the embodiment of the present application may be a server, for example, it may be a server in the cloud, or may also be a chip configured in a server in the cloud; or, the image enhancement device shown in the embodiment of the present application
  • the device can be a smart terminal or a chip configured in the smart terminal.
  • the image enhancement method disclosed in the foregoing embodiments of the present application may be applied to the processor 1002 or implemented by the processor 1002.
  • the processor 1002 may be an integrated circuit chip with signal processing capabilities.
  • the steps of the above-mentioned image enhancement method can be completed by an integrated logic circuit of hardware in the processor 1002 or instructions in the form of software.
  • the processor 1002 may be a chip including the NPU shown in FIG. 8.
  • the aforementioned processor 1002 may be a central processing unit (CPU), a graphics processing unit (GPU), a general-purpose processor, a digital signal processor (DSP), or an application specific integrated circuit (application integrated circuit).
  • CPU central processing unit
  • GPU graphics processing unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application can be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory (RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory, or electrically erasable programmable memory, registers, etc. mature in the field Storage medium.
  • the storage medium is located in the memory 1001, and the processor 1002 reads the instructions in the memory 1001, and combines its hardware to complete the functions required by the units included in the image enhancement device shown in FIG. 20 in the implementation of this application, or execute the method of this application
  • the image enhancement method shown in FIG. 10 to FIG. 15 of the embodiment, or each step shown in FIG. 17 to FIG. 19 of the method embodiment of the present application is performed.
  • the communication interface 1003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • the bus 1004 may include a path for transferring information between various components of the image enhancement device 1000 (for example, the memory 1001, the processor 1002, and the communication interface 1003).
  • image enhancement device 1000 only shows a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the image enhancement device 1000 may also include other necessary for normal operation. Device. At the same time, according to specific needs, those skilled in the art should understand that the above-mentioned image enhancement device 1000 may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the above-mentioned image intensifying device 1000 may also only include the necessary devices for implementing the embodiments of the present application, and not necessarily all the devices shown in FIG. 21.
  • An embodiment of the present application also provides a chip, which includes a transceiver unit and a processing unit.
  • the transceiver unit may be an input/output circuit or a communication interface;
  • the processing unit is a processor, microprocessor, or integrated circuit integrated on the chip.
  • the chip can execute the image enhancement method in the above method embodiment.
  • the embodiment of the present application also provides a computer-readable storage medium on which an instruction is stored, and the image enhancement method in the foregoing method embodiment is executed when the instruction is executed.
  • the embodiments of the present application also provide a computer program product containing instructions that, when executed, execute the image enhancement method in the foregoing method embodiments.
  • the memory may include a read-only memory and a random access memory, and provide instructions and data to the processor.
  • Part of the processor may also include non-volatile random access memory.
  • the processor can also store device type information.
  • the memory may include a read-only memory and a random access memory, and provide instructions and data to the processor.
  • Part of the processor may also include non-volatile random access memory.
  • the processor can also store device type information.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了人工智能领域中计算机视觉领域的一种图像增强方法以及装置,该图像增强方法包括:获取待处理图像;通过神经网络对该待处理图像进行特征增强处理,得到该待处理图像的增强图像特征,该神经网络包括N个卷积层,N为正整数;根据该增强图像特征对该待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。本申请的技术方案使得待处理图像在细节、颜色以及亮度方面的性能均得到提升,从而提高了图像增强处理的效果。

Description

图像增强方法以及装置
本申请要求于2019年09月30日提交中国专利局、申请号为201910943355.7、申请名称为“图像增强方法以及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉人工智能领域,更具体地,涉及计算机视觉领域中的一种图像增强方法以及装置。
背景技术
计算机视觉是各个应用领域,如制造业、检验、文档分析、医疗诊断,和军事等领域中各种智能/自主***中不可分割的一部分,它是一门关于如何运用照相机/摄像机和计算机来获取我们所需的,被拍摄对象的数据与信息的学问。形象地说,就是给计算机安装上眼睛(照相机/摄像机)和大脑(算法)用来代替人眼对目标进行识别、跟踪和测量等,从而使计算机能够感知环境。因为感知可以看作是从感官信号中提取信息,所以计算机视觉也可以看作是研究如何使人工***从图像或多维数据中“感知”的科学。总的来说,计算机视觉就是用各种成像***代替视觉器官获取输入信息,再由计算机来代替大脑对这些输入信息完成处理和解释。计算机视觉的最终研究目标就是使计算机能像人那样通过视觉观察和理解世界,具有自主适应环境的能力。
图像增强是图像处理领域重要的一个分支,通过图像增强技术可以在不重新采集数据的情况下改善图像质量,以满足更多实际应用需求。例如,图像增强技术可以通过有目的地强调图像(或视频)的整体或局部特性,将原来不清晰的图像变得清晰或强调某些感兴趣的特征,扩大图像中不同物体特征之间的差别,抑制不感兴趣的特征,使之改善图像质量、丰富信息量,加强图像判读和识别效果,满足某些特殊分析的需要。
在计算机视觉领域内,常常需要利用采集设备获取图像(或视频),并对图像进行识别或分析。通常情况,受采集设备的拍摄环境或其它未知因素等的影响,获取到的图像会出现模糊、对比度低等现象,成像质量较低会影响图像的显示效果、以及图像的分析和识别等,可以先对图像或视频进行图像增强处理,再对增强后的图像或视频进行识别或分析。但是,目前在很多情况下,图像增强处理的效果并不理想。因此,如何提升图像增强处理的效果,成为一个亟需解决的问题。
发明内容
本申请提供一种图像增强方法以及装置,能够使得待处理图像在细节方面、颜色方面以及亮度方面的性能得以增强,从而能够提升图像增强处理的效果。
第一方面,提供了一种图像增强方法,包括:获取待处理图像;通过神经网络对所述 待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
示例性地,上述待处理图像可以是画质较差的原始图像;例如,可以是指受到天气、距离、拍摄环境等因素的影响,获取的待处理图像存在图像模糊、或者图像画质较低、或者图像颜色与亮度较低等问题。
应理解,上述颜色增强处理可以用于改善待处理图像的色彩分布,提高待处理图像的色彩饱和度;亮度增强处理可以是指调整待处理图像的亮度;特征增强处理可以是指增强图像中的细节,使得图像中包括更多的细节信息;比如,特征增强处理可以是指对待处理图像进行细节特征增强。
本申请实施例提供的图像增强方法,通过对待处理图像进行特征增强处理,得到待处理图像的图像增强特征,并利用图像增强特征进一步对待处理图像进行颜色增强处理与亮度增强处理,从而在进行颜色增强与亮度增强时还能增强待处理图像的细节特征,使得待处理图像在细节方面、颜色方面以及亮度方面的性能均得以增强,从而提升图像增强处理的效果。
需要说明的是,图像增强也可以称为图像质量增强,比如可以是指图像的亮度、颜色、对比度、饱和度和/或动态范围等进行处理,以使得该图像的各项指标满足预设的条件。
结合第一方面,在第一方面的某些实现方式中,所述通过神经网络对所述待处理图像进行所述特征增强处理,得到所述待处理图像的增强图像特征,包括:通过拉普拉斯增强算法对所述待处理图像进行所述特征增强处理,得到所述待处理图像的增强图像特征。
在本申请的实施例中,可以通过拉普拉斯增强算法实现待处理图像的特征增强,其中,拉普拉斯增强算法可以实现对待处理图像进行特征增强时不会引入新的纹理,从而在一定程度上避免图像增强处理后的输出图像中引入伪纹理的问题,能够提升图像增强处理的效果。
在一种可能的实现方式中,上述拉普拉斯增强算法可以是通过对待处理图像中的高频特征进行增强,从而得到待处理图像的增强图像特征。
其中,上述图像的高频特征可以是指待处理图像的指图像的细节、纹理等信息。
结合第一方面,在第一方面的某些实现方式中,所述拉普拉斯增强算法用于根据所述N个卷积层中的第i个卷积层的残差特征对第i个卷积层的输入图像特征进行所述特征增强处理,得到所述第i个卷积层的增强图像特征,其中,所述残差特征表示所述第i个卷积层的输入图像特征与所述第i个卷积层中通过卷积操作处理后的图像特征之间的差值,所述第i个卷积层的增强图像特征为所述第i+1个卷积层的输入图像特征,所述输入图像特征是根据所述待处理图像得到的,i为正整数。
在本申请的实施例中,拉普拉斯增强算法可以是改进的拉普拉斯增强算法,通过本申请实施例的改进的拉普拉斯增强算法可以通过将前一卷积层中的图像特征用于后续图像特征的增强,实现对不同卷积层的图像特征的渐进式逐步增强,能够提升图像增强处理的效果。
结合第一方面,在第一方面的某些实现方式中,所述待处理图像的增强图像特征为所述N个卷积层中第N个卷积层输出的图像特征,通过以下等式得到所述待处理图像的增 强图像特征:
L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
其中,L(F N)表示所述第N个卷积层的增强图像特征,F N表示所述第N个卷积层的输入图像特征,Ф表示所述第N个卷积层的卷积核,s l表示通过学习得到的缩放参数,N为正整数。
本申请的实施例中,通过可学习的参数s l替换传统拉普拉斯增强算法中的固定缩放系数s c;同时,采用相邻层的残差特征进行特征增强,残差特征可以用于表示任何需要强调的信息。因此,本申请实施例的拉普拉斯算法不仅仅可以对图像的高频信息进行增强,还能够实现不同卷积层的图像特征的渐进式逐步增强,从而提升图像增强处理的效果。
结合第一方面,在第一方面的某些实现方式中,所述根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像,包括:根据所述待处理图像的增强图像特征,得到所述待处理图像的置信图像特征与光照补偿图像特征,其中,所述置信图像特征用于所述待处理图像进行颜色增强,所述光照补偿图像特征用于对所述待处理图像进行亮度增强;根据所述待处理图像、所述置信图像特征以及所述光照补偿图像特征得到所述输出图像。
需要说明的是,上述置信图像特征可以表示对所述待处理图像进行颜色增强处理的映射关系,或者映射函数。
示例性地,置信图像特征与待处理图像的图像特征可以相对应,比如,置信图像特征中的一个元素可以用于表示待处理图像的图像特征中对应元素的缩放程度;通过对待处理图像中不同区域的缩放可以实现待处理图像的颜色增强。
应理解,上述待处理图像的增强图像特征中可以包括更多的细节特征以及纹理;根据待处理图像的增强图像特征对待处理图像进行颜色增强处理以及亮度增强处理可以使得输出图像实现细节增强,同时也能提升输出图像的亮度与颜色。
在本申请的实施例中,用于颜色增强处理的置信图像特征以及用于亮度增强处理的光照补偿图像特征可以是通过待处理图像的增强图像特征得到的,相比于传统的缩放式方法即直接通过映射函数将待处理图像中各像素点的颜色与亮度进行增强相比,本申请实施例中的置信图像特征与光照补偿图像特征不仅能够进行颜色增强与亮度增强,同时还能实现对待处理图像的细节增强,从而提升图像增强处理的效果。
结合第一方面,在第一方面的某些实现方式中,还包括:通过对所述待处理图像的增强图像特征进行卷积操作,得到所述置信图像特征与所述光照补偿图像特征;通过所述待处理图像的图像特征与所述置信图像特征相乘得到所述待处理图像的颜色增强图像特征;将所述颜色增强图像特征与所述光照补偿图像特征进行融合,得到所述输出图像。
在一种可能的实现方式,可以并行地通过待处理图像的增强图像特征得到上述置信图像特征以及上述光照补偿图像特征;比如,可以通过网络模型中的第一分支对待处理图像的增强图像特征进行卷积操作,得到上述置信图像特征;通过网络模型中的第二分支对所述待处理图像的增强图像特征进行卷积操作,得到上述光照补偿图像特征。
在一种可能的实现方式中,上述颜色增强图像特征与光照补偿图像特征进行融合可以是指颜色增强图像特征与光照补偿图像特征相加。
第二方面,提供一种图像增强方法,包括:检测到用户用于打开相机的第一操作;响 应于所述第一操作,在所述显示屏上显示拍摄界面,所述拍摄界面上包括取景框,所述取景框内包括第一图像;检测到所述用户指示相机的第二操作;响应于所述第二操作,在所述取景框内显示第二图像,或者在所述电子设备中保存第二图像,所述第二图像是根据所述第一图像的增强图像特征对所述第一图像进行颜色增强处理与亮度增强处理得到的,所述第一图像的增强图像特征是通过神经网络对所述第一图像进行特征增强处理得到的,所述神经网络包括N个卷积层,N为正整数。
其中,上述对第一图像进行特征增强处理的具体流程可以根据上述第一方面以及第一方面的任意一种实现方式得到。
在一种可能的实现方式中,本申请实施例提供的图像增强方法可以应用于智能终端的拍照领域,通过本申请实施例的图像增强方法可以对智能终端获取的画质较差的原始图像进行图像增强处理得到画质提升的输出图像,例如,可以是在智能终端进行实时拍照时,对获取的原始图像进行图像增强处理,将图像增强处理后的输出图像显示在智能终端的屏幕上,或者,还可以通过对获取的原始图像进行图像增强处理,将图像增强处理后的输出图像保存至智能终端的相册中。
第三方面,提供一种图像增强方法,包括:获取待处理的道路画面;通过神经网络对所述待处理的道路画面进行特征增强处理,得到所述待处理的道路画面的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理的道路画面进行颜色增强处理与亮度增强处理,得到处理后的输出道路画面;根据所述处理后的输出道路画面,识别所述输出道路画面中的信息。
其中,上述对道路画面进行特征增强处理的具体流程可以根据上述第一方面以及第一方面的任意一种实现方式得到。
在一种可能的实现方式中,本申请实施例提供的图像增强方法可以应用于自动驾驶领域。例如,可以应用于自动驾驶车辆的导航***中,通过本申请中的图像增强方法可以使得自动驾驶车辆在道路行驶的导航过程中,通过获取的画质较低的原始道路画面进行图像增强处理,得到增强处理后的道路画面,从而实现自动驾驶车辆的安全性。
第四方面,提供一种图像增强方法,包括:获取街景画面;通过神经网络对所述街景画面进行特征增强处理,得到所述街景画面的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述街景画面进行颜色增强处理与亮度增强处理,得到处理后的输出街景画面;根据所述处理后的输出街景画面,识别所述输出街景画面中的信息。
其中,上述对街景画面进行特征增强处理的具体流程可以根据上述第一方面以及第一方面的任意一种实现方式得到。
在一种可能的实现方式中,本申请实施例提供的图像增强方法可以应用于安防领域。例如,本申请实施例的图像增强方法可以应用于平安城市的监控图像增强,比如,公共场合的监控设备采集到的图像(或者,视频)往往受到天气、距离等因素的影响,存在图像模糊,图像画质较低等问题。通过本申请的图像增强方法可以对采集到的原始图像进行图像增强,从而可以为公安人员恢复出车牌号码、清晰人脸等重要信息,为案件侦破提供重要的线索信息。
应理解,在上述第一方面中对相关内容的扩展、限定、解释和说明也适用于第二方面、 第三方面以及第四方面中相同的内容。
第五方面,提供一种图像增强装置,包括用于执行上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法的模块。
第六方面,提供一种图像增强装置,包括:存储器,用于存储程序;处理器,用于执行该存储器存储的程序,当该存储器存储的程序被执行时,该处理器用于执行:获取待处理图像;对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征;根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
在一种可能的实现方式中,上述图像增强装置中包括的处理器还用于上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法。
第七方面,提供了一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法。
第八方面,提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法。
第九方面,提供了一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行上述第一方面至第四方面以及第一方面至第四方面中的任意一种实现方式中的图像增强方法。
附图说明
图1是本申请实施例提供的一种人工智能主体框架示意图;
图2是本申请实施例提供的一种应用场景的示意图;
图3是本申请实施例提供的另一种应用场景的示意图;
图4是本申请实施例提供的再一种应用场景的示意图;
图5是本申请实施例提供的再一种应用场景的示意图
图6是本申请实施例提供的***架构的结构示意图;
图7是本申请实施例提供的一种卷积神经网络结构示意图;
图8是本申请实施例提供的一种芯片硬件结构示意图;
图9是本申请实施例提供了一种***架构的示意图;
图10是本申请实施例提供的图像增强方法的示意性流程图;
图11是本申请实施例提供的图像增强模型的结构示意图;
图12是本申请实施例提供的拉普拉斯增强单元以及混合增强单元的示意图;
图13是本申请实施例提供的特征增强处理的示意图;
图14是本申请实施例提供的颜色增强处理与亮度增强处理的示意图;
图15是本申请实施例提供的颜色增强处理与亮度增强处理的示意图;
图16是本申请实施例提供的视觉质量测评结果的示意图;
图17是本申请实施例提供的图像增强方法的示意性流程图;
图18是本申请实施例提供的一组显示界面示意图;
图19是本申请实施例提供的另一组显示界面示意图;
图20是本申请实施例提供的图像增强装置的示意性框图;
图21是本申请实施例提供的图像增强装置的硬件结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应理解,本申请实施例中的图像可以为静态图像(或称为静态画面)或动态图像(或称为动态画面),例如,本申请中的图像可以为视频或动态图片,或者,本申请中的图像也可以为静态图片或照片。为了便于描述,本申请在下述实施例中将静态图像或动态图像统一称为图像。
图1示出一种人工智能主体框架示意图,该主体框架描述了人工智能***总体工作流程,适用于通用的人工智能领域需求。
下面从“智能信息链”(水平轴)和“信息技术(information technology,IT)价值链”(垂直轴)两个维度对上述人工智能主题框架100进行详细的阐述。
“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。
“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到***的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施110
基础设施为人工智能***提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。
基础设施可以通过传感器与外部沟通,基础设施的计算能力可以由智能芯片提供。
这里的智能芯片可以是中央处理器(central processing unit,CPU)、神经网络处理器(neural-network processing unit,NPU)、图形处理器(graphics processing unit,GPU)、专门应用的集成电路(application specific integrated circuit,ASIC)以及现场可编程门阵列(field programmable gate array,FPGA)等硬件加速芯片。
基础设施的基础平台可以包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。
例如,对于基础设施来说,可以通过传感器和外部沟通获取数据,然后将这些数据提供给基础平台提供的分布式计算***中的智能芯片进行计算。
(2)数据120
基础设施的上一层的数据用于表示人工智能领域的数据来源。该数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有***的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理130
上述数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等处理方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能***中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力140
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用***,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用150
智能产品及行业应用指人工智能***在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶,平安城市,智能终端等。
图2是本申请实施例提供的图像增强方法的应用场景的示意图。
如图2所示,本申请实施例的技术方案可以应用于智能终端,本申请实施例中的图像增强方法可以对输入图像进行图像增强处理,得到该输入图像经图像增强后的输出图像。该智能终端可以为移动的或固定的,例如,该智能终端可以是具有图像增强功能的移动电话、平板个人电脑(tablet personal computer,TPC)、媒体播放器、智能电视、笔记本电脑(laptop computer,LC)、个人数字助理(personal digital assistant,PDA)、个人计算机(personal computer,PC)、照相机、摄像机、智能手表、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR),可穿戴式设备(wearable device,WD)或者自动驾驶的车辆等,本申请实施例对此不作限定。
需要说明的是,在本申请的实施例中图像增强也可以称为图像质量增强,具体可以是指对图像的亮度、颜色、对比度、饱和度和/或动态范围等进行处理,以使得该图像的各项指标满足预设的条件。在本申请实施例中,图像增强和图像质量增强具有相同的涵义。
下面对本申请实施例的具体应用场景进行举例说明。
应用场景一:智能终端拍照领域
在一个实施例中,如图3所示,本申请实施例的图像增强方法可以应用于智能终端设备(例如,手机)的拍摄。通过本申请实施例的图像增强方法可以对获取的质量较差的原始图像进行图像增强处理得到画质提升的输出图像。
需要说明的是,在图3中为了区别于灰度图像部分,彩色图像部分通过斜线填充来表示。
示例性地,可以通过本申请实施例的图像增强方法在智能终端进行实时拍照时,对获 取的原始图像进行图像增强处理,将图像增强处理后的输出图像显示在智能终端的屏幕上。
示例性地,可以通过本申请实施例的图像增强方法对获取的原始图像进行图像增强处理,将图像增强处理后的输出图像保存至智能终端的相册中。
示例性地,本申请提出了一种图像增强方法,应用于具有显示屏和摄像头的电子设备,包括:检测到用户用于打开相机的第一操作;响应于所述第一操作,在所述显示屏上显示拍摄界面,所述拍摄界面上包括取景框,所述取景框内包括第一图像;检测到所述用户指示相机的第二操作;响应于所述第二操作,在所述取景框内显示第二图像,或者在所述电子设备中保存第二图像,所述第二图像是根据所述第一图像的增强图像特征对所述第一图像进行颜色增强处理与亮度增强处理得到的,所述第一图像的增强图像特征是通过神经网络对所述第一图像进行特征增强处理得到的,所述神经网络包括N个卷积层,N为正整数。
需要说明的是,本申请实施例提供的图像增强方法同样适用于后面图6至图16中相关实施例中对图像增强方法相关内容的扩展、限定、解释和说明,此处不再赘述。
应用场景二:自动驾驶领域
在一个实施例中,如图4所示,本申请实施例的图像增强方法可以应用于自动驾驶领域。例如,可以应用于自动驾驶车辆的导航***中,通过本申请中的图像增强方法可以使得自动驾驶车辆在道路行驶的导航过程中,通过获取的画质较低的原始道路画面进行图像增强处理,得到增强处理后的道路画面,从而实现自动驾驶车辆的安全性。
示例性地,本申请提供了一种图像增强方法,该方法包括:获取待处理的道路画面;通过神经网络对所述待处理的道路画面进行特征增强处理,得到所述待处理的道路画面的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理的道路画面进行颜色增强处理与亮度增强处理,得到处理后的输出道路画面;根据所述处理后的输出道路画面,识别所述输出道路画面中的信息。
需要说明的是,本申请实施例提供的图像增强方法同样适用于后面图6至图16中相关实施例中对图像增强方法相关内容的扩展、限定、解释和说明,此处不再赘述。
应用场景三:安防领域
在一个实施例中,如图5所示,本申请实施例的图像增强方法可以应用于安防领域。例如,本申请实施例的图像增强方法可以应用于平安城市的监控图像增强,比如,公共场合的监控设备采集到的图像(或者,视频)往往受到天气、距离等因素的影响,存在图像模糊,图像画质较低等问题。通过本申请的图像增强方法可以对采集到的图片进行图像增强,从而可以为公安人员恢复出车牌号码、清晰人脸等重要信息,为案件侦破提供重要的线索信息。
示例性地,本申请提供了一种图像增强方法,该方法包括:获取街景画面;通过神经网络对所述街景画面进行特征增强处理,得到所述街景画面的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述街景画面进行颜色增强处理与亮度增强处理,得到处理后的输出街景画面;根据所述处理后的输出街景画面,识别所述输出街景画面中的信息。
需要说明的是,本申请实施例提供的图像增强方法同样适用于后面图6至图16中相关实施例中对图像增强方法相关内容的扩展、限定、解释和说明,此处不再赘述。
应用场景四:片源增强
在一个实施例中,本申请实施例的图像增强方法还可以应用于片源增强场景。例如,在利用智能终端(例如智能电视、智慧屏等)播放电影时,为了显示更好的图像质量(画质),可以对电影的原始片源采用本申请实施例的图像增强方法进行图像增强处理,以提升片源的画质,获得更好的视觉观感。
示例性地,在使用智能电视或智慧屏播放老电影(老电影的片源的时间比较早、片源的画质较差)时,可以使用本申请实施例的图像增强方法对老电影的片源进行图像增强处理,能显示现代电影的视觉感官。比如,可以通过本申请实施例的图像增强方法将老电影的片源增强为高动态范围图像(high-dynamic range,HDR)10,或者杜比视界(Dolby vision)标准的高质量视频。
示例性地,本申请提供了一种图像增强方法,该方法包括:获取原始图像(例如,电影的原始片源);通过神经网络对所述原始图像进行特征增强处理,得到所述原始图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述原始图像进行颜色增强处理与亮度增强处理,得到处理后的输出图像(例如,提升画质的片源)。
需要说明的是,本申请实施例提供的图像增强方法同样适用于后面图6至图16中相关实施例中对图像增强方法相关内容的扩展、限定、解释和说明,此处不再赘述。
应理解,上述为对应用场景的举例说明,并不对本申请的应用场景作任何限定。
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例可能涉及的神经网络的相关术语和概念进行介绍。
(1)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2020118721-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(2)深度神经网络
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2020118721-appb-000002
其中,
Figure PCTCN2020118721-appb-000003
是输入向量,
Figure PCTCN2020118721-appb-000004
是输出向量,
Figure PCTCN2020118721-appb-000005
是偏移 向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2020118721-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2020118721-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2020118721-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2020118721-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2020118721-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(3)卷积神经网络
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
(4)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
(5)反向传播算法
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的神经网络模型的参数,例如权重矩阵。
图6示出了本申请实施例提供的一种***架构200。
在图6中,数据采集设备260用于采集训练数据。针对本申请实施例的图像增强方法来说,可以通过训练数据对图像增强模型(又称为图像增强网络)进行进一步训练,即数据采集设备260采集的训练数据可以是训练图像。
示例性地,在本申请实施例中训练图像增强模型的训练数据可以包括原始图像、样本增强图像。
例如,原始图像可以是指图像画质较低的图像,样本增强图像可以是指图像画质较高的图像,比如,可以是指相对于样本图像而言在亮度、颜色、细节等一个或多个方面均得到提升后的图像。
应理解,图像增强也可以称为图像质量增强,具体可以是指对图像的亮度、颜色、对比度、饱和度和/或动态范围等进行处理,以使得该图像的各项指标满足预设的条件。在本申请实施例中,图像增强和图像质量增强具有相同的涵义。
在采集到训练数据之后,数据采集设备260将这些训练数据存入数据库230,训练设备220基于数据库230中维护的训练数据训练得到目标模型/规则201(即本申请实施例中的图像增强模型)。训练设备220将训练数据输入至图像增强模型,直到训练图像增强模型输出的预测增强图像与样本增强图像之间的差值满足预设条件(例如,预测增强图像与样本增强图像差值小于一定阈值,或者预测增强图像与样本增强图像的差值保持不变或不再减少),从而完成目标模型/规则201的训练。
示例性地,本申请实施例中用于执行图像增强方法的图像增强模型可以实现端到端的训练,比如,对于图12所示的图像增强模型可以通过输入图像与输入图像对应的样本增强图像(例如,真值图像)实现端到端的训练。
在本申请提供的实施例中,该目标模型/规则201是通过训练图像增强模型得到的。需要说明的是,在实际的应用中,所述数据库230中维护的训练数据不一定都来自于数据采集设备260的采集,也有可能是从其他设备接收得到的。
另外需要说明的是,训练设备220也不一定完全基于数据库230维护的训练数据进行目标模型/规则201的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。还需要说明的是,数据库230中维护的训练数据中的至少部分数据也可以用于执行设备210对待处理处理进行处理的过程。
根据训练设备220训练得到的目标模型/规则201可以应用于不同的***或设备中,如应用于图6所示的执行设备210,所述执行设备210可以是终端,如手机终端,平板电脑,笔记本电脑,AR/VR,车载终端等,还可以是服务器或者云端等。
在图6中,执行设备210配置输入/输出(input/output,I/O)接口212,用于与外部设备进行数据交互,用户可以通过客户设备240向I/O接口212输入数据,所述输入数据在本申请实施例中可以包括:客户设备输入的待处理图像。
预处理模块213和预处理模块214用于根据I/O接口212接收到的输入数据(如待处理图像)进行预处理,在本申请实施例中,也可以没有预处理模块213和预处理模块214(也可以只有其中的一个预处理模块),而直接采用计算模块211对输入数据进行处理。
在执行设备210对输入数据进行预处理,或者在执行设备210的计算模块211执行计算等相关的处理过程中,执行设备210可以调用数据存储***250中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储***250中。
最后,I/O接口212将处理结果,如上述得到待处理图像增强图像,即将得到的输出图像返回给客户设备240,从而提供给用户。
值得说明的是,训练设备220可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则201,该相应的目标模型/规则201即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。
在图6中所示情况下,在一种情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口212提供的界面进行操作。
另一种情况下,客户设备240可以自动地向I/O接口212发送输入数据,如果要求客户设备240自动发送输入数据需要获得用户的授权,则用户可以在客户设备240中设置相应权限。用户可以在客户设备240查看执行设备210输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备240也可以作为数据采集端,采集如图所示输入I/O接口212的输入数据及输出I/O接口212的输出结果作为新的样本数据,并存入数据库230。当然,也可以不经过客户设备240进行采集,而是由I/O接口212直接将如图所示输入I/O接口212的输入数据及输出I/O接口212的输出结果,作为新的样本数据存入数据库230。
值得注意的是,图6仅是本申请实施例提供的一种***架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图6中,数据存储***250相对执行设备210是外部存储器,在其它情况下,也可以将数据存储***250置于执行设备210中。
如图6所示,根据训练设备220训练得到目标模型/规则201,该目标模型/规则201在本申请实施例中可以是图像增强模型,具体的,本申请实施例提供的图像增强模型可以是深度神经网络,卷积神经网络,或者,可以是深度卷积神经网络等。
下面结合图7重点对卷积神经网络的结构进行详细的介绍。如上文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,卷积神经网络是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的图像作出响应。
本申请实施例中图像增强模型的结构可以如图7所示。在图7中,卷积神经网络300可以包括输入层310,卷积层/池化层320(其中,池化层为可选的),全连接层330以及输出层340。其中,输入层310可以获取待处理图像,并将获取到的待处理图像交由卷积层/池化层320以及全连接层330进行处理,可以得到图像的处理结果。下面对图7中的CNN 300中内部的层结构进行详细的介绍。
卷积层/池化层320:
如图7所示卷积层/池化层320可以包括如示例321-326层,举例来说:在一种实现中,321层为卷积层,322层为池化层,323层为卷积层,324层为池化层,325为卷积层,326为池化层;在另一种实现方式中,321、322为卷积层,323为池化层,324、325为卷积层,326为池化层,即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。
下面将以卷积层321为例,介绍一层卷积层的内部工作原理。
卷积层321可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素等,这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。
不同的权重矩阵可以用来提取图像中不同的特征,例如,一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的卷积特征图的尺寸也相同,再将提取到的多个尺寸相同的卷积特征图合并形成卷积运算的输出。
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络300进行正确的预测。
当卷积神经网络300有多个卷积层的时候,初始的卷积层(例如321)往往提取较多的一般特征,一般特征也可以称之为低级别的特征;随着卷积神经网络300深度的加深,越往后的卷积层(例如326)提取到的特征越来越复杂,比如,高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。
池化层:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,在如图7中320所示例的321-326各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。
另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。
全连接层330:
在经过卷积层/池化层320的处理后,卷积神经网络300还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层320只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络300需 要利用全连接层330来生成一个或者一组所需要的类的数量的输出。因此,在全连接层330中可以包括多层隐含层(如图7所示的331、332至33n)以及输出层340,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像增强,图像识别,图像分类,图像检测以及图像超分辨率重建等等。
在全连接层330中的多层隐含层之后,也就是整个卷积神经网络300的最后层为输出层340,该输出层340具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络300的前向传播(如图7由310至340方向的传播为前向传播)完成,反向传播(如图7由340至310方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络300的损失,及卷积神经网络300通过输出层输出的结果和理想结果之间的误差。
需要说明的是,图7所示的卷积神经网络仅作为一种本申请实施例图像增强模型的结构示例,在具体的应用中,本申请实施例的图像增强方法所采用的卷积神经网络还可以以其他网络模型的形式存在。
本申请的实施例中,图像增强装置可以包括图7所示的卷积神经网络300,该图像增强装置可以对待处理图像进行图像增强处理,得到处理后的输出图像。
图8是本申请实施例提供的一种芯片的硬件结构,该芯片包括神经网络处理器400(neural-network processing unit,NPU)。该芯片可以被设置在如图6所示的执行设备210中,用以完成计算模块211的计算工作。该芯片也可以被设置在如图6所示的训练设备220中,用以完成训练设备220的训练工作并输出目标模型/规则201。如图7所示的卷积神经网络中各层的算法均可在如图8所示的芯片中得以实现。
NPU 400作为协处理器挂载到主中央处理器(central processing unit,CPU)上,由主CPU分配任务。NPU 400的核心部分为运算电路403,控制器404控制运算电路403提取存储器(权重存储器或输入存储器)中的数据并进行运算。
在一些实现中,运算电路403内部包括多个处理单元(process engine,PE)。在一些实现中,运算电路403是二维脉动阵列。运算电路403还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路403是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路403从权重存储器402中取矩阵B相应的数据,并缓存在运算电路403中每一个PE上。运算电路403从输入存储器401中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器408(accumulator)中。
向量计算单元407可以对运算电路403的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。例如,向量计算单元407可以用于神经网络中非卷积/非FC层的网络计算,如池化(pooling),批归一化(batch normalization),局部响应归一化(local response normalization)等。
在一些实现种,向量计算单元能407将经处理的输出的向量存储到统一存储器406。例如,向量计算单元407可以将非线性函数应用到运算电路403的输出,例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元407生成归一化的值、合并值,或二者均有。
在一些实现中,处理过的输出的向量能够用作到运算电路403的激活输入,例如用于在神经网络中的后续层中的使用。
统一存储器406用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器405(direct memory access controller,DMAC)将外部存储器中的输入数据存入到输入存储器401和/或统一存储器406、将外部存储器中的权重数据存入权重存储器402,以及将统一存储器406中的数据存入外部存储器。
总线接口单元410(bus interface unit,BIU),用于通过总线实现主CPU、DMAC和取指存储器409之间进行交互。
与控制器404连接的取指存储器409(instruction fetch buffer),用于存储控制器404使用的指令。控制器404,用于调用取指存储器409中缓存的指令,实现控制该运算加速器的工作过程。
一般地,统一存储器406,输入存储器401,权重存储器402以及取指存储器409均为片上(On-Chip)存储器,外部存储器为该NPU外部的存储器,该外部存储器可以为双倍数据率同步动态随机存储器(double data rate synchronous dynamic random access memory,DDR SDRAM)、高带宽存储器(high bandwidth memory,HBM)或其他可读可写的存储器。
其中,图7所示的卷积神经网络中各层的运算可以由运算电路403或向量计算单元407执行。
上文中介绍的图6中的执行设备210能够执行本申请实施例的图像增强方法的各个步骤,图7所示的CNN模型和图8所示的芯片也可以用于执行本申请实施例的图像增强方法的各个步骤。
图9所示是本申请实施例提供了一种***架构500。该***架构包括本地设备520、本地设备530以及执行设备510和数据存储***550,其中,本地设备520和本地设备530通过通信网络与执行设备510连接。
示例性地,执行设备510可以由一个或多个服务器实现。
可选的,执行设备510可以与其它计算设备配合使用。例如:数据存储器、路由器、负载均衡器等设备。执行设备510可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备510可以使用数据存储***550中的数据,或者调用数据存储***550中的程序代码来实现本申请实施例的图像增强方法。
需要说明的是,上述执行设备510也可以称为云端设备,此时执行设备510可以部署在云端。
具体地,执行设备510可以执行以下过程:获取待处理图像;通过神经网络对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
在一种可能的实现方式中,本申请实施例的图像增强方法可以是在云端执行的离线方法,比如,可以由上述执行设备510中执行本申请实施例的图像增强方法。
在一种可能的实现方式中,本申请实施例的图像增强方法可以是由本地设备520或者本地设备530执行。
在本申请的实施例中,可以对获取的画质较差的待处理图像进行图像增强,从而得到待处理图像在图像细节、图像颜色以及图像亮度等方面性能均得到提升的输出图像。
例如,用户可以操作各自的用户设备(例如,本地设备520和本地设备530)与执行设备510进行交互。每个本地设备可以表示任何计算设备,例如,个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备510进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。
在一种实现方式中,本地设备520、本地设备530可以从执行设备510获取到目标神经网络的相关参数,将目标神经网络部署在本地设备520、本地设备530上,利用该目标神经网络进行图像增强处理等。
在另一种实现中,执行设备510上可以直接部署目标神经网络,执行设备510通过从本地设备520和本地设备530获取待处理图像,并根据目标神经网络对待处理图像进行图像增强处理等。
例如,上述目标神经网络可以是本申请实施例中的图像增强模型。
目前,智能终端的拍照成像和视频受限于智能终端光学传感器的硬件性能,其拍摄的照片和视频质量仍然不够高,存在高噪声、解析力较低、细节缺失、偏色等问题。图像(或图片)增强是各类图像处理应用的基础,计算机视觉常常会涉及到如何对获取到的图像进行图像增强的问题。当前已有的图像增强方法可以分为两类:第一类,缩放式方法(Scaling Method),即通过学习原图像或视频到目标图像或视频转换关系的映射函数,对输入图像或视频的像素或特征进行缩放(Scale)操作;第二类是生成式方法(Generative Method),即通过生成式对抗网络(generative adversarial networks,GAN)对输入图像或视频提取特征,生成新元素,重构输出图像或视频。但是,对于缩放式方法若原输入图像的细节不清晰,则通过缩放式方法得到的图像增强处理后的图像无法实现增强细节;对于生成式方法在进行图像增强时会引入伪纹理。
有鉴于此,本申请实施例提供了一种图像增强方法,通过对待处理图像进行特征增强处理,得到待处理图像的图像增强特征,并利用图像增强特征进一步对待处理图像进行颜色增强处理与亮度增强处理,从而在进行颜色增强与亮度增强时还能增强待处理图像的细节特征,使得待处理图像在细节方面、颜色方面以及亮度方面的性能均得以增强,从而提升图像增强处理的效果。
图10示出了本申请实施例提供的图像增强方法的示意性流程图,图10所示图像增强方法可以由图像增强装置执行,该图像增强装置具体可以是图6中的执行设备210,也可以是图9中的执行设备510或者本地设备。图10所示的方法包括步骤610至630,下面分别对步骤610至630进行详细的介绍。
步骤610、获取待处理图像。
其中,待处理图像可以是画质较差的原始图像;例如,可以是指受到天气、距离、拍摄环境等因素的影响,获取的待处理图像存在图像模糊、或者图像画质较低、或者图像颜色与亮度较低等问题。
示例性地,上述待处理图像可以是电子设备通过摄像头拍摄到的图像,或者,上述待 处理图像还可以是从电子设备内部获得的图像(例如,电子设备的相册中存储的图像,或者,电子设备从云端获取的图片)。例如,电子设备可以是图9所示的本地设备或者执行设备中的任意一个。
步骤620、通过神经网络对待处理图像进行特征增强处理,得到待处理图像的增强图像特征,神经网络可以包括N个卷积层,N为正整数。
其中,上述特征增强处理可以是指对待处理图像进行细节特征增强。
应理解,上述神经网络可以是指图11所示的图像增强模型中的特征提取部分;图像增强模型中可以包括多个神经网络,特征提取部分可以为第一神经网络,特征重建部分可以包括第二神经网络。比如,第一神经网络可以用于对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征;第二神经网络可以用于根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。其中,图13所示可以为第一神经网络,图14所示可以为第二神经网络。
需要说明的是,特征增强处理可以是指有目的地强调图像的整体或局部特性,将原来不清晰的图像变得清晰或强调某些感兴趣的特征,扩大图像中不同物体特征之间的差别,抑制不感兴趣的特征,使之改善图像质量、丰富信息量,加强图像判读和识别效果,满足识别与分析的需要。
步骤630、根据增强图像特征对待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
其中,颜色增强处理可以用于改善待处理图像的色彩分布,提高待处理图像的色彩饱和度。亮度增强处理可以是指调整待处理图像的亮度。通过上述特征增强处理、颜色增强处理以及强度增强处理可以提高了图像的清晰度、图像的质量,使图像中的物体的轮廓更加清晰,细节更加明显。
需要说明的是,在本申请的实施例中,输出图像可以是指对获取的待处理图像进行图像增强得到的图像,图像增强可以称为图像质量增强,具体可以是指对图像的亮度、颜色、对比度、饱和度和/或动态范围等进行处理,以使得该图像的一项或者多项指标满足预设的条件。
示例性地,可以通过采用拉普拉斯增强算法对待处理图像进行特征增强处理,得到待处理图像的增强图像特征。
应理解,待处理图像的增强图像特征可以是指对待处理图像中的细节或者纹理等进行增强后得到的增强图像特征。
在本申请的实施例中,可以通过拉普拉斯增强算法实现待处理图像中的细节特征的增强,其中,拉普拉斯增强算法在用于对待处理图像进行特征增强时不会引入新的纹理,从而避免了图像增强处理后得到的输出图像中引入伪纹理的问题,能够提升图像增强处理的效果。
示例性地,上述拉普拉斯增强算法可以是指传统的拉普拉斯增强算法,通过待处理图像的原始图像特征与待图像的高频特征进行融合处理后得到的增强图像特征。
例如,通过以下等式对待处理图像进行特征增强处理:
E=I+s c·[I-h(I)];
其中,I可以表示待处理图像的原始图像特征,E可以表示待处理图像的增强图像特 征,h可以表示模糊核函数、s c可以表示常数值的缩放系数,I-h(I)可以表示得到待处理图像的高频特征。
需要说明的是,图像的高频特征可以是指图像的细节、纹理等信息;图像的低频特征可以是指图像的轮廓信息。
进一步地,为了提升待处理图像的特征增强处理效果,本申请实施例中提出了改进的拉普拉斯增强算法,通过本申请实施例的改进的拉普拉斯增强算法可以通过将前一卷积层中的图像特征用于后续图像特征的增强,实现对不同卷积层的图像特征的渐进式逐步增强。
可选地,本申请实施例提出的拉普拉斯增强算法可以用于根据所述N个卷积层中的第i个卷积层的残差特征对第i个卷积层的输入图像特征进行特征增强处理,得到第i个卷积层的增强图像特征,其中,残差特征可以表示第i个卷积层的输入图像特征与第i个卷积层中卷积操作处理后的图像特征之间的差异,第i个卷积层的增强图像特征为第i+1个卷积层的输入图像特征,输入图像特征是根据待处理图像得到的,i为正整数。
示例性地,待处理图像的增强图像特征可以为所述N个卷积层中第N个卷积层输出的图像特征,通过以下等式得到待处理图像的增强图像特征:
L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
其中,L(F N)可以表示第N个卷积层的增强图像特征,F N可以表示第N个卷积层的输入图像特征,Ф可以表示第N个卷积层的卷积核,s l可以表示通过学习得到的参数,N为正整数。
本申请实施例中提出的拉普拉斯增强算法,通过可学习的参数替换传统拉普拉斯增强算法中的固定缩放系数;同时,采用相邻层的残差特征进行特征增强,残差特征可以用于表示任何需要强调的信息。因此,本申请实施例的拉普拉斯算法不仅仅可以对图像的高频信息进行增强,还能够对不同卷积层的图像特征的渐进式逐步增强,从而提高图像特征增强的效果。
进一步地,在本申请的实施例中,可以通过获取的增强图像特征对待处理图像进行颜色增强与亮度增强,由于待处理图像的增强图像特征中包括更多的细节特征以及纹理,因此,根据待处理图像的增强图像特征对待处理图像进行颜色增强处理以及亮度增强处理可以使得输出图像实现细节增强,同时也能提升输出图像的亮度与颜色。
示例性地,根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像,可以包括:根据所述待处理图像的增强图像特征,得到所述待处理图像的置信图像特征与光照补偿图像特征,其中,所述置信图像特征用于所述待处理图像进行颜色增强,所述光照补偿图像特征用于对所述待处理图像进行亮度增强;根据所述待处理图像、所述置信图像特征以及所述光照补偿图像特征得到所述输出图像。
需要说明的是,上述置信图像特征可以表示对所述待处理图像进行颜色增强处理的映射关系,或者映射函数。
示例性地,置信图像特征与待处理图像的图像特征可以相对应,比如,置信图像特征中的一个元素可以用于表示待处理图像的图像特征中对应元素的缩放程度;通过对待处理图像中不同区域的缩放可以实现待处理图像的颜色增强。
在本申请的实施例中,用于颜色增强处理的置信图像特征以及用于亮度增强处理的光 照补偿图像特征可以是通过待处理图像的增强图像特征得到的,相比于传统的缩放式方法即直接通过映射函数将待处理图像中各像素点的颜色与亮度进行增强相比,本申请实施例中的置信图像特征与光照补偿图像特征不仅能够进行颜色增强与亮度增强,同时还能实现对待处理图像的细节增强,从而提升图像增强处理的效果。
例如,可以通过对所述待处理图像的增强图像特征进行卷积操作,得到所述置信图像特征与所述光照补偿图像特征;通过所述待处理图像的图像特征与所述置信图像特征相乘得到所述待处理图像的颜色增强图像特征;将所述颜色增强图像特征与所述光照补偿图像特征进行融合,得到所述输出图像。
需要说明的是,上述置信图像特征与光照补偿图像特征在根据待处理图像的增强图像特征得到的,因此,置信图像特征在用于对待处理图像进行颜色增强时在一定程度上也会增强待处理图像的图像特征;同理,光照补偿图像特征在用于对待处理图像进行亮度增强时也会增强待处理图像的图像特征,从而实现待处理图像的细节、颜色以及亮度的增强。
本申请实施例中,用于颜色增强处理的置信图像特征以及用于亮度增强处理的图像特征是通过待处理图像的增强图像特征得到的,相比于传统的缩放式方法即直接通过映射函数将待处理图像中各像素点的颜色与亮度进行增强相比,本申请实施例中的置信图像特征与光照补偿图像特征不仅能够进行颜色增强与亮度增强,同时还能实现对待处理图像的细节增强。
在一种可能的实现方式,可以并行地通过待处理图像的增强图像特征得到上述置信图像特征以及上述光照补偿图像特征;比如,可以通过网络模型中的第一分支对待处理图像的增强图像特征进行卷积操作,得到上述置信图像特征;通过网络模型中的第二分支对所述待处理图像的增强图像特征进行卷积操作,得到上述光照补偿图像特征。
在一种可能的实现方式中,可以通过待处理图像、置信图像特征以及光照补偿图像特征得到输出图像。
例如,可以通过待处理图像的图像特征(例如,待处理图像的原始图像特征)与置信图像特征相乘,从而实现待处理图像中不同区域的颜色增强,得到颜色增强图像特征;再将颜色增强图像特征与光照补偿图像特征进行融合从而得到图像增强处理后的输出图像。
示例性地,上述颜色增强图像特征与光照补偿图像特征进行融合可以是指颜色增强图像特征与光照补偿图像特征相加得到输出图像。
在另一种可能的实现方式中,可以通过待处理图像的增强图像特征、置信图像特征以及光照补偿特征得到图像增强处理后的输出图像。
例如,可以通过待处理图像的增强图像特征与置信图像特征相乘,从而实现待处理图像中不同区域的颜色增强,得到颜色增强图像特征;再将颜色增强图像特征与光照补偿图像特征进行融合从而得到图像增强处理后的输出图像。
图11是本申请实施例提供的用于图像增强的模型结构示意图。图11所示的模型可以部署于执行上述图像增强方法的图像增强装置中。
图11所示的模型可以包括四个部分,即输入部分、特征提取部分、特征重建部分以及输出部分。其中,特征提取部分可以包括拉普拉斯增强单元(laplacian enhancing unit,LEU);特征重建部分可以包括混合增强单元(hybrid enhancing module,HEM)。
应理解,图11所示的图像增强模型中可以包括多个神经网络,特征提取部分可以为 第一神经网络,特征重建部分可以包括第二神经网络。第一神经网络可以用于对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征;第二神经网络可以用于根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。例如,图13所示可以为第一神经网络,图14所示可以为第二神经网络。
其中,在特征提取部分可以将拉普拉斯增强单元嵌入到卷积层,通过使用拉普拉斯增强单元对提取的特征进行增强;具体地,拉普拉斯增强单元可以用于通过拉普拉斯增强算法对待处理图像进行特征增强处理。
例如,通过若干层拉普拉斯增强单元对所提取特征进行渐进地增强,将前一层的图像特征可以用于增强后一层图像特征,通过叠加前一层的图像特征的残差,实现对不同卷积层的图像特征进行逐步增强,从而能够提高图像增强的性能。对于特征重建部分,可以使用混合增强单元实现缩放式方法和生成式方法两种图像增强方法的优点。
需要说明的是,混合增强单元可以是以通过拉普拉斯增强单元处理后的输入图像的图像特征作为输入数据,即混合增强单元是以拉普拉斯单元输出的输出图像的增强图像特征为输入数据。其中,拉普拉斯增强单元可以用于通过拉普拉斯算法对输出图像进行特征增强处理,混合增强单元可以用于对输出图像进行颜色增强处理与亮度增强处理。
示例性地,图11所示的本申请实施例提供的用于执行图像增强方法的图像增强模型可以称为混合渐进式增强U型网络(hybrid progressive enhancing u-net,HPEU),通过采用LEU与HEM,从而使得感受野逐层增大;其中,LEU可以实现基于更多的图像信息对输出图像进行特征增强处理。
此外,拉普拉斯增强单元可以用于对不同级别的图像特征进行不同程度的增强处理。比如,在图像增强模型较浅的卷积层,LEU可以主要用于根据输入图像的边缘信息进行局部区域特征增强处理;在图像增强模型较深的卷积层,由于感受野更大,因此LEU可以用于全局特征的增强处理。
需要说明的是,上述感受野是计算机视觉领域的深度神经网络领域的一个术语,用来表示神经网络内部的不同位置的神经元对原始图像的感受范围的大小。神经元感受野的值越大,表示其能接触到的原始图像的范围就越大,也意味着该神经元可能蕴含更为全局、语义层次更高的特征;而值越小则表示神经元包含的特征越趋向于局部和细节,感受野的值可以大致用来判断每一层的抽象层次。
图12是本申请实施例提供的拉普拉斯增强单元以及混合增强单元的示意图。如图12所示的可以包括一个或者多个拉普拉斯增强单元以及混合增强单元,输入图像经过卷积层提取特征,所提取的特征经过拉普拉斯增强单元进行增强后,得到增强后的图像特征;进一步,将增强后的图像特征作为后续卷积层,或者后续拉普拉斯增强单元层的输入数据;直至将增强后的图像特征输入给混合增强单元,可以通过将特征通道划分为两部分,分别计算缩放组件和生成组件,进而融合两种组件得到最终的增强图像。
其中,上述缩放组件可以用于实现色彩增强,通过置信图(又称为置信图像特征)可以约束输入图像不同区域的色彩增强程度;上述生成组件可以用于光照补偿,从而实现对比度和亮度的增强。
例如,图13为本申请实施例提出的拉普拉斯增强单元处理流程的示意图(即特征增强处理的示意图)。如图13所示,拉普拉斯增强单元的处理过程包括以下步骤:
步骤一:假设当前网络(第一神经网络)包括N个卷积层,第N个卷积层的输出数据为图像特征F N,即可以将F N作为特征增强处理的输入数据。
步骤二:通过第N个卷积层提取F N的特征,记为Φ(F N)。
步骤三:计算Φ(F N)和F N之间的残差,残差可以记为:Φ(F N)-F N,或者,F N-Φ(F N)并通过可学习的参数对上述残差进行增强处理。
步骤四:将上述增强处理后的残差与Φ(F N)进行叠加,得到第N个卷积层通过拉普拉斯增强单元进行特征增强处理后得到的增强图像特征。
示例性地,通过以下等式得到第N个卷积层输出的图像特征:
L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
其中,L(F N)可以表示第N个卷积层的增强图像特征,F N可以表示第N个卷积层的输入图像特征,Ф可以表示第N个卷积层的卷积核,s l可以表示通过学习得到的参数,可以用于表示LEU每次对图像特征进行增强的增强程度。
应理解,第一神经网络可以包括N个卷积层,上述第N个卷积层输出的图像特征可以是对待处理图像进行特征增强处理,得到的待处理图像的增强图像特征。
例如,s l可以表示通过学习得到的缩放参数,缩放参数可以对图像中的不同区域执行缩放操作,从而实现图像中不同区域的颜色增强。
例如,图14为本申请实施例提出的混合增强单元处理流程的示意图(即颜色增强处理与亮度增强处理的示意图)。如图14所示,混合增强单元的处理过程包括以下步骤:
步骤一:将拉普拉斯增强单元输出的增强图像特征(即可以是将第一神经网络输出的增强图像特征)的多个特征通道划分为两个分支,记为第一分支与第二分支。
步骤二:通过混合增强单元中的第一分支进行卷积操作得到置信图,进而将置信图作用到输入图像进行像素级的缩放,得到缩放组件。
需要说明的是,置信图即上述图10所示的增强处理方法中的置信图像特征,用于对输入图像进行颜色增强。比如,置信图可以表示对输入图像进行颜色增强处理的映射关系,或者映射函数。缩放组件即图10所示的增强处理方法中的颜色增强图像特征。
示例性地,置信图与输入图像可以相对应,比如,置信图的图像特征中的一个元素可以用于表示输入图像的图像特征中对应元素的缩放程度;通过对输入图像中不同区域的缩放可以实现待输入图像的颜色增强。
其中,置信图可以包括若干个通道,与输入图像的通道相对应。
示例性地,将置信图作用到输入图像可以是指将置信图与输入图像相乘,比如,将输入图像与置信图进行逐像素的相乘。
步骤三:通过混合增强单元中的第二分支进行卷积操作得到用于光照补偿的生成组件。
需要说明的是,生成组件用于对输入图像进行亮度增强;生成组件即上述图10所示的增强处理方法中的光照补偿图像特征。
步骤四:将两个分支的图像特征进行融合处理,得到图像增强处理后的输出图像。
例如,缩放组件为s[i,j]=x[i,j]·c[i,j],其中x表示输入图像、c表示置信图,i和j表示像素坐标,则最终得到的输出图像为y[i,j]=x[i,j]·c[i,j]+g[i,j]。
在一个可能的实现方式中,假设拉普拉斯增强单元输出的输入图像的增强图像特征有N个通道,则可以将N个通道划分为两个分支,其中,M个通道特征通过卷积得到置信 图,置信图可以用于计算缩放组件,M个通道可以分别对应图像的R、G、B三个通道;其他的N-M个通道可以通过卷积操作得到生成组件,生成组件用于光照补偿实现对比度与亮度的增强。
应理解,由于生成组件用于光照补偿,因此可以通过一个通道特征进行卷积操作得到用于光照增强处理的光照补偿图像特征。
示例性地,在本申请的实施例中,可以根据混合增强模块与输入图像的增强图像特征,得到输入图像的置信图像特征与光照补偿图像特征,其中,混合增强模块包括第一分支与第二分支,第一分支用于根据增强图像得到置信图像特征,第二分支用于根据增强图像特征得到光照补偿图像特征,置信图像特征可以用于表示对待处理图像进行颜色增强的映射关系(例如,映射函数),光照补偿图像特征可以用于对输入图像进行亮度增强;根据输入图像、置信图像特征以及光照补偿图像特征得到输出图像。
例如,图15是本申请实施例提供的混合增强单元处理流程的示意图。其中,待处理图像与置信图的尺度大小可以相同,可以通过置信图对输入图像的不同区域进行不同程度的缩放,从而实现颜色增强;再通过叠加生成组件用于实现光照补偿,实现对比度与亮度的增强,最终得到输入图像的特征增强、颜色增强以及亮度增强后的输出图像。
需要说明的是,上述置信图可以表示对所述待处理图像进行颜色增强处理的映射关系,或者映射函数。
示例性地,置信图与待处理图像可以相对应,比如,置信图的图像特征中的一个元素可以用于表示待处理图像的图像特征中对应元素的缩放程度;通过对待处理图像中不同区域的缩放可以实现待处理图像的颜色增强。
在本申请的实施例中,用于颜色增强处理的置信图以及用于亮度增强处理的光照补偿图像特征可以是通过待处理图像的增强图像特征得到的,相比于传统的缩放式方法即直接通过映射函数将待处理图像中各像素点的颜色与亮度进行增强相比,本申请实施例中的置信图像特征与光照补偿图像特征不仅能够进行颜色增强与亮度增强,同时还能实现对待处理图像的细节增强。
表1
模型 PSNR MS-SSIM Time(ms) Parameters
Baseline 22.67 0.9355 371 3758879
+LEU 22.94 0.9352 383 3758903
+HEM 22.79 0.9405 390 3758828
本申请 23.29 0.9431 408 3758852
表1是本申请实施例提供的在MIT-Adobe FiveK数据集上的量化性能测评结果。其中,Baseline表示基础模型架构,即可以是不包括上述混合增强单元(HEM)与上述拉普拉斯增强单元(LEU)的Unet模型;+LEU表示包括上述拉普拉斯增强单元的图像增强Unet模型;+HEM表示包括上述混合增强单元的图像增强Unet模型;峰值信噪比(peak signal-to-noise ratio,PSNR)通常情况下用作图像处理等领域中信号重建质量的测量方法,可以通过均方误差进行定义。多尺度结构相似性(multi-scale structural similarity index,MS-SSIM)可以用于衡量两幅图像相似度的指标,用于评价算法处理的输出图像的质量。结构相似度指数从图像组成的角度将结构信息定义为独立于亮度、对比度的反映场景中物 体结构的属性,并将失真建模为亮度、对比度和结构三个不同因素的组合。比如,用均值作为亮度的估计,标准差作为对比度的估计,协方差作为结构相似程度的度量。时间用于表示模型对输入图像进行图像增强的时间。参数量(parameters)可以用于描述神经网络包含的参数量,用于评价模型的大小。
从表1所示的性能测评结果可以看出,本申请实施例提出的图像增强模型(HPEU)在PSNR方面提升了0.62dB,此外并没有增加过多的参数量,处理时延增加37ms;即通过本申请实施例提出的图像增强方法可以有效提升图像增强的效果。
表2
Figure PCTCN2020118721-appb-000011
表2是本申请实施例提供的不同图像增强模型在数据集上的量化性能测评结果。其中,数据集1可以表示MIT-Adobe FiveK数据集;数据集2可以表示DPED-iPhone数据集;测试的模型包括:数码相机照片增强器(Weakly Supervised Photo Enhancer for Digital camera,WESPE)、上下文融合网络(context aggregation network,CAN)、区域尺度缩放的全局U型网络(range scaling global Unet,RSGUnet)。
从表2所示的性能测评结果可以看出,在DPED-iPhone数据集上本申请提出的图像增强模型的PSNR和SSIM不如WESPE模型,主要由于DPED-iPhone数据集的非像素级对齐导致的,对于DPED-iPhone数据集是通过采用单反相机与手机在同一场景,以相同的角度同时拍摄得到的图像对,由于单反相机与手机的传感器不同,因此单反相机拍摄的图像与相机拍摄的图形并非逐像素对齐,从而导致HPEU在下采样的过程中会产生相对偏差;在MIT-Adobe FiveK数据集上,本申请实施例提出的图像增强模型(HPEU)的PSNR和SSIM最高,且视觉效果与真值(Ground Truth)更加接近。
表3
模型 PSNR MS-SSIM
GDF 22.64 0.9353
HDR 22.27 0.9391
HPEU+引导滤波 22.59 0.9383
目前,已有是针对小分辨率图片进行图像增强快速处理方法,比如以原始图像作为引导(Guidance)对输出图像进行上采样,得到增强后的图像。
表3是本申请实施例提供的在MIT-Adobe FiveK数据集通过在模型中添加引导滤波(Guided Filter)的量化性能评测结果。测试的模型包括:可训练引导滤波模型(trainable guided filter,GDF)、高动态范围网络模型(high dynamic range,HDR)以及本申请实施例提供的图像增强模型(HPEU)中添加引导滤波。
从表3所示的性能测评结果可以看出,在本申请的实施例提出的HPEU模型中添加引导滤波后网络的性能与高动态范围网络模型接近。
表4
模型 WESPE CAN RSGUnet HPEU HDR GDF HPEU+引导滤波
运行时间(ms) 1895 2043 417 408 210 57 51
表4是本申请实施例提供的图像增强处理运行时间的测评结果。从表4所示的运行时间测评结果可以看出,在相同的处理情况下本申请实施例提出的图像增强模型(HPEU)模型的运行时间最短,HPEU+Guided Filter的计算效率更高。在HPEU模型中增加Guided Filter后,HPEU+Guided Filter的速度比GDF模型和HDR模型更快,且客观量化指标相接近。
表5
模型 PSNR MS-SSIM Perceptual Index
Input 17.08 0.8372 4.10
WESPE 22.57 0.9186 3.67
RSGUnet 22.83 0.9254 3.77
本申请 23.02 0.9255 3.50
表5是本申请实施例提供的基于DPED-iphone数据集引入视觉感知损失后的量化测评结果。其中,包括客观评价指标PSNR、SSIM以及感知评价指标(Perceptual Index)。
从表5所示的性能测评结果可以看出,本申请实施例提出的图像增强模型(HPEU)在细节增强和噪声抑制方面均有较好的表现。
表6
模型 数据集A 数据集B 数据集C
DPED 17.22 16.58 16.32
CAN 16.33 15.02 15.72
RSGUnet 16.17 14.97 15.17
本申请 18.32 16.91 17.39
表6是本申请实施例提供的基于MIT-Adobe FiveK数据集训练各个网络模型,并用数据集A(DPED-iphone)、数据集B(DPED-Sony)和数据集C(DPED-Blackberry)作为测试数据集,用于验证模型的泛化能力。
从表6所示的性能测评结果可以看出,本申请实施例提供的图像增强模型(HPEU)的PSNR超过其它对比的模型,其泛化能力相对最好。
图16是本申请实施例提供的视觉质量测评结果的示意图。需要说明的是,在图16中为了区别于灰度图像部分,彩色图像部分通过斜线填充来表示。
其中,图16的(a)表示输入图像(例如,待处理图像);图16的(b)表示采用基础模型得到的预测输出图像,基础模型即可以不包括上述混合增强单元(HEM)与上述拉普拉斯增强单元(LEU)的Unet模型;图16的(c)表示采用上述拉普拉斯增强单元的Unet模型得到的预测输出图像;图16的(d)表示采用上述混合增强单元的Unet模型得到的预测输出图像;图16的(e)表示采用本申请实施例的模型(例如,图11或图12所示的模型)得到的预测输出图像;图16的(f)表示输入图像对应的真值图像(Ground Truth),真值图像可以表示输入图像对应的样本增强图像;图16的(e)表示误差图1,即基础模 型输出的预测输出图像与真值图像之间的残差;图16的(h)表示误差图2,即采用上述拉普拉斯增强单元的Unet模型输出的预测输出图像与真值图像之间的残差;图16的(i)表示误差图3,即采用上述混合增强单元的Unet模型输出的预测输出图像与真值图像之间的残差;图16的(j)表示误差图4,即采用本申请实施例的模型输出的预测输出图像与真值图像之间的残差。
从图16所示的视觉质量测评结果的示意图可以看出,采用拉普拉斯增强单元与混合增强单元即本申请实施例提出的图像增强模型得到的输出图像在视觉上与真值图像最接近。
图17是本申请实施例提供的图像增强方法的示意性流程图。图17所示的方法700包括步骤710至步骤740,下面对步骤710至步骤740进行详细的说明。
步骤710、检测到用户用于打开相机的第一操作。
步骤720、响应于所述第一操作,在所述显示屏上显示拍摄界面,在所述显示屏上显示拍摄界面,所述拍摄界面上包括取景框,所述取景框内包括第一图像。
在一个示例中,用户的拍摄行为可以包括用户打开相机的第一操作;响应于所述第一操作,在显示屏上显示拍摄界面。
图18中的(a)示出了手机的一种图形用户界面(graphical user interface,GUI),该GUI为手机的桌面810。当电子设备检测到用户点击桌面810上的相机应用(application,APP)的图标820的操作后,可以启动相机应用,显示如图18中的(b)所示的另一GUI,该GUI可以称为拍摄界面830。该拍摄界面830上可以包括取景框840。在预览状态下,该取景框840内可以实时显示预览图像。
需要说明的是,在本申请实施例中为了区别于灰度图像部分,彩色图像部分通过斜线填充来表示。
示例性的,参见图18中的(b),电子设备在启动相机后,取景框840内可以显示有第一图像,该第一图像为彩色图像。拍摄界面上还可以包括用于指示拍照模式的控件850,以及其它拍摄控件。
在一个示例中,用户的拍摄行为可以包括用户打开相机的第一操作;响应于所述第一操作,在显示屏上显示拍摄界面。例如,电子设备可以检测到用户点击桌面上的相机应用(application,APP)的图标的第一操作后,可以启动相机应用,显示拍摄界面。在拍摄界面上可以包括取景框,可以理解的是,在拍照模式和录像模式下,取景框的大小可以不同。例如,取景框可以为拍照模式下的取景框。在录像模式下,取景框可以为整个显示屏。在预览状态下即可以是用户打开相机且未按下拍照/录像按钮之前,该取景框内可以实时显示预览图像。
在一个示例中,预览图像可以为彩色图像,预览图像可以是在相机设置为自动拍照模式的情况下显示的图像。
步骤730、检测到所述用户指示相机的第二操作。
在一种可能的实现方式中,可以是检测到用户指示第一处理模式的第二操作。其中,第一处理模式可以是专业拍摄模式(例如,图像增强拍摄模式)。参见图19中的(a),拍摄界面上包括拍摄选项860,在电子设备检测到用户点击拍摄选项860后,参见图19中的(b),电子设备显示拍摄模式界面。在电子设备检测到用户点击拍摄模式界面上用 于指示专业拍摄模式861后,手机进入专业拍摄模式,例如,手机进行图像增强拍摄模式。
在一种可能的实现方式中,可以是检测到用户用于指示拍摄的第二操作,该第二操作为在拍摄远距离的物体,或者拍摄微小的物体,或者在拍摄环境较差的情况下用于指示拍摄的操作。
示例性地,参见图19中的(c)中,检测到用户用于指示拍摄的第二操作870。
在另一种可能的实现方式中,可以是检测到用于指示拍摄的第二操作,即用于可以不执行图19中的(a)与图19中的(b)所示的操作,直接执行图19中的(c)所示的指示拍摄的第二操作870。
应理解,用户用于指示拍摄行为的第二操作可以包括按下电子设备的相机中的拍摄按钮,也可以包括用户设备通过语音指示电子设备进行拍摄行为,或者,还可以包括用户其它的指示电子设备进行拍摄行为。上述为举例说明,并不对本申请作任何限定。
步骤740、响应于所述第二操作,在所述取景框内显示第二图像,或者在所述电子设备中保存第二图像,所述第二图像为针对所述摄像头采集到的所述第一图像的增强图像特征对所述第一图像进行颜色增强处理与亮度增强处理得到的,所述第一图像的增强图像特征是通过神经网络对所述第一图像进行特征增强处理得到的,所述神经网络包括N个卷积层,N为正整数。
应理解,对第一图像进行图像增强方法的流程可以参见图10所示的图像增强方法,执行图像增强方法的图像增强模型可以采用图11所示的模型。
在一种可能的实现方式中,参见图19,图19的(d)中取景框内显示的是第二图像,图19的(c)中取景框内显示的是第一图像,第二图像和第一图像的内容相同或者实质上相同,但是第二图像的画质优于第一图像。例如,第二图像的细节显示优于第一图像;或者,第二图像的亮度优于第一图像;或者,第二图像的亮度优于第一图像。
在另一种可能的实现方式中,取景框中可以不显示图19的(d)所示的第二图像,而是将第二图像保存至电子设备的相册中。
可选地,在一种可能的实现方式中,所述第一图像的增强图像特征是根据拉普拉斯增强算法对所述第一图像进行所述特征增强处理得到的。
可选地,在一种可能的实现方式中,所述拉普拉斯增强算法用于根据所述N个卷积层中的第i个卷积层的残差特征对所述第i个卷积层的输入图像特征进行所述特征增强处理得到所述第i个卷积层的增强图像特征,其中,所述残差特征表示所述第i个卷积层的输入图像特征与所述第i个卷积层中卷积操作处理后的图像特征之间的差异,所述第i个卷积层的增强图像特征为所述第i+1个卷积层的输入图像特征,所述输入图像特征是根据所述第一图像得到的,i为正整数。
可选地,在一种可能的实现方式中,所述第一图像的增强图像特征为所述N个卷积层中的第N个卷积层输出的图像特征,所述第一图像的增强图像特征是通过以下等式得到的:
L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
其中,L(F N)表示所述第N个卷积层的增强图像特征,F N表示所述第N个卷积层的输入图像特征,Ф表示所述第N个卷积层的卷积核,s l表示通过学习得到的缩放参数。
可选地,在一种可能的实现方式中,所述输出图像是根据第一图像、置信图像特征以 及光照补偿图像特征得到的,所述置信图像特征与所述光照补偿图像特征是根据所述第一图像的增强图像特征得到的,所述置信图像特征用于对所述第一图像进行颜色增强,所述光照补偿图像特征用于对所述第一图像进行亮度增强。
可选地,在一种可能的实现方式中,所述输出图像是根据颜色增强图像特征与所述光照补偿图像特征进行融合得到的,所述颜色增强图像特征是根据所述第一图像的图像特征与所述置信图像特征相乘得到,所述置信图像特征是通过对所述第一图像的增强图像特征进行卷积操作得到的,所述光照补偿图像特征是通过对所述第一图像的增强图像特征进行卷积操作得到的。
应理解,上述举例说明是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体数值或具体场景。本领域技术人员根据所给出的上述举例说明,显然可以进行各种等价的修改或变化,这样的修改或变化也落入本申请实施例的范围内。
上文结合图1至图19,详细描述了本申请实施例提供的图像增强方法,下面将结合图20和图21,详细描述本申请的装置实施例。应理解,本申请实施例中的图像增强装置可以执行前述本申请实施例的各种图像增强方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
图20是本申请实施例提供的图像增强装置的示意性框图。应理解,图像增强装置900可以执行图10所示的图像增强方法。该图像增强装置900包括:获取单元910和处理单元920。
其中,所述获取单元910,用于获取待处理图像;所述处理单元920,用于通过神经网络对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
可选地,作为一个实施例,所述处理单元920具体用于:
通过拉普拉斯增强算法对所述待处理图像进行所述特征增强处理,得到所述待处理图像的增强图像特征。
可选地,作为一个实施例,所述拉普拉斯增强算法用于根据所述N个卷积层中的第i个卷积层的残差特征对所述第i个卷积层的输入图像特征进行所述特征增强处理得到所述第i个卷积层的增强图像特征,其中,所述残差特征表示所述第i个卷积层的输入图像特征与所述第i个卷积层中通过卷积操作处理后的图像特征之间的差值,所述第i个卷积层的增强图像特征为所述第i+1个卷积层的输入图像特征,所述输入图像特征是根据所述待处理图像得到的,i为小于或等于N的正整数。
可选地,作为一个实施例,所述待处理图像的增强图像特征为所述N个卷积层中的第N个卷积层输出的图像特征,所述处理单元920具体用于:通过以下等式得到所述待处理图像的增强图像特征:
L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
其中,L(F N)表示所述第N个卷积层的增强图像特征,F N表示所述第N个卷积层的输入图像特征,Ф表示所述第N个卷积层的卷积核,s l表示通过学习得到的参数。
可选地,作为一个实施例,所述处理单元920具体用于:
根据所述待处理图像的增强图像特征,得到所述待处理图像的置信图像特征与光照补偿图像特征,其中,所述置信图像特征用于对所述待处理图像进行颜色增强,所述光照补偿图像特征用于对所述待处理图像进行亮度增强;
根据所述待处理图像、所述置信图像特征以及所述光照补偿图像特征得到所述输出图像。
可选地,作为一个实施例,所述处理单元920还用于:
通过对所述待处理图像的增强图像特征进行卷积操作,得到所述置信图像特征与所述光照补偿图像特征;
通过所述待处理图像的图像特征与所述置信图像特征相乘得到所述待处理图像的颜色增强图像特征;
将所述颜色增强图像特征与所述光照补偿图像特征进行融合,得到所述输出图像。
在一个实施例中,图20所示的增强装置900也可以用于执行图17至图19所示的图像增强方法。
需要说明的是,上述图像增强装置900以功能单元的形式体现。这里的术语“单元”可以通过软件和/或硬件形式实现,对此不作具体限定。
例如,“单元”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。
因此,在本申请的实施例中描述的各示例的单元,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
图21是本申请实施例提供的图像增强装置的硬件结构示意图。如图21所示的图像增强装置1000(该装置1000具体可以是一种计算机设备)包括存储器1001、处理器1002、通信接口1003以及总线1004。其中,存储器1001、处理器1002、通信接口1003通过总线1004实现彼此之间的通信连接。
存储器1001可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1001可以存储程序,当存储器1001中存储的程序被处理器1002执行时,处理器1002用于执行本申请实施例的图像增强方法的各个步骤,例如,执行图10至图15所示的各个步骤,或者,执行图17至图19所示的各个步骤。
应理解,本申请实施例所示的图像增强装置可以是服务器,例如,可以是云端的服务器,或者,也可以是配置于云端的服务器中的芯片;或者,本申请实施例所示的图像增强装置可以是智能终端,也可以是配置于智能终端中的芯片。
上述本申请实施例揭示的图像增强方法可以应用于处理器1002中,或者由处理器1002实现。处理器1002可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述图像增强方法的各步骤可以通过处理器1002中的硬件的集成逻辑电路或者软件形式的指令完成。例如,处理器1002可以是包含图8所示的NPU的芯片。
上述的处理器1002可以是中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1001,处理器1002读取存储器1001中的指令,结合其硬件完成本申请实施中图20所示的图像增强装置中包括的单元所需执行的功能,或者,执行本申请方法实施例的图10至图15所示的图像增强方法,或者,执行本申请方法实施例的图17至图19所示的各个步骤。
通信接口1003使用例如但不限于收发器一类的收发装置,来实现装置1000与其他设备或通信网络之间的通信。
总线1004可包括在图像增强装置1000各个部件(例如,存储器1001、处理器1002、通信接口1003)之间传送信息的通路。
应注意,尽管上述图像增强装置1000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,图像增强装置1000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要本领域的技术人员应当理解,上述图像增强装置1000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,上述图像增强装置1000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图21中所示的全部器件。
本申请实施例还提供一种芯片,该芯片包括收发单元和处理单元。其中,收发单元可以是输入输出电路、通信接口;处理单元为该芯片上集成的处理器或者微处理器或者集成电路。该芯片可以执行上述方法实施例中的图像增强方法。
本申请实施例还提供一种计算机可读存储介质,其上存储有指令,该指令被执行时执行上述方法实施例中的图像增强方法。
本申请实施例还提供一种包含指令的计算机程序产品,该指令被执行时执行上述方法实施例中的图像增强方法。
还应理解,本申请实施例中,该存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。处理器的一部分还可以包括非易失性随机存取存储器。例如,处理器还可以存储设备类型的信息。
还应理解,本申请实施例中,该存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。处理器的一部分还可以包括非易失性随机存取存储器。例如,处理器还可以存储设备类型的信息。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三 种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (15)

  1. 一种图像增强方法,其特征在于,包括:
    获取待处理图像;
    通过神经网络对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;
    根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
  2. 如权利要求1所述的图像增强方法,其特征在于,所述通过神经网络对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,包括:
    通过拉普拉斯增强算法对所述待处理图像进行所述特征增强处理,得到所述待处理图像的增强图像特征。
  3. 如权利要求2所述的图像增强方法,其特征在于,所述拉普拉斯增强算法用于根据所述N个卷积层中的第i个卷积层的残差特征对所述第i个卷积层的输入图像特征进行所述特征增强处理,得到所述第i个卷积层的增强图像特征,其中,所述残差特征表示所述第i个卷积层的输入图像特征与所述第i个卷积层中通过卷积操作处理后的图像特征之间的差值,所述第i个卷积层的增强图像特征为所述第i+1个卷积层的输入图像特征,所述输入图像特征是根据所述待处理图像得到的,i为小于或等于N的正整数。
  4. 如权利要求1至3中任一项所述的图像增强方法,其特征在于,所述待处理图像的增强图像特征为所述N个卷积层中第N个卷积层的增强图像特征,根据以下等式得到所述待处理图像的增强图像特征:
    L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
    其中,L(F N)表示所述第N个卷积层的增强图像特征,F N表示所述第N个卷积层的输入图像特征,Ф表示所述第N个卷积层的卷积核,s l表示通过学习得到的参数。
  5. 如权利要求1至4中任一项所述的图像增强方法,其特征在于,所述根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像,包括:
    根据所述待处理图像的增强图像特征,得到所述待处理图像的置信图像特征与光照补偿图像特征,其中,所述置信图像特征用于对所述待处理图像进行颜色增强,所述光照补偿图像特征用于对所述待处理图像进行亮度增强;
    根据所述待处理图像、所述置信图像特征以及所述光照补偿图像特征得到所述输出图像。
  6. 如权利要求5所述的图像增强方法,其特征在于,还包括:
    通过对所述待处理图像的增强图像特征进行卷积操作,得到所述置信图像特征与所述光照补偿图像特征;
    通过所述待处理图像的图像特征与所述置信图像特征相乘得到所述待处理图像的颜色增强图像特征;
    将所述颜色增强图像特征与所述光照补偿图像特征进行融合,得到所述输出图像。
  7. 一种图像增强装置,其特征在于,包括:
    获取单元,用于获取待处理图像;
    处理单元,用于通过神经网络对所述待处理图像进行特征增强处理,得到所述待处理图像的增强图像特征,所述神经网络包括N个卷积层,N为正整数;根据所述增强图像特征对所述待处理图像进行颜色增强处理与亮度增强处理,得到输出图像。
  8. 如权利要求7所述的图像增强装置,其特征在于,所述处理单元具体用于:
    通过拉普拉斯增强算法对所述待处理图像进行所述特征增强处理,得到所述待处理图像的增强图像特征。
  9. 如权利要求8所述的图像增强装置,其特征在于,所述拉普拉斯增强算法用于根据所述N个卷积层中的第i个卷积层的残差特征对所述第i个卷积层的输入图像特征进行所述特征增强处理,得到所述第i个卷积层的增强图像特征,其中,所述残差特征表示所述第i个卷积层的输入图像特征与所述第i个卷积层中通过卷积操作处理后的图像特征之间的差值,所述第i个卷积层的增强图像特征为所述第i+1个卷积层的输入图像特征,所述输入图像特征是根据所述待处理图像得到的,i为小于或等于N的正整数。
  10. 如权利要求7至9中任一项所述的图像增强装置,其特征在于,所述待处理图像的增强图像特征为所述N个卷积层中的第N个卷积层输出的图像特征,所述处理单元具体用于:
    根据以下等式得到所述待处理图像的增强图像特征:
    L(F N)=Φ(F N)+s l·(Φ(F N)-F N);
    其中,L(F N)表示所述第N个卷积层的增强图像特征,F N表示所述第N个卷积层的输入图像特征,Ф表示所述第N个卷积层的卷积核,s l表示通过学习得到的缩放参数。
  11. 如权利要求7至10中任一项所述的图像增强装置,其特征在于,所述处理单元具体用于:
    根据所述待处理图像的增强图像特征,得到所述待处理图像的置信图像特征与光照补偿图像特征,其中,所述置信图像特征用于对所述待处理图像进行颜色增强,所述光照补偿图像特征用于对所述待处理图像进行亮度增强;
    根据所述待处理图像、所述置信图像特征以及所述光照补偿图像特征得到所述输出图像。
  12. 如权利要求11所述的图像增强装置,其特征在于,所述处理单元还用于:
    通过对所述待处理图像的增强图像特征进行卷积操作,得到所述置信图像特征与所述光照补偿图像特征;
    通过所述待处理图像的图像特征与所述置信图像特征相乘得到所述待处理图像的颜色增强图像特征;
    将所述颜色增强图像特征与所述光照补偿图像特征进行融合,得到所述输出图像。
  13. 一种图像增强装置,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于执行权利要求1至6中任一项所述的图像增强方法。
  14. 一种计算机可读介质,其特征在于,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求1至6中任一项所述的 图像增强方法。
  15. 一种芯片,其特征在于,包括:处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行如权利要求1至6中任一项所述的图像增强方法。
PCT/CN2020/118721 2019-09-30 2020-09-29 图像增强方法以及装置 WO2021063341A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910943355.7 2019-09-30
CN201910943355.7A CN112581379A (zh) 2019-09-30 2019-09-30 图像增强方法以及装置

Publications (1)

Publication Number Publication Date
WO2021063341A1 true WO2021063341A1 (zh) 2021-04-08

Family

ID=75117268

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118721 WO2021063341A1 (zh) 2019-09-30 2020-09-29 图像增强方法以及装置

Country Status (2)

Country Link
CN (1) CN112581379A (zh)
WO (1) WO2021063341A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114708250A (zh) * 2022-04-24 2022-07-05 上海人工智能创新中心 一种图像处理方法、装置及存储介质
US11468543B1 (en) 2021-08-27 2022-10-11 Hong Kong Applied Science and Technology Research Institute Company Limited Neural-network for raw low-light image enhancement
CN117612115A (zh) * 2024-01-24 2024-02-27 山东高速信息集团有限公司 一种基于高速公路的车辆识别方法

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222606A (zh) * 2021-04-16 2022-10-21 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读介质及电子设备
CN113034407B (zh) * 2021-04-27 2022-07-05 深圳市慧鲤科技有限公司 图像处理方法及装置、电子设备和存储介质
CN113392804B (zh) * 2021-07-02 2022-08-16 昆明理工大学 一种基于多角度的交警目标数据集的场景化构建方法及***
CN115601244B (zh) * 2021-07-07 2023-12-12 荣耀终端有限公司 图像处理方法、装置和电子设备
CN113852759B (zh) * 2021-09-24 2023-04-18 豪威科技(武汉)有限公司 图像增强方法及拍摄装置
CN116433800B (zh) * 2023-06-14 2023-10-20 中国科学技术大学 基于社交场景用户偏好与文本联合指导的图像生成方法
CN117575969B (zh) * 2023-10-31 2024-05-07 广州成至智能机器科技有限公司 一种红外图像画质增强方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120155759A1 (en) * 2010-12-21 2012-06-21 Microsoft Corporation Establishing clusters of user preferences for image enhancement
CN109658345A (zh) * 2018-11-13 2019-04-19 建湖云飞数据科技有限公司 一种图像处理方法
CN109753978A (zh) * 2017-11-01 2019-05-14 腾讯科技(深圳)有限公司 图像分类方法、装置以及计算机可读存储介质
CN110163235A (zh) * 2018-10-11 2019-08-23 腾讯科技(深圳)有限公司 图像增强模型的训练、图像增强方法、装置和存储介质
CN110378848A (zh) * 2019-07-08 2019-10-25 中南大学 一种基于衍生图融合策略的图像去雾方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11030722B2 (en) * 2017-10-04 2021-06-08 Fotonation Limited System and method for estimating optimal parameters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120155759A1 (en) * 2010-12-21 2012-06-21 Microsoft Corporation Establishing clusters of user preferences for image enhancement
CN109753978A (zh) * 2017-11-01 2019-05-14 腾讯科技(深圳)有限公司 图像分类方法、装置以及计算机可读存储介质
CN110163235A (zh) * 2018-10-11 2019-08-23 腾讯科技(深圳)有限公司 图像增强模型的训练、图像增强方法、装置和存储介质
CN109658345A (zh) * 2018-11-13 2019-04-19 建湖云飞数据科技有限公司 一种图像处理方法
CN110378848A (zh) * 2019-07-08 2019-10-25 中南大学 一种基于衍生图融合策略的图像去雾方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11468543B1 (en) 2021-08-27 2022-10-11 Hong Kong Applied Science and Technology Research Institute Company Limited Neural-network for raw low-light image enhancement
CN114708250A (zh) * 2022-04-24 2022-07-05 上海人工智能创新中心 一种图像处理方法、装置及存储介质
CN114708250B (zh) * 2022-04-24 2024-06-07 上海人工智能创新中心 一种图像处理方法、装置及存储介质
CN117612115A (zh) * 2024-01-24 2024-02-27 山东高速信息集团有限公司 一种基于高速公路的车辆识别方法
CN117612115B (zh) * 2024-01-24 2024-05-03 山东高速信息集团有限公司 一种基于高速公路的车辆识别方法

Also Published As

Publication number Publication date
CN112581379A (zh) 2021-03-30

Similar Documents

Publication Publication Date Title
WO2021063341A1 (zh) 图像增强方法以及装置
WO2021164234A1 (zh) 图像处理方法以及图像处理装置
WO2021164731A1 (zh) 图像增强方法以及图像增强装置
US12008797B2 (en) Image segmentation method and image processing apparatus
WO2020253416A1 (zh) 物体检测方法、装置和计算机存储介质
WO2021043273A1 (zh) 图像增强方法和装置
US20230214976A1 (en) Image fusion method and apparatus and training method and apparatus for image fusion model
CN110532871B (zh) 图像处理的方法和装置
WO2021043168A1 (zh) 行人再识别网络的训练方法、行人再识别方法和装置
WO2020192483A1 (zh) 图像显示方法和设备
WO2020177607A1 (zh) 图像去噪方法和装置
CN112446380A (zh) 图像处理方法和装置
WO2021018106A1 (zh) 行人检测方法、装置、计算机可读存储介质和芯片
WO2022134971A1 (zh) 一种降噪模型的训练方法及相关装置
US20230177641A1 (en) Neural network training method, image processing method, and apparatus
CN111667399A (zh) 风格迁移模型的训练方法、视频风格迁移的方法以及装置
CN113066017B (zh) 一种图像增强方法、模型训练方法及设备
EP4006776A1 (en) Image classification method and apparatus
WO2021103731A1 (zh) 一种语义分割方法、模型训练方法及装置
CN113076685A (zh) 图像重建模型的训练方法、图像重建方法及其装置
WO2021042774A1 (zh) 图像恢复方法、图像恢复网络训练方法、装置和存储介质
CN115861380B (zh) 雾天低照度场景下端到端无人机视觉目标跟踪方法及装置
CN112529904A (zh) 图像语义分割方法、装置、计算机可读存储介质和芯片
WO2024002211A1 (zh) 一种图像处理方法及相关装置
CN114627034A (zh) 一种图像增强方法、图像增强模型的训练方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20872333

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20872333

Country of ref document: EP

Kind code of ref document: A1