CN110399881B - End-to-end quality enhancement method and device based on binocular stereo image - Google Patents

End-to-end quality enhancement method and device based on binocular stereo image Download PDF

Info

Publication number
CN110399881B
CN110399881B CN201910624300.XA CN201910624300A CN110399881B CN 110399881 B CN110399881 B CN 110399881B CN 201910624300 A CN201910624300 A CN 201910624300A CN 110399881 B CN110399881 B CN 110399881B
Authority
CN
China
Prior art keywords
network
quality
quality image
image
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910624300.XA
Other languages
Chinese (zh)
Other versions
CN110399881A (en
Inventor
邹文斌
金枝
彭映青
唐毅
李霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201910624300.XA priority Critical patent/CN110399881B/en
Publication of CN110399881A publication Critical patent/CN110399881A/en
Application granted granted Critical
Publication of CN110399881B publication Critical patent/CN110399881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

According to the end-to-end quality enhancement method and device based on the binocular stereo images disclosed by the embodiment of the invention, low/high-quality images in the binocular stereo images are respectively input to a feature extraction network for feature extraction, shallow feature maps of the obtained low/high-quality images are respectively input to an information distillation network for information distillation, high feature maps of the obtained low/high-quality images are input to an information fusion network for information fusion, and finally the obtained fusion feature maps are input to a long-short term memory network for learning the visual difference of the stereo images, so that the quality enhancement images of the low-quality images are reconstructed. According to the end-to-end image quality enhancement algorithm, the high-quality image is used for guiding the reconstruction of the low-quality image, the visual difference of the three-dimensional image is learned through the long-term and short-term memory network based on information fusion, the operation speed can be improved, the error transmission is avoided, the rate and the accuracy of image reconstruction are ensured, and the consumption of memory and computing resources is reduced.

Description

End-to-end quality enhancement method and device based on binocular stereo image
Technical Field
The invention relates to the technical field of image processing, in particular to an end-to-end quality enhancement method and device based on binocular stereo images.
Background
In recent years, with the increase of additional information in the visual difference, quality enhancement of stereoscopic images has become an active research area.
Since the pioneering work of Super-Resolution Convolutional Neural networks (SRCNN), a learning-based method has been widely adopted to enhance image quality. Currently, the commonly adopted stereo image enhancement method utilizes stereo matching to learn the corresponding relationship between stereo image pairs, and uses cost to simulate long-term dependence in a network, however, learning the accurate corresponding relationship between stereo image pairs has great difficulty due to the large difference between different viewpoints of stereo image pairs. In addition, a Convolutional Neural Network (CNN) is used as a main method, the size of the input image is adjusted before the input image is sent to the Network, and a deeper recursive Network is adopted to obtain better reconstruction performance, but the method needs a large amount of computing resources and memory consumption.
Disclosure of Invention
The embodiments of the present invention mainly aim to provide an end-to-end quality enhancement method and apparatus based on binocular stereo images, which can at least solve the problems that it is difficult to learn the accurate corresponding relationship between stereo image pairs and a large amount of computing resources and memory consumption are required when the stereo image quality is enhanced in the related art.
In order to achieve the above object, a first aspect of embodiments of the present invention provides an end-to-end quality enhancement method based on binocular stereo images, which is applied to an overall neural network including a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network, and a long-term and short-term memory network, where the first quality enhancement branch network and the second quality enhancement branch network each include a feature extraction network and an information distillation network, and the method includes:
respectively inputting a low-quality image and a high-quality image in a binocular stereo image into the first quality enhancement branch network and the second quality enhancement branch network, and respectively performing feature extraction processing on the input image through the feature extraction network to obtain shallow feature images respectively corresponding to the low-quality image and the high-quality image;
inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into the information distillation network respectively to carry out information distillation processing, so as to obtain high-level characteristic diagrams corresponding to the low-quality image and the high-quality image respectively; wherein the high-level feature map has a larger amount of useful information than the shallow feature map;
inputting the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into the information fusion network, and performing information fusion processing on the low-quality image features and the high-quality image features to obtain fusion feature maps;
inputting the fusion feature map into the long-short term memory network, learning position information under the visual fields respectively corresponding to the low-quality image and the high-quality image, and reconstructing to obtain a quality enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
In order to achieve the above object, a second aspect of the embodiments of the present invention provides an end-to-end quality enhancement apparatus based on binocular stereo images, applied to an overall neural network including a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network, and a long-term and short-term memory network, where the first quality enhancement branch network and the second quality enhancement branch network each include a feature extraction network and an information distillation network, the apparatus including:
the extraction module is used for respectively inputting a low-quality image and a high-quality image in a binocular stereo image into the first quality enhancement branch network and the second quality enhancement branch network, and respectively performing feature extraction processing on the input image through the feature extraction network to obtain shallow feature maps respectively corresponding to the low-quality image and the high-quality image;
the distillation module is used for respectively inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into the information distillation network for information distillation processing to obtain high-level characteristic diagrams respectively corresponding to the low-quality image and the high-quality image; wherein the high-level feature map has a larger amount of useful information than the shallow feature map;
the fusion module is used for inputting the high-level feature maps of the low-quality image and the high-quality image into the information fusion network and performing information fusion processing on the low-quality image feature and the high-quality image feature to obtain fusion feature maps;
and the reconstruction module is used for inputting the fusion feature map into the long-term and short-term memory network, learning position information under the visual fields respectively corresponding to the low-quality image and the high-quality image, and reconstructing to obtain a quality enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
According to the end-to-end quality enhancement method and device based on the binocular stereo images, provided by the embodiment of the invention, the low-quality images and the high-quality images in the binocular stereo images are respectively input to the feature extraction network to carry out feature extraction processing on the input images, the obtained shallow feature images corresponding to the low-quality images and the high-quality images are respectively input to the information distillation network to carry out information distillation processing, the obtained high feature images corresponding to the low-quality images and the high-quality images are input to the information fusion network to carry out information fusion processing, the obtained fusion feature images are input to the long-short term memory network to learn the visual difference of the stereo images, and the quality enhancement images of the input low-quality images are reconstructed. According to the end-to-end image quality enhancement algorithm, the reconstruction of low-quality images is guided by using high-quality images, the visual difference of the three-dimensional images is learned through a long-term and short-term memory network based on information fusion, the operation speed can be greatly improved, the error transmission of multi-stage processing is avoided, the image reconstruction rate and accuracy are ensured, and the consumption of memory and calculation resources is reduced.
Other features and corresponding effects of the present invention are set forth in the following portions of the specification, and it should be understood that at least some of the effects are apparent from the description of the present invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a network framework of an overall neural network according to a first embodiment of the present invention;
fig. 2 is a schematic basic flow chart of a quality enhancement method according to a first embodiment of the present invention;
FIG. 3 is a schematic view of a basic flow chart of an information distillation processing method according to a first embodiment of the present invention;
fig. 4 is a schematic diagram of a network architecture of a DBlock according to a first embodiment of the present invention;
fig. 5 is a schematic diagram of a network architecture of an LDBlock according to a first embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a quality enhancing apparatus according to a second embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to a third embodiment of the invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment:
in order to solve the technical problems that it is difficult to learn the exact corresponding relationship between stereo image pairs and a large amount of computing resources and memory consumption are required when stereo image quality enhancement is performed in the related art, the present embodiment provides an end-to-end quality enhancement method based on binocular stereo images, which is applied to an overall neural network including a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network, and a long-term and short-term memory network, where the first quality enhancement branch network and the second quality enhancement branch network both include a feature extraction network and an information distillation network, as shown in fig. 1, a network framework diagram of the overall neural network provided by the present embodiment is shown, in the diagram, a is the first quality enhancement branch network, B is the second quality enhancement branch network, and in the network framework of the present embodiment, feature extraction and information distillation networks of dual-flow high/low-quality images are adopted, the abundant additional information brought by the binocular stereo image is efficiently utilized.
As shown in fig. 2, a basic flow diagram of the quality enhancement method provided in this embodiment is shown, and the quality enhancement method provided in this embodiment includes the following steps:
step 201, respectively inputting a low-quality image and a high-quality image in a binocular stereo image to a first quality enhancement branch network and a second quality enhancement branch network, and respectively performing feature extraction processing on the input images through a feature extraction network to obtain shallow feature maps respectively corresponding to the low-quality image and the high-quality image.
Specifically, in the binocular stereo image of the present embodiment, the image quality of different views is different, one view corresponds to a low-quality image, the other view corresponds to a high-quality image, the low-quality image and the high-quality image are respectively used as the input of the feature extraction network of one of the quality enhancement branch networks for feature extraction, in the present embodiment, the feature extraction network may include two convolution kernels of 3 × 3, and the output is a 64-dimensional feature map.
Step 202, inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into an information distillation network respectively for information distillation processing to obtain high-level characteristic diagrams corresponding to the low-quality image and the high-quality image respectively; the high-level feature map has a larger amount of useful information than the shallow feature map.
Specifically, in this embodiment, the feature map output by the feature extraction network is further input to the information distillation network, and the extracted feature map has more useful information than the feature map output by the feature extraction network. In practical applications, in order to further ensure that the extracted feature map has as much useful information as possible, the information distillation network in this embodiment may be composed of a plurality of identical information distillation sub-networks in cascade.
As shown in fig. 3, which is a schematic flow chart of the information distillation processing method provided in this embodiment, optionally, the information distillation network includes: an image quality enhancement network and a compression network; the method specifically comprises the following steps of inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into an information distillation network respectively for information distillation processing, and obtaining the high-level characteristic diagrams respectively corresponding to the low-quality image and the high-quality image:
step 301, inputting shallow feature maps of a low-quality image and a high-quality image into an image quality enhancement network respectively to perform information enhancement processing, so as to obtain quality-enhanced feature maps;
and step 302, inputting the quality-enhanced feature map into a compression network for information compression processing to obtain a high-level feature map corresponding to the low-quality image and the high-quality image respectively.
Specifically, in this embodiment, the information distillation network is composed of two modules, namely, an image quality enhancement network and a compression network, wherein the image quality enhancement network is used to enhance the information of the image, so as to increase the useful information in the feature map; then, the information compression is carried out through a compression network, wherein the compression network can adopt a convolution kernel of 1 multiplied by 1 to play the roles of reducing dimensionality and reducing parameters, and in addition, the nonlinear characteristic of the characteristic diagram can be greatly increased on the premise of keeping the size of the characteristic diagram unchanged.
Optionally, the first quality enhancement branch network and the second quality enhancement branch network further include: an attention mechanism network; before the high-level feature maps of the low-quality image and the high-quality image are input into the information fusion network, the method further comprises the following steps: and respectively inputting the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into an attention mechanism network, selectively learning the features in the high-level feature maps of the low-quality image and the high-quality image, and obtaining the high-level feature maps of the low-quality image and the high-level image from which more useful information is extracted.
Specifically, in this embodiment, after the high-level feature map is output through the information distillation network, in order to extract more key and useful information, an attention mechanism is further added on the basis of the information distillation network, and more weights are assigned to the useful key features in the extracted feature map to obtain more detailed information of the target that needs to be focused, so as to focus on learning more useful features. It should be noted that the use of force mechanism networks for configuration within or outside of the information distillation network allows for more characterization features to be learned. The information distillation network is formed by one or more information distillation networks through cascade operation, and an attention mechanism network is configured behind each information distillation network or on an image quality enhancement unit inside each information distillation network. In this embodiment, the configuration of the attention mechanism network is not limited uniquely, and may be determined according to specific requirements in practical applications.
It should be noted that, in the present embodiment, the process including the feature extraction operation, the information distillation operation, and the attention mechanism operation can be expressed as: di=C(Di-1(f (x))), i ═ 1, …, n, and Pi=P(Di) Where x denotes an input image, i.e., a low-quality image and a high-quality image, which are input to the respective feature extraction networks, respectively, f denotes a feature extraction operation, DiRepresents the function of the ith information distillation network, C represents the attention mechanism operation, and P represents the compression operation.
Optionally, the network depth and the network width of the information distillation network in the first quality enhancement branch network are greater than those of the information distillation network in the second quality enhancement branch network.
Specifically, since a low-quality image requires a deeper network to learn more features than a high-quality image, in the present embodiment, a long information distillation module (LDBlock) is used on a first quality enhancement branch network that performs low-quality image processing to extract useful information, and an information distillation module (DBlock) is used on a second quality enhancement branch network that performs high-quality image processing, wherein the network becomes deeper and wider than the information distillation module, and the abstraction level of the extractable information is higher.
Fig. 4 is a schematic diagram of a network architecture of a DBlock provided in this embodiment, and fig. 5 is a schematic diagram of a network architecture of an LDBlock provided in this embodiment, where S denotes a branch operation, C denotes a cascade between channels, and a Channel-wise denotes an attention mechanism network, and the LDBlock and the DBlock in this embodiment both include a plurality of sub-networks. The embodiment adopts the stacked convolution operation to extract more information after segmenting the feature map by the deeper and wider LDBlock compared with DBlock. In order to reduce the parameters of the network, the present embodiment employs packet convolution (the number of groups is 4) at the second layer convolution layer of each sub-network. In addition, the LDBlock and DBlock in this embodiment all use filters of 3 × 3 size, which increases the representation capability and reduces the parameter data, thereby enhancing the calculation capability. In addition, the present embodiment adaptively learns the feature map using an attention mechanism embedded in the information distilling module in consideration of the interrelation of features between channels.
And 203, inputting the high-level feature maps of the low-quality image and the high-level feature map of the high-quality image into the information fusion network, and performing information fusion processing on the low-quality image features and the high-quality image features to obtain a fusion feature map.
Specifically, in this embodiment, after feature extraction is performed on a low-quality image and a high-quality image respectively through a dual-flow network formed by a first quality enhancement branch network and a second quality enhancement branch network, the extracted features are fused through an information fusion network. The information fusion process in the present embodiment can be expressed as follows: f0=F(Ilow+Ihigh) Wherein, IlowOutput of the information distillation network representing the first quality-enhancing branch network, IhighAn output of the information distillation network representing the second quality-enhanced branch network, F represents an information fusion operation of the high-quality image feature and the low-quality image feature, F0Representing the output of the information fusion network. Furthermore, it should be understood that if the attention mechanism operation is performed after the information distilling operation, the high-level feature maps of the low-quality image and the high-quality image, which are extracted to more useful information, output by the attention mechanism network should be input to the information fusion network correspondingly.
And step 204, inputting the fusion feature map into a long-term and short-term memory network, learning position information under the visual fields respectively corresponding to the low-quality image and the high-quality image, and reconstructing to obtain a quality enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
Specifically, in this embodiment, using the similarity of features, a Long Short-Term Memory network (LSTM) is used to learn the corresponding information of the positions in different fields of view, and the disparity change between two viewpoints in a stereo image pair is processed, so that the left and right images of the binocular stereo image maintain the consistency of the positions. In order to improve the utilization rate of the features, the embodiment combines the feature map before entering the LSTM, and guides the low-quality image to perform quality enhancement reconstruction through the high-quality image, so that the high-quality image can be reconstructed more effectively. In the present embodiment, the image reconstruction operation can be expressed as follows: y ═ L (F)0)+F0Where y is the final low quality image enhanced image, L represents the LSTM function, and F0Representing the output of the information fusion network.
Furthermore, in an alternative implementation of this embodiment, the loss function of the overall neural network is expressed as: loss ═ n1(Loss)+n2(Llstm) (ii) a Wherein the content of the first and second substances,
Figure BDA0002126567560000071
Figure BDA0002126567560000072
wherein Loss is the Loss function of the overall neural network, LlstmAs a loss function of the long-short term memory network, n1And n2Respectively, the weight hyperparameter corresponding to the loss function, alpha is the parameter of the whole neural network, F is the predicted image generated by the whole neural network, IlowFor low quality images, IhighFor high quality images, IGTFor high quality images before compression of low quality images, IlstmThe image is enhanced for the quality of the low quality image.
Specifically, the network of the present embodiment is optimized by a loss function, and the present embodiment designs two loss functions, including a total loss function of the whole network and a loss function of the LSTM network, wherein the total loss function is used to measure a difference between a predicted image of a low-quality image and a corresponding high-quality image before compression of the low-quality image, and the loss function of the LSTM network is used to optimize a difference between positions of the low-quality image and the high-quality image.
According to the end-to-end quality enhancement method based on the binocular stereo images, provided by the embodiment of the invention, the low-quality images and the high-quality images in the binocular stereo images are respectively input to the feature extraction network to carry out feature extraction processing on the input images, the obtained shallow feature maps corresponding to the low-quality images and the high-quality images are respectively input to the information distillation network to carry out information distillation processing, the obtained high feature maps corresponding to the low-quality images and the high-quality images are input to the information fusion network to carry out information fusion processing, the obtained fusion feature maps are input to the long-short term memory network to learn the visual difference of the stereo images, and the quality enhancement images of the input low-quality images are reconstructed. According to the end-to-end image quality enhancement algorithm, the reconstruction of low-quality images is guided by using high-quality images, the visual difference of the three-dimensional images is learned through a long-term and short-term memory network based on information fusion, the operation speed can be greatly improved, the error transmission of multi-stage processing is avoided, the image reconstruction rate and accuracy are ensured, and the consumption of memory and calculation resources is reduced.
Second embodiment:
in order to solve the technical problems that it is difficult to learn the accurate corresponding relationship between stereo image pairs and a large amount of computing resources and memory consumption are required when stereo image quality enhancement is performed in the related art, this embodiment shows an end-to-end quality enhancement device based on binocular stereo images, which is applied to an overall neural network including a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network, and a long-term and short-term memory network, where the first quality enhancement branch network and the second quality enhancement branch network both include a feature extraction network and an information distillation network, and referring to fig. 6 specifically, the quality enhancement device of this embodiment includes:
the extracting module 601 is configured to input a low-quality image and a high-quality image in a binocular stereo image to the first quality enhancement branch network and the second quality enhancement branch network, and perform feature extraction processing on the input images through the feature extraction networks, so as to obtain shallow feature maps corresponding to the low-quality image and the high-quality image;
a distilling module 602, configured to input the shallow feature maps of the low-quality image and the high-quality image to an information distilling network respectively for information distilling processing, so as to obtain high-level feature maps corresponding to the low-quality image and the high-quality image respectively; wherein the useful information content of the high-level characteristic diagram is more than that of the shallow-level characteristic diagram;
the fusion module 603 is configured to input both the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into the information fusion network, and perform information fusion processing on the low-quality image features and the high-quality image features to obtain fusion feature maps;
and a reconstructing module 604, configured to input the fusion feature map into a long-term and short-term memory network, learn position information under the fields of view respectively corresponding to the low-quality image and the high-quality image, and reconstruct a quality-enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
In an alternative embodiment of this embodiment, the information distillation network comprises: image quality enhancement networks and compression networks. Correspondingly, the distillation module 602 is specifically configured to input the shallow feature maps of the low-quality image and the high-quality image to an image quality enhancement network for information enhancement processing, so as to obtain quality-enhanced feature maps; and inputting the feature map with enhanced quality into a compression network for information compression processing to obtain high-level feature maps respectively corresponding to the low-quality image and the high-quality image.
In an optional implementation manner of this embodiment, the first quality enhancement branch network and the second quality enhancement branch network further include: attention mechanism network. The quality enhancing apparatus of the present embodiment further comprises: a learning module; the learning module is used for respectively inputting the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into the attention mechanism network before inputting the high-level feature maps of the low-quality image and the high-level image into the information fusion network, selectively learning the features in the high-level feature maps of the low-quality image and the high-quality image, and obtaining the high-level feature maps of the low-quality image and the high-level image which extract more useful information. Correspondingly, the fusion module 603 is specifically configured to input both the low-quality image from which more useful information is extracted and the high-level feature map of the high-quality image to the information fusion network, and perform information fusion processing on the low-quality image features and the high-quality image features to obtain a fusion feature map.
In an optional implementation manner of this embodiment, the network depth and the network width of the information distillation network in the first quality enhancement branch network are greater than those of the information distillation network in the second quality enhancement branch network.
Further, in an alternative implementation of this embodiment, the loss function of the overall neural network is expressed as: loss ═ n1(Loss)+n2(Llstm) (ii) a Wherein the content of the first and second substances,
Figure BDA0002126567560000091
Figure BDA0002126567560000092
wherein Loss is the Loss function of the overall neural network, LlstmAs a loss function of the long-short term memory network, n1And n2Respectively, the weight hyperparameter corresponding to the loss function, alpha is the parameter of the whole neural network, F is the predicted image generated by the whole neural network, IlowFor low quality images, IhighFor high quality images, IGTFor high quality images before compression of low quality images, IlstmThe image is enhanced for the quality of the low quality image.
It should be noted that, the end-to-end quality enhancement method based on the binocular stereo image in the foregoing embodiment can be implemented based on the end-to-end quality enhancement device based on the binocular stereo image provided in this embodiment, and it can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the end-to-end quality enhancement device based on the binocular stereo image described in this embodiment may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
By using the end-to-end quality enhancement device based on the binocular stereo image provided by this embodiment, the low-quality image and the high-quality image in the binocular stereo image are respectively input to the feature extraction network to perform feature extraction processing on the input image, the obtained shallow feature maps corresponding to the low-quality image and the high-quality image are respectively input to the information distillation network to perform information distillation processing, the obtained high feature maps corresponding to the low-quality image and the high-quality image are input to the information fusion network to perform information fusion processing, and the obtained fusion feature maps are input to the long-short term memory network to learn the visual difference of the stereo image, so as to reconstruct the quality enhanced image of the input low-quality image. According to the end-to-end image quality enhancement algorithm, the reconstruction of low-quality images is guided by using high-quality images, the visual difference of the three-dimensional images is learned through a long-term and short-term memory network based on information fusion, the operation speed can be greatly improved, the error transmission of multi-stage processing is avoided, the image reconstruction rate and accuracy are ensured, and the consumption of memory and calculation resources is reduced.
The third embodiment:
the present embodiment provides an electronic apparatus, as shown in fig. 7, which includes a processor 701, a memory 702, and a communication bus 703, wherein: the communication bus 703 is used for realizing connection communication between the processor 701 and the memory 702; the processor 701 is configured to execute one or more computer programs stored in the memory 702 to implement at least one step of the end-to-end binocular stereo image based quality enhancement method in the first embodiment.
The present embodiments also provide a computer-readable storage medium including volatile or non-volatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, computer program modules or other data. Computer-readable storage media include, but are not limited to, RAM (Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash Memory or other Memory technology, CD-ROM (Compact disk Read-Only Memory), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
The computer-readable storage medium in this embodiment may be used for storing one or more computer programs, and the stored one or more computer programs may be executed by a processor to implement at least one step of the method in the first embodiment.
The present embodiment also provides a computer program, which can be distributed on a computer readable medium and executed by a computing device to implement at least one step of the method in the first embodiment; and in some cases at least one of the steps shown or described may be performed in an order different than that described in the embodiments above.
The present embodiments also provide a computer program product comprising a computer readable means on which a computer program as shown above is stored. The computer readable means in this embodiment may include a computer readable storage medium as shown above.
It will be apparent to those skilled in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software (which may be implemented in computer program code executable by a computing device), firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit.
In addition, communication media typically embodies computer readable instructions, data structures, computer program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to one of ordinary skill in the art. Thus, the present invention is not limited to any specific combination of hardware and software.
The foregoing is a more detailed description of embodiments of the present invention, and the present invention is not to be considered limited to such descriptions. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (8)

1. An end-to-end quality enhancement method based on binocular stereo images is applied to an overall neural network comprising a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network and a long-term and short-term memory network, wherein the first quality enhancement branch network and the second quality enhancement branch network respectively comprise a feature extraction network and an information distillation network, and the quality enhancement method is characterized by comprising the following steps of:
respectively inputting a low-quality image and a high-quality image in a binocular stereo image into the first quality enhancement branch network and the second quality enhancement branch network, and respectively performing feature extraction processing on the input image through the feature extraction network to obtain shallow feature images respectively corresponding to the low-quality image and the high-quality image;
inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into the information distillation network respectively to carry out information distillation processing, so as to obtain high-level characteristic diagrams corresponding to the low-quality image and the high-quality image respectively; wherein the high-level feature map has a larger amount of useful information than the shallow feature map;
inputting the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into the information fusion network, and performing information fusion processing on the low-quality image features and the high-quality image features to obtain fusion feature maps;
inputting the fusion feature map into the long-short term memory network, learning position information under the visual fields respectively corresponding to the low-quality image and the high-quality image, and reconstructing to obtain a quality enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
2. The quality enhancement method of claim 1, wherein the information distillation network comprises: an image quality enhancement network and a compression network;
the step of inputting the shallow feature maps of the low-quality image and the high-quality image into the information distillation network respectively to perform information distillation processing to obtain the high-level feature maps corresponding to the low-quality image and the high-quality image respectively comprises:
respectively inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into the image quality enhancement network for information enhancement processing to obtain quality-enhanced characteristic diagrams;
and inputting the quality-enhanced feature map into the compression network for information compression processing to obtain high-level feature maps respectively corresponding to the low-quality image and the high-quality image.
3. The quality enhancement method of claim 1, wherein the first quality enhancement branch network and the second quality enhancement branch network further comprise: an attention mechanism network;
before the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image are input into the information fusion network, the method further comprises the following steps:
respectively inputting the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image into the attention mechanism network, selectively learning the features in the high-level feature maps of the low-quality image and the high-quality image, and obtaining the high-level feature maps of the low-quality image and the high-level feature maps of the high-quality image, which extract more useful information;
the inputting the high-level feature maps of the low-quality image and the high-quality image into the information fusion network comprises:
and inputting the high-level characteristic graphs of the low-quality image and the high-quality image which extract more useful information into the information fusion network.
4. The quality enhancement method of claim 1, wherein a network depth and a network width of the information distillation network in the first quality enhancement branch network are greater than the information distillation network in the second quality enhancement branch network.
5. An end-to-end quality enhancement device based on binocular stereo images is applied to an overall neural network comprising a first quality enhancement branch network, a second quality enhancement branch network, an information fusion network and a long-term and short-term memory network, wherein the first quality enhancement branch network and the second quality enhancement branch network respectively comprise a feature extraction network and an information distillation network, and the quality enhancement device is characterized by comprising:
the extraction module is used for respectively inputting a low-quality image and a high-quality image in a binocular stereo image into the first quality enhancement branch network and the second quality enhancement branch network, and respectively performing feature extraction processing on the input image through the feature extraction network to obtain shallow feature maps respectively corresponding to the low-quality image and the high-quality image;
the distillation module is used for respectively inputting the shallow characteristic diagrams of the low-quality image and the high-quality image into the information distillation network for information distillation processing to obtain high-level characteristic diagrams respectively corresponding to the low-quality image and the high-quality image; wherein the high-level feature map has a larger amount of useful information than the shallow feature map;
the fusion module is used for inputting the high-level feature maps of the low-quality image and the high-quality image into the information fusion network and performing information fusion processing on the low-quality image feature and the high-quality image feature to obtain fusion feature maps;
and the reconstruction module is used for inputting the fusion feature map into the long-term and short-term memory network, learning position information under the visual fields respectively corresponding to the low-quality image and the high-quality image, and reconstructing to obtain a quality enhanced image of the low-quality image according to the acquired position difference information of the low-quality image and the high-quality image.
6. The quality enhancement device of claim 5, wherein the information distillation network comprises: an image quality enhancement network and a compression network;
the distillation module is specifically used for respectively inputting the shallow layer characteristic diagrams of the low-quality image and the high-quality image into the image quality enhancement network for information enhancement processing to obtain quality-enhanced characteristic diagrams; and inputting the quality-enhanced feature map into the compression network for information compression processing to obtain high-level feature maps respectively corresponding to the low-quality image and the high-quality image.
7. The quality enhancement apparatus of claim 5, wherein the first quality enhancement branch network and the second quality enhancement branch network further comprise, respectively: an attention mechanism network; the quality enhancement device further comprises: a learning module;
the learning module is used for respectively inputting the high-level feature maps of the low-quality image and the high-level feature map of the high-quality image into the attention mechanism network before inputting the high-level feature maps of the low-quality image and the high-quality image into the information fusion network, selectively learning the features in the high-level feature maps of the low-quality image and the high-quality image, and obtaining the high-level feature maps of the low-quality image and the high-quality image with more useful information extracted;
the fusion module is specifically used for inputting the high-level feature maps of the low-quality image and the high-level feature map of the high-quality image, which extract more useful information, into the information fusion network, and performing information fusion processing on the low-quality image features and the high-quality image features to obtain a fusion feature map.
8. The quality enhancement apparatus of claim 5, wherein a network depth and a network width of the information distillation network in the first quality enhancement branch network are greater than the information distillation network in the second quality enhancement branch network.
CN201910624300.XA 2019-07-11 2019-07-11 End-to-end quality enhancement method and device based on binocular stereo image Active CN110399881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910624300.XA CN110399881B (en) 2019-07-11 2019-07-11 End-to-end quality enhancement method and device based on binocular stereo image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910624300.XA CN110399881B (en) 2019-07-11 2019-07-11 End-to-end quality enhancement method and device based on binocular stereo image

Publications (2)

Publication Number Publication Date
CN110399881A CN110399881A (en) 2019-11-01
CN110399881B true CN110399881B (en) 2021-06-01

Family

ID=68325378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910624300.XA Active CN110399881B (en) 2019-07-11 2019-07-11 End-to-end quality enhancement method and device based on binocular stereo image

Country Status (1)

Country Link
CN (1) CN110399881B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111862034B (en) * 2020-07-15 2023-06-30 平安科技(深圳)有限公司 Image detection method, device, electronic equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403415A (en) * 2017-07-21 2017-11-28 深圳大学 Compression depth plot quality Enhancement Method and device based on full convolutional neural networks
CN108769671A (en) * 2018-06-13 2018-11-06 天津大学 Stereo image quality evaluation method based on adaptive blending image
CN109360178A (en) * 2018-10-17 2019-02-19 天津大学 Based on blending image without reference stereo image quality evaluation method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107392868A (en) * 2017-07-21 2017-11-24 深圳大学 Compression binocular image quality enhancement method and device based on full convolutional neural networks
CN107578404B (en) * 2017-08-22 2019-11-15 浙江大学 View-based access control model notable feature is extracted complete with reference to objective evaluation method for quality of stereo images
CN109523506B (en) * 2018-09-21 2021-03-26 浙江大学 Full-reference stereo image quality objective evaluation method based on visual salient image feature enhancement
CN109714592A (en) * 2019-01-31 2019-05-03 天津大学 Stereo image quality evaluation method based on binocular fusion network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403415A (en) * 2017-07-21 2017-11-28 深圳大学 Compression depth plot quality Enhancement Method and device based on full convolutional neural networks
CN108769671A (en) * 2018-06-13 2018-11-06 天津大学 Stereo image quality evaluation method based on adaptive blending image
CN109360178A (en) * 2018-10-17 2019-02-19 天津大学 Based on blending image without reference stereo image quality evaluation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深层特征学习的无参考立体图像质量评价;富振奇 等;《光电子·激光》;20180531;第29卷(第5期);第545-552页 *

Also Published As

Publication number Publication date
CN110399881A (en) 2019-11-01

Similar Documents

Publication Publication Date Title
CN109101975B (en) Image semantic segmentation method based on full convolution neural network
Pang et al. Hierarchical dynamic filtering network for RGB-D salient object detection
CN108416327B (en) Target detection method and device, computer equipment and readable storage medium
Sun et al. Deep pixel‐to‐pixel network for underwater image enhancement and restoration
Wang et al. Multi-scale dilated convolution of convolutional neural network for image denoising
CN111402170B (en) Image enhancement method, device, terminal and computer readable storage medium
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
CN110852961A (en) Real-time video denoising method and system based on convolutional neural network
CN112819157B (en) Neural network training method and device, intelligent driving control method and device
CN111639230B (en) Similar video screening method, device, equipment and storage medium
Ma et al. Modal complementary fusion network for RGB-T salient object detection
CN111833360A (en) Image processing method, device, equipment and computer readable storage medium
CN115115540A (en) Unsupervised low-light image enhancement method and unsupervised low-light image enhancement device based on illumination information guidance
CN115660955A (en) Super-resolution reconstruction model, method, equipment and storage medium for efficient multi-attention feature fusion
CN114780768A (en) Visual question-answering task processing method and system, electronic equipment and storage medium
CN114627035A (en) Multi-focus image fusion method, system, device and storage medium
CN116452810A (en) Multi-level semantic segmentation method and device, electronic equipment and storage medium
Luo et al. Multi-scale receptive field fusion network for lightweight image super-resolution
CN110399881B (en) End-to-end quality enhancement method and device based on binocular stereo image
CN115082306A (en) Image super-resolution method based on blueprint separable residual error network
CN108986210B (en) Method and device for reconstructing three-dimensional scene
CN113033448B (en) Remote sensing image cloud-removing residual error neural network system, method and equipment based on multi-scale convolution and attention and storage medium
Shen et al. RSHAN: Image super-resolution network based on residual separation hybrid attention module
CN110490876B (en) Image segmentation method based on lightweight neural network
CN115984934A (en) Training method of face pose estimation model, face pose estimation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant