WO2023070447A1 - 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质 - Google Patents

模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质 Download PDF

Info

Publication number
WO2023070447A1
WO2023070447A1 PCT/CN2021/127078 CN2021127078W WO2023070447A1 WO 2023070447 A1 WO2023070447 A1 WO 2023070447A1 CN 2021127078 W CN2021127078 W CN 2021127078W WO 2023070447 A1 WO2023070447 A1 WO 2023070447A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature map
decoding
level
processing
Prior art date
Application number
PCT/CN2021/127078
Other languages
English (en)
French (fr)
Inventor
段然
陈冠男
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to CN202180003171.8A priority Critical patent/CN116368500A/zh
Priority to PCT/CN2021/127078 priority patent/WO2023070447A1/zh
Publication of WO2023070447A1 publication Critical patent/WO2023070447A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • G06V40/13Sensors therefor
    • G06V40/1318Sensors therefor using electro-optical elements or layers, e.g. electroluminescent sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • G06V40/1347Preprocessing; Feature extraction
    • G06V40/1359Extracting features related to ridge properties; Determining the fingerprint type, e.g. whorl or loop
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to the field of computer technology, in particular to a model training method, an image processing method, a computing processing device and a non-transitory computer-readable medium.
  • the fingerprint under the optical screen uses the point light source under the screen to illuminate the finger. The light is reflected by the finger and then received by the optical sensor under the screen. Due to the difference in the light intensity of the light reflected by the fingerprint valley and the fingerprint ridge, a fingerprint image can be generated. Because the fingerprint collection system under the optical screen can collect fingerprints in a large range and the hardware cost is low, it has high production value.
  • the present disclosure provides a model training method, including:
  • the samples in the sample set include blurred images and clear images of the same fingerprint
  • the fuzzy image is input into the convolutional neural network, the encoding network in the convolutional neural network performs down-sampling and feature extraction to the blurred image, and outputs a plurality of feature maps, and the decoding network in the convolutional neural network Upsampling and feature extraction are performed on the feature map, and a predicted image corresponding to the blurred image is output; wherein, the encoding network includes a plurality of encoding levels, and the decoding network includes a plurality of decoding levels, and in the encoding network After the feature map obtained from the processing of the Fth encoding level is fused with the feature map obtained from the processing of the Gth decoding level in the decoding network, it is used as the input of the G+1th decoding level in the decoding network, and the Fth The feature map obtained by the encoding level processing has the same resolution as the feature map obtained by the Gth decoding level processing, and the F and the G are both positive integers;
  • the convolutional neural network with parameter adjustment is determined as the image processing model.
  • each of the encoding levels includes a first convolutional block and/or a downsampling block
  • each of the decoding levels includes a second convolutional block and/or an upsampling block
  • At least one of the first convolution block, the down-sampling block, the second convolution block and the up-sampling block includes at least one set of asymmetric convolution kernels.
  • the encoding network includes N encoding modules, each of the encoding modules includes M encoding levels, the M and the N are both positive integers, and the convolutional neural network
  • the encoding network in the network performs down-sampling and feature extraction on the blurred image, and the steps of outputting a plurality of feature maps include:
  • the first encoding level of the first encoding module in the N encoding modules performs feature extraction on the blurred image
  • the i-th encoding level of the first encoding module sequentially performs downsampling and feature extraction on the feature map obtained by processing the i-1th encoding level of the first encoding module; wherein, the i is greater than or equal to 2, and less than or equal to M;
  • the first coding level of the j-th coding module among the N coding modules performs feature extraction on the feature map obtained by processing the first coding level of the j-1-th coding module; wherein, the j is greater than or equal to 2 , and less than or equal to N;
  • the i-th coding level of the j-th coding module downsamples the feature map obtained by processing the i-1th coding level of the j-th coding module, and combines the feature map obtained by down-sampling with the
  • the feature maps obtained by processing the i-th encoding level of the j-1 encoding modules are fused, and feature extraction is performed on the fused results;
  • the plurality of feature maps include feature maps obtained by processing at each coding level of the Nth coding module among the N coding modules.
  • the decoding network includes the M decoding layers, and the decoding network in the convolutional neural network performs upsampling and feature extraction on the feature map, and outputs the same as the blurred image
  • the corresponding steps of predicting an image include:
  • the first decoding level of the M decoding levels performs feature extraction on the feature map obtained by processing the Mth encoding level of the Nth encoding module, and performs upsampling on the extracted feature map;
  • the feature map obtained by downsampling is fused with the feature map obtained by processing the i-th coding level of the j-1-th coding module, and the fused result is characterized
  • Extraction steps include:
  • the feature map obtained by processing the u-1th decoding level in the M decoding levels is fused with the feature map obtained by processing the M-u+1th encoding level of the Nth encoding module to obtain the first
  • a step of fusing feature maps comprising:
  • the feature map obtained by processing the M-1 decoding level in the M decoding levels is fused with the feature map obtained by the first encoding level processing of the N encoding module to obtain a second fusion feature map steps, including:
  • both the first convolutional block and the second convolutional block include a first convolutional layer and a second convolutional layer, and the first convolutional layer includes the non- A symmetrical convolution kernel, the second convolution layer includes a 1 ⁇ 1 convolution kernel;
  • the downsampling block includes a maximum pooling layer and a minimum pooling layer, the maximum pooling layer and the minimum pooling layer both include the asymmetric convolution kernel;
  • the asymmetric convolution kernel includes a 1 ⁇ k convolution kernel and a k ⁇ 1 convolution kernel, and the k is greater than or equal to 2.
  • the convolution kernels in the encoding layer and the decoding layer are both symmetric convolution kernels.
  • the encoding network includes P encoding levels
  • the encoding network in the convolutional neural network performs downsampling and feature extraction on the blurred image, and outputs a plurality of feature maps, include:
  • the qth coding level in the P coding levels sequentially performs feature extraction and downsampling on the feature map obtained by processing the q-1th coding level;
  • the q is greater than or equal to 2 and less than or equal to P
  • the plurality of feature maps include feature maps obtained by processing the P coding levels.
  • the decoding network includes the P decoding layers, and the decoding network in the convolutional neural network performs upsampling and feature extraction on the feature map, and outputs the same as the blurred image
  • the corresponding steps of predicting an image include:
  • the r is greater than or equal to 2 and less than or equal to P
  • the predicted image is a feature map obtained through processing at a P-th decoding level among the P decoding levels.
  • the step of merging the calculated feature map with the feature map obtained from the Pth encoding level processing to obtain a third fused feature map includes:
  • the feature map obtained by processing the r-1th decoding level in the P decoding levels is fused with the feature map obtained by processing the P-r+1th coding level in the P coding levels to obtain the fourth
  • the steps of fusing feature maps include:
  • the step of calculating the loss value of the convolutional neural network according to the predicted image, the clear image and a preset loss function includes:
  • the loss value is calculated according to the following formula:
  • the is the loss value
  • the Y is the predicted image
  • the W is the width of the predicted image
  • the H is the height of the predicted image
  • the C is the number of channels of the predicted image
  • the E(Y) is the predicted
  • the edge map of the image is the edge map of the clear image
  • the ⁇ is greater than or equal to 0 and less than or equal to 1
  • the x is a positive integer greater than or equal to 1 and less than or equal to W
  • the y is greater than or equal to 1 and less than or a positive integer equal to H
  • z is a positive integer greater than or equal to 1 and less than or equal to C.
  • the step of acquiring a sample set includes:
  • Preprocessing the original image to obtain the blurred image includes at least one of the following: image segmentation, size cropping, flipping, brightness enhancement, noise processing and normalization processing.
  • the step of preprocessing the original image to obtain the blurred image includes:
  • the blurred image includes the normalized first image, the second image and the third image.
  • the original image includes a first pixel value of a first pixel
  • the step of performing image segmentation on the original image to obtain a first image, a second image and a third image include:
  • the first pixel is located outside the preset area and the first pixel value is greater than or equal to a first threshold and less than or equal to a second threshold, then determine the pixel value of the first pixel in the first image is the first pixel value;
  • the pixel value of the first pixel in the first image is 0;
  • the first pixel is located outside the preset area and the first pixel value is greater than or equal to a third threshold and less than or equal to a fourth threshold, then determine the pixel value of the first pixel in the second image is the first pixel value;
  • the pixel value of the first pixel in the second image is 0;
  • the first pixel is located within a preset area range, then determine the pixel value of the first pixel in the third image to be the first pixel value;
  • the third threshold is greater than the second threshold.
  • the step of performing image segmentation on the original image to obtain a first image, a second image and a third image includes:
  • Edge detection is performed on the original image, and the original image is divided into the first image, the second image and the third image according to the position and length of the detected edge.
  • the step of performing normalization processing on the first image, the second image and the third image respectively includes:
  • the image to be processed is any one of the first image, the second image, and the third image, and the image to be processed is the image includes a second pixel value for a second pixel;
  • the present disclosure provides an image processing method, including:
  • the step of obtaining the blurred fingerprint image includes:
  • Preprocessing the original fingerprint image to obtain the fuzzy fingerprint image includes at least one of the following: image segmentation, size cropping, flipping, brightness enhancement, noise processing and normalization processing.
  • the present disclosure provides a computing processing device, including:
  • One or more processors when the computer readable code is executed by the one or more processors, the computing processing device performs the method as described in any one.
  • the present disclosure provides a non-transitory computer-readable medium, storing computer-readable codes, which, when the computer-readable codes run on a computing processing device, cause the computing processing device to execute according to any one of the Methods.
  • Fig. 1 schematically shows a schematic diagram of fingerprint image acquisition under an optical screen
  • Figure 2 schematically shows a schematic diagram of a multi-point light source imaging solution
  • Fig. 3 schematically shows a schematic flow chart of a model training method
  • Fig. 4 schematically shows a group of original images and clear images
  • Fig. 5 schematically shows a schematic flow chart of acquiring a clear image
  • Fig. 6 schematically shows a schematic diagram of a blurred image
  • Fig. 7 schematically shows a schematic diagram of a first image, a second image and a third image
  • Fig. 8 schematically shows a schematic structural diagram of the first convolutional neural network
  • Fig. 9 schematically shows a schematic diagram of the structure of the first convolution block
  • Fig. 10 schematically shows a schematic diagram of the structure of a downsampling block
  • Fig. 11 schematically shows a schematic structural diagram of the second convolutional neural network
  • Fig. 12 schematically shows a schematic flowchart of an image processing method
  • Fig. 13 schematically shows a structural block diagram of a model training device
  • Fig. 14 schematically shows a structural block diagram of an image processing device
  • Figure 15 schematically illustrates a block diagram of a computing processing device for performing a method according to the present disclosure
  • Fig. 16 schematically shows a storage unit for holding or carrying program codes for realizing the method according to the present disclosure.
  • the fingerprint under the optical screen uses a point light source under the screen to illuminate the finger, and the light is reflected by the finger and then received by the photosensitive element under the screen. Due to the difference in the light intensity reflected by the fingerprint valley and the fingerprint ridge, fingerprints can be generated image.
  • a fingerprint image with a larger area and higher intensity is generally obtained by turning on multiple point light sources under the screen at the same time.
  • an ideal fingerprint image cannot be obtained.
  • the fingerprint images corresponding to each point light source are too discrete to be spliced into a complete fingerprint image; in order to obtain a complete fingerprint image, multiple point light sources need to be densely arranged. This will cause the fingerprint images corresponding to the point light sources to overlap with each other.
  • FIG. 3 schematically shows a flow chart of a model training method. As shown in FIG. 3 , the method may include the following steps.
  • Step S31 Obtain a sample set, the samples in the sample set include blurred images and clear images of the same fingerprint.
  • the execution subject of this embodiment may be a computer device, the computer device has a model training device, and the model training method provided in this embodiment is executed by the model training device.
  • the computer device may be, for example, a smart phone, a tablet computer, a personal computer, etc., which is not limited in this embodiment.
  • the execution body of this embodiment can obtain the sample set in various ways.
  • the execution subject may obtain the sample stored therein from another server (such as a database server) for storing data through a wired connection or a wireless connection.
  • the execution subject may obtain samples collected by an off-screen fingerprint collection device, etc., and store these samples locally, thereby generating a sample set.
  • multiple point light sources can be turned on at the same time on the fingerprint collection device under the screen to collect different fingers of the fingerprint collection personnel multiple times, and the original image can be generated through the imaging module inside the device, as shown in Figure 4 Shown on the left.
  • the original image may be, for example, an image in 16-bit png format.
  • the blurred image may be an original image directly generated by the under-screen fingerprint collection device, or may be an image obtained by preprocessing the original image, which is not limited in the present disclosure.
  • the left and right images in Figure 4 show the original image and clear image of the same fingerprint, respectively.
  • both the original image and the clear image may be grayscale images of a single channel.
  • the original image and the clear image may also be multi-channel color images, such as RGB images.
  • FIG. 5 a schematic flowchart of obtaining a clear image is shown.
  • a certain fingerprint such as a finger print
  • the light source is to obtain a fingerprint image corresponding to a single-point light source, and finally obtain a clear image of the fingerprint by cutting and splicing fingerprint images corresponding to multiple single-point light sources.
  • the number of point light sources shown in Figure 5 is 4, which are respectively point light source 1, point light source 2, point light source 3 and point light source 4, and turn on the four point light sources in turn to obtain fingerprint images corresponding to four single point light sources.
  • a clear image is obtained after cropping and stitching the fingerprint images corresponding to the four single-point light sources.
  • step S31 may specifically include: first obtaining the original image of the same fingerprint; then performing preprocessing on the original image to obtain a blurred image; wherein the preprocessing includes at least one of the following: image segmentation, size Cropping, flipping, brightness enhancement, noise handling and normalization.
  • the blurred image is an image obtained by preprocessing the original image.
  • the original image obtained by turning on multiple point light sources at the same time not only contains the fingerprint information generated by the photosensitive element receiving the reflected light of the fingerprint (located in area a in Figure 6); there is also ambient light
  • the introduced noise information located in the region b in Figure 6
  • the light information near the point light source located in the region c in Figure 6.
  • area a contains the main fingerprint information
  • area b contains a lot of ambient light noise and a small amount of weak fingerprint information
  • area c contains strong light source signals and a small amount of fingerprint information.
  • the original image Before training the convolutional neural network, the original image can be segmented first to obtain the first image corresponding to area a (as shown in a in Figure 7), and the second image corresponding to area b (as shown in b in Figure 7 shown) and the third image corresponding to area c (shown as c in FIG. 7 ).
  • the first image, the second image and the third image respectively contain image information of different regions in the original image.
  • the separation of primary data and secondary data can be realized, and the influence of ambient light and point light source on the fingerprint image can be reduced.
  • the inventors also found that when the pixel value ranges from 0 to 65535, most of the pixel values of each pixel in area a containing the main fingerprint information are distributed below 10000, that is, the pixel values in area a are mainly located in in the low numerical range, while the pixel values in the b area, especially the c area, are located in the higher numerical range. Therefore, in order to obtain more fingerprint information in area a and prevent the loss of main fingerprint information, the first image, the second image and the third image obtained by image segmentation can be normalized respectively, as shown in a in Figure 7 What is shown is the first image after normalization processing, what b in Fig. 7 shows is the second image after normalization processing, and what c among Fig. 7 shows is the third image after normalization processing .
  • the blurred image includes the normalized first image, second image and third image.
  • the blurred image may be a three-channel image obtained by splicing the normalized first image, second image, and third image in the channel dimension.
  • the image segmentation of the original image may use a threshold segmentation method, or an edge detection method, etc., which is not limited in the present disclosure.
  • the original image includes a first pixel value for a first pixel.
  • the steps of performing image segmentation on the original image to obtain the first image, the second image and the third image may include:
  • the first pixel is located outside the preset area and the first pixel value is greater than or equal to the first threshold and less than or equal to the second threshold, then determine that the pixel value of the first pixel in the first image is the first pixel value; If a pixel is located outside the preset area, and the first pixel value is less than the first threshold and greater than the second threshold, it is determined that the pixel value of the first pixel in the first image is 0.
  • the first pixel is located outside the preset area and the first pixel value is greater than or equal to the third threshold and less than or equal to the fourth threshold, then determine that the pixel value of the first pixel in the second image is the first pixel value; If a pixel is located outside the preset area, and the first pixel value is less than the third threshold and greater than the fourth threshold, then it is determined that the pixel value of the first pixel in the second image is 0.
  • the third threshold is greater than the second threshold. That is, the pixel values of the area b are generally higher than the pixel values of the area a.
  • the first pixel is within the range of the preset area, then determine the pixel value of the first pixel in the third image as the first pixel value.
  • the first image corresponds to area a
  • area a can be segmented from the original image according to the following formula.
  • I (x, y) represents the pixel value at coordinates (x, y) in the original image
  • min a is the first threshold
  • max a is the second threshold .
  • the specific values of the first threshold and the second threshold can be determined by artificially selecting a relatively smooth area in the area a, and performing statistics and calculations on the pixel values of the original image in this area to determine the minimum value of the area a and the maximum value of the area a.
  • the first threshold may be the average value of the minimum values of the a region of multiple original images
  • the second threshold may be the average value of the maximum values of the a region of the multiple original images.
  • region a can be separated from the original image by the above formula, similar to image matting.
  • the pixel values of region b and region c are both 0.
  • the second image corresponds to area b
  • area b can be divided from the original image according to the following formula.
  • I (x, y) represents the pixel value at coordinates (x, y) in the original image
  • min b is the third threshold
  • max b is the fourth threshold .
  • the specific values of the third threshold and the fourth threshold can be determined by artificially selecting a relatively smooth area in the b area, counting and calculating the pixel values of the original image in this area, and determining the minimum value of the b area and the maximum value of the b area value
  • the third threshold may be the average value of the minimum values of the b region of multiple original images
  • the fourth threshold may be the average value of the maximum values of the b region of multiple original images.
  • region b can be separated from the original image by the above formula, which is similar to matting.
  • the pixel values of region a and region c are both 0.
  • the segmentation corresponding to the region c of the third image can be performed according to the position of the point light source in the fingerprint image.
  • the coordinates of the preset area are also fixed, and the coordinates of the point light source and the radius of the light spot can be directly measured to determine the preset area, thereby realizing the division of the c area.
  • the pixel values of the a area and the b area are both 0.
  • the step of performing image segmentation on the original image to obtain the first image, the second image and the third image may include: performing edge detection on the original image, and according to the position and length of the detected edge, Segment the original image into a first image, a second image, and a third image.
  • the Laplacian edge detection algorithm can be used to detect the edge of the original image, filter the length and position of the detected edge, and use the finally extracted edge as the boundary of each region for segmentation.
  • the Laplacian edge detection algorithm can detect the boundary between area a and area b, the boundary between area a and area c, and may also detect the boundary caused by noise and effectively identify the area border etc. Further, the boundary caused by noise can be screened out according to the boundary length, and the boundary of the effective recognition area can be screened out according to the boundary position. Because the edge detection speed is relatively fast, so using the edge detection method for image segmentation can improve the segmentation efficiency.
  • the image to be processed includes the second pixel value of the second pixel.
  • the step of normalizing the first image, the second image, and the third image may include: first determining the maximum and minimum values of all pixel values contained in the image to be processed; and then according to The maximum value, the minimum value and the second pixel value determine the pixel value of the second pixel in the image to be processed after normalization processing.
  • Step S32 Input the blurred image into the convolutional neural network, the encoding network in the convolutional neural network performs down-sampling and feature extraction on the blurred image, outputs multiple feature maps, and the decoding network in the convolutional neural network performs up-sampling on the feature maps Sampling and feature extraction, outputting a predicted image corresponding to the blurred image.
  • the encoding network includes multiple encoding levels
  • the decoding network includes multiple decoding levels. After the feature map processed by the F-th encoding level in the encoding network is fused with the feature map obtained by the G-th decoding level in the decoding network, it is used as a decoding The input of the G+1th decoding level in the network.
  • the feature map processed at the F-th encoding level has the same resolution as the feature map processed at the G-th decoding level, and both F and G are positive integers.
  • Convolutional Neural Networks is a neural network structure that uses, for example, images as input and output, and replaces scalar weights with filters (convolution kernels).
  • the convolution process can be seen as using a trainable filter to convolve an input image or feature map, and output a convolution feature plane, which can also be called a feature map.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network. In the convolutional layers of a convolutional neural network, a neuron is only connected to some neurons in adjacent layers.
  • a convolution layer can apply several convolution kernels to an input image to extract various types of features of the input image. Each convolution kernel can extract one type of feature.
  • the convolution kernel is generally initialized in the form of a matrix of random size. During the training process of the convolutional neural network, the convolution kernel will obtain reasonable weights through learning. In the same convolution layer, multiple convolution kernels can be used to extract different image information.
  • the decoding network By fusing the feature map obtained from the F-th encoding level in the encoding network with the feature map obtained from the G-th decoding level in the decoding network, it is input to the G+1 decoding level in the decoding network, which can be used in the encoding network and Jump connections are implemented between decoding networks.
  • the skip connection between the encoding network and the decoding network the retention of image details by the decoding network can be increased, and the image details and information lost by the encoding network during the downsampling process can be transferred to the decoding network, so that the decoding network recovers the spatial resolution by upsampling. In the process, this information can be used to generate a more accurate image, thereby improving the accuracy of extracting sharp images from blurred images.
  • Downsampling operations may include: max binning, mean binning, random binning, undersampling such as selecting fixed pixels, demultiplexing output such as splitting the input image into multiple smaller images, etc., which are not covered by this disclosure. limited.
  • the upsampling operation may include: maximum value combining, strides transposed convolutions, interpolation, etc., which is not limited in the present disclosure.
  • the spatial dimension of the feature map can be gradually reduced through multiple downsampling, and the receptive field can be expanded, so that the encoding network can better extract local and global features of different scales, and the downsampling can improve the extracted feature map. Compression, thereby saving calculation and memory usage, and improving processing speed.
  • the spatial resolution of multiple feature maps output by the encoding network can be restored to be consistent with the blurred image by multiple upsampling.
  • Step S33 Calculate the loss value of the convolutional neural network according to the predicted image, the clear image and the preset loss function, and adjust the parameters of the convolutional neural network with the goal of minimizing the loss value.
  • the loss function is an important equation used to measure the difference between the predicted image and the clear image. For example, the higher the output value (loss) of the loss function, the greater the difference.
  • the loss value of the convolutional neural network can be calculated according to the following formula:
  • is greater than or equal to 0 and less than or equal to 1
  • x is a positive integer greater than or equal to 1 and less than or equal to W
  • y is a positive integer greater than or equal to 1 and less than or equal to H
  • z is A positive integer greater than or equal to 1 and less than or equal to C.
  • E(Y) can be the edge map of the prediction image obtained according to the Sobel edge extraction algorithm, It can be an edge map of a clear image obtained according to the Sobel edge extraction algorithm.
  • the AdamW optimizer can be used to optimize the parameters of the convolutional neural network according to the loss value.
  • the initial learning rate can be set to 10 -4
  • the batch size of the training data can be set to 48.
  • the iteration threshold may be a preset number of iterations, for example, if the number of times of updating the parameters of the convolutional neural network is greater than the iteration threshold, the training ends.
  • the loss threshold may be preset. For example, if the loss value calculated by the loss function is less than the loss threshold, the training ends.
  • Step S34 The convolutional neural network whose parameters have been adjusted is determined as an image processing model.
  • the trained convolutional neural network in response to determining that the training of the convolutional neural network is completed, may be determined as the image processing model.
  • the image processing model can be used to extract clear fingerprint images from fuzzy fingerprint images.
  • a model that can be used to extract clear fingerprint images can be obtained by training the convolutional neural network.
  • the convolutional neural network provided by this embodiment includes an encoding network and a decoding network with skip connections, and the retention of image details by the decoding network can be increased through the skip connections between the encoding network and the decoding network, thereby improving the extraction of clear images from blurred images. Improve the accuracy of the image and improve the effect of image processing.
  • the specific structure of the convolutional neural network can be set according to actual requirements.
  • each coding level may include a first convolution block and/or a downsampling block.
  • the first convolution block is used to perform feature extraction on the input feature matrix.
  • the downsampling block is used to downsample the input feature map.
  • Each decoding level may include a second convolutional block and/or an upsampling block.
  • the second convolution block is used to perform feature extraction on the input feature matrix.
  • the upsampling block is used to upsample the input feature map.
  • At least one of the first convolution block, the down-sampling block, the second convolution block and the up-sampling block includes at least one set of asymmetric convolution kernels.
  • the asymmetric convolution kernel may include, for example, a 1 ⁇ k convolution kernel and a k ⁇ 1 convolution kernel.
  • the value of k is greater than or equal to 2, and the value of k can be set according to requirements, for example, it can be 5.
  • the calculation amount can be greatly reduced, thereby improving the processing speed.
  • the asymmetric convolution kernel to perform horizontal convolution and vertical convolution respectively, the horizontal gradient and vertical gradient in the image can be learned, which helps to extract the information changes in the fingerprint image.
  • the coding network may include N coding modules, such as CM-1, CM-2, ..., CM-N shown in FIG. 8 .
  • N may be a positive integer, or N may be greater than or equal to 2 and less than or equal to 20, for example, N may take a value of 8, 10, 12, 15, etc., and the present disclosure does not limit the specific value of N.
  • Each coding module may include M coding levels.
  • M can be a positive integer, or M can be greater than or equal to 2 and less than or equal to 8, as shown in Figure 8, the value of M is 3, that is, each coding module includes 3 coding levels, respectively the first coding level a1 , the second coding level a2 and the third coding level a3.
  • the present disclosure does not limit the specific value of M.
  • the first coding level a1 of any coding module may include one or more first convolutional blocks.
  • the i-th coding level of any one coding module may include one or more first convolutional blocks and one downsampling block. Wherein, i is greater than or equal to 2 and less than or equal to M.
  • the decoding network may include M decoding levels, that is, the number of decoding levels in the decoding network is the same as the number of encoding levels in each encoding module.
  • the decoding network shown in FIG. 8 includes three decoding levels, which are the first decoding level b1, the second decoding level b2 and the third decoding level b3.
  • each of the first decoding level to the M-1th decoding level may include one or more second convolutional blocks and one upsampling block.
  • the Mth decoding level may include one or more second convolutional blocks.
  • Each encoding module shown in Figure 8 includes two down-sampling blocks, each down-sampling block can realize 2 times down-sampling of the input feature map, and the decoding network includes two up-sampling blocks, each up-sampling block can down-sample the input feature map Figure achieves 2x upsampling. In this way, you can ensure that the image output by the convolutional neural network has the same resolution as the image input to the convolutional neural network.
  • the encoding network in the convolutional neural network performs down-sampling and feature extraction on the blurred image, and the steps of outputting multiple feature maps may include:
  • the first coding level a1 of the first coding module CM-1 in the N coding modules performs feature extraction on blurred images
  • the i-th encoding level of the first encoding module CM-1 performs down-sampling and feature extraction in turn on the feature map obtained by processing the i-1th encoding level of the first encoding module;
  • the first coding level of the j-th coding module in the N coding modules performs feature extraction on the feature map processed by the first coding level of the j-1-th coding module; wherein, j is greater than or equal to 2, and less than or is equal to N;
  • the i-th encoding level of the j-th encoding module down-samples the feature map obtained by processing the i-1-th encoding level of the j-th encoding module, and combines the feature map obtained by the down-sampling with the j-1-th encoding module
  • the feature maps obtained from the i-th encoding level processing are fused, and feature extraction is performed on the fused results.
  • the feature map obtained by downsampling is fused with the feature map obtained by processing the i-th coding level of the j-1th coding module, and the step of extracting features from the fusion result may include: the down-sampled
  • the feature map and the feature map obtained by processing the i-th coding level of the j-1-th coding module are spliced in the channel dimension, and feature extraction is performed on the spliced result.
  • the blurred image may be an image obtained by splicing the first image, the second image, and the third image in the channel dimension.
  • the matrix size of the blurred image can be B ⁇ 3 ⁇ H ⁇ W, where B is the number of original images in a training batch, H is the height of the original image, and W is the width of the original image.
  • the output clear image is a matrix of B ⁇ 1 ⁇ H ⁇ W.
  • the feature extraction of the blurred image can be performed through the first convolution block in the first encoding level a1 to obtain the first feature map; the downsampling block in the second encoding level a2 The first downsampling is performed on the first feature map, and the first convolution block in the second coding level a2 performs feature extraction on the feature map obtained by the first downsampling to obtain the second feature map; the third coding level The downsampling block in a3 performs a second downsampling on the second feature map, and the first convolution block in the third coding level a3 performs feature extraction on the feature map obtained by the second downsampling to obtain the third feature map .
  • the first convolution block in the first encoding level a1 performs feature extraction on the first feature map output by the first encoding module CM-1; in the second encoding level a2
  • the downsampling block of the first encoding level a1 performs the first downsampling on the feature map output, and the first convolutional block in the second encoding level a2 performs the first downsampling on the feature map obtained by the first encoding
  • the second feature map output by module CM-1 performs feature fusion, and performs feature extraction on the fusion result;
  • the downsampling block in the third coding level a3 performs a second downsampling on the feature map output by the second coding level a2 Sampling, the first convolution block in the third encoding level a3 performs feature fusion on the feature map obtained by the second downsampling and the third feature map output by the first encoding module CM-1, and performs fusion on the result of fusion feature extraction.
  • the feature map output by the first encoding level a1 is the fourth feature map
  • the feature map output by the second encoding level a2 is the fifth feature map
  • the feature map output by the coding level a3 is the sixth feature map.
  • the first convolution block in the first encoding level a1 performs feature extraction on the fourth feature map output by the encoding module CM-N-1 to obtain the seventh feature map;
  • the second The downsampling block in the first encoding level a2 performs the first downsampling on the feature map output by the first encoding level a1, and the first convolution block in the second encoding level a2 performs the first downsampling on the feature map obtained by the first downsampling Perform feature fusion with the fifth feature map output by the encoding module CM-N-1, and perform feature extraction on the fusion result to obtain the eighth feature map;
  • the feature map output by a2 is down-sampled for the second time, and the first convolution block in the third coding level a3 performs the feature map obtained by the second down-sampling and the sixth feature map output by the coding module CM-N-1.
  • the plurality of feature maps output by the encoding network include feature maps obtained by processing at each encoding level of the Nth encoding module among the N encoding modules.
  • the decoding network in the convolutional neural network performs up-sampling and feature extraction on the feature map, and the step of outputting a predicted image corresponding to the blurred image may include:
  • the first decoding level among the M decoding levels performs feature extraction on the feature map obtained by processing the Mth encoding level of the Nth encoding module, and upsamples the extracted feature map;
  • the decoding network fuses the feature map obtained by processing the u-1th decoding level of the M decoding levels with the feature map obtained by processing the M-u+1th encoding level of the Nth encoding module to obtain the first fused feature map ;
  • u is greater than or equal to 2 and less than or equal to M-1; the value of M can be greater than or equal to 3;
  • the decoding network inputs the first fusion feature map into the u-th decoding level among the M decoding levels, and the u-th decoding level sequentially performs feature extraction and upsampling on the first fusion feature map;
  • the decoding network fuses the feature map obtained by processing the M-1th decoding level among the M decoding levels with the feature map obtained by processing the first encoding level of the Nth encoding module to obtain a second fusion feature map;
  • the decoding network inputs the second fusion feature map into the M decoding level among the M decoding levels, and the M decoding level performs feature extraction on the second fusion feature map to obtain a predicted image.
  • the feature map obtained by processing the u-1th decoding level in the M decoding levels is fused with the feature map obtained by the M-u+1th encoding level of the N-th encoding module to obtain the first
  • the step of fusing the feature maps may include: combining the feature map obtained by processing the u-1th decoding level in the M decoding levels and the feature map obtained by processing the M-u+1th coding level of the N-th coding module in the channel Dimensionally spliced to obtain the first fusion feature map.
  • the step of merging the feature map obtained by processing the M-1 decoding level in the M decoding levels with the feature map obtained by the first encoding level processing of the N encoding module to obtain the second fused feature map may include: The feature map obtained by processing the M-1 decoding level in the M decoding levels and the feature map obtained by the first encoding level processing of the N encoding module are spliced in the channel dimension to obtain the second fusion feature map.
  • the first encoding level a1 outputs the seventh feature map; the second encoding level a2 outputs the eighth feature map; the third encoding level a3 outputs the ninth feature map picture.
  • the second convolution block in the first decoding level b1 performs feature extraction on the ninth feature map, and the upsampling block in the first decoding level b1 performs feature extraction on the result of feature extraction Upsampling for the first time;
  • the decoding network fuses the feature map output by the first decoding level b1 with the eighth feature map for the first time, and inputs the feature map obtained by the first fusion into the second decoding level b2;
  • the second The second convolution block in the decoding level b2 performs feature extraction on the feature map obtained by the first fusion, and the upsampling block in the second decoding level b2 performs the second upsampling on the result of the feature extraction;
  • the decoding network takes the first The feature map output by the second decoding level b2 is fused with the seventh feature map for the second time, and the feature map obtained by the second fusion is input into the third decoding level b3; the second convolution in the third decoding level b3 The block performs
  • the first convolutional block may include a first convolutional layer and a second convolutional layer
  • the first convolutional layer may include an asymmetric convolutional kernel
  • the second convolutional layer may include a 1 ⁇ 1 convolutional layer nuclear.
  • the feature maps obtained by processing a pair of asymmetric convolution kernels in the first convolutional layer can be fused through the splicing layer (cat as shown in Figure 9), and then through the second The convolutional layer compresses the number of channels of the fusion result to reduce the amount of calculation.
  • the InstanceNorm layer uses the InstanceNorm method to normalize the output results of the second convolutional layer, and then the PRelu layer uses the activation function PRelu to process the input feature map, and the first volume is output after the activation function processing Block.
  • the structure of the second convolution block may be the same as that of the first convolution block, of course, it may also be different.
  • the asymmetric convolution kernels contained in the maximum pooling layer and the minimum pooling layer can be the same or different.
  • the feature map output by the max pooling layer and the feature map output by the min pooling layer can be fused by a splicing layer (such as cat shown in Figure 10), and the downsampling block is output after fusion.
  • a splicing layer such as cat shown in Figure 10
  • the upsampling block is used to perform an upsampling operation, and the upsampling operation may specifically include: PixelShuffle, maximum value combination, strides transposed convolutions, interpolation (for example, interpolation, bi-cubic interpolation, etc.) and the like.
  • the present disclosure is not limited thereto.
  • the structure of the convolutional neural network is like a cross grid, which can deepen the fusion between deep features and shallow features, make full use of the limited fingerprint information in the original image, and improve the accuracy of extraction from the original image.
  • Image Accuracy As shown in Figure 8, the structure of the convolutional neural network is like a cross grid, which can deepen the fusion between deep features and shallow features, make full use of the limited fingerprint information in the original image, and improve the accuracy of extraction from the original image.
  • the convolutional neural network uses spatially separable convolution to implement most of the convolution operations.
  • spatially separable convolution for feature extraction or sampling processing, the amount of calculation can be greatly reduced, thereby improving the processing speed and contributing to Realize real-time processing of input images.
  • spatially separable convolution can learn the horizontal and vertical gradients in blurred images, which helps to extract information changes in fingerprints and improve the accuracy of extracting clear images from blurred images.
  • the convolution kernels in the coding level and the decoding level are both symmetrical convolution kernels.
  • the encoding network may include P encoding levels.
  • the coding network as shown in 11 includes 3 coding levels, namely a first coding level, a second coding level and a third coding level.
  • the dotted line box on the left side of the second coding level in Figure 11 shows the specific structure of the second coding level.
  • the second coding level can specifically include: InstanceNorm layer, PRelu layer, third convolutional layer, InstanceNorm layer, PRelu layer and downsampling layers.
  • the InstanceNorm layer uses the InstanceNorm method to normalize the input feature map.
  • the PRelu layer uses the activation function PRelu to process the input feature map.
  • the third convolution layer may include a 5 ⁇ 5 convolution kernel for feature extraction on the input feature map.
  • the downsampling layer can include a convolutional layer with a 4 ⁇ 4 convolutional kernel, and the stride of the convolutional layer can be 2, so that the aspect ratio of the feature map output by the second encoding layer is smaller than that of the input feature map 2 times.
  • the specific structures of the first coding level, the second coding level and the third coding level may be the same.
  • the decoding network may include P decoding layers, that is, the number of decoding layers is the same as the number of encoding layers.
  • the encoding network as shown in 11 comprises 3 decoding levels, namely a first decoding level, a second decoding level and a third decoding level.
  • the dotted line box on the right side of the second decoding level in Figure 11 is the specific structure of the second decoding level, the second decoding level can specifically include: InstanceNorm layer, PRelu layer, upsampling layer, InstanceNorm layer, PRelu layer, The fourth convolutional layer.
  • the InstanceNorm layer uses the InstanceNorm method to normalize the input feature map.
  • the PRelu layer uses the activation function PRelu to process the input feature map.
  • the upsampling layer can include a convolutional layer with a 4 ⁇ 4 transposed convolutional kernel, which can have a stride of 2, so that the feature map output by the second decoding layer has an aspect ratio of the input feature map Each is enlarged by 2 times.
  • the fourth convolution layer may include a 5 ⁇ 5 convolution kernel for feature extraction on the input feature map.
  • the specific structures of the first decoding level, the second decoding level and the third decoding level may be the same.
  • the encoding network in the convolutional neural network performs down-sampling and feature extraction on the blurred image, and the steps of outputting multiple feature maps may include:
  • the first coding level in the P coding levels performs feature extraction and downsampling on the blurred image in sequence
  • the qth encoding level in the P encoding levels performs feature extraction and downsampling in sequence on the feature map obtained from the processing of the q-1th encoding level.
  • q is greater than or equal to 2 and less than or equal to P
  • the multiple feature maps output by the encoding network include feature maps obtained by P encoding level processing.
  • the first coding level performs feature extraction and downsampling on the blurred image in sequence to obtain the tenth feature map; the second coding level sequentially performs feature extraction and downsampling on the tenth feature map to obtain the eleventh feature Figure; the third encoding level sequentially performs feature extraction and downsampling on the eleventh feature map to obtain the twelfth feature map.
  • the matrix size corresponding to the blurred image is B ⁇ 3 ⁇ H ⁇ W, where B is the number of original images in a training batch, H is the height of the original image, and W is the width of the original image.
  • the matrix size corresponding to the tenth feature map is B ⁇ 64 ⁇ H/2 ⁇ W/2
  • the matrix size corresponding to the eleventh feature map is B ⁇ 128 ⁇ H/4 ⁇ W/4
  • the matrix size corresponding to the twelfth feature map is The matrix size is B ⁇ 256 ⁇ H/8 ⁇ W/8.
  • the decoding network may also include a third convolutional block, including an InstanceNorm layer, a PRelu layer, a convolutional layer with a 5 ⁇ 5 convolutional kernel, an InstanceNorm layer, a PRelu layer, a convolutional layer with a 5 ⁇ 5 convolutional kernel, input and output
  • a third convolutional block including an InstanceNorm layer, a PRelu layer, a convolutional layer with a 5 ⁇ 5 convolutional kernel, an InstanceNorm layer, a PRelu layer, a convolutional layer with a 5 ⁇ 5 convolutional kernel, input and output
  • the width and height dimensions of the feature matrix of the three convolutional blocks remain unchanged.
  • the decoding network in the convolutional neural network performs upsampling and feature extraction on the feature map, and the steps of outputting a predicted image corresponding to the blurred image may include:
  • the decoding network fuses the calculated feature map with the feature map obtained from the P-th encoding level processing to obtain a third fused feature map
  • the decoding network inputs the third fusion feature map into the first decoding level among the P decoding levels, and the first decoding level sequentially performs upsampling and feature extraction on the third fusion feature map;
  • the decoding network fuses the feature map obtained by processing the r-1th decoding level in the P decoding levels with the feature map obtained by processing the P-r+1th coding level in the P coding levels to obtain a fourth fused feature map;
  • the decoding network inputs the fourth fusion feature map to the r-th decoding level among the P decoding levels, and the r-th decoding level sequentially performs upsampling and feature extraction on the fourth fusion feature map.
  • r is greater than or equal to 2 and less than or equal to P
  • the predicted image is a feature map obtained by processing at the Pth decoding level among the P decoding levels.
  • the step of fusing the calculated feature map with the feature map obtained by processing the Pth encoding level to obtain the third fused feature map may include: combining the calculated feature map and the feature map obtained by processing the Pth encoding level in the channel dimension Splicing is performed on the above to obtain the third fusion feature map.
  • the step of fusing the feature map obtained by processing the r-1th decoding level in the P decoding levels with the feature map obtained by processing the P-r+1th coding level in the P coding levels to obtain a fourth fused feature map may include: splicing the feature map obtained by processing the r-1th decoding level in the P decoding levels and the feature map obtained by processing the P-r+1th encoding level in the P encoding levels in the channel dimension to obtain the first Four fused feature maps.
  • the third convolution block performs feature extraction on the twelfth feature map to obtain the calculated feature map, and the decoding network fuses the calculated feature map with the twelfth feature map to obtain the third fusion feature map .
  • the third fusion feature map is used as the input of the first decoding level, and the first decoding level sequentially performs upsampling and feature extraction on the third fusion feature map to obtain the thirteenth feature map; the decoding network combines the thirteenth feature map and the third feature map The eleventh feature map is fused to obtain the fourteenth feature map, and the fourteenth feature map is input to the second decoding level; the second decoding level sequentially performs upsampling and feature extraction on the fourteenth feature map to obtain the tenth feature map.
  • the fifth feature map; the decoding network fuses the fifteenth feature map and the tenth feature map to obtain the sixteenth feature map, and inputs the sixteenth feature map into the third decoding level; the third decoding level pairs the sixteenth feature map
  • the feature map is sequentially upsampled and feature extracted to obtain the predicted image.
  • Fig. 12 schematically shows a flowchart of an image processing method. As shown in Fig. 12, the method may include the following steps.
  • Step S1201 Obtain a fuzzy fingerprint image.
  • the step of obtaining the fuzzy fingerprint image includes: obtaining the original fingerprint image; preprocessing the original fingerprint image to obtain the fuzzy fingerprint image; wherein, The preprocessing includes at least one of the following: image segmentation, size cropping, flipping, brightness enhancement, noise processing and normalization processing.
  • the process of obtaining the original fingerprint image is the same as the process of obtaining the original image, and the process of preprocessing the original fingerprint image is the same as the process of preprocessing the original image, and will not be repeated here.
  • the execution subject of this embodiment may be a computer device, and the computer device has an image processing device, and the image processing method provided by this embodiment is executed by the image processing device.
  • the computer device may be, for example, a smart phone, a tablet computer, a personal computer, etc., which is not limited in this embodiment.
  • the executive body of this embodiment can obtain the fuzzy fingerprint image in various ways.
  • the execution subject can obtain the original fingerprint image collected by the multi-point light source under-screen fingerprint collection device, and then preprocess the obtained original fingerprint image to obtain a fuzzy fingerprint image.
  • Step S1202 Input the fuzzy fingerprint image into the image processing model trained by the model training method provided by any embodiment, and obtain a clear fingerprint image corresponding to the fuzzy fingerprint image.
  • the image processing model may be pre-trained, or may be obtained through training during image processing, which is not limited in this embodiment.
  • the image processing method provided by this embodiment can extract a high-quality clear fingerprint image by inputting the fuzzy fingerprint image into the image processing model, extract and enhance the fingerprint ridge and fingerprint valley information of the fingerprint, and the clear fingerprint image can be directly Applied to fingerprint recognition.
  • this embodiment can improve the acquisition efficiency of a clear fingerprint image.
  • Fig. 13 schematically shows a block diagram of a model training device. Referring to Figure 13, may include:
  • the acquisition module 1301 is configured to acquire a sample set, the samples in the sample set include blurred images and clear images of the same fingerprint;
  • the prediction module 1302 is configured to input the blurred image into a convolutional neural network, and the encoding network in the convolutional neural network performs downsampling and feature extraction on the blurred image, and outputs a plurality of feature maps, and the convolutional neural network
  • the decoding network in the product neural network performs up-sampling and feature extraction on the feature map, and outputs a predicted image corresponding to the blurred image; wherein, the encoding network includes multiple encoding levels, and the decoding network includes multiple decoding level, after the feature map obtained by processing the Fth encoding level in the encoding network is fused with the feature map obtained by processing the Gth decoding level in the decoding network, it is used as the G+1th decoding level in the decoding network Input, the feature map obtained by processing at the F-th coding level has the same resolution as the feature map obtained by processing at the G-th decoding level, and both the F and the G are positive integers;
  • the training module 1303 is configured to calculate the loss value of the convolutional neural network according to the predicted image, the clear image and a preset loss function, and adjust the convolution with the goal of minimizing the loss value
  • the parameters of the neural network
  • the determination module 1304 is configured to determine the convolutional neural network whose parameters have been adjusted as an image processing model.
  • Fig. 14 schematically shows a block diagram of an image processing device. Referring to Figure 14, may include:
  • An acquisition module 1401 configured to acquire a fuzzy fingerprint image
  • the extraction module 1402 is configured to input the fuzzy fingerprint image into the image processing model trained by the model training method provided in any embodiment, to obtain a clear fingerprint image corresponding to the fuzzy fingerprint image.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without any creative efforts.
  • the various component embodiments of the present disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the computing processing device according to the embodiments of the present disclosure.
  • DSP digital signal processor
  • the present disclosure can also be implemented as an apparatus or apparatus program (eg, computer program and computer program product) for performing a part or all of the methods described herein.
  • Such a program realizing the present disclosure may be stored on a computer-readable medium, or may have the form of one or more signals.
  • Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.
  • FIG. 15 illustrates a computing processing device that may implement methods according to the present disclosure.
  • the computing processing device conventionally includes a processor 1010 and a computer program product in the form of memory 1020 or non-transitory computer readable media.
  • Memory 1020 may be electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 1020 has a storage space 1030 for program code 1031 for performing any method steps in the methods described above.
  • the storage space 1030 for program codes may include respective program codes 1031 for respectively implementing various steps in the above methods. These program codes can be read from or written into one or more computer program products.
  • These computer program products comprise program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit as described with reference to FIG. 16 .
  • the storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 15 .
  • the program code can eg be compressed in a suitable form.
  • the storage unit includes computer readable code 1031', i.e. code readable by, for example, a processor such as 1010, which code, when executed by a computing processing device, causes the computing processing device to perform the above-described methods. each step.
  • references herein to "one embodiment,” “an embodiment,” or “one or more embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Additionally, please note that examples of the word “in one embodiment” herein do not necessarily all refer to the same embodiment.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
  • the disclosure can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware.
  • the use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Image Processing (AREA)

Abstract

模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质。模型训练方法包括:获取样本集,样本集中的样本包括同一指纹的模糊图像和清晰图像;将模糊图像输入卷积神经网络中,卷积神经网络中的编码网络对模糊图像进行下采样和特征提取,输出多个特征图,卷积神经网络中的解码网络对特征图进行上采样和特征提取,输出与模糊图像对应的预测图像;其中,编码网络中第F个编码层级处理得到的特征图与解码网络中第G个解码层级处理得到的特征图融合后,作为解码网络中第G+1个解码层级的输入;根据预测图像、清晰图像以及预设的损失函数,调整卷积神经网络的参数;将完成参数调整的卷积神经网络确定为图像处理模型。

Description

模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质 技术领域
本公开涉及计算机技术领域,特别是涉及一种模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质。
背景技术
光学屏下指纹是利用屏下点光源照亮手指,光线经手指反射后被屏幕下方的光学传感器接收,由于指纹谷和指纹脊对光线进行反射的光线强度存在差异,因此可以生成指纹图像。由于光学屏下指纹采集***对指纹的采集范围较大,并且硬件成本较低,因此具有较高的生产价值。
概述
本公开提供了一种模型训练方法,包括:
获取样本集,所述样本集中的样本包括同一指纹的模糊图像和清晰图像;
将所述模糊图像输入卷积神经网络中,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像;其中,所述编码网络包括多个编码层级,所述解码网络包括多个解码层级,所述编码网络中第F个编码层级处理得到的特征图与所述解码网络中第G个解码层级处理得到的特征图融合后,作为所述解码网络中第G+1个解码层级的输入,所述第F个编码层级处理得到的特征图与所述第G个解码层级处理得到的特征图具有相同的分辨率,所述F和所述G均为正整数;
根据所述预测图像、所述清晰图像以及预设的损失函数,计算所述卷积神经网络的损失值,以最小化所述损失值为目标,调整所述卷积神经网络的参数;
将完成参数调整的卷积神经网络确定为图像处理模型。
在一种可选的实现方式中,各所述编码层级包括第一卷积块和/或下采 样块,各所述解码层级包括第二卷积块和/或上采样块;
其中,所述第一卷积块、所述下采样块、所述第二卷积块以及所述上采样块中的至少一个包括至少一组非对称卷积核。
在一种可选的实现方式中,所述编码网络包括N个编码模块,各所述编码模块包括M个所述编码层级,所述M和所述N均为正整数,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图的步骤,包括:
所述N个编码模块中第一个编码模块的第一个编码层级对所述模糊图像进行特征提取;
所述第一个编码模块的第i个编码层级对所述第一个编码模块的第i-1个编码层级处理得到的特征图依次进行下采样和特征提取;其中,所述i大于或等于2,且小于或等于M;
所述N个编码模块中第j个编码模块的第一个编码层级对第j-1个编码模块的第一个编码层级处理得到的特征图进行特征提取;其中,所述j大于或等于2,且小于或等于N;
所述第j个编码模块的第i个编码层级对所述第j个编码模块的第i-1个编码层级处理得到的特征图进行下采样,并将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取;
其中,所述多个特征图包括所述N个编码模块中第N个编码模块的各个编码层级处理得到的特征图。
在一种可选的实现方式中,所述解码网络包括所述M个解码层级,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像的步骤,包括:
所述M个解码层级中第一个解码层级对所述第N个编码模块的第M个编码层级处理得到的特征图进行特征提取,并对提取得到的特征图进行上采样;
将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融合特征图;其中,所述u大于或等于2,且小于或等于M-1;
将所述第一融合特征图输入所述M个解码层级中第u个解码层级,所述第u个解码层级对所述第一融合特征图依次进行特征提取和上采样;
将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图;
将所述第二融合特征图输入所述M个解码层级中第M个解码层级,所述第M个解码层级对所述第二融合特征图进行特征提取,得到所述预测图像。
在一种可选的实现方式中,所述将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取的步骤,包括:
将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图在通道维度上进行拼接,并对拼接的结果进行特征提取;
所述将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融合特征图的步骤,包括:
将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第一融合特征图;
所述将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图的步骤,包括:
将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图在通道维度上进行拼接,得到第二融合特征图。
在一种可选的实现方式中,所述第一卷积块和所述第二卷积块均包括第一卷积层和第二卷积层,所述第一卷积层包括所述非对称卷积核,所述第二卷积层包括1×1卷积核;
所述下采样块包括最大池化层和最小池化层,所述最大池化层和所述最小池化层均包括所述非对称卷积核;
其中,所述非对称卷积核包括1×k卷积核和k×1卷积核,所述k大于或 等于2。
在一种可选的实现方式中,所述编码层级和所述解码层级中的卷积核均为对称卷积核。
在一种可选的实现方式中,所述编码网络包括P个编码层级,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图的步骤,包括:
所述P个编码层级中第一个编码层级对所述模糊图像依次进行特征提取和下采样;
所述P个编码层级中第q个编码层级对第q-1个编码层级处理得到的特征图依次进行特征提取和下采样;
其中,所述q大于或等于2,且小于或等于P,所述多个特征图包括所述P个编码层级处理得到的特征图。
在一种可选的实现方式中,所述解码网络包括所述P个解码层级,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像的步骤,包括:
对所述P个编码层级中第P个编码层级处理得到的特征图进行特征提取,得到计算特征图;
将所述计算特征图与所述第P个编码层级处理得到的特征图进行融合,得到第三融合特征图;
将所述第三融合特征图输入所述P个解码层级中的第一个解码层级,所述第一个解码层级对所述第三融合特征图依次进行上采样和特征提取;
将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图;
将所述第四融合特征图输入至所述P个解码层级中的第r个解码层级,所述第r个解码层级对所述第四融合特征图依次进行上采样和特征提取;
其中,所述r大于或等于2,且小于或等于P,所述预测图像为所述P个解码层级中的第P个解码层级处理得到的特征图。
在一种可选的实现方式中,所述将所述计算特征图与所述第P个编码层级处理得到的特征图进行融合,得到第三融合特征图的步骤,包括:
将所述计算特征图与所述第P个编码层级处理得到的特征图在通道维度上进行拼接,得到第三融合特征图;
所述将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图的步骤,包括:
将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第四融合特征图。
在一种可选的实现方式中,所述根据所述预测图像、所述清晰图像以及预设的损失函数,计算所述卷积神经网络的损失值的步骤,包括:
按照以下公式计算所述损失值:
Figure PCTCN2021127078-appb-000001
Figure PCTCN2021127078-appb-000002
Figure PCTCN2021127078-appb-000003
其中,所述
Figure PCTCN2021127078-appb-000004
为所述损失值,所述Y为所述预测图像,所述
Figure PCTCN2021127078-appb-000005
为所述清晰图像,所述W为所述预测图像的宽度,所述H为所述预测图像的高度,所述C为所述预测图像的通道数,所述E(Y)为所述预测图像的边缘图,所述
Figure PCTCN2021127078-appb-000006
为所述清晰图像的边缘图,所述λ大于或等于0,且小于或等于1,所述x为大于或等于1且小于或等于W的正整数,所述y为大于或等于1且小于或等于H的正整数,所述z为大于或等于1且小于或等于C的正整数。
在一种可选的实现方式中,所述获取样本集的步骤,包括:
获取所述同一指纹的原始图像;
对所述原始图像进行预处理,获得所述模糊图像;其中,所述预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
在一种可选的实现方式中,所述对所述原始图像进行预处理,获得所述模糊图像的步骤,包括:
对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像,所述第一图像、所述第二图像和所述第三图像分别包含所述原始图像不同区 域的信息;
对所述第一图像、所述第二图像和所述第三图像分别进行归一化处理,所述模糊图像包括归一化处理后的所述第一图像、所述第二图像和所述第三图像。
在一种可选的实现方式中,所述原始图像包括第一像素的第一像素值,所述对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,包括:
若所述第一像素位于预设区域范围外,所述第一像素值大于或等于第一阈值,且小于或等于第二阈值,则确定所述第一图像中所述第一像素的像素值为所述第一像素值;
若所述第一像素位于预设区域范围外,所述第一像素值小于所述第一阈值,且大于所述第二阈值,则确定所述第一图像中所述第一像素的像素值为0;
若所述第一像素位于预设区域范围外,所述第一像素值大于或等于第三阈值,且小于或等于第四阈值,则确定所述第二图像中所述第一像素的像素值为所述第一像素值;
若所述第一像素位于预设区域范围外,所述第一像素值小于所述第三阈值,且大于所述第四阈值,则确定所述第二图像中所述第一像素的像素值为0;
若所述第一像素位于预设区域范围内,则确定所述第三图像中所述第一像素的像素值为所述第一像素值;
其中,所述第三阈值大于所述第二阈值。
在一种可选的实现方式中,所述对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,包括:
对所述原始图像进行边缘检测,根据检测到的边缘的位置和长度,将所述原始图像分割为所述第一图像、所述第二图像和所述第三图像。
在一种可选的实现方式中,所述对所述第一图像、所述第二图像和所述第三图像分别进行归一化处理的步骤,包括:
确定待处理图像所包含的所有像素值中的最大值和最小值,所述待处理图像为所述第一图像、所述第二图像和所述第三图像中的任意一个,所述 待处理图像包括第二像素的第二像素值;
根据所述最大值、最小值以及所述第二像素值,确定归一化处理后的所述待处理图像中的所述第二像素的像素值。
本公开提供了一种图像处理方法,包括:
获取模糊指纹图像;
将所述模糊指纹图像输入至如任一项所述的模型训练方法训练得到的图像处理模型,得到所述模糊指纹图像对应的清晰指纹图像。
在一种可选的实现方式中,当所述模糊图像为对所述原始图像进行预处理获得的结果时,所述获取模糊指纹图像的步骤,包括:
获取原始指纹图像;
对所述原始指纹图像进行预处理,获得所述模糊指纹图像;其中,所述预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
本公开提供了一种计算处理设备,包括:
存储器,其中存储有计算机可读代码;
一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行如任一项所述的方法。
本公开提供了一种非瞬态计算机可读介质,存储有计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据如任一项所述的方法。
上述说明仅是本公开技术方案的概述,为了能够更清楚了解本公开的技术手段,而可依照说明书的内容予以实施,并且为了让本公开的上述和其它目的、特征和优点能够更明显易懂,以下特举本公开的具体实施方式。
附图简述
为了更清楚地说明本公开实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。需要说明的是,附图中的比例仅作为示意并不代表实际比例。
图1示意性地示出了一种光学屏下指纹图像采集示意图;
图2示意性地示出了多点光源成像方案示意图;
图3示意性地示出了一种模型训练方法的流程示意图;
图4示意性地示出了一组原始图像和清晰图像;
图5示意性地示出了采集清晰图像的流程示意图;
图6示意性地示出了一种模糊图像的示意图;
图7示意性地示出了第一图像、第二图像和第三图像的示意图;
图8示意性地示出了第一种卷积神经网络的结构示意图;
图9示意性地示出了第一卷积块的结构示意图;
图10示意性地示出了下采样块的结构示意图;
图11示意性地示出了第二种卷积神经网络的结构示意图;
图12示意性地示出了一种图像处理方法的流程示意图;
图13示意性地示出了一种模型训练装置的结构框图;
图14示意性地示出了一种图像处理装置的结构框图;
图15示意性地示出了用于执行根据本公开的方法的计算处理设备的框图;
图16示意性地示出了用于保持或者携带实现根据本公开的方法的程序代码的存储单元。
详细描述
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
如图1所示,光学屏下指纹是利用屏下点光源照亮手指,光线经手指反射后被屏幕下方的光敏元件接收,由于指纹谷和指纹脊反射的光线强度存在差异,因此可以生成指纹图像。
在相关技术中,一般通过同时点亮屏幕下方的多个点光源,以获得更大面积和更高强度的指纹图像。然而,受限于点光源的发光和成像原理,无论如何排布各点光源之间的位置,均无法获得理想的指纹图像。如图2所示, 当多个点光源排布稀疏时,各点光源对应的指纹图像过于离散,无法拼接成完整的指纹图像;为了获得完整的指纹图像,多个点光源需要密集排布,这样又会导致各点光源对应的指纹图像之间存在相互重叠。
在相关技术中,为了获得清晰的指纹图像,还可以通过依次点亮各点光源,分别采集单点光源对应的指纹图像,然后再对多个单点光源对应的指纹图像进行裁剪以及对齐拼接等处理,获得完整清晰的指纹图像。然而,该方案需要采集各单点光源对应的指纹图像,采集时间较长,可行性较差。
为了解决上述问题,图3示意性地示出了一种模型训练方法的流程图,如图3所示,该方法可以包括以下步骤。
步骤S31:获取样本集,样本集中的样本包括同一指纹的模糊图像和清晰图像。
本实施例的执行主体可以为计算机设备,该计算机设备具有模型训练装置,通过该模型训练装置来执行本实施例提供的模型训练方法。其中,计算机设备例如可以为智能手机、平板电脑、个人计算机等,本实施例对此不作限定。
本实施例的执行主体可以通过多种方式来获取样本集。例如,执行主体可以通过有线连接方式或无线连接的方式,从用于存储数据的另一服务器(例如数据库服务器)中获取存储于其中的样本。再例如,执行主体可以获取由屏下指纹采集设备等采集的样本,并将这些样本存储在本地,从而生成样本集。
在具体实现中,可以在屏下指纹采集设备上,同时点亮多个点光源,对参与指纹采集人员的不同手指进行多次采集,通过设备内部的成像模块,生成原始图像,如图4中左图所示。该原始图像例如可以为16位png格式的图像。
模糊图像可以为屏下指纹采集设备直接生成的原始图像,还可以是对原始图像进行预处理得到的图像,本公开对此不作限定。
图4中的左图和右图分别示出的是同一指纹的原始图像和清晰图像。本实施例中,原始图像与清晰图像可以均为单一通道的灰度图像。在实际应用中,原始图像与清晰图像还可以为多通道的彩色图像,比如RGB图像等。
参照图5示出了获得清晰图像的流程示意图。如图5所示,可以在多个 点光源同时点亮并采集完某一指纹(如手指指纹)的初始图像之后,确保被采集人员的手指在屏幕上保持不动,然后依次点亮各点光源,获取单点光源对应的指纹图像,通过对多个单点光源对应的指纹图像进行裁剪和拼接,最终得到该指纹的清晰图像。图5中示出的点光源数量为4个,分别为点光源1、点光源2、点光源3和点光源4,依次点亮4个点光源,获得4张单点光源对应的指纹图像,对4张单点光源对应的指纹图像进行裁剪拼接后得到一张清晰图像。
在一种可选的实现方式中,步骤S31具体可以包括:首先获取同一指纹的原始图像;然后对原始图像进行预处理,获得模糊图像;其中,预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
本实现方式中,模糊图像为对原始图像进行预处理得到的图像。
如图6所示,通过同时点亮多个点光源获取的原始图像中,不仅包含了光敏元件接收指纹的反射光而生成的指纹信息(位于图6中的a区域);还存在周围环境光引入的噪声信息(位于图6中的b区域)以及点光源附近的光信息(位于图6中的c区域)。其中,a区域内包含了主要的指纹信息,b区域内存在大量环境光噪声以及少量微弱的指纹信息,c区域内存在较强的光源信号和少量的指纹信息。在训练卷积神经网络之前,可以首先对原始图像进行图像分割,获得a区域对应的第一图像(如图7中的a所示)、b区域对应的第二图像(如图7中的b所示)以及c区域对应的第三图像(如图7中的c所示)。
其中,第一图像、第二图像和第三图像分别包含原始图像中不同区域的图像信息。通过将原始图像按区域划分为第一图像、第二图像和第三图像,可以实现主要数据与次要数据的剥离,降低环境光和点光源对指纹图像的影响。
另外,发明人还发现,当像素值的取值范围为0~65535时,包含主要指纹信息的a区域内各像素的像素值大部分分布在10000以下,也就是a区域内的像素值主要位于低数值范围内,而b区域尤其是c区域内的像素值则位于较高数值范围内。因此,为了更多地获取a区域内的指纹信息,防止主要指纹信息丢失,可以对图像分割得到的第一图像、第二图像和第三 图像分别进行归一化处理,图7中的a示出的是归一化处理后的第一图像,图7中的b示出的是归一化处理后的第二图像,图7中的c示出的是归一化处理后的第三图像。
本实施例中,模糊图像包括归一化处理后的第一图像、第二图像和第三图像。具体地,模糊图像可以由归一化处理后的第一图像、第二图像和第三图像在通道维度上拼接得到的三通道图像。通过在通道维度上对不同区域的图像进行拼接,可以在多个维度上提取到更多有效的指纹信息,提高后续指纹识别的准确度。
在具体实现中,对原始图像进行图像分割可以采用阈值分割法,或者边缘检测法等,本公开对此不作限定。
原始图像包括第一像素的第一像素值。在第一种实现方式中,对原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,可以包括:
若第一像素位于预设区域范围外,第一像素值大于或等于第一阈值,且小于或等于第二阈值,则确定第一图像中第一像素的像素值为第一像素值;若第一像素位于预设区域范围外,第一像素值小于第一阈值,且大于第二阈值,则确定第一图像中第一像素的像素值为0。
若第一像素位于预设区域范围外,第一像素值大于或等于第三阈值,且小于或等于第四阈值,则确定第二图像中第一像素的像素值为第一像素值;若第一像素位于预设区域范围外,第一像素值小于第三阈值,且大于第四阈值,则确定第二图像中第一像素的像素值为0。
其中,第三阈值大于第二阈值。即区域b的像素值整体高于区域a的像素值。
若第一像素位于预设区域范围内,则确定第三图像中第一像素的像素值为第一像素值。
具体地,第一图像对应a区域,可以按照以下公式将a区域从原始图像中分割出来。
Figure PCTCN2021127078-appb-000007
其中,
Figure PCTCN2021127078-appb-000008
代表第一图像中坐标(x,y)处的像素值,I (x,y)代表原始图像中 坐标(x,y)处的像素值,min a为第一阈值,max a为第二阈值。
其中,第一阈值和第二阈值的具体数值可以通过在a区域中人为选中一块较平滑的区域,对原始图像在该区域内的像素值进行统计和计算,确定a区域最小值和a区域最大值,第一阈值可以为多个原始图像的a区域最小值的平均值,第二阈值可以为多个原始图像的a区域最大值的平均值。
需要说明的是,通过上述公式可以将a区域从原始图像中分割出来,类似于抠图,在第一图像中,b区域和c区域的像素值均为0。
具体地,第二图像对应b区域,可以按照以下公式将b区域从原始图像中分割出来。
Figure PCTCN2021127078-appb-000009
其中,
Figure PCTCN2021127078-appb-000010
代表第二图像中坐标(x,y)处的像素值,I (x,y)代表原始图像中坐标(x,y)处的像素值,min b为第三阈值,max b为第四阈值。
其中,第三阈值和第四阈值的具体数值可以通过在b区域中人为选中一块较平滑的区域,对原始图像在该区域内的像素值进行统计和计算,确定b区域最小值和b区域最大值,第三阈值可以为多个原始图像的b区域最小值的平均值,第四阈值可以为多个原始图像的b区域最大值的平均值。
需要说明的是,通过上述公式可以将b区域从原始图像中分割出来,类似于抠图,在第二图像中,a区域和c区域的像素值均为0。
第三图像对应c区域的分割,可以根据指纹图像中点光源的位置进行。在点光源固定的情况下,预设区域的坐标也是固定的,可以直接测量点光源的坐标和光点半径来确定预设区域,进而实现c区域的分割。在第三图像中,a区域和b区域的像素值均为0。
在第二种实现方式中,对原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,可以包括:对原始图像进行边缘检测,根据检测到的边缘的位置和长度,将原始图像分割为第一图像、第二图像和第三图像。
在具体实现中,可以采用拉普拉斯边缘检测算法,对原始图像进行边缘检测,对检测到的边缘的长度和位置进行筛选,将最终提取到的边缘作为各区域的边界进行分割。
在亮态采集环境下,拉普拉斯边缘检测算法可以检测到区域a和区域b之间的边界,区域a与区域c之间的边界,还可能会检测到噪声引起的边界以及有效识别区域的边界等。进一步地,可以根据边界长度筛选掉噪声引起的边界,根据边界位置可以筛选掉有效识别区域的边界。由于边缘检测速度比较快,因此采用边缘检测法进行图像分割,可以提高分割效率。
假设待处理图像为第一图像、第二图像和第三图像中的任意一个,待处理图像包括第二像素的第二像素值。在具体实现中,对第一图像、第二图像和第三图像分别进行归一化处理的步骤,可以包括:首先确定待处理图像所包含的所有像素值中的最大值和最小值;然后根据最大值、最小值以及第二像素值,确定归一化处理后的待处理图像中的第二像素的像素值。
具体地,可以分别计算待处理图像所包含的所有像素值的最大值和最小值,假设最大值为max,最小值为min,待处理图像中第二像素的第二像素值I,归一化后第二像素的像素值为I norm=(I-min)/(max-min),这样可以将待处理图像中的像素值都归一化到0~1的数值范围内。
步骤S32:将模糊图像输入卷积神经网络中,卷积神经网络中的编码网络对模糊图像进行下采样和特征提取,输出多个特征图,卷积神经网络中的解码网络对特征图进行上采样和特征提取,输出与模糊图像对应的预测图像。
其中,编码网络包括多个编码层级,解码网络包括多个解码层级,编码网络中第F个编码层级处理得到的特征图与解码网络中第G个解码层级处理得到的特征图融合后,作为解码网络中第G+1个解码层级的输入。
第F个编码层级处理得到的特征图与第G个解码层级处理得到的特征图具有相同的分辨率,F和G均为正整数。
卷积神经网络(Convolutional Neural Networks,CNN)是一种使用例如图像作为输入和输出,并通过滤波器(卷积核)来替代标量权重的神经网络结构。卷积过程可以看作是使用一个可训练的滤波器对一个输入的图像或卷积特征平面(feature map)做卷积,输出一个卷积特征平面,卷积特征平面还可以称为特征图。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元只与部分相邻层的神经元连接。卷积层可以对输入图像应用若干个卷积核,以提取输入图像的多种类型的特征。每个卷积核可以提取一种类型的特征。卷积核一般以随机大小的矩阵的形式 初始化,在卷积神经网络的训练过程中,卷积核将通过学习以得到合理的权值。在同一卷积层中,可以使用多个卷积核来提取不同的图像信息。
通过将编码网络中第F个编码层级处理得到的特征图与解码网络中第G个解码层级处理得到的特征图融合后,输入到解码网络中第G+1个解码层级,可以在编码网络和解码网络之间实现跳跃连接。通过编码网络和解码网络之间的跳跃连接可以增加解码网络对图像细节的保留,可以将编码网络在下采样过程中损失的图像细节和信息传递至解码网络,使得解码网络在上采样恢复空间分辨率的过程中,可以利用这些信息生成出更准确的图像,从而提高从模糊图像中提取清晰图像的准确性。
下采样操作可以包括:最大值合并、平均值合并、随机合并、欠采样例如选择固定的像素、解复用输出如将输入图像拆分为多个更小的图像等,本公开对此并不限定。
上采样操作可以包括:最大值合并、跨度转置卷积(strides transposed convolutions)、插值等,本公开对此并不限定。
在编码网络中,可以通过多次下采样可以逐步缩小特征图的空间维度可以扩大感受野,使得编码网络可以更好地提取不同尺度的局部和全局特征,而且下采样可以对提取的特征图进行压缩,从而节省计算量和内存的占用,并提高处理速度。
在解码网络中,可以通过多次上采样将编码网络输出的多个特征图的空间分辨率恢复到与模糊图像一致。
步骤S33:根据预测图像、清晰图像以及预设的损失函数,计算卷积神经网络的损失值,以最小化损失值为目标,调整卷积神经网络的参数。
其中,损失函数(loss function)是用于衡量预测图像和清晰图像的差异的重要方程。比如,损失函数的输出值(loss)越高表示差异越大。
在一种可选的实现方式中,可以按照以下公式计算卷积神经网络的损失值:
Figure PCTCN2021127078-appb-000011
Figure PCTCN2021127078-appb-000012
Figure PCTCN2021127078-appb-000013
其中,
Figure PCTCN2021127078-appb-000014
为损失值,Y为预测图像,
Figure PCTCN2021127078-appb-000015
为清晰图像,W为预测图像的宽度,H为预测图像的高度,C为预测图像的通道数,E(Y)为预测图像的边缘图,
Figure PCTCN2021127078-appb-000016
为清晰图像的边缘图,λ大于或等于0且小于或等于1,x为大于或等于1且小于或等于W的正整数,y为大于或等于1且小于或等于H的正整数,z为大于或等于1且小于或等于C的正整数。
其中,E(Y)可以是根据索贝尔边缘提取算法获取的预测图像的边缘图,
Figure PCTCN2021127078-appb-000017
可以是根据索贝尔边缘提取算法获取的清晰图像的边缘图。
由于
Figure PCTCN2021127078-appb-000018
可以引导网络恢复清晰图像的低频信息,
Figure PCTCN2021127078-appb-000019
有利于增强原始图像的边缘信息,因此在本实现方式中,使用
Figure PCTCN2021127078-appb-000020
Figure PCTCN2021127078-appb-000021
的加权和作为损失函数,可以提高图像提取的效果。
在具体实现中,可以利用AdamW优化器,根据损失值优化卷积神经网络的参数。初始学习率可以设置为10 -4,训练数据的batch size可以设置为48。
在具体实现中,可以通过判断卷积神经网络是否收敛来确定是否结束训练,其中,判断卷积神经网络是否收敛可以通过以下方式中的任意一种:判断更新卷积神经网络的参数的次数是否达到迭代阈值;判断卷积神经网络的损失值是否低于损失阈值。其中,迭代阈值可以是预先设置的迭代次数,比如,更新卷积神经网络的参数的次数大于迭代阈值,则结束训练。其中,损失阈值可以是预先设置的,比如,若损失函数计算得到的损失值小于损失阈值,则结束训练。
步骤S34:将完成参数调整的卷积神经网络确定为图像处理模型。
在本实施例中,响应于确定卷积神经网络训练完成,可以将训练后的卷积神经网络确定为图像处理模型。该图像处理模型可以用于从模糊指纹图像中提取清晰指纹图像。
本实施例提供的模型训练方法,通过对卷积神经网络进行训练,能够得到一种可以用于提取清晰指纹图像的模型。本实施例提供的卷积神经网络包括带有跳跃连接的编码网络和解码网络,通过编码网络和解码网络之间的跳跃连接可以增加解码网络对图像细节的保留,从而提高从模糊图像中提取清晰图像的准确性,提高图像处理效果。
本实施例中,卷积神经网络的具体结构可以根据实际需求进行设定。
在一种可选的实现方式中,各编码层级可以包括第一卷积块和/或下采样块。第一卷积块用于对输入的特征矩阵进行特征提取。下采样块用于对输入的特征图进行下采样。
各解码层级可以包括第二卷积块和/或上采样块。第二卷积块用于对输入的特征矩阵进行特征提取。上采样块用于对输入的特征图进行上采样。
其中,第一卷积块、下采样块、第二卷积块以及上采样块中的至少一个包括至少一组非对称卷积核。
非对称卷积核例如可以包括1×k卷积核和k×1卷积核。其中k值大于或等于2,k值可以根据需求设定,例如可以为5。
本实现方式中,通过采用非对称卷积核进行特征提取或采样处理,可以大幅减少计算量,从而提高处理速度。通过采用非对称卷积核分别进行横向卷积和纵向卷积,可以学习到图像中的横向梯度和纵向梯度,有助于提取指纹图像中信息的变化。
如图8所示,编码网络可以包括N个编码模块,如图8所示的CM-1、CM-2、……、CM-N。其中,N可以为正整数,或者N可以大于或等于2且小于或等于20,例如N可以取值为8、10、12、15等等,本公开对N的具体数值不作限定。
各编码模块可以包括M个编码层级。其中,M可以为正整数,或者M可以大于或等于2且小于或等于8,如图8所示的M值为3,即各编码模块包括3个编码层级,分别为第一个编码层级a1、第二个编码层级a2和第三个编码层级a3。本公开对M的具体数值不作限定。
具体地,任意一个编码模块的第一个编码层级a1可以包括一个或多个第一卷积块。任意一个编码模块的第i个编码层级可以包括一个或多个第一卷积块和一个下采样块。其中,i大于或等于2,且小于或等于M。
解码网络可以包括M个解码层级,即解码网络中解码层级的数量与各编码模块中编码层级的数量相同。如图8所示的解码网络中包括三个解码层级,分别为第一个解码层级b1,第二个解码层级b2以及第三个解码层级b3。
在解码网络中,第一个解码层级至第M-1个解码层级均可以包括一个或多个第二卷积块和一个上采样块。第M个解码层级可以包括一个或多个第二卷积块。
图8所示各编码模块中包括两个下采样块,各下采样块可以对输入的特征图实现2倍下采样,解码网络中包括两个上采样块,各上采样块可以对输入的特征图实现2倍上采样。这样,可以确保卷积神经网络输出的图像与输入卷积神经网络的图像具有相同的分辨率。
本实现方式中,卷积神经网络中的编码网络对模糊图像进行下采样和特征提取,输出多个特征图的步骤,可以包括:
N个编码模块中第一个编码模块CM-1的第一个编码层级a1对模糊图像进行特征提取;
第一个编码模块CM-1的第i个编码层级对第一个编码模块的第i-1个编码层级处理得到的特征图依次进行下采样和特征提取;
N个编码模块中第j个编码模块的第一个编码层级对第j-1个编码模块的第一个编码层级处理得到的特征图进行特征提取;其中,j大于或等于2,且小于或等于N;
第j个编码模块的第i个编码层级对第j个编码模块的第i-1个编码层级处理得到的特征图进行下采样,并将下采样得到的特征图与第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取。
其中,将下采样得到的特征图与第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取的步骤,可以包括:将下采样得到的特征图与第j-1个编码模块的第i个编码层级处理得到的特征图在通道维度上进行拼接,并对拼接的结果进行特征提取。
模糊图像可以为在通道维度上对第一图像、第二图像和第三图像进行拼接处理后得到的图像。在具体实现中,模糊图像的矩阵尺寸可以为B×3×H×W,其中B为一个训练batch中原始图像的个数,H为原始图像的高,W为原始图像的宽。输出的清晰图像为B×1×H×W的矩阵。
在第一个编码模块CM-1内,可以通过第一个编码层级a1中的第一卷积块对模糊图像进行特征提取,得到第一特征图;第二个编码层级a2中的下采样块对第一特征图进行第一次下采样,第二个编码层级a2中的第一卷积块对第一次下采样得到的特征图进行特征提取,得到第二特征图;第三个编码层级a3中的下采样块对第二特征图进行第二次下采样,第三个编码层级a3中 的第一卷积块对第二次下采样得到的特征图进行特征提取,得到第三特征图。
在第二个编码模块CM-2内,第一个编码层级a1中的第一卷积块对第一个编码模块CM-1输出的第一特征图进行特征提取;第二个编码层级a2中的下采样块对第一个编码层级a1输出的特征图进行第一次下采样,第二个编码层级a2中的第一卷积块对第一次下采样得到的特征图与第一个编码模块CM-1输出的第二特征图进行特征融合,并对融合的结果进行特征提取;第三个编码层级a3中的下采样块对第二个编码层级a2输出的特征图进行第二次下采样,第三个编码层级a3中的第一卷积块对第二次下采样得到的特征图与第一个编码模块CM-1输出的第三特征图进行特征融合,并对融合的结果进行特征提取。
假设第N-1个编码模块CM-N-1内,第一个编码层级a1输出的特征图为第四特征图,第二个编码层级a2输出的特征图为第五特征图,第三个编码层级a3输出的特征图为第六特征图。
在第N个编码模块CM-N内,第一个编码层级a1中的第一卷积块对编码模块CM-N-1输出的第四特征图进行特征提取,得到第七特征图;第二个编码层级a2中的下采样块对第一个编码层级a1输出的特征图进行第一次下采样,第二个编码层级a2中的第一卷积块对第一次下采样得到的特征图与编码模块CM-N-1输出的第五特征图进行特征融合,并对融合的结果进行特征提取,得到第八特征图;第三个编码层级a3中的下采样块对第二个编码层级a2输出的特征图进行第二次下采样,第三个编码层级a3中的第一卷积块对第二次下采样得到的特征图与编码模块CM-N-1输出的第六特征图进行特征融合,并对融合的结果进行特征提取,得到第九特征图。
其中,编码网络输出的多个特征图包括N个编码模块中第N个编码模块的各个编码层级处理得到的特征图。
相应地,卷积神经网络中的解码网络对特征图进行上采样和特征提取,输出与模糊图像对应的预测图像的步骤,可以包括:
M个解码层级中第一个解码层级对第N个编码模块的第M个编码层级处理得到的特征图进行特征提取,并对提取得到的特征图进行上采样;
解码网络将M个解码层级中第u-1个解码层级处理得到的特征图与第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融 合特征图;其中,u大于或等于2,且小于或等于M-1;M的取值可以大于或等于3;
解码网络将第一融合特征图输入M个解码层级中第u个解码层级,第u个解码层级对第一融合特征图依次进行特征提取和上采样;
解码网络将M个解码层级中第M-1个解码层级处理得到的特征图与第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图;
解码网络将第二融合特征图输入M个解码层级中第M个解码层级,第M个解码层级对第二融合特征图进行特征提取,得到预测图像。
在具体实现中,将M个解码层级中第u-1个解码层级处理得到的特征图与第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融合特征图的步骤,可以包括:将M个解码层级中第u-1个解码层级处理得到的特征图与第N个编码模块的第M-u+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第一融合特征图。
将M个解码层级中第M-1个解码层级处理得到的特征图与第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图的步骤,可以包括:将M个解码层级中第M-1个解码层级处理得到的特征图与第N个编码模块的第一个编码层级处理得到的特征图在通道维度上进行拼接,得到第二融合特征图。
如前所述,在第N个编码模块CM-N内,第一个编码层级a1输出第七特征图;第二个编码层级a2输出第八特征图;第三个编码层级a3输出第九特征图。
如图8所示,在解码网络内,第一个解码层级b1中的第二卷积块对第九特征图进行特征提取,第一个解码层级b1中的上采样块对特征提取的结果进行第一次上采样;解码网络将第一个解码层级b1输出的特征图与第八特征图进行第一次融合,并将第一次融合得到特征图输入第二个解码层级b2;第二个解码层级b2中的第二卷积块对第一次融合得到特征图进行特征提取,第二个解码层级b2中的上采样块对特征提取的结果进行第二次上次样;解码网络将第二个解码层级b2输出的特征图与第七特征图进行第二次融合,并将第二次融合得到的特征图输入第三个解码层级b3;第三个解码层级b3 中的第二卷积块对第二次融合得到的特征图进行特征提取,输出预测图像。
参照图9示出了一种第一卷积块的结构示意图。如图9所示,第一卷积块可以包括第一卷积层和第二卷积层,第一卷积层可以包括非对称卷积核,第二卷积层可以包括1×1卷积核。
在第一卷积块中,可以通过拼接层(如图9中所示的cat)对第一卷积层内的一对非对称卷积核分别处理得到的特征图进行融合,之后通过第二卷积层对融合结果进行通道数压缩,减少计算量。之后,再由InstanceNorm层采用InstanceNorm方法对第二卷积层输出的结果进行归一化处理,之后再由PRelu层采用激活函数PRelu对输入的特征图进行处理,经过激活函数处理后输出第一卷积块。
第二卷积块可以与第一卷积块的结构相同,当然也可以不同。
参照图10示出了一种下采样块的结构示意图。如图10所示,下采样块可以包括最大池化层和最小池化层,最大池化层和最小池化层均包括一组非对称卷积核。即下采样块包括空间可分离的最大池化层和最小池化层,可以设置最大池化层和最小池化层的滤波核尺寸k=5。最大池化层和最小池化层所包含的非对称卷积核可以相同或不同。
在下采样块中,可以通过拼接层(如图10中所示的cat)对最大池化层输出的特征图和最小池化层输出的特征图进行融合,融合后输出下采样块。
上采样块用于执行上采样操作,上采样操作具体可以包括:PixelShuffle、最大值合并、跨度转置卷积(strides transposed convolutions)、插值(例如,内插值、两次立方插值等)等。然而,本公开对此并不限定。
如图8所示,卷积神经网络的结构如交叉网格状,可以加深深层特征与浅层特征之间的融合,充分利用了原始图像中有限的指纹信息,提高从原始图像中提取到清晰图像的准确性。
本实现方式中,卷积神经网络使用空间可分离卷积实现大部分卷积操作,通过使用空间可分离卷积进行特征提取或采样处理,可以大幅减少计算量,从而提高处理速度,有助于实现对输入图像的实时处理。并且,空间可分离卷积可以学习到模糊图像中的横向梯度和纵向梯度,有助于提取指纹中信息的变化,提高从模糊图像中提取到清晰图像的准确度。
在另一种可选的实现方式中,编码层级和解码层级中的卷积核均为对称 卷积核。
本实现方式中,编码网络包括可以P个编码层级。如11示出的编码网络包括3个编码层级,即第一个编码层级、第二个编码层级和第三个编码层级。
图11中第二个编码层级的左侧虚线框内为第二个编码层级的具体结构,第二个编码层级具体可以包括:InstanceNorm层、PRelu层、第三卷积层、InstanceNorm层、PRelu层和下采样层。
其中,InstanceNorm层是采用InstanceNorm方法对输入的特征图进行归一化处理。
PRelu层是采用激活函数PRelu对输入的特征图进行处理。
第三卷积层可以包括5×5卷积核,用于对输入的特征图进行特征提取。
下采样层可以包括4×4卷积核的卷积层,该卷积层的步长(stride)可以为2,因此第二个编码层级输出的特征图的宽高比输入的特征图各缩小2倍。
第一个编码层级、第二个编码层级和第三个编码层级的具体结构可以相同。
本实现方式中,解码网络可以包括P个解码层级,即解码层级数量与编码层级的数量相同。如11示出的编码网络包括3个解码层级,即第一个解码层级、第二个解码层级和第三个解码层级。
图11中第二个解码层级的右侧的虚线框内为第二个解码层级的具体结构,第二个解码层级具体可以包括:InstanceNorm层、PRelu层、上采样层、InstanceNorm层、PRelu层、第四卷积层。
其中,InstanceNorm层是采用InstanceNorm方法对输入的特征图进行归一化处理。PRelu层是采用激活函数PRelu对输入的特征图进行处理。
上采样层可以包括4×4转置卷积核的卷积层,该卷积层的步长(stride)可以为2,因此第二个解码层级输出的特征图的宽高比输入的特征图各扩大2倍。
第四卷积层可以包括5×5卷积核,用于对输入的特征图进行特征提取。
第一个解码层级、第二个解码层级和第三个解码层级的具体结构可以相同。
本实现方式中,卷积神经网络中的编码网络对模糊图像进行下采样和特 征提取,输出多个特征图的步骤,可以包括:
P个编码层级中第一个编码层级对模糊图像依次进行特征提取和下采样;
P个编码层级中第q个编码层级对第q-1个编码层级处理得到的特征图依次进行特征提取和下采样。
其中,q大于或等于2,且小于或等于P,编码网络输出的多个特征图包括P个编码层级处理得到的特征图。
在具体实现中,第一个编码层级对对模糊图像依次进行特征提取和下采样,得到第十特征图;第二编码层级对第十特征图依次进行特征提取和下采样,得到第十一特征图;第三编码层级对第十一特征图依次进行特征提取和下采样,得到第十二特征图。
其中,模糊图像对应的矩阵尺寸为B×3×H×W,其中B为一个训练batch中原始图像的个数,H为原始图像的高,W为原始图像的宽。第十特征图对应的矩阵尺寸为B×64×H/2×W/2,第十一特征图对应的矩阵尺寸为B×128×H/4×W/4,第十二特征图对应的矩阵尺寸为B×256×H/8×W/8。
解码网络还可以包括第三卷积块,包括InstanceNorm层、PRelu层、5×5卷积核的卷积层、InstanceNorm层、PRelu层、5×5卷积核的卷积层,输入和输出第三卷积块的特征矩阵的宽高尺寸保持不变。
本实现方式中,卷积神经网络中的解码网络对特征图进行上采样和特征提取,输出与模糊图像对应的预测图像的步骤,可以包括:
通过第三卷积块对P个编码层级中第P个编码层级处理得到的特征图进行特征提取,得到计算特征图;
解码网络将计算特征图与第P个编码层级处理得到的特征图进行融合,得到第三融合特征图;
解码网络将第三融合特征图输入P个解码层级中的第一个解码层级,第一个解码层级对第三融合特征图依次进行上采样和特征提取;
解码网络将P个解码层级中第r-1个解码层级处理得到的特征图与P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图;
解码网络将第四融合特征图输入至P个解码层级中的第r个解码层级,第r个解码层级对第四融合特征图依次进行上采样和特征提取。
其中,r大于或等于2,且小于或等于P,预测图像为P个解码层级中的第P个解码层级处理得到的特征图。
其中,将计算特征图与第P个编码层级处理得到的特征图进行融合,得到第三融合特征图的步骤,可以包括:将计算特征图与第P个编码层级处理得到的特征图在通道维度上进行拼接,得到第三融合特征图。
将P个解码层级中第r-1个解码层级处理得到的特征图与P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图的步骤,可以包括:将P个解码层级中第r-1个解码层级处理得到的特征图与P个编码层级中第P-r+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第四融合特征图。
在具体实现中,结合图8,第三卷积块对第十二特征图进行特征提取,得到计算特征图,解码网络将计算特征图与第十二特征图进行融合,得到第三融合特征图。第三融合特征图作为第一个解码层级的输入,第一个解码层级对第三融合特征图依次进行上采样和特征提取,得到第十三特征图;解码网络将第十三特征图和第十一特征图进行融合,得到第十四特征图,并将第十四特征图输入第二个解码层级;第二个解码层级对第十四特征图依次进行上采样和特征提取,得到第十五特征图;解码网络将第十五特征图和第十特征图进行融合,得到第十六特征图,并将第十六特征图输入第三个解码层级;第三个解码层级对第十六特征图依次进行上采样和特征提取,得到预测图像。
图12示意性地示出了一种图像处理方法的流程图,如图12所示,该方法可以包括以下步骤。
步骤S1201:获取模糊指纹图像。
当模型训练过程中采用的模糊图像为对原始图像进行预处理获得的结果时,获取模糊指纹图像的步骤,包括:获取原始指纹图像;对原始指纹图像进行预处理,获得模糊指纹图像;其中,预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
获取原始指纹图像的过程与获取原始图像的过程相同,对原始指纹图像进行预处理的过程与对原始图像进行预处理的过程相同,这里不再赘述。
本实施例的执行主体可以为计算机设备,该计算机设备具有图像处理装 置,通过该图像处理装置来执行本实施例提供的图像处理方法。其中,计算机设备例如可以为智能手机、平板电脑、个人计算机等,本实施例对此不作限定。
本实施例的执行主体可以通过多种方式来获取模糊指纹图像。例如,执行主体可以获取由多点光源屏下指纹采集设备所采集的原始指纹图像,然后对获取的原始指纹图像进行预处理,获得模糊指纹图像。
步骤S1202:将模糊指纹图像输入至任一实施例提供的模型训练方法训练得到的图像处理模型,得到模糊指纹图像对应的清晰指纹图像。
其中,图像处理模型可以是预先训练好的,也可以是在图像处理的过程中训练得到的,本实施例对此不作限定。
本实施例提供的图像处理方法,通过将模糊指纹图像输入图像处理模型中,可以提取到高质量的清晰指纹图像,将指纹的指纹脊和指纹谷信息进行提取和增强,该清晰指纹图像可以直接应用于指纹识别。与依次点亮点光源获取清晰指纹图像的相关技术相比,本实施例可以提高清晰指纹图像的获取效率。
图13示意性地示出了一种模型训练装置框图。参照图13,可以包括:
获取模块1301,被配置为获取样本集,所述样本集中的样本包括同一指纹的模糊图像和清晰图像;
预测模块1302,被配置为将所述模糊图像输入卷积神经网络中,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像;其中,所述编码网络包括多个编码层级,所述解码网络包括多个解码层级,所述编码网络中第F个编码层级处理得到的特征图与所述解码网络中第G个解码层级处理得到的特征图融合后,作为所述解码网络中第G+1个解码层级的输入,所述第F个编码层级处理得到的特征图与所述第G个解码层级处理得到的特征图具有相同的分辨率,所述F和所述G均为正整数;
训练模块1303,被配置为根据所述预测图像、所述清晰图像以及预设的损失函数,计算所述卷积神经网络的损失值,以最小化所述损失值为目标, 调整所述卷积神经网络的参数;
确定模块1304,被配置为将完成参数调整的卷积神经网络确定为图像处理模型。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关模型训练方法的实施例中进行了详细描述,例如,使用软件、硬件、固件等方式实现,此处将不做详细阐述说明。
图14示意性地示出了一种图像处理装置框图。参照图14,可以包括:
获取模块1401,被配置为获取模糊指纹图像;
提取模块1402,被配置为将所述模糊指纹图像输入至如任一实施例提供的模型训练方法训练得到的图像处理模型,得到所述模糊指纹图像对应的清晰指纹图像。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关图像处理方法的实施例中进行了详细描述,例如,使用软件、硬件、固件等方式实现,此处将不做详细阐述说明。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
本公开的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本公开实施例的计算处理设备中的一些或者全部部件的一些或者全部功能。本公开还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本公开的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,图15示出了可以实现根据本公开的方法的计算处理设备。该计算处理设备传统上包括处理器1010和以存储器1020形式的计算机程序产品或者非瞬态计算机可读介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图16所述的便携式或者固定存储单元。该存储单元可以具有与图15的计算处理设备中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
以上对本公开所提供的一种模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质进行了详细介绍,本文中应用了具体个例对本公开的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本公 开的方法及其核心思想;同时,对于本领域的一般技术人员,依据本公开的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本公开的限制。
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本公开的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本公开的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本公开可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来 具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
最后应说明的是:以上实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开各实施例技术方案的精神和范围。

Claims (20)

  1. 一种模型训练方法,其中,包括:
    获取样本集,所述样本集中的样本包括同一指纹的模糊图像和清晰图像;
    将所述模糊图像输入卷积神经网络中,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像;其中,所述编码网络包括多个编码层级,所述解码网络包括多个解码层级,所述编码网络中第F个编码层级处理得到的特征图与所述解码网络中第G个解码层级处理得到的特征图融合后,作为所述解码网络中第G+1个解码层级的输入,所述第F个编码层级处理得到的特征图与所述第G个解码层级处理得到的特征图具有相同的分辨率,所述F和所述G均为正整数;
    根据所述预测图像、所述清晰图像以及预设的损失函数,计算所述卷积神经网络的损失值,以最小化所述损失值为目标,调整所述卷积神经网络的参数;
    将完成参数调整的卷积神经网络确定为图像处理模型。
  2. 根据权利要求1所述的模型训练方法,其中,各所述编码层级包括第一卷积块和/或下采样块,各所述解码层级包括第二卷积块和/或上采样块;
    其中,所述第一卷积块、所述下采样块、所述第二卷积块以及所述上采样块中的至少一个包括至少一组非对称卷积核。
  3. 根据权利要求2所述的模型训练方法,其中,所述编码网络包括N个编码模块,各所述编码模块包括M个所述编码层级,所述M和所述N均为正整数,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图的步骤,包括:
    所述N个编码模块中第一个编码模块的第一个编码层级对所述模糊图像进行特征提取;
    所述第一个编码模块的第i个编码层级对所述第一个编码模块的第i-1个编码层级处理得到的特征图依次进行下采样和特征提取;其中,所述i大于或等于2,且小于或等于M;
    所述N个编码模块中第j个编码模块的第一个编码层级对第j-1个编码模块的第一个编码层级处理得到的特征图进行特征提取;其中,所述j大于或等于2,且小于或等于N;
    所述第j个编码模块的第i个编码层级对所述第j个编码模块的第i-1个编码层级处理得到的特征图进行下采样,并将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取;
    其中,所述多个特征图包括所述N个编码模块中第N个编码模块的各个编码层级处理得到的特征图。
  4. 根据权利要求3所述的模型训练方法,其中,所述解码网络包括所述M个解码层级,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像的步骤,包括:
    所述M个解码层级中第一个解码层级对所述第N个编码模块的第M个编码层级处理得到的特征图进行特征提取,并对提取得到的特征图进行上采样;
    将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融合特征图;其中,所述u大于或等于2,且小于或等于M-1;
    将所述第一融合特征图输入所述M个解码层级中第u个解码层级,所述第u个解码层级对所述第一融合特征图依次进行特征提取和上采样;
    将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图;
    将所述第二融合特征图输入所述M个解码层级中第M个解码层级,所述第M个解码层级对所述第二融合特征图进行特征提取,得到所述预测图像。
  5. 根据权利要求4所述的模型训练方法,其中,所述将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图进行融合,并对融合的结果进行特征提取的步骤,包括:
    将下采样得到的特征图与所述第j-1个编码模块的第i个编码层级处理得到的特征图在通道维度上进行拼接,并对拼接的结果进行特征提取;
    所述将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图进行融合,得到第一融合特征图的步骤,包括:
    将所述M个解码层级中第u-1个解码层级处理得到的特征图与所述第N个编码模块的第M-u+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第一融合特征图;
    所述将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图进行融合,得到第二融合特征图的步骤,包括:
    将所述M个解码层级中第M-1个解码层级处理得到的特征图与所述第N个编码模块的第一个编码层级处理得到的特征图在通道维度上进行拼接,得到第二融合特征图。
  6. 根据权利要求2所述的模型训练方法,其中,所述第一卷积块和所述第二卷积块均包括第一卷积层和第二卷积层,所述第一卷积层包括所述非对称卷积核,所述第二卷积层包括1×1卷积核;
    所述下采样块包括最大池化层和最小池化层,所述最大池化层和所述最小池化层均包括所述非对称卷积核;
    其中,所述非对称卷积核包括1×k卷积核和k×1卷积核,所述k大于或等于2。
  7. 根据权利要求1所述的模型训练方法,其中,所述编码层级和所述解码层级中的卷积核均为对称卷积核。
  8. 根据权利要求7所述的模型训练方法,其中,所述编码网络包括P个编码层级,所述卷积神经网络中的编码网络对所述模糊图像进行下采样和特征提取,输出多个特征图的步骤,包括:
    所述P个编码层级中第一个编码层级对所述模糊图像依次进行特征提取和下采样;
    所述P个编码层级中第q个编码层级对第q-1个编码层级处理得到的特征图依次进行特征提取和下采样;
    其中,所述q大于或等于2,且小于或等于P,所述多个特征图包括所述P个编码层级处理得到的特征图。
  9. 根据权利要求8所述的模型训练方法,其中,所述解码网络包括所述P个解码层级,所述卷积神经网络中的解码网络对所述特征图进行上采样和特征提取,输出与所述模糊图像对应的预测图像的步骤,包括:
    对所述P个编码层级中第P个编码层级处理得到的特征图进行特征提取,得到计算特征图;
    将所述计算特征图与所述第P个编码层级处理得到的特征图进行融合,得到第三融合特征图;
    将所述第三融合特征图输入所述P个解码层级中的第一个解码层级,所述第一个解码层级对所述第三融合特征图依次进行上采样和特征提取;
    将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图;
    将所述第四融合特征图输入至所述P个解码层级中的第r个解码层级,所述第r个解码层级对所述第四融合特征图依次进行上采样和特征提取;
    其中,所述r大于或等于2,且小于或等于P,所述预测图像为所述P个解码层级中的第P个解码层级处理得到的特征图。
  10. 根据权利要求9所述的模型训练方法,其中,所述将所述计算特征图与所述第P个编码层级处理得到的特征图进行融合,得到第三融合特征图的步骤,包括:
    将所述计算特征图与所述第P个编码层级处理得到的特征图在通道维度上进行拼接,得到第三融合特征图;
    所述将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图进行融合,得到第四融合特征图的步骤,包括:
    将所述P个解码层级中第r-1个解码层级处理得到的特征图与所述P个编码层级中第P-r+1个编码层级处理得到的特征图在通道维度上进行拼接,得到第四融合特征图。
  11. 根据权利要求1至10任一项所述的模型训练方法,其中,所述根据所述预测图像、所述清晰图像以及预设的损失函数,计算所述卷积神经网络的损失值的步骤,包括:
    按照以下公式计算所述损失值:
    Figure PCTCN2021127078-appb-100001
    Figure PCTCN2021127078-appb-100002
    Figure PCTCN2021127078-appb-100003
    其中,所述
    Figure PCTCN2021127078-appb-100004
    为所述损失值,所述Y为所述预测图像,所述
    Figure PCTCN2021127078-appb-100005
    为所述清晰图像,所述W为所述预测图像的宽度,所述H为所述预测图像的高度,所述C为所述预测图像的通道数,所述E(Y)为所述预测图像的边缘图,所述
    Figure PCTCN2021127078-appb-100006
    为所述清晰图像的边缘图,所述λ大于或等于0,且小于或等于1,所述x为大于或等于1且小于或等于W的正整数,所述y为大于或等于1且小于或等于H的正整数,所述z为大于或等于1且小于或等于C的正整数。
  12. 根据权利要求1至10任一项所述的模型训练方法,其中,所述获取样本集的步骤,包括:
    获取所述同一指纹的原始图像;
    对所述原始图像进行预处理,获得所述模糊图像;其中,所述预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
  13. 根据权利要求12所述的模型训练方法,其中,所述对所述原始图像进行预处理,获得所述模糊图像的步骤,包括:
    对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像,所述第一图像、所述第二图像和所述第三图像分别包含所述原始图像不同区域的信息;
    对所述第一图像、所述第二图像和所述第三图像分别进行归一化处理,所述模糊图像包括归一化处理后的所述第一图像、所述第二图像和所述第三图像。
  14. 根据权利要求13所述的模型训练方法,其中,所述原始图像包括第一像素的第一像素值,所述对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,包括:
    若所述第一像素位于预设区域范围外,所述第一像素值大于或等于第一阈值,且小于或等于第二阈值,则确定所述第一图像中所述第一像素的像素 值为所述第一像素值;
    若所述第一像素位于预设区域范围外,所述第一像素值小于所述第一阈值,且大于所述第二阈值,则确定所述第一图像中所述第一像素的像素值为0;
    若所述第一像素位于预设区域范围外,所述第一像素值大于或等于第三阈值,且小于或等于第四阈值,则确定所述第二图像中所述第一像素的像素值为所述第一像素值;
    若所述第一像素位于预设区域范围外,所述第一像素值小于所述第三阈值,且大于所述第四阈值,则确定所述第二图像中所述第一像素的像素值为0;
    若所述第一像素位于预设区域范围内,则确定所述第三图像中所述第一像素的像素值为所述第一像素值;
    其中,所述第三阈值大于所述第二阈值。
  15. 根据权利要求13所述的模型训练方法,其中,所述对所述原始图像进行图像分割,获得第一图像、第二图像和第三图像的步骤,包括:
    对所述原始图像进行边缘检测,根据检测到的边缘的位置和长度,将所述原始图像分割为所述第一图像、所述第二图像和所述第三图像。
  16. 根据权利要求13所述的模型训练方法,其中,所述对所述第一图像、所述第二图像和所述第三图像分别进行归一化处理的步骤,包括:
    确定待处理图像所包含的所有像素值中的最大值和最小值,所述待处理图像为所述第一图像、所述第二图像和所述第三图像中的任意一个,所述待处理图像包括第二像素的第二像素值;
    根据所述最大值、最小值以及所述第二像素值,确定归一化处理后的所述待处理图像中的所述第二像素的像素值。
  17. 一种图像处理方法,其中,包括:
    获取模糊指纹图像;
    将所述模糊指纹图像输入至如权利要求1至16任一项所述的模型训练方法训练得到的图像处理模型,得到所述模糊指纹图像对应的清晰指纹图像。
  18. 根据权利要求17所述的图像处理方法,其中,当所述模糊图像为对所述原始图像进行预处理获得的结果时,所述获取模糊指纹图像的步骤, 包括:
    获取原始指纹图像;
    对所述原始指纹图像进行预处理,获得所述模糊指纹图像;其中,所述预处理包括以下至少之一:图像分割、尺寸裁剪、翻转、亮度增强、噪声处理和归一化处理。
  19. 一种计算处理设备,其中,包括:
    存储器,其中存储有计算机可读代码;
    一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行如权利要求1至18中任一项所述的方法。
  20. 一种非瞬态计算机可读介质,存储有计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据如权利要求1至18中任一项所述的方法。
PCT/CN2021/127078 2021-10-28 2021-10-28 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质 WO2023070447A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180003171.8A CN116368500A (zh) 2021-10-28 2021-10-28 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质
PCT/CN2021/127078 WO2023070447A1 (zh) 2021-10-28 2021-10-28 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/127078 WO2023070447A1 (zh) 2021-10-28 2021-10-28 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质

Publications (1)

Publication Number Publication Date
WO2023070447A1 true WO2023070447A1 (zh) 2023-05-04

Family

ID=86160363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127078 WO2023070447A1 (zh) 2021-10-28 2021-10-28 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质

Country Status (2)

Country Link
CN (1) CN116368500A (zh)
WO (1) WO2023070447A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342984A (zh) * 2023-05-31 2023-06-27 之江实验室 一种模型训练的方法以及图像处理的方法及装置
CN116363123A (zh) * 2023-05-23 2023-06-30 杭州华得森生物技术有限公司 对循环肿瘤细胞检测的荧光显微成像***及其方法
CN116542884A (zh) * 2023-07-07 2023-08-04 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质
CN117152124A (zh) * 2023-10-24 2023-12-01 万里云医疗信息科技(北京)有限公司 针对血管分支处的微血管检测方法、装置及存储介质
CN117197487A (zh) * 2023-09-05 2023-12-08 东莞常安医院有限公司 一种免疫胶体金诊断试纸条自动识别***
CN117437131A (zh) * 2023-12-21 2024-01-23 珠海视新医用科技有限公司 内窥镜图像电子染色方法及装置、设备、存储介质
CN117593639A (zh) * 2023-11-21 2024-02-23 北京天鼎殊同科技有限公司 公路及其附属物的提取方法、装置、设备及介质
CN117853507A (zh) * 2024-03-06 2024-04-09 阿里巴巴(中国)有限公司 交互式图像分割方法、设备、存储介质和程序产品

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108376387A (zh) * 2018-01-04 2018-08-07 复旦大学 基于聚合膨胀卷积网络的图像去模糊方法
CN112102177A (zh) * 2020-07-27 2020-12-18 中山大学 基于压缩与激励机制神经网络的图像去模糊方法
US20210158490A1 (en) * 2019-11-22 2021-05-27 Nec Laboratories America, Inc. Joint rolling shutter correction and image deblurring
CN113538359A (zh) * 2021-07-12 2021-10-22 北京曙光易通技术有限公司 一种用于指静脉图像分割的***以及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108376387A (zh) * 2018-01-04 2018-08-07 复旦大学 基于聚合膨胀卷积网络的图像去模糊方法
US20210158490A1 (en) * 2019-11-22 2021-05-27 Nec Laboratories America, Inc. Joint rolling shutter correction and image deblurring
CN112102177A (zh) * 2020-07-27 2020-12-18 中山大学 基于压缩与激励机制神经网络的图像去模糊方法
CN113538359A (zh) * 2021-07-12 2021-10-22 北京曙光易通技术有限公司 一种用于指静脉图像分割的***以及方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WU JIAQI; LI QIA; LIANG SAI; KUANG SHEN-FEN: "Convolutional Neural Network with Squeeze and Excitation Modules for Image Blind Deblurring", 2020 INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC), IEEE, 29 May 2020 (2020-05-29), pages 338 - 345, XP033783897, DOI: 10.1109/ICTC49638.2020.9123259 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363123A (zh) * 2023-05-23 2023-06-30 杭州华得森生物技术有限公司 对循环肿瘤细胞检测的荧光显微成像***及其方法
CN116363123B (zh) * 2023-05-23 2023-12-22 杭州华得森生物技术有限公司 对循环肿瘤细胞检测的荧光显微成像***及其方法
CN116342984A (zh) * 2023-05-31 2023-06-27 之江实验室 一种模型训练的方法以及图像处理的方法及装置
CN116342984B (zh) * 2023-05-31 2023-08-08 之江实验室 一种模型训练的方法以及图像处理的方法及装置
CN116542884A (zh) * 2023-07-07 2023-08-04 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质
CN116542884B (zh) * 2023-07-07 2023-10-13 合肥市正茂科技有限公司 模糊图像清晰化模型的训练方法、装置、设备及介质
CN117197487A (zh) * 2023-09-05 2023-12-08 东莞常安医院有限公司 一种免疫胶体金诊断试纸条自动识别***
CN117197487B (zh) * 2023-09-05 2024-04-12 东莞常安医院有限公司 一种免疫胶体金诊断试纸条自动识别***
CN117152124A (zh) * 2023-10-24 2023-12-01 万里云医疗信息科技(北京)有限公司 针对血管分支处的微血管检测方法、装置及存储介质
CN117152124B (zh) * 2023-10-24 2024-01-19 万里云医疗信息科技(北京)有限公司 针对血管分支处的微血管检测方法、装置及存储介质
CN117593639A (zh) * 2023-11-21 2024-02-23 北京天鼎殊同科技有限公司 公路及其附属物的提取方法、装置、设备及介质
CN117593639B (zh) * 2023-11-21 2024-05-28 北京天鼎殊同科技有限公司 公路及其附属物的提取方法、装置、设备及介质
CN117437131A (zh) * 2023-12-21 2024-01-23 珠海视新医用科技有限公司 内窥镜图像电子染色方法及装置、设备、存储介质
CN117437131B (zh) * 2023-12-21 2024-03-26 珠海视新医用科技有限公司 内窥镜图像电子染色方法及装置、设备、存储介质
CN117853507A (zh) * 2024-03-06 2024-04-09 阿里巴巴(中国)有限公司 交互式图像分割方法、设备、存储介质和程序产品

Also Published As

Publication number Publication date
CN116368500A (zh) 2023-06-30

Similar Documents

Publication Publication Date Title
WO2023070447A1 (zh) 模型训练方法、图像处理方法、计算处理设备及非瞬态计算机可读介质
CN110399929B (zh) 眼底图像分类方法、装置以及计算机可读存储介质
RU2661750C1 (ru) Распознавание символов с использованием искусственного интеллекта
CN112132959B (zh) 数字岩心图像处理方法、装置、计算机设备及存储介质
CN112800964B (zh) 基于多模块融合的遥感影像目标检测方法及***
CN111311629A (zh) 图像处理方法、图像处理装置及设备
CN111369581A (zh) 图像处理方法、装置、设备及存储介质
CN111915627A (zh) 语义分割方法、网络、设备及计算机存储介质
CN112150493A (zh) 一种基于语义指导的自然场景下屏幕区域检测方法
CN111626295B (zh) 车牌检测模型的训练方法和装置
CN109815931B (zh) 一种视频物体识别的方法、装置、设备以及存储介质
US20240161304A1 (en) Systems and methods for processing images
CN110991374B (zh) 一种基于rcnn的指纹奇异点检测方法
CN115761258A (zh) 一种基于多尺度融合与注意力机制的图像方向预测方法
CN112270366A (zh) 基于自适应多特征融合的微小目标检测方法
CN115829942A (zh) 基于非负性约束稀疏自编码器的电子电路缺陷检测方法
CN113284122B (zh) 基于深度学习的卷纸包装缺陷检测方法、装置及存储介质
Patel et al. A novel approach for semantic segmentation of automatic road network extractions from remote sensing images by modified UNet
CN112686896B (zh) 基于分割网络的频域空间结合的玻璃缺陷检测方法
Nguyen et al. On the use of attention in deep learning based denoising method for ancient Cham inscription images
US11893784B2 (en) Assessment of image quality for optical character recognition using machine learning
WO2022121858A1 (zh) 图像处理方法、指纹信息提取方法、装置、设备、产品及介质
CN116543246A (zh) 图像去噪模型的训练方法、图像去噪方法、装置及设备
CN114663421A (zh) 基于信息迁移和有序分类的视网膜图像智能分析***及方法
CN114140381A (zh) 一种基于MDP-net的玻璃体混浊分级筛查方法及装置

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 17928087

Country of ref document: US

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21961813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE