CN117876716B - Image processing method using grouping wavelet packet transformation - Google Patents

Image processing method using grouping wavelet packet transformation Download PDF

Info

Publication number
CN117876716B
CN117876716B CN202410282726.2A CN202410282726A CN117876716B CN 117876716 B CN117876716 B CN 117876716B CN 202410282726 A CN202410282726 A CN 202410282726A CN 117876716 B CN117876716 B CN 117876716B
Authority
CN
China
Prior art keywords
frequency domain
feature map
wavelet packet
grouping
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410282726.2A
Other languages
Chinese (zh)
Other versions
CN117876716A (en
Inventor
郭锴凌
龚楷钧
徐向民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202410282726.2A priority Critical patent/CN117876716B/en
Publication of CN117876716A publication Critical patent/CN117876716A/en
Application granted granted Critical
Publication of CN117876716B publication Critical patent/CN117876716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an image processing method utilizing packet wavelet packet transformation, relating to general image data processing or generation. The convolution operation in the convolution neural network model is replaced by a grouping wavelet packet transformation module, and the input feature map is transformed from a space domain to a frequency domain so as to obtain a frequency domain feature map; performing depth separable convolution on the frequency domain feature map by using convolution check in the frequency domain, and adding and splicing the output results to obtain a target frequency domain feature map; restoring the target frequency domain feature map to a space domain by using a grouping wavelet packet inverse transformation module to obtain an output feature map; and carrying out forward propagation on the output characteristic diagram by using a convolutional neural network model, and carrying out iterative training by using a gradient descent method to obtain an image classification model for classifying the images. The method can greatly improve the image classification precision, effectively reduce the quantity and calculated quantity of the convolutional neural network, enhance the channel information interaction and improve the network expression capability.

Description

Image processing method using grouping wavelet packet transformation
Technical Field
The present invention relates to image data processing or generation in general, and more particularly, to an image processing method using packet wavelet packet transform.
Background
In recent years, as the scale of large-scale deep neural networks is increasing, light-weight research of neural networks has attracted attention and is applied to various fields of computer vision. The neural network is light and used for reducing the low importance part in the network, reserving the high importance part in the network, improving the calculation speed and saving the calculation resources while maintaining the model reasoning accuracy, and is widely researched and applied to improving the performance of the modern deep neural network.
Some technologies use artificial design structures to construct a lightweight neural network [1][2], which uses lightweight convolution such as depth separable convolution and 1×1 convolution to construct a network, so as to reduce parameters and computational redundancy in the network, but still suffer from limitation of convolution operation on channel information extraction capability. The weight pruning is adopted in the technology to construct a lightweight neural network [3][4], the low-importance weight in the neural network is cut, the calculation cost of the neural network is reduced, the network performance is kept, but the processing speed is influenced by the irregular network structure after pruning, and the compression rate and the reasoning accuracy are also not strictly guaranteed.
The wavelet packet transformation [5] is a component part for decomposing signals into information with different scales, two-dimensional wavelet transformation is adopted to extract the spatial information [6][7] with different frequencies of images in certain technologies, and the characteristic map is preprocessed through the wavelet transformation, so that the network can extract the characteristic map information more easily, the neural network can achieve the same expression capacity as the original one by using fewer convolution operations, and the overall light weight of the network is realized. Although wavelet transform is remarkable in spatial information processing, it is not effective in channel information processing.
[1]Wang X, Chu X, Han C, et al. SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 731-741.
[2]Han K, Wang Y, Tian Q, et al. Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 1580-1589.
[3]Ye Y, You G, Fwu J K, et al. Channel pruning via optimal thresholding[C]//Neural Information Processing: 27th International Conference, ICONIP 2020, Bangkok, Thailand, November 18–22, 2020, Proceedings, Part V 27. Springer International Publishing, 2020: 508-516.
[4]Guo Y, Yuan H, Tan J, et al. Gdp: Stabilized neural network pruning via gates with differentiable polarization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 5239-5250.
[5]Shensa M J. The discrete wavelet transform: wedding the a trous and Mallat algorithms[J]. IEEE Transactions on signal processing, 1992, 40(10): 2464-2482.
[6]Ramamonjisoa M, Firman M, Watson J, et al. Single image depth prediction with wavelet decomposition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 11089-11098.
[7]Liu P, Zhang H, Lian W, et al. Multi-level wavelet convolutional neural networks[J]. IEEE Access, 2019, 7: 74973-74985.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an image processing method utilizing grouping wavelet packet transformation, which effectively reduces the quantity and calculated amount of convolutional neural network parameters, enhances channel information interaction and improves network expression capability.
The invention relates to an image processing method using grouping wavelet packet transformation, which comprises the following steps:
Step one, constructing an image data set, determining a channel compression rate, and determining the length r of a grouping wavelet packet transformation module group according to the channel compression rate;
Step two, inputting an input image into a convolutional neural network, and replacing part of convolutional operations in a convolutional neural network model with a grouping wavelet packet transformation module, wherein in the grouping wavelet packet transformation module, an input feature map of the convolutional operations to be replaced is set as X, and the input feature map X is transformed from a space domain to a frequency domain so as to obtain a frequency domain feature map Y;
step three, the frequency domain feature map Y is subjected to depth separable convolution by using convolution check in the frequency domain, and the output results are added and spliced to obtain a target frequency domain feature map M with channel information and spatial information aggregated;
Restoring the target frequency domain feature map M to a space domain by using a grouping wavelet packet inverse transformation module to obtain an output feature map Q;
and fifthly, carrying out forward propagation on the output feature map Q by using a convolutional neural network model, and carrying out iterative training by using a gradient descent method to obtain an image classification model for classifying images.
Further improvement, the channel compression rate in the first step is determined according to the parameter quantity target of the model.
In a further improvement, in the second step, the transforming the input feature map X from the spatial domain to the frequency domain, specifically includes,
Setting a replaced convolution operation input dimension Cin; dividing the input characteristic diagram X into Cin/r groups along the channel dimension according to the length r of a grouping wavelet packet transformation module group to obtain a grouped input characteristic diagram X ', transforming the vector of the grouped input characteristic diagram X' to a frequency domain through grouping wavelet packet transformation, and transforming the input characteristic diagram X into grouping wavelet packets through convolution operation.
Further, the vector of the input characteristic diagram X' after grouping is changed to a frequency domain through grouping wavelet packet transformation, specifically,
Corresponding each spatial position (h, w) of the grouped input feature map XAs an input vector of the grouping wavelet packet transformation module, the vector of the grouped input feature map X' is transformed into the frequency domain by the following formula:
Wherein, Representing the input feature map/>, after the i-th group groupingWavelet packet transformation results of the h-th row and w-th column vectors; i=1..cin/r; H. w represents two dimensions of space; DWT () represents the wavelet packet transform.
Further, the convolution operation is specifically that,
The feature extraction operation of each level of the grouping wavelet packet transformation module is represented by a matrix, and the matrix of each level is multiplied according to the level sequence to obtain a transformation matrix; Sharing the transformation matrix to the grouped input feature graphs X' of the Cin/r group, and splicing the shared transformation matrix in a first dimension to obtain a convolution kernel W 1 of 1X 1 groups;
the input feature map X is subjected to a group convolution by the convolution kernel W 1:
Wherein, A packet convolution representing a number of packets of Cin/r; concat (-) represents a splice operation along the channel dimension; * Representing a convolution operation; /(I)Representing input feature graphs taken along a channel dimensionSubscript the portions from the (r×i+1) to the (r×i+r); y represents a frequency domain feature map.
Further, the third step comprises, in particular,
The first step, the frequency domain feature map Y is grouped into Cin/r, and the frequency domain feature map Y i and the convolution kernel W i are subjected to depth separable convolution by the following formula:
Wherein, Z i represents an output frequency domain feature map of the i-th group of frequency domain feature map Y i and a convolution kernel W i which are subjected to depth separable convolution; depthwise represents a depth separable convolution with the convolution kernel feature map;
step two, adding all the output frequency domain feature graphs Z i to obtain a frequency domain feature graph Z with the channel information aggregated;
Thirdly, repeating the Cout/r times in the first step and the second step to obtain Cout/r frequency domain feature graphs Z with the channel information aggregated; splicing Cout/r frequency domain feature graphs Z aggregated with channel information along the channel dimension to obtain a target frequency domain feature graph M;
Where Cout represents the convolution operation output dimension that is replaced.
Further, the fourth step, specifically includes,
Dividing the target frequency domain feature map M into Cout/r groups according to the length r of a convolution group of a grouping wavelet packet transformation module in a first dimension to obtain a grouped target frequency domain feature map M';
Corresponding each spatial position (h, w) of the grouped target frequency domain feature map M As an input vector of the grouping wavelet packet inverse transformation module, the vector of the grouped target frequency domain feature map M' is subjected to inverse wavelet packet transformation back to the spatial domain by the following formula:
Wherein IDWT (. -%) represents the inverse haar wavelet packet transform; representing the j-th set of target frequency domain feature maps/> The spatial domain results of the wavelet packet inverse transform for the h-th row and w-th column vectors;
the characteristic extraction operation of each level of the grouping wavelet packet inverse transformation module is represented by a matrix, and the matrix of each level is multiplied according to the level order to obtain an inverse transformation matrix Inverse transform matrix/>Sharing the target frequency domain feature map M' after Cout/r grouping, and splicing the shared transformation matrix in a first dimension to obtain a1 multiplied by 1 grouping convolution kernel W 2;
And carrying out grouping convolution on the target frequency domain feature map M through the convolution kernel W 2:
Wherein, Representing taking a frequency domain feature map/>, along a channel dimensionSubscripts are parts of the (r.times.i. +1) to (r.times.i. +r).
Further, the group length r is set according to a parameter amount target of the image classification model, and the parameter amount of the grouping wavelet packet transformation module is 1/r of the convolution operation to be replaced.
Still further, the group length r is set to be a common divisor of the input-output dimension size of the convolution operation replaced by the packet wavelet packet transform module.
Further improvements are made in that the input image is a preprocessed image; the preprocessing includes filling, cropping, flipping, and normalizing the image.
Advantageous effects
The invention has the advantages that:
(1) The present invention adopts the grouping wavelet packet transformation module to replace the convolution operation in the network, the parameter number and the calculation amount of the grouping wavelet packet transformation module are 1/r of the convolution operation replaced, the network redundancy is reduced, and the lightweight network is constructed.
(2) The existing weight pruning method can randomly generate irregular weights while constructing a lightweight network, is unfavorable for the deployment of the network in a hardware environment, and the grouping wavelet packet transformation module provided by the invention is structured operation, does not generate irregular weights and is convenient for hardware use.
(3) The invention adopts wavelet packet transformation to extract channel information, and improves the network channel information extraction capacity compared with convolution operation. Can be easily inserted into classical deep neural networks.
Drawings
FIG. 1 is a schematic diagram of the integration of a packet wavelet packet transform module with a convolutional neural network ResNet of the present invention;
FIG. 2 is a schematic diagram of a block convolution kernel construction flow in a block wavelet packet transformation according to the present invention;
FIG. 3 is a schematic diagram of a flow chart of the channel aggregation and spatial information in the frequency domain according to the present invention;
Fig. 4 is a schematic diagram of a block convolution kernel construction flow in the block wavelet packet inverse transformation according to the present invention.
Detailed Description
The invention is further described below in connection with the examples, which are not to be construed as limiting the invention in any way, but rather as falling within the scope of the claims.
The wavelet packet transform has spatial feature extraction capability, but its channel information extraction capability remains to be explored. The invention relates to an image processing method utilizing grouping wavelet packet transformation, which adopts a channel dimension grouping wavelet packet transformation mode to transform an input characteristic diagram into a frequency domain, uses convolution to check the frequency domain characteristic diagram in the frequency domain to carry out depth separable convolution so as to aggregate space and channel dimension information, uses grouping wavelet packet inverse transformation to restore the frequency domain characteristic diagram into the space domain, adds a grouping wavelet packet transformation module into a convolution neural network to train simultaneously, and finally obtains a lightweight convolution neural network.
Referring specifically to fig. 1-4, an image processing method using packet wavelet packet transform according to the present invention includes the steps of:
s1, constructing a data set, determining a channel compression rate, and determining the length r of the grouping wavelet packet transformation module group according to the channel compression rate.
In the invention, the disclosed image classification data set is divided into a training set and a testing set, and the images in the data set are subjected to preprocessing such as filling, cutting, overturning, normalization and the like. The training set is used for training the weight of the model and the structural parameters of the model, and the effect of the visual attention method is tested on the testing set. And meanwhile, setting the group length r in the grouping wavelet packet transformation module according to the parameter quantity target of the image classification model to be constructed. The group length r is required to be set as a common divisor of the input and output dimension of the convolution operation replaced by the grouping wavelet packet transformation module, and the group length r corresponds to the parameter number and the calculated amount of the grouping wavelet packet transformation module and is compressed to be 1/r of the replaced convolution operation. The smaller the group length r, the smaller the compression ratio; the larger the parameter number, the higher the accuracy.
S2, building a convolutional neural network model, and replacing partial convolutional operation in the convolutional neural network by using a packet wavelet packet transformation module. Then in the grouping wavelet packet transformation module, the input feature diagram of the convolution operation to be replaced is set as X, the input feature diagram X is transformed from a space domain to a frequency domain through the grouping wavelet packet transformation module, the feature diagram is grouped by convolution check in the frequency domain to carry out depth separable convolution, the channel and the space information are aggregated, and then the frequency domain feature diagram is restored to the space domain by the grouping wavelet packet inversion, wherein the specific flow is shown in figures 2, 3 and 4.
In the present invention, taking classical convolutional neural network ResNet as an example, the packet wavelet packet transform module replaces the convolutional operation of each basic block or bottleneck block. The basic block and bottleneck block structure is shown in fig. 1.
In S2, the specific content of transforming the input feature map X from the spatial domain to the frequency domain by the packet wavelet packet transform module of the channel dimension is: given the convolution operation input dimension Cin and output dimension Cout, which are replaced, an input feature map. Wherein H, W denotes two dimensions of space. Dividing the input feature map X into Cin/r groups along the channel dimension according to the group length r of the grouping wavelet packet transformation module to obtain a grouped input feature map/>And transforming the grouped input characteristic diagram X' to a frequency domain along the channel dimension by using a grouping wavelet packet transformation module.
In this embodiment, we will correspond to each spatial position (h, w) of the input signature X' after grouping along the channelsAs an input vector to the packet wavelet packet transform module. Wherein h=1,. -%, H; w=1. Group i input feature map/>, after groupingWavelet packet transform result/>, of the h-th row and w-th column vectorsCan be expressed as:
Where DWT (°) represents the packet wavelet packet transform.
The wavelet packet transform used in the present invention is a haar wavelet packet transform, which is defined as follows: dividing the vector with the length of N into two parts of odd number and even number, and calculating the average value and the difference value of the vectors of the two parts to obtain two new vectors with the length of N/2. Repeating the above steps, recursion processing is carried out on the new vectors until the length of each sub-vector is 1, and the sub-vectors are arranged to obtain the frequency domain wavelet coefficients.
In the present invention, 1×1 group convolution is used to implement the input feature mapA packet wavelet packet transform is performed. The construction method of the convolution kernel is as shown in fig. 2, the feature extraction operation of each level of the grouping wavelet packet transformation module is represented by a matrix, and the haar wavelet packet transformation matrix of each level is multiplied according to the level sequence to obtain a transformation matrix/>The transformation matrix is shared to a Cin/r group input feature diagram X', and the shared transformation matrix is spliced in a first dimension to obtain a 1X 1 grouping convolution kernel W 1:
Wherein, ;/>Representing a stitching operation along a first dimension.
And finally, carrying out 1X 1 group convolution on the input feature map by using the constructed convolution kernel W 1, wherein the group length of the group convolution is equal to the group length r of the group wavelet transformation module, the group number is Cin/r, and the frequency domain feature map Y is obtained, and the result is expressed as follows:
Wherein, A packet convolution representing a number of packets of Cin/r; concat (-) represents a splice operation along the channel dimension; * Representing a convolution operation; /(I)Representing input feature graphs taken along a channel dimensionSubscript the portions from the (r×i+1) to the (r×i+r); frequency domain feature map/>
The method comprises the steps of carrying out depth separable convolution on a frequency domain characteristic Y diagram by convolution check in a frequency domain, and aggregating channel and space information, and specifically: and (3) utilizing the property that the space domain convolution is equal to the multiplication of the frequency domains, checking each group of frequency domain feature maps Y i by using the convolution on the frequency domains to carry out the depth separable convolution, adding and splicing the results, and aggregating the space information and the channel information. The amount of parameters and computation is reduced compared to spatial domain convolution.
In the invention, the mechanism for aggregating spatial information and channel information is shown in FIG. 3, and the input frequency domain feature map is shown in the figureDividing the channel dimension into Cin/r groups according to the group length r of the grouping wavelet packet transformation module to obtain grouped Cin/r frequency domain feature graphs/>. Will/>Respectively with convolution kernel/>Performing depth separable convolution, aggregating spatial information, and performing i-th set of frequency domain feature graphs/>The output frequency domain feature map depth-separable convolutions with the convolution kernel W i can be expressed as:
Depthwise (x) represents that the convolution kernel W i space size, step length and edge complement 0 in the depth separable convolution are the same as the convolution operation replaced by the grouping wavelet packet transformation module by utilizing the convolution kernel feature map; ,/> Representing the two dimensions of the space after depth separable convolution.
Then adding each group of output frequency domain feature graphs, aggregating channel information to obtain the frequency domain feature graph aggregated with the channel information
Repeating Cout/r times by the channel information processing and aggregation mode to obtain Cout/r frequency domain feature graphs aggregated with the channel information. All Cout/r frequency domain feature maps are spliced along the channel dimension to obtain a target frequency domain feature map M with the same dimension as the replaced convolution output feature map:
Wherein, ; The term "first dimension" refers to the first dimension.
The frequency domain characteristic diagram is restored back to the space domain by inverse transformation of a grouping wavelet packet, and the method specifically comprises the following steps: mapping the target frequency domain characteristic diagramDividing the first dimension into Cout/r groups according to the length r of the grouping wavelet packet transformation convolution group to obtain a target frequency domain characteristic diagram/>And grouping the target frequency domain feature map/>And performing inverse wavelet packet transformation in the channel dimension to restore the space domain. In the present embodiment, the grouped target frequency domain feature mapCorresponding/>, to each spatial position (h, w)As an input vector to the inverse packet wavelet packet transform module. J-th set of frequency domain feature maps/>Spatial domain results of wavelet packet inverse transform for h-th row and w-th column vectors of (c)Can be expressed as:
Wherein IDWT (°) represents the inverse haar wavelet packet transform. Namely, the inverse wavelet packet transformation in the invention is the inverse haar wavelet packet transformation, and is defined as follows: according to the mode that the Haer wavelet transformation decomposes the vector step by step from the high layer to the bottom layer, the Haer wavelet packet inverse transformation uses the mean value and the difference value of the lower layer to restore the upper layer vector step by step from the bottom layer to the high layer.
Finally, the space domain result is spliced in the first dimension to obtain the vector of the output characteristic diagram Q
In the present invention, 1×1 block convolution is used to achieve a target frequency domain feature mapAnd carrying out inverse transformation on the grouping wavelet packet. The construction method of the convolution kernel is as shown in fig. 4, the feature extraction operation of each level of the grouping wavelet packet inverse transformation module is represented by a matrix, and the haar wavelet packet inverse transformation matrix of each level is multiplied according to the level sequence to obtain an inverse transformation matrix/>Sharing the inverse transformation matrix to all Cout/r feature maps, and splicing the shared transformation matrix in a first dimension to obtain a1×1 group convolution kernel W 2:
Wherein, ,/>Representing a stitching operation along a first dimension. And finally, carrying out 1×1 group convolution on the frequency domain feature map M by using the constructed convolution kernel W 2, wherein the group length of the group convolution is equal to the group length r of the group wavelet transformation, the group number is Cout/r, and the frequency domain feature map Q is obtained, and the result is expressed as follows:
Wherein, A packet convolution representing a packet number Cout/r; concat (.) represents a splice operation along the channel dimension, x represents a convolution operation,/>Representing taking a frequency domain feature map along a channel dimensionSubscript (rxi+1) to (rxi+r) parts, and the obtained output spatial domain feature map
The forward propagation of the model is continued for the output feature map Q and the model parameters are updated by the backward propagation. After the model is trained by using the gradient descent method, a general and efficient image classification network can be obtained. After the reference model is added with the combined multidimensional visual attention method, the image classification accuracy can be improved under the condition of greatly reducing the model parameter quantity and the calculated quantity.
The present invention also provides a storage medium, which may be a storage medium such as a ROM, a RAM, a magnetic disk, or an optical disk, and the storage medium stores one or more programs that, when executed by a processor, implement an image processing method using packet wavelet packet transformation provided in the above embodiment.
The invention also provides a device, which can be a desktop computer, a notebook computer, a smart phone, a PDA handheld terminal, a tablet computer or other terminal devices with display function, wherein the computing device comprises a processor and a memory, the memory stores one or more programs, and when the processor executes the programs stored in the memory, the image processing method using the grouping wavelet packet transformation provided by the embodiment is realized.
While only the preferred embodiments of the present invention have been described above, it should be noted that modifications and improvements can be made by those skilled in the art without departing from the structure of the present invention, and these do not affect the effect of the implementation of the present invention and the utility of the patent.

Claims (4)

1. An image processing method using packet wavelet packet transformation, comprising the steps of:
Step one, constructing an image data set, determining a channel compression rate, and determining the length r of a grouping wavelet packet transformation module group according to the channel compression rate;
Step two, inputting an input image into a convolutional neural network, and replacing convolutional operation in a convolutional neural network model with a grouping wavelet packet transformation module, wherein in the grouping wavelet packet transformation module, an input feature map of the convolutional operation to be replaced is set as X, and the input feature map X is transformed from a space domain to a frequency domain so as to obtain a frequency domain feature map Y;
step three, the frequency domain feature map Y is subjected to depth separable convolution by using convolution check in the frequency domain, and the output results are added and spliced to obtain a target frequency domain feature map M with channel information and spatial information aggregated;
Restoring the target frequency domain feature map M to a space domain by using a grouping wavelet packet inverse transformation module to obtain an output feature map Q;
Step five, forward propagation of the convolutional neural network model is carried out on the output feature map Q, iterative training is carried out by using a gradient descent method, and an image classification model for classifying images is obtained;
In step two, the input feature map X is transformed from the spatial domain to the frequency domain, specifically comprising,
Setting a replaced convolution operation input dimension Cin; dividing the input characteristic diagram X into Cin/r groups along the channel dimension according to the length r of a grouping wavelet packet transformation module group to obtain a grouped input characteristic diagram X ', transforming the vector of the grouped input characteristic diagram X' to a frequency domain through grouping wavelet packet transformation, and transforming the input characteristic diagram X into grouping wavelet packets through convolution operation;
the vector of the grouped input feature map X' is changed to the frequency domain by grouping wavelet packet transformation, specifically,
Corresponding each spatial position (h, w) of the grouped input feature map XAs an input vector of the grouping wavelet packet transformation module, the vector of the grouped input feature map X' is transformed into the frequency domain by the following formula:
Wherein, Representing the input feature map/>, after the i-th group groupingWavelet packet transformation results of the h-th row and w-th column vectors; i=1..cin/r; h=1, a, H is formed; w=1.. W is a metal; H. w represents two dimensions of space; DWT (-) represents wavelet packet transform;
the convolution operation is specifically described as,
The feature extraction operation of each level of the grouping wavelet packet transformation module is represented by a matrix, and the matrix of each level is multiplied according to the level sequence to obtain a transformation matrix; Sharing the transformation matrix to the grouped input feature graphs X' of the Cin/r group, and splicing the shared transformation matrix in a first dimension to obtain a convolution kernel W 1 of 1X 1 groups;
the input feature map X is subjected to a group convolution by the convolution kernel W 1:
Wherein, A packet convolution representing a number of packets of Cin/r; concat (-) represents a splice operation along the channel dimension; * Representing a convolution operation; /(I)Representing input feature graphs taken along a channel dimensionSubscript the portions from the (r×i+1) to the (r×i+r); y represents a frequency domain feature map;
The third step, specifically comprises the steps of,
The first step, the frequency domain feature map Y is grouped into Cin/r, and the frequency domain feature map Y i and the convolution kernel W i are subjected to depth separable convolution by the following formula:
Wherein, Z i represents an output frequency domain feature map of the i-th group of frequency domain feature map Y i and a convolution kernel W i which are subjected to depth separable convolution; depthwise represents a depth separable convolution with the convolution kernel feature map;
step two, adding all the output frequency domain feature graphs Z i to obtain a frequency domain feature graph Z with the channel information aggregated;
Thirdly, repeating the Cout/r times in the first step and the second step to obtain Cout/r frequency domain feature graphs Z with the channel information aggregated; splicing Cout/r frequency domain feature graphs Z aggregated with channel information along the channel dimension to obtain a target frequency domain feature graph M;
where Cout represents the convolution operation output dimension that is replaced;
the fourth step, specifically comprises the steps of,
Dividing the target frequency domain feature map M into Cout/r groups according to the length r of a convolution group of a grouping wavelet packet transformation module in a first dimension to obtain a grouped target frequency domain feature map M';
Corresponding each spatial position (h, w) of the grouped target frequency domain feature map M As an input vector of the grouping wavelet packet inverse transformation module, the vector of the grouped target frequency domain feature map M' is subjected to inverse wavelet packet transformation back to the spatial domain by the following formula:
Wherein IDWT (. -%) represents the inverse haar wavelet packet transform; representing the j-th set of target frequency domain feature maps/> The spatial domain results of the wavelet packet inverse transform for the h-th row and w-th column vectors;
the characteristic extraction operation of each level of the grouping wavelet packet inverse transformation module is represented by a matrix, and the matrix of each level is multiplied according to the level order to obtain an inverse transformation matrix Inverse transform matrix/>Sharing the target frequency domain feature map M' after Cout/r grouping, and splicing the shared transformation matrix in a first dimension to obtain a1 multiplied by 1 grouping convolution kernel W 2;
And carrying out grouping convolution on the target frequency domain feature map M through the convolution kernel W 2:
Wherein, Representing taking a frequency domain feature map/>, along a channel dimensionSubscript the portions from the (r×i+1) to the (r×i+r); /(I)Representing two dimensions of space after depth separable convolution;
The group length r is set according to the parameter targets of the image classification model, and the parameter of the grouping wavelet packet transformation module is 1/r of the convolution operation to be replaced.
2. An image processing method using a packet wavelet packet transform according to claim 1, wherein said channel compression rate in step one is determined based on a parameter amount target of a model.
3. An image processing method using a packet wavelet packet transform according to claim 1, wherein the group length r is set to be a common divisor of the input-output dimension size of the convolution operation replaced by the packet wavelet packet transform module.
4. An image processing method using a packet wavelet packet transform according to claim 1, wherein said input image is a preprocessed image; the preprocessing includes filling, cropping, flipping, and normalizing the image.
CN202410282726.2A 2024-03-13 2024-03-13 Image processing method using grouping wavelet packet transformation Active CN117876716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410282726.2A CN117876716B (en) 2024-03-13 2024-03-13 Image processing method using grouping wavelet packet transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410282726.2A CN117876716B (en) 2024-03-13 2024-03-13 Image processing method using grouping wavelet packet transformation

Publications (2)

Publication Number Publication Date
CN117876716A CN117876716A (en) 2024-04-12
CN117876716B true CN117876716B (en) 2024-06-18

Family

ID=90590514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410282726.2A Active CN117876716B (en) 2024-03-13 2024-03-13 Image processing method using grouping wavelet packet transformation

Country Status (1)

Country Link
CN (1) CN117876716B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222018A (en) * 2011-05-19 2011-10-19 西南交通大学 Pthreads-based wavelet and wavelet packet multi-core parallel computing method
DE102019217363A1 (en) * 2019-11-11 2021-05-12 Robert Bosch Gmbh Method for configuring an object recognition system
JP2024519533A (en) * 2021-05-06 2024-05-15 ストロング フォース アイオーティ ポートフォリオ 2016,エルエルシー Quantum, Biological, Computer Vision and Neural Network Systems for the Industrial Internet of Things

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种融合小波变换与卷积神经网络的高相似度图像识别与分类算法;姜文超;刘海波;杨宇杰;陈佳峰;孙傲冰;;计算机工程与科学;20180915(第09期);全文 *
形状自适应的嵌入式小波图像编码算法;沃焱;韩国强;张艳青;;华南理工大学学报(自然科学版);20080115(第01期);全文 *

Also Published As

Publication number Publication date
CN117876716A (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN111898733A (en) Deep separable convolutional neural network accelerator architecture
CN112270259B (en) SAR image ship target rapid detection method based on lightweight convolutional neural network
US20200389182A1 (en) Data conversion method and apparatus
CN112215199B (en) SAR image ship detection method based on multi-receptive field and dense feature aggregation network
US20230135109A1 (en) Method for processing signal, electronic device, and storage medium
Shi et al. F 3 Net: Fast Fourier filter network for hyperspectral image classification
CN109447943B (en) Target detection method, system and terminal equipment
CN106503386A (en) The good and bad method and device of assessment luminous power prediction algorithm performance
CN117876716B (en) Image processing method using grouping wavelet packet transformation
CN112149747A (en) Hyperspectral image classification method based on improved Ghost3D module and covariance pooling
Zhan et al. Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems
CN115620120A (en) Street view image multi-scale high-dimensional feature construction quantification method, equipment and storage medium
Zhang et al. Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning
CN115035408A (en) Unmanned aerial vehicle image tree species classification method based on transfer learning and attention mechanism
CN114492795A (en) Deep convolutional neural network compression method, computer device and storage medium
Zhou et al. Lite-YOLOv3: a real-time object detector based on multi-scale slice depthwise convolution and lightweight attention mechanism
Huang et al. Hardware-friendly compression and hardware acceleration for transformer: A survey
US20230351181A1 (en) Approximating activation functions with taylor series
Li et al. A research and strategy of remote sensing image denoising algorithms
Du et al. An effective approach for sonar image recognition with improved efficientdet and ensemble learning
Gong et al. Research on mobile traffic data augmentation methods based on SA-ACGAN-GN
Yu et al. A Dynamic Multi-Branch Neural Network Module for 3D Point Cloud Classification and Segmentation Using Structural Re-parametertization
Li et al. VNet: a versatile network to train real-time semantic segmentation models on a single GPU
US20230229917A1 (en) Hybrid multipy-accumulation operation with compressed weights
US20240028895A1 (en) Switchable one-sided sparsity acceleration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant