CN112037157A - Data processing method and device, computer readable medium and electronic equipment - Google Patents

Data processing method and device, computer readable medium and electronic equipment Download PDF

Info

Publication number
CN112037157A
CN112037157A CN202010963275.0A CN202010963275A CN112037157A CN 112037157 A CN112037157 A CN 112037157A CN 202010963275 A CN202010963275 A CN 202010963275A CN 112037157 A CN112037157 A CN 112037157A
Authority
CN
China
Prior art keywords
expansion
expansion rate
feature map
rate table
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010963275.0A
Other languages
Chinese (zh)
Inventor
张弓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010963275.0A priority Critical patent/CN112037157A/en
Publication of CN112037157A publication Critical patent/CN112037157A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Neurology (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, a computer readable medium, and an electronic device. The method comprises the following steps: acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map; and inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer performs expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map. The method can realize the self-adaptive and refined adjustment of the receptive field of each pixel in the characteristic diagram, and avoid the omission of the characteristics in the characteristic extraction process; and the sampling density can be improved, and the detail characteristics can be reserved.

Description

Data processing method and device, computer readable medium and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, a computer readable medium, and an electronic device.
Background
When the deep learning algorithm is used for image processing, the expansion of the receptive field is beneficial to the neural network to receive information in a larger range and extract more complex information. In general, one conventional way to expand the receptive field is to increase the number of layers of the neural network, with deeper network layers receiving broader information. However, the disadvantage is that the complexity of the network is greatly increased, and the difficulty of convergence is increased. Another way of expanding the convolution is to enlarge the field of view by enlarging the filter size. However, the drawback is that some pixels are regularly skipped during the convolution operation, which can create holes in the resulting feature, resulting in the feature being missed.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure provides a data processing method, a data processing apparatus, a computer readable medium, and a terminal device, which can adaptively adjust a receptive field in a calculation process of a dilation convolution, and avoid feature omission.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the present disclosure, there is provided a data processing method comprising:
acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer performs expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map.
According to a second aspect of the present disclosure, there is provided a data processing apparatus comprising:
the self-adaptive expansion rate table calculation module is used for acquiring a feature map to be processed and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and the expansion convolution processing module is used for inputting the expansion rate table into the expansion convolution layer so as to enable the expansion convolution layer to carry out expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map.
According to a third aspect of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the data processing method described above.
According to a fourth aspect of the present disclosure, there is provided a terminal device comprising:
one or more processors;
a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data processing method described above.
In the data processing method provided by an embodiment of the present disclosure, an expansion rate calculation layer is configured, and an expansion rate table corresponding to a characteristic map is calculated by using the expansion rate convolution layer; when the feature map is subjected to expansion convolution calculation, the expansion rate table can be inquired to obtain the expansion rate value corresponding to each pixel, and sampling is performed according to the expansion rate value corresponding to each pixel, so that the adaptive and refined adjustment of the receptive field of each pixel in the feature map is realized, and the omission of features in the feature extraction process is avoided; and the sampling density can be improved, and the detail characteristics can be reserved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
FIG. 1-1 schematically illustrates a sample schematic of a dilated convolution with a dilation rate of 1 in an exemplary embodiment of the disclosure;
1-2 schematically illustrate a sample schematic of a dilated convolution with a dilation rate of 2 in an exemplary embodiment of the disclosure;
FIGS. 1-3 schematically illustrate a sample schematic of a dilated convolution with a dilation rate of 4 in an exemplary embodiment of the disclosure;
FIG. 2 schematically illustrates a flow chart of a data processing method in an exemplary embodiment of the disclosure;
FIG. 3 schematically illustrates a flow diagram of a method of expanding a convolutional layer in an exemplary embodiment of the present disclosure;
FIG. 4 schematically illustrates a sampling result of an expanded convolutional layer in an exemplary embodiment of the present disclosure;
FIG. 5 schematically illustrates a composition diagram of a display control apparatus in an exemplary embodiment of the present disclosure;
fig. 6 schematically illustrates an electronic device structure diagram of a terminal device in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
When the deep learning algorithm is used for processing an image, for example, in application scenes such as image noise reduction and image recognition, and a convolutional neural network is used for carrying out feature extraction on the image, the expansion of the receptive field is beneficial to the neural network to receive information in a wider range and extract more complex information. One conventional way to expand the receptive field is to increase the number of layers of the neural network, with deeper network layers receiving broader information. The disadvantage is that the complexity of the network is greatly increased, and the difficulty of convergence is improved accordingly. The principle of the dilation convolution is to inject 0 into the convolution kernel, so that the sampling range of convolution operation is expanded, and the operation complexity is unchanged. For example, referring to fig. 1-1 through fig. 1-3, a schematic diagram of the dilation convolution sampling is shown, wherein the dilation rate of fig. 1-1 is 1, the convolution kernel is in the original form, and the sampling points are closely connected. The expansion ratio of fig. 1-2 is 2, the convolution kernel interval injects zeros and the valid sample points are spaced apart. The expansion ratio of fig. 1-3 is 4 and the effective sampling point spacing is 4. The advantage of the dilated convolution is to enlarge the field of view, but to miss subtle features. Yet another common way is to use a dilation convolution. The concept of the dilation convolution is to enlarge the size of the filter and fill zero in the filter at certain intervals, and the dilation ratio of the size is the dilation rate. The disadvantage is that some pixels are regularly skipped during convolution operation, which tends to create holes in the resulting feature.
In view of the above-described drawbacks and deficiencies of the prior art, the exemplary embodiment provides a data processing method. Referring to fig. 2, the data processing method described above may include the steps of:
s11, acquiring a feature map to be processed, inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and S12, inputting the expansion rate table into the expansion convolutional layer, so that the expansion convolutional layer combines the expansion rate table to perform expansion convolution processing on the feature map, and feature data corresponding to the feature map is obtained.
In the data processing method provided by the present exemplary embodiment, an expansion rate calculation layer is configured, and an expansion rate table corresponding to a characteristic map is calculated by using the expansion rate convolution layer; when the feature map is subjected to the expansion convolution calculation, the expansion rate table can be inquired to obtain the expansion rate value corresponding to each pixel, and sampling is performed according to the expansion rate value corresponding to each pixel, so that on one hand, the adaptive and refined adjustment of the receptive field of each pixel in the feature map is realized, and the omission of the features in the feature extraction process is avoided; on the other hand, the sampling density can be improved, and the detail characteristics can be reserved.
Hereinafter, each step of the data processing method in the present exemplary embodiment will be described in more detail with reference to the drawings and examples.
In step S11, acquiring a feature map to be processed, and inputting the feature map into an expansion ratio calculation layer to acquire an expansion ratio table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map.
In this exemplary embodiment, the feature map to be processed may be a feature map obtained by performing feature extraction on an original image once. For example, in image noise reduction, image recognition, and image registration, a feature map is obtained by extracting features of an input original image.
The data processing method can provide an expansion ratio calculation layer and an expansion convolution layer. Specifically, after the feature map to be processed is obtained, the feature map may be input into the expansion rate convolution layer, and the expansion rate value corresponding to each pixel in the feature map may be calculated to obtain the expansion rate table corresponding to the feature map.
Specifically, the expansion ratio calculation layer may be one or a plurality of convolution layers provided in succession, and the feature map is subjected to convolution processing once or a plurality of successive times, thereby obtaining an expansion ratio table. The expansion rate is set for each pixel in a self-adaptive manner by utilizing the pixel characteristics of each pixel in the characteristic diagram. Alternatively, in other exemplary embodiments of the present disclosure, when the expansion ratio calculation layer includes a plurality of convolution layers, a pooling layer (pooling), a residual network module; in addition, the basic components of the neural network such as an image scaling module for performing dimension reduction processing on the feature map can be further included.
Further, the image size of the expansion rate table may be configured according to the resolution or the image size of the feature map. Alternatively, the image size of the expansion rate table may be configured by combining the feature map and the information of the expansion convolution layer. For example, the image size of the inflation rate table may be configured according to the image size of the feature map; or, combining the image size of the feature map and the number of output channels of the expansion convolution layer to configure the image size of the expansion rate table; or combining the image size of the feature map and the filter grouping information of the output channel of the expansion convolutional layer to configure the image size of the expansion rate table.
Specifically, the image size (or resolution) of the expansion rate table may be set to be the same as the image size (or resolution) of the feature map; for example, if the image size of the input feature map is Width (Width) height (height) and the input channel number (channel _ in), the image size of the expansion rate table is also Width height channel _ in. Alternatively, the image size of the expansion rate table is set to be the product of the image size of the feature map and the number of output channels of the expanded convolution layer, for example, the input feature map image size is Width height channel _ in, and the number of output channels of the expanded convolution layer is channel _ out, that is, the output feature map image size is Width height channel _ out, and at this time, the image size of the expansion rate array is Width height (channel _ in channel _ out). Or setting the image size of the expansion rate table as the product of the image size of the characteristic graph and the grouping number of the expansion convolutional layer output channel number filters; for example, the image size of the input feature map is Width height channel _ in, the filter banks corresponding to the output channels of the expansion convolution layer are divided into a plurality of groups, and the image size of the expansion rate is Width height channel _ in _ group if the number of groups is group.
In step S12, the inflation rate table is input into the inflation convolution layer, so that the inflation convolution layer performs inflation convolution processing on the feature map by combining the inflation rate table, so as to obtain feature data corresponding to the feature map.
In this exemplary embodiment, referring to fig. 3, the step S12 may include:
step S121, querying the expansion rate table to obtain an expansion rate value corresponding to each pixel in the feature map.
Specifically, after the expansion rate table corresponding to the characteristic map is calculated, both the characteristic map and the expansion rate table may be input to the expansion convolution layer. And inquiring the expansion rate table by the expansion convolution layer to obtain the expansion rate value corresponding to each pixel, adjusting the sampling range of convolution operation according to the expansion rate value, and then performing convolution operation processing characteristics. Wherein the sampling is performed using the same interval in the horizontal direction and the vertical direction of the pixel based on the expansion ratio value corresponding to each pixel. That is, when the expansion rate value corresponding to a pixel is d, the intervals of the sampling points of the expansion convolution operation in the horizontal direction and the vertical direction are d.
In some exemplary embodiments of the present disclosure, the expansion rate value d in the expansion rate table may be an integer, or a non-integer (decimal). When the expansion ratio value is an integer, the corresponding expansion ratio value may be used as it is.
If the expansion rate value d corresponding to the pixel point in the feature map is a decimal, the coordinates of the sampling point may be a decimal. Therefore, the expansion convolutional layer can be further provided with an interpolation operation unit, which is used for estimating the characteristic value of the sampling point by utilizing the pixel interpolation of the neighborhood when the expansion rate value is a non-integer, and then performing the convolution operation of the expansion convolutional layer. Specifically, the method may include:
step S21, neighborhood pixel values of the sampling points corresponding to the pixels are obtained, and corresponding neighborhood pixel interpolation is calculated based on the neighborhood pixel values;
and step S22, estimating the characteristic values of the sampling points according to the neighborhood pixel interpolation, and performing expansion convolution based on the characteristic values of the sampling points.
For example, the feature value at the sampling coordinate can be estimated by using the two-line interpolation method, and calculating the weight of each pixel according to the distance by using the number of coordinate pixels in the specified neighborhood. Specifically, the formula may include:
Figure BDA0002681326090000071
wherein x is receivedFeature map, y is the output feature map, P0Is the coordinate position on the output characteristic diagram y, w represents the convolution kernel of the expansion convolution layer at the expansion rate of 1, R represents the whole coordinate position in w, PnEnumerating the coordinate positions in R, G () being an interpolation weight function, d representing the calculated expansion ratio, (P)0+d·Pn) Enumerating the coordinate position of a sampling point of the expansion convolution layer when the expansion rate is d, wherein N represents the neighborhood integer coordinate position of the sampling point, and q enumerates the integer coordinate in N.
G (q, P) in the above formula0+d·Pn) According to q and (P)0+d·Pn) The interpolation weight is calculated, and the interpolation weight is inversely proportional to the spatial distance. Optionally, G () uses a bilinear interpolation scheme or a bicubic interpolation scheme. N in the above formula represents a neighborhood integer coordinate position of a sampling point, and optionally, N is an integer coordinate position in a 4 × 4 neighborhood of the sampling point. That is, if the coordinates of the sampling point are (row, col), and row, col can be approximated as integers row ', col', N represents four rows and four columns, 16 integer coordinate positions from the (row-1) th row to the (row +2) th row, and from the (col-1) th column to the (col +2) th column. Referring to the embodiment shown in fig. 4, the expansion rate d may be a decimal number, which results in that the coordinates of the sampling point may be a decimal number, so that the characteristic value of the sampling point is estimated by using the interpolation of the pixels in the neighborhood, and then the convolution operation is performed with the expansion convolutional layer.
For reference comparison, a conventional convolution calculation formula is typically:
Figure BDA0002681326090000072
where w refers to the weight value of the convolution kernel, P0Refers to the central pixel coordinate to be calculated, PnPixel coordinate positions within the neighborhood are enumerated. When the convolution kernel size is 3x3, PnI.e. represents P0And 9 pixels in 3x3 neighborhood. Whereas the conventional dilation-convolution calculation is typically:
Figure BDA0002681326090000073
the expansion coefficient d is increased compared to the above conventional convolution. The physical meaning is that the distance between sampling points is changed from 1 to d.
Compared with the prior art, the embodiment of the present disclosure adds the interpolation function G (), the integer coordinate q. The physical meaning is that the pixel value of the sampling point is estimated by adopting an interpolation mode according to the pixel value x (q) at the integer coordinate q of the neighborhood. The expansion rate value sampling method can ensure that when the expansion rate value in the expansion rate table is a decimal number, the corresponding expansion rate value can be accurately obtained for sampling.
Referring to fig. 3, the step S12 may further include:
and step S122, determining the sampling range of the corresponding pixel according to the expansion rate value so as to perform expansion convolution processing.
In some exemplary embodiments of the present disclosure, when the expansion rate table is the same as the image size of the input feature map, for each layer output channel of the expansion convolution layer, the same coordinate pixel is subjected to expansion convolution processing in each layer output channel with the same expansion rate value.
For example, when the feature map and the expansion rate table have the same image size, both are (Width height channel _ in), and channel _ in is 2, the expansion rate table includes two expansion rate sub-tables, which respectively correspond to channels of two dimensions of the feature map, and the two expansion rate sub-tables are the same. When the number of output channels of the expansion convolution calculation layer is 3, at this time, a pixel of a first layer channel of the feature map queries the expansion rate table to obtain a corresponding expansion rate value, an interval is determined according to the expansion rate value, sampling is carried out, and then convolution calculation is carried out to obtain a first feature value; the pixels of the second layer channel of the characteristic diagram query the expansion rate table to obtain a corresponding expansion rate value, intervals are determined according to the expansion rate value, sampling is carried out, and then convolution calculation is carried out to obtain a second characteristic value; and determining the characteristic value of each pixel of the output channel of the first layer of the expansion convolution layer according to the first characteristic value and the second characteristic value. By analogy, a characteristic diagram with the number of output channels of the expansion convolution layer being 3 can be obtained.
In some exemplary embodiments of the present disclosure, different expansion ratios may also be calculated for the filter banks corresponding to the output channels of the expansion convolutional layer, respectively. That is, the image size of the expansion rate table is the product of the feature image size and the number of output channels of the expansion convolutional layer. For example, the input feature map image size is Width height channel _ in, and the number of output channels of the expanded convolution layer is channel _ out, that is, the output feature map image size is Width height channel _ out, and the image size of the expansion rate table is Width height channel _ in _ out at this time. For example, the size of the feature map image is Width height channel _ in, and channel _ in is 2; and the number of output channels of the expansion convolution layer is 3, the image size of the expansion rate table is Width height 6. At the moment, the expansion rate value of the first layer dimension of the expansion rate table is inquired by the channel pixel of the first layer dimension of the feature map, sampling is carried out, and a first feature value is obtained after convolution calculation; inquiring the expansion rate value of the second layer dimension of the expansion rate table by the channel pixel of the second layer dimension, sampling, and obtaining a second characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the output channel of the first layer of the expansion convolution layer according to the first characteristic value and the second characteristic value. Inquiring expansion rate values of third layer dimensions of the expansion rate table by channel pixels of first layer dimensions of the characteristic diagram, sampling, and obtaining third characteristic values after convolution calculation; inquiring the expansion rate value of the fourth layer dimension of the expansion rate table by the channel pixel of the second layer dimension, sampling, and obtaining a fourth characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the second layer output channel of the expansion convolution layer according to the third characteristic value and the fourth characteristic value. Inquiring expansion rate values of fifth layer dimensions of the expansion rate table by channel pixels of the first layer dimensions of the characteristic diagram, sampling, and obtaining a fifth characteristic value after convolution calculation; inquiring the expansion rate value of the sixth layer dimension of the expansion rate table by the channel pixel of the second layer dimension, sampling, and obtaining a sixth characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the output channel of the third layer of the expansion convolution layer according to the fifth characteristic value and the sixth characteristic value. In this way, a characteristic diagram with the number of output channels of the expansion convolution layer being 3 can be obtained.
In some exemplary embodiments of the present disclosure, the filters of each output channel of the swelling convolutional layer may be further grouped according to a preset rule; inquiring the expansion rate table corresponding to each group according to the grouping result so as to carry out expansion convolution processing on the feature map according to the expansion rate table corresponding to each group; in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
For example, the filter bank corresponding to each output channel of the expansion convolutional layer may be divided into several groups, and the same coordinate pixels of different output channels in the groups use the same expansion rate. For example, if the size of the input feature map image is Width height channel in and the number of groups is group, the image size of the expansion rate table is Width height (channel in group). In the expansion convolutional layer, the filters may be divided into groups according to actual requirements, for example, into groups of 4, 6, or 8. For example, if channel _ in is 2, group is 3, and the number of expansion convolution layer output channels is channel _ out is 3, the image size of the expansion rate table is Width height 6. For example, when the expansion convolutional layer channel _ out is 32 and group is 4, the same expansion rate table is used for the 1 st to 4 th layers, the same expansion rate table is used for the 5 th to 8 th layers, and so on. By dividing the filter into groups, the complexity of the network can be reduced, the computing resources can be saved, and the accuracy of the expansion convolution operation can be ensured.
In this exemplary embodiment, an image feature extraction model is provided, which includes an expansion ratio calculation layer and an expansion convolution layer. The number of input channels of the model is 2, and the number of output channels is 3. The expansion ratio calculation layer contains 1 convolution layer, the filter size is 3x3, and the output channel is 2. The layer performs convolution operation on the input feature map and outputs the expansion rate table with the same image size. The expansion convolution layer firstly searches the expansion rate table to determine the expansion rate value corresponding to the pixel, expands the sampling range and then calculates.
For example, for a pixel with the coordinates of row 100, column 100 and the first channel in the input feature map, the same coordinate in the expansion ratio array is searched, and the corresponding expansion ratio is found to be 1.5. If the filter size of the dilated convolution layer is 3x3, the conventional sampling coordinates of the pixel are as follows, the first number of brackets representing the row coordinates and the second number representing the column coordinates:
(99,99) (99,100) (99,101)
(100,99) (100,100) (100,101)
(101,99) (101,100) (101,101)
and after the expansion ratio is set to 1.5, its sampling coordinates become:
(98.5,98.5) (98.5,100) (98.5,101.5)
(100,98.5) (100,100) (100,101.5)
(101.5,98.5) (101.5,100) (101.5,101.5)
because the sampling coordinate is decimal, the weight of each pixel is calculated according to the distance by using integer coordinate pixels in a 4 x 4 neighborhood by adopting a bilinear interpolation method, and the characteristic value at the sampling coordinate is estimated. Taking the coordinate point (98.5,101.5) as an example, the integer coordinates of the reference pixels are from 97 th row to 100 th row and from 100 th column to 103 th column, specifically:
(97,100) (97,101) (97,102) (97,103)
(98,100) (98,101) (98,102) (98,103)
(99,100) (99,101) (99,102) (99,103)
(100,100) (100,101) (100,102) (100,103)
and then after the characteristic value at the sampling point is estimated, the result is calculated by a convolution mode.
The data processing method provided by the embodiment of the disclosure can be applied to a deep learning neural network and used in the fields of image noise reduction, image recognition or image registration, semantic segmentation, face recognition, image reconstruction and the like. By arranging a corresponding expansion rate table for the feature image using the expansion rate calculation layer, the field of view can be adaptively and finely adjusted at the pixel level in the expansion convolution layer in accordance with the image feature. And the calculated amount is relatively small, which is beneficial to improving the performance of the fields of semantic segmentation, image reconstruction and the like. Taking the field of image noise reduction as an example, in the case of block noise caused by image compression, the receptive field needs to be enlarged as much as possible to fade the large block noise; for texture regions with rich details, the sampling and detail preservation needs to be as dense as possible. These two tasks are contradictory, and the actual processing is difficult to balance, or large noise is left, or high frequency details are erased. According to the scheme, the expansion rate can be calculated at the pixel level according to the image characteristics, the receptive field is adaptively adjusted, and the problem is favorably solved.
It is to be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the method according to an exemplary embodiment of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Further, referring to fig. 5, in the present exemplary embodiment, a data processing apparatus 50 is further provided, including: an adaptive expansion rate table calculation module 501 and an expansion convolution processing module 502.
Wherein the content of the first and second substances,
the adaptive expansion rate table calculating module 501 may be configured to obtain a feature map to be processed, and input the feature map into an expansion rate calculating layer to obtain an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map.
The dilation convolution processing module 502 may be configured to input the dilation rate table into a dilation convolution layer, so that the dilation convolution layer performs dilation convolution processing on the feature map in combination with the dilation rate table to obtain feature data corresponding to the feature map.
In one example of the present disclosure, the expansion ratio calculation layer includes at least one convolution layer; the adaptive inflation rate table calculation module 501 may be configured to perform at least one convolution process on the feature map to obtain the inflation rate table.
In one example of the present disclosure, the inflation rate table image size configuration module (not shown in the figures).
The inflation rate table image size configuration module may be configured to configure an image size of the inflation rate table according to an image size of the feature map; alternatively, the first and second electrodes may be,
combining the image size of the feature map and the number of output channels of the expansion convolutional layer to configure the image size of the expansion rate table; or
And combining the image size of the characteristic diagram with the filter grouping information of the output channel of the expansion convolutional layer to configure the image size of the expansion rate table.
In one example of the present disclosure, the image size of the inflation rate table is the same as the image size of the feature map; or
The image size of the expansion rate table is the product of the image size of the characteristic diagram and the number of output channels of the expansion convolutional layer; or
And the image size of the expansion rate table is the product of the image size of the characteristic diagram and the grouping number of the expansion convolutional layer output channel filters.
In an example of the present disclosure, the dilation convolution processing module 502 may include: query unit, sampling unit (not shown in the figure).
The query unit may be configured to query the expansion rate table to obtain an expansion rate value corresponding to each pixel in the feature map.
The sampling unit may be configured to determine a sampling range of a corresponding pixel according to the expansion rate value, so as to perform the expansion convolution processing.
In one example of the present disclosure, the query unit may include: a first lookup unit (not shown).
The first query unit may be configured to perform, for each layer of output channels of the expansion convolution layer, expansion convolution processing on the same coordinate pixel in each layer of output channels by using the same expansion rate value; wherein the expansion rate table is the same as the image size of the feature map.
In one example of the present disclosure, the query unit may include: a second lookup unit (not shown).
The second query unit may be configured to query expansion rate tables corresponding to output channels of each layer of the expansion convolutional layer, so that the feature map is used to perform expansion convolution processing by respectively combining the expansion rate tables corresponding to the output channels; and the image size of the expansion rate table is the product of the image size of the characteristic diagram and the number of output channels of the expansion convolutional layer.
In one example of the present disclosure, the query unit may include: a third interrogation unit (not shown in the figure).
The second query unit may be configured to group filters of output channels of the swelling convolutional layer according to a preset rule; inquiring the expansion rate table corresponding to each group according to the grouping result so as to carry out expansion convolution processing on the feature map according to the expansion rate table corresponding to each group; in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
In one example of the present disclosure, the sampling unit may include: based on the expansion ratio value corresponding to the pixel, sampling is performed using the same interval in the horizontal direction and the vertical direction of the pixel.
In one example of the present disclosure, the expansion ratio value is an integer, or a non-integer.
In an example of the present disclosure, the dilation convolution processing module 502 may further include: and a neighborhood interpolation calculation unit. (not shown in the figure).
The neighborhood interpolation calculation unit may be configured to, when the expansion rate value is a non-integer, obtain a neighborhood pixel value of a sampling point corresponding to the pixel, and calculate a corresponding neighborhood pixel interpolation based on the neighborhood pixel value; and estimating the characteristic value of the sampling point according to the neighborhood pixel interpolation so as to carry out expansion convolution based on the characteristic value of the sampling point.
The specific details of each module in the data processing apparatus have been described in detail in the corresponding data processing method, and therefore are not described herein again.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Fig. 6 shows a schematic diagram of a wireless communication device suitable for implementing an embodiment of the invention.
It should be noted that the electronic device 600 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may specifically include: a processor 610, an internal memory 621, an external memory interface 622, a Universal Serial Bus (USB) interface 630, a charging management module 640, a power management module 641, a battery 642, an antenna 1, an antenna 2, a mobile communication module 650, a wireless communication module 660, an audio module 670, a speaker 671, a receiver 672, a microphone 673, an earphone interface 674, a sensor module 680, a display 690, a camera module 691, a pointer 692, a motor 693, buttons 694, and a Subscriber Identity Module (SIM) card interface 695. Among other things, sensor modules 680 may include a depth sensor 6801, a pressure sensor 6802, a gyroscope sensor 6803, an air pressure sensor 6804, a magnetic sensor 6805, an acceleration sensor 6806, a distance sensor 6807, a proximity light sensor 6808, a fingerprint sensor 6809, a temperature sensor 6810, a touch sensor 6811, an ambient light sensor 6812, and a bone conduction sensor 6813.
It is to be understood that the illustrated structure of the embodiment of the present application does not constitute a specific limitation to the electronic device 600. In other embodiments of the present application, the electronic device 600 may include more or fewer components than illustrated, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 610 may include one or more processing units, such as: the Processor 610 may include an Application Processor (AP), a modem Processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband Processor, and/or a Neural Network Processor (NPU), and the like. The different processing units may be separate devices or may be integrated into one or more processors.
The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 610 for storing instructions and data. The memory may store instructions for implementing six modular functions: detection instructions, connection instructions, information management instructions, analysis instructions, data transmission instructions, and notification instructions, and execution is controlled by the processor 610. In some embodiments, the memory in the processor 610 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 610. If the processor 610 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 610, thereby increasing the efficiency of the system.
In some embodiments, processor 610 may include one or more interfaces. The Interface may include an Integrated Circuit (I2C) Interface, an Inter-Integrated Circuit built-in audio (I2S) Interface, a Pulse Code Modulation (PCM) Interface, a Universal Asynchronous Receiver/transmitter (UART) Interface, a Mobile Industry Processor Interface (MIPI), a General-Purpose Input/Output (GPIO) Interface, a Subscriber Identity Module (SIM) Interface, and/or a Universal Serial Bus (USB) Interface, etc.
The I2C interface is a bi-directional synchronous Serial bus including a Serial Data line (SDA) and a Serial Clock Line (SCL). In some embodiments, processor 610 may include multiple sets of I2C buses. The processor 610 may be coupled to the touch sensor 6811, the charger, the flash, the camera module 691, etc., through different I2C bus interfaces, respectively. For example: the processor 610 may be coupled to the touch sensor 6811 via an I2C interface, such that the processor 610 and the touch sensor 6811 communicate via an I2C bus interface to implement touch functionality of the electronic device 600.
The I2S interface may be used for audio communication. In some embodiments, processor 610 may include multiple sets of I2S buses. The processor 610 may be coupled to the audio module 670 via an I2S bus to enable communication between the processor 610 and the audio module 670. In some embodiments, the audio module 670 may communicate audio signals to the wireless communication module 660 via an I2S interface to enable answering a call via a bluetooth headset.
The PCM interface may also be used for audio communication, sampling, quantizing and encoding analog signals. In some embodiments, the audio module 670 and the wireless communication module 660 may be coupled by a PCM bus interface. In some embodiments, the audio module 670 may also transmit audio signals to the wireless communication module 660 through the PCM interface, so as to implement a function of answering a call through a bluetooth headset. Both the I2S interface and the PCM interface may be used for audio communication.
The UART interface is a universal serial data bus used for asynchronous communications. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is generally used to connect the processor 610 and the wireless communication module 660. For example: the processor 610 communicates with the bluetooth module in the wireless communication module 660 through the UART interface to implement the bluetooth function. In some embodiments, the audio module 670 may transmit the audio signal to the wireless communication module 660 through the UART interface, so as to realize the function of playing music through the bluetooth headset.
The MIPI interface may be used to connect the processor 610 with the display screen 690, the camera module 691, and other peripheral devices. The MIPI Interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, the processor 610 and the camera module 691 communicate via a CSI interface to implement the camera function of the electronic device 600. The processor 610 and the display screen 690 communicate via the DSI interface to implement the display function of the electronic device 600.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 610 with the camera module 691, the display screen 690, the wireless communication module 660, the audio module 670, the sensor module 680, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, and the like.
The USB interface 630 is an interface conforming to the USB standard specification, and may specifically be a MiniUSB interface, a microsusb interface, a USB type c interface, or the like. The USB interface 630 may be used to connect a charger to charge the electronic device 600, and may also be used to transmit data between the electronic device 600 and a peripheral device. And the earphone can also be used for connecting an earphone and playing audio through the earphone. The interface may also be used to connect other electronic devices, such as AR devices and the like.
It should be understood that the connection relationship between the modules according to the embodiment of the present invention is only illustrative, and is not limited to the structure of the electronic device 600. In other embodiments of the present application, the electronic device 600 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.
The charging management module 640 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 640 may receive charging input from a wired charger via the USB interface 630. In some wireless charging embodiments, the charging management module 640 may receive a wireless charging input through a wireless charging coil of the electronic device 600. The charging management module 640 may also supply power to the electronic device through the power management module 641 while charging the battery 642.
The power management module 641 is configured to connect the battery 642, the charging management module 640 and the processor 610. The power management module 641 receives the input from the battery 642 and/or the charging management module 640, and supplies power to the processor 610, the internal memory 621, the display screen 690, the camera module 691, the wireless communication module 660, and the like. The power management module 641 may also be configured to monitor battery capacity, battery cycle count, battery state of health (leakage, impedance), and other parameters. In some other embodiments, the power management module 641 may be disposed in the processor 610. In other embodiments, the power management module 641 and the charging management module 640 may be disposed in the same device.
The wireless communication function of the electronic device 600 may be implemented by the antenna 1, the antenna 2, the mobile communication module 650, the wireless communication module 660, the modem processor, the baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 600 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 650 may provide a solution including 2G/3G/4G/5G wireless communication applied to the electronic device 600. The mobile communication module 650 may include at least one filter, a switch, a power Amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 650 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the filtered electromagnetic wave to the modem processor for demodulation. The mobile communication module 650 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 650 may be disposed in the processor 610. In some embodiments, at least some of the functional blocks of the mobile communication module 650 may be disposed in the same device as at least some of the blocks of the processor 610.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 671, the receiver 672, etc.) or displays an image or video through the display screen 690. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be separate from the processor 610, and may be located in the same device as the mobile communication module 650 or other functional modules.
The Wireless Communication module 660 may provide a solution for Wireless Communication applied to the electronic device 600, including Wireless Local Area Networks (WLANs) (e.g., Wireless Fidelity (Wi-Fi) network), Bluetooth (BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The wireless communication module 660 may be one or more devices integrating at least one communication processing module. The wireless communication module 660 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering on electromagnetic wave signals, and transmits the processed signals to the processor 610. The wireless communication module 660 may also receive a signal to be transmitted from the processor 610, perform frequency modulation and amplification on the signal, and convert the signal into electromagnetic waves through the antenna 2 to radiate the electromagnetic waves.
In some embodiments, antenna 1 of electronic device 600 is coupled to mobile communication module 650 and antenna 2 is coupled to wireless communication module 660 such that electronic device 600 may communicate with networks and other devices via wireless communication techniques. The wireless communication technology may include Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Multiple Access (Time-Division Code Division Multiple Access, TDSCDMA), Long Term Evolution (Long Term Evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a Global Navigation Satellite System (GLONASS), a Beidou Navigation Satellite System (BDS), a Quasi-Zenith Satellite System (QZSS), and/or a Satellite Based Augmentation System (SBAS).
The electronic device 600 implements display functions via the GPU, the display screen 690, the application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 690 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 610 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 690 is used to display images, video, etc. The display screen 690 includes a display panel. The Display panel may be a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), an Active Matrix Organic Light-Emitting Diode (Active-Matrix Organic Light-Emitting Diode, AMOLED), a flexible Light-Emitting Diode (FLED), a miniature, a Micro-oeld, a Quantum dot Light-Emitting Diode (Quantum dot Light-Emitting Diodes, QLED), or the like. In some embodiments, electronic device 600 may include 1 or N display screens 690, N being a positive integer greater than 1.
The electronic device 600 may implement a shooting function through the ISP, the camera module 691, the video codec, the GPU, the display screen 690, the application processor, and the like.
The ISP is used to process the data fed back by the camera module 691. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in the camera module 691.
The camera module 691 is for capturing still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a Complementary Metal-Oxide-Semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the electronic device 600 may include 1 or N camera modules 691, where N is a positive integer greater than 1, and if the electronic device 600 includes N cameras, one of the N cameras is the main camera.
The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the electronic device 600 selects at a frequency bin, the digital signal processor is used to perform a fourier transform or the like on the frequency bin energy.
Video codecs are used to compress or decompress digital video. The electronic device 600 may support one or more video codecs. In this way, the electronic device 600 may play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.
The NPU is a Neural-Network (NN) computing processor, which processes input information quickly by using a biological Neural Network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. Applications such as intelligent recognition of the electronic device 600 can be realized through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, and the like.
The external memory interface 622 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device 600. The external memory card communicates with the processor 610 through the external memory interface 622 to implement data storage functions. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 621 may be used to store computer-executable program code, including instructions. The internal memory 621 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The data storage area may store data (e.g., audio data, phone book, etc.) created during use of the electronic device 600, and the like. In addition, the internal memory 621 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk Storage device, a Flash memory device, a Universal Flash Storage (UFS), and the like. The processor 610 executes various functional applications of the electronic device 600 and data processing by executing instructions stored in the internal memory 621 and/or instructions stored in a memory provided in the processor.
The electronic device 600 may implement audio functions through the audio module 670, the speaker 671, the receiver 672, the microphone 673, the headset interface 674, an application processor, and the like. Such as music playing, recording, etc.
The audio module 670 is used to convert digital audio information into an analog audio signal output and also used to convert an analog audio input into a digital audio signal. The audio module 670 may also be used to encode and decode audio signals. In some embodiments, the audio module 670 may be disposed in the processor 610, or some functional modules of the audio module 670 may be disposed in the processor 610.
The speaker 671, also called "horn", is used to convert the electrical audio signals into sound signals. The electronic apparatus 600 can listen to music through the speaker 671 or listen to a hands-free call.
A receiver 672, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the electronic device 600 receives a call or voice information, it can receive voice by placing the receiver 672 close to the ear.
A microphone 673, also known as a "microphone", is used to convert acoustic signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal into the microphone 673 by making a sound near the microphone 673 through the mouth of the user. The electronic device 600 may be provided with at least one microphone 673. In other embodiments, the electronic device 600 may be provided with two microphones 673 to implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 600 may further include three, four, or more microphones 673 to collect sound signals, reduce noise, identify sound sources, perform directional recording, and so on.
The headset interface 674 is used to connect wired headsets. The headset interface 674 may be a USB interface 630, or may be a 3.5mm Open Mobile electronic device Platform (OMTP) standard interface, a Cellular Telecommunications Industry Association of america (CTIA) standard interface.
The depth sensor 6801 is used to obtain depth information of the scene. In some embodiments, the depth sensor may be disposed in the camera module 691.
The pressure sensor 6802 is used for sensing the pressure signal and converting the pressure signal into an electrical signal. In some embodiments, pressure sensor 6802 may be disposed on display 690. The pressure sensor 6802 can be of a wide variety of types, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 6802, the capacitance between the electrodes changes. The electronic device 600 determines the strength of the pressure from the change in capacitance. When a touch operation is applied to the display screen 690, the electronic apparatus 600 detects the intensity of the touch operation according to the pressure sensor 6802. The electronic apparatus 600 can also calculate the position of the touch from the detection signal of the pressure sensor 6802. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.
The gyro sensor 6803 may be used to determine a motion pose of the electronic device 600. In some embodiments, the angular velocity of electronic device 600 about three axes (i.e., x, y, and z axes) may be determined by gyroscope sensors 6803. The gyro sensor 6803 can be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 6803 detects a shake angle of the electronic device 600, calculates a distance to be compensated for by the lens module according to the shake angle, and allows the lens to counteract the shake of the electronic device 600 through a reverse movement, thereby achieving anti-shake. The gyro sensor 6803 can also be used for navigation and body feeling game scenes.
The air pressure sensor 6804 is for measuring air pressure. In some embodiments, the electronic device 600 calculates altitude, aiding in positioning and navigation from barometric pressure values measured by the barometric pressure sensor 6804.
The magnetic sensor 6805 comprises a hall sensor. The electronic device 600 may detect the opening and closing of the flip holster using the magnetic sensor 6805. In some embodiments, when the electronic device 600 is a flip, the electronic device 600 can detect the opening and closing of the flip according to the magnetic sensor 6805. And then according to the opening and closing state of the leather sheath or the opening and closing state of the flip cover, the automatic unlocking of the flip cover is set.
The acceleration sensor 6806 can detect the magnitude of acceleration of the electronic device 600 in various directions (typically three axes). The magnitude and direction of gravity may be detected when the electronic device 600 is stationary. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 6807 for measuring distance. The electronic device 600 may measure distance by infrared or laser. In some embodiments, taking a picture of a scene, the electronic device 600 may utilize the distance sensor 6807 to measure distances to achieve fast focus.
The proximity light sensor 6808 may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 600 emits infrared light to the outside through the light emitting diode. The electronic device 600 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 600. When insufficient reflected light is detected, the electronic device 600 may determine that there are no objects near the electronic device 600. The electronic device 600 can utilize the proximity light sensor 6808 to detect that the user holds the electronic device 600 close to the ear for communication, so as to automatically turn off the screen to save power. The proximity light sensor 6808 can also be used in a holster mode, a pocket mode automatically unlocking and locking the screen.
The fingerprint sensor 6809 is for collecting a fingerprint. The electronic device 600 can utilize the collected fingerprint characteristics to achieve fingerprint unlocking, access an application lock, fingerprint photographing, fingerprint incoming call answering, and the like.
The temperature sensor 6810 is used to detect temperature. In some embodiments, the electronic device 600 implements a temperature processing strategy using the temperature detected by the temperature sensor 6810. For example, when the temperature reported by the temperature sensor 6810 exceeds a threshold, the electronic device 600 performs a reduction in performance of a processor located near the temperature sensor 6810 to reduce power consumption and implement thermal protection. In other embodiments, the electronic device 600 heats the battery 642 when the temperature is below another threshold to avoid a low temperature causing the electronic device 600 to shut down abnormally. In other embodiments, when the temperature is below a further threshold, the electronic device 600 performs a boost on the output voltage of the battery 642 to avoid an abnormal shutdown due to low temperatures.
The touch sensor 6811 is also referred to as a "touch device". The touch sensor 6811 may be disposed on the display screen 690, and the touch sensor 6811 and the display screen 690 form a touch screen, which is also referred to as a "touch screen". The touch sensor 6811 is used to detect a touch operation applied thereto or therearound. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided via the display screen 690. In other embodiments, the touch sensor 6811 can be disposed on the surface of the electronic device 600 at a different location than the display screen 690.
The ambient light sensor 6812 is used to sense the ambient light level. Electronic device 600 may adaptively adjust the brightness of display 690 based on the perceived ambient light level. The ambient light sensor 6812 can also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 6812 can also cooperate with the proximity light sensor 6808 to detect whether the electronic device 600 is in a pocket for protection against accidental touches.
The bone conduction sensor 6813 can acquire a vibration signal. In some embodiments, the bone conduction sensor 6813 can acquire vibration signals of the human voice vibrating a bone mass. The bone conduction sensor 6813 may receive a blood pressure pulsation signal in contact with the pulse of the human body. In some embodiments, the bone conduction sensor 6813 may also be disposed in a headset, integrated into a bone conduction headset. The audio module 670 may analyze a voice signal based on the vibration signal of the bone block vibrated by the sound part acquired by the bone conduction sensor 6813, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure pulsation signal acquired by the bone conduction sensor 6813, so as to realize a heart rate detection function.
Keys 694 include a power-on key, a volume key, etc. Keys 694 may be mechanical keys. Or may be touch keys. The electronic apparatus 600 may receive a key input, and generate a key signal input related to user setting and function control of the electronic apparatus 600.
The motor 693 may generate a vibration cue. The motor 693 can be used for incoming call vibration prompt and also for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 693 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 690. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
Indicator 692 may be an indicator light that may be used to indicate a state of charge, a change in charge, or may be used to indicate a message, a missed call, a notification, etc.
The SIM card interface 695 is used for connecting a SIM card. The SIM card can be attached to and detached from the electronic device 600 by being inserted into the SIM card interface 695 or being pulled out of the SIM card interface 695. The electronic device 600 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 695 can support a Nano SIM card, a Micro SIM card, a SIM card, etc. Multiple cards can be inserted into the same SIM card interface 695 at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 695 may also be compatible with different types of SIM cards. The SIM interface 695 may also be compatible with an external memory card. The electronic device 600 interacts with the network through the SIM card to implement functions such as communication and data communication. In some embodiments, the electronic device 600 employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device 600 and cannot be separated from the electronic device 600.
In particular, according to an embodiment of the present invention, the processes described below with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program, when executed by a Central Processing Unit (CPU), performs various functions defined in the system of the present application.
It should be noted that the computer readable medium shown in the embodiment of the present invention may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
It should be noted that, as another aspect, the present application also provides a computer-readable medium, which may be included in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method as described in the embodiments below. For example, the electronic device may implement the steps shown in fig. 2.
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims (14)

1. A data processing method, comprising:
acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer performs expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map.
2. The data processing method of claim 1, wherein the expansion ratio calculation layer comprises at least one convolutional layer;
the inputting the feature map into an expansion rate calculation layer to obtain an expansion rate table corresponding to the feature map includes:
and performing convolution processing on the feature map at least once to obtain the expansion rate table.
3. A data processing method according to claim 1 or 2, characterized in that the method further comprises:
configuring the image size of the expansion rate table according to the image size of the feature map; alternatively, the first and second electrodes may be,
combining the image size of the feature map and the number of output channels of the expansion convolutional layer to configure the image size of the expansion rate table; or
And combining the image size of the characteristic diagram with the filter grouping information of the output channel of the expansion convolutional layer to configure the image size of the expansion rate table.
4. A data processing method according to claim 3, wherein the image size of the dilation table is the same as the image size of the feature map; or
The image size of the expansion rate table is the product of the image size of the characteristic diagram and the number of output channels of the expansion convolutional layer; or
And the image size of the expansion rate table is the product of the image size of the characteristic diagram and the grouping number of the expansion convolutional layer output channel filters.
5. The data processing method of claim 1, wherein the inputting the inflation rate table into an inflation convolution layer to enable the inflation convolution layer to perform inflation convolution processing on the feature map in combination with the inflation rate table comprises:
inquiring the expansion rate table to obtain the expansion rate value corresponding to each pixel in the feature map;
and determining the sampling range of the corresponding pixel according to the expansion rate value so as to perform expansion convolution processing.
6. The data processing method according to claim 5, wherein said querying the expansion rate table to obtain an expansion rate value corresponding to each pixel in the feature map comprises:
for each layer of output channels of the expansion convolution layer, the same coordinate pixel adopts the same expansion rate value to carry out expansion convolution processing in each layer of output channels; wherein the expansion rate table is the same as the image size of the feature map.
7. The data processing method according to claim 5, wherein said querying the expansion rate table to obtain an expansion rate value corresponding to each pixel in the feature map comprises:
inquiring expansion rate tables corresponding to output channels of each layer of the expansion convolution layer so as to be used for the expansion convolution processing of the characteristic graph by respectively combining the expansion rate tables corresponding to the output channels; and the image size of the expansion rate table is the product of the image size of the characteristic diagram and the number of output channels of the expansion convolutional layer.
8. The data processing method according to claim 5, wherein said querying the expansion rate table to obtain an expansion rate value corresponding to each pixel in the feature map comprises:
grouping filters of each output channel of the expansion convolutional layer according to a preset rule;
inquiring the expansion rate table corresponding to each group according to the grouping result so as to carry out expansion convolution processing on the feature map according to the expansion rate table corresponding to each group; in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
9. The data processing method according to any one of claims 5 to 8, wherein said determining a sampling range of corresponding pixels from said expansion ratio value comprises:
based on the expansion ratio value corresponding to the pixel, sampling is performed using the same interval in the horizontal direction and the vertical direction of the pixel.
10. A data processing method according to any of claims 5 to 8, wherein the expansion ratio value is an integer or a non-integer.
11. The data processing method according to claim 10, wherein when the expansion rate value is a non-integer, the performing expansion convolution according to the expansion rate value corresponding to each pixel comprises:
acquiring neighborhood pixel values of sampling points corresponding to the pixels, and calculating corresponding neighborhood pixel interpolation values based on the neighborhood pixel values;
and estimating the characteristic value of the sampling point according to the neighborhood pixel interpolation so as to carry out expansion convolution based on the characteristic value of the sampling point.
12. A data processing apparatus, comprising:
the self-adaptive expansion rate table calculation module is used for acquiring a feature map to be processed and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and the expansion convolution processing module is used for inputting the expansion rate table into the expansion convolution layer so as to enable the expansion convolution layer to carry out expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map.
13. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 11.
14. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out a data processing method as claimed in any one of claims 1 to 11.
CN202010963275.0A 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment Pending CN112037157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010963275.0A CN112037157A (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010963275.0A CN112037157A (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN112037157A true CN112037157A (en) 2020-12-04

Family

ID=73589857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010963275.0A Pending CN112037157A (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112037157A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734015A (en) * 2021-01-14 2021-04-30 北京市商汤科技开发有限公司 Network generation method and device, electronic equipment and storage medium
CN113570859A (en) * 2021-07-23 2021-10-29 江南大学 Traffic flow prediction method based on asynchronous space-time expansion graph convolution network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040065806A (en) * 2003-01-16 2004-07-23 엘지전자 주식회사 Cubic convolution interpolation apparatus and method
CN108961253A (en) * 2018-06-19 2018-12-07 深动科技(北京)有限公司 A kind of image partition method and device
CN109284782A (en) * 2018-09-13 2019-01-29 北京地平线机器人技术研发有限公司 Method and apparatus for detecting feature
CN110543849A (en) * 2019-08-30 2019-12-06 北京市商汤科技开发有限公司 detector configuration method and device, electronic equipment and storage medium
CN111311629A (en) * 2020-02-21 2020-06-19 京东方科技集团股份有限公司 Image processing method, image processing device and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040065806A (en) * 2003-01-16 2004-07-23 엘지전자 주식회사 Cubic convolution interpolation apparatus and method
CN108961253A (en) * 2018-06-19 2018-12-07 深动科技(北京)有限公司 A kind of image partition method and device
CN109284782A (en) * 2018-09-13 2019-01-29 北京地平线机器人技术研发有限公司 Method and apparatus for detecting feature
CN110543849A (en) * 2019-08-30 2019-12-06 北京市商汤科技开发有限公司 detector configuration method and device, electronic equipment and storage medium
CN111311629A (en) * 2020-02-21 2020-06-19 京东方科技集团股份有限公司 Image processing method, image processing device and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734015A (en) * 2021-01-14 2021-04-30 北京市商汤科技开发有限公司 Network generation method and device, electronic equipment and storage medium
CN112734015B (en) * 2021-01-14 2023-04-07 北京市商汤科技开发有限公司 Network generation method and device, electronic equipment and storage medium
CN113570859A (en) * 2021-07-23 2021-10-29 江南大学 Traffic flow prediction method based on asynchronous space-time expansion graph convolution network
CN113570859B (en) * 2021-07-23 2022-07-22 江南大学 Traffic flow prediction method based on asynchronous space-time expansion graph convolution network

Similar Documents

Publication Publication Date Title
CN113810601B (en) Terminal image processing method and device and terminal equipment
CN111552451B (en) Display control method and device, computer readable medium and terminal equipment
CN111770282B (en) Image processing method and device, computer readable medium and terminal equipment
CN112954251B (en) Video processing method, video processing device, storage medium and electronic equipment
WO2022100685A1 (en) Drawing command processing method and related device therefor
CN110248037B (en) Identity document scanning method and device
CN112700377A (en) Image floodlight processing method and device and storage medium
CN114422340A (en) Log reporting method, electronic device and storage medium
CN111741303A (en) Deep video processing method and device, storage medium and electronic equipment
CN112037157A (en) Data processing method and device, computer readable medium and electronic equipment
CN114880251A (en) Access method and access device of storage unit and terminal equipment
CN112188094B (en) Image processing method and device, computer readable medium and terminal equipment
CN113852755A (en) Photographing method, photographing apparatus, computer-readable storage medium, and program product
CN113467735A (en) Image adjusting method, electronic device and storage medium
CN112637481A (en) Image scaling method and device
CN114005016A (en) Image processing method, electronic equipment, image processing system and chip system
CN113518189A (en) Shooting method, shooting system, electronic equipment and storage medium
CN113674258B (en) Image processing method and related equipment
CN113923351B (en) Method, device and storage medium for exiting multi-channel video shooting
CN113596320B (en) Video shooting variable speed recording method, device and storage medium
CN114466238B (en) Frame demultiplexing method, electronic device and storage medium
WO2022033344A1 (en) Video stabilization method, and terminal device and computer-readable storage medium
CN115412678A (en) Exposure processing method and device and electronic equipment
CN115393676A (en) Gesture control optimization method and device, terminal and storage medium
CN111460942A (en) Proximity detection method and device, computer readable medium and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination