CN112037157B - Data processing method and device, computer readable medium and electronic equipment - Google Patents

Data processing method and device, computer readable medium and electronic equipment Download PDF

Info

Publication number
CN112037157B
CN112037157B CN202010963275.0A CN202010963275A CN112037157B CN 112037157 B CN112037157 B CN 112037157B CN 202010963275 A CN202010963275 A CN 202010963275A CN 112037157 B CN112037157 B CN 112037157B
Authority
CN
China
Prior art keywords
expansion
expansion rate
feature map
rate table
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010963275.0A
Other languages
Chinese (zh)
Other versions
CN112037157A (en
Inventor
张弓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010963275.0A priority Critical patent/CN112037157B/en
Publication of CN112037157A publication Critical patent/CN112037157A/en
Application granted granted Critical
Publication of CN112037157B publication Critical patent/CN112037157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Neurology (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, a computer readable medium, and an electronic device. The method comprises the following steps: acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map; and inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer combines the expansion rate table to carry out expansion convolution processing on the characteristic map, and thus characteristic data corresponding to the characteristic map is obtained. The method can realize self-adaptive and fine adjustment receptive fields for all pixels in the feature map, and avoid missing features in the feature extraction process; and the sampling density can be improved, and the detail characteristics are reserved.

Description

Data processing method and device, computer readable medium and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, a computer readable medium, and an electronic device.
Background
When the deep learning algorithm is used for image processing, the enlarged receptive field is beneficial to the neural network to receive a larger range of information and extract more complex information. In general, one way to expand the receptive field conventionally is to increase the number of layers of the neural network, with deeper network layers receiving more information. But has the disadvantage of greatly increasing network complexity and increasing convergence difficulty. Another way of dilation convolution is to enlarge the receptive field by enlarging the filter size. But the disadvantage is that convolution regularly skips some pixels, which can create holes in the resulting feature, resulting in the feature being omitted.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure provides a data processing method, a data processing apparatus, a computer readable medium, and a terminal device, capable of adaptively adjusting a receptive field in a calculation process of an expansion convolution, and avoiding feature omission.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to a first aspect of the present disclosure, there is provided a data processing method comprising:
acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
And inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer combines the expansion rate table to carry out expansion convolution processing on the characteristic map, and thus characteristic data corresponding to the characteristic map is obtained.
According to a second aspect of the present disclosure, there is provided a data processing apparatus comprising:
The self-adaptive expansion rate table calculation module is used for acquiring a feature map to be processed, inputting the feature map into the expansion rate calculation layer, and acquiring an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
and the expansion convolution processing module is used for inputting the expansion rate table into an expansion convolution layer so that the expansion convolution layer combines the expansion rate table to carry out expansion convolution processing on the characteristic map to obtain characteristic data corresponding to the characteristic map.
According to a third aspect of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the data processing method described above.
According to a fourth aspect of the present disclosure, there is provided a terminal device comprising:
one or more processors;
And a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data processing method described above.
According to the data processing method provided by the embodiment of the disclosure, an expansion rate calculation layer is configured, and an expansion rate table corresponding to a feature map is calculated by utilizing the expansion rate convolution layer; when the expansion convolution calculation is carried out on the feature map, the expansion rate table can be queried to obtain expansion rate values corresponding to all pixels, and sampling is carried out according to the expansion rate values corresponding to all pixels, so that the self-adaptive and fine adjustment receptive field of all pixels in the feature map is realized, and the omission of the features in the feature extraction process is avoided; and the sampling density can be improved, and the detail characteristics are reserved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
1-1 Schematically illustrate an expanded convolution sampling diagram with an expansion ratio of1 in an exemplary embodiment of the present disclosure;
FIGS. 1-2 schematically illustrate an expanded convolution sampling schematic diagram with an expansion ratio of 2 in an exemplary embodiment of the present disclosure;
FIGS. 1-3 schematically illustrate an expanded convolution sampling schematic diagram with an expansion ratio of 4 in an exemplary embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a data processing method in an exemplary embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a method of expanding convolutional layers in an exemplary embodiment of the present disclosure;
FIG. 4 schematically illustrates a sample result diagram of an expanded convolution layer in an exemplary embodiment of the present disclosure;
fig. 5 schematically illustrates a composition diagram of a display control apparatus in an exemplary embodiment of the present disclosure;
fig. 6 schematically illustrates an electronic device structure of a terminal device in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
When the deep learning algorithm is used for processing the image, for example, in application scenes such as image noise reduction and image recognition, and the convolutional neural network is used for extracting the characteristics of the image, the enlarged receptive field is beneficial to the neural network to receive a larger range of information and extract more complex information. One way to conventionally expand receptive fields is to increase the number of layers of the neural network, with deeper network layers receiving wider information. The method has the defects that the network complexity is greatly increased, and the convergence difficulty is increased. The principle of the expansion convolution is that 0 is injected into the convolution kernel, so that the sampling range of convolution operation is enlarged, and the operation complexity is unchanged. For example, referring to the expanded convolution sampling diagrams shown in fig. 1-1 to 1-3, where the expansion ratio of fig. 1-1 is 1, the convolution kernel is the original form, and the sampling points are closely connected. The expansion ratio of fig. 1-2 is 2, the convolution kernel interval is injected with zeros, and the effective sampling points are spaced apart. The expansion ratio of fig. 1-3 is 4 and the effective sampling point spacing is 4. The advantage of dilation convolution is that the receptive field is enlarged, but subtle features are missed. Yet another common way is to employ dilation convolution. The expansion convolution concept is to enlarge the size of the filter and zero-fill the filter at certain intervals, and the expansion ratio of the size is the expansion rate. The disadvantage is that convolution regularly skips some pixels, which tends to create holes in the resulting feature.
In view of the foregoing disadvantages and shortcomings of the prior art, a data processing method is provided in the present exemplary embodiment. Referring to fig. 2, the data processing method described above may include the steps of:
s11, acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
S12, inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer carries out expansion convolution processing on the feature map by combining the expansion rate table to obtain feature data corresponding to the feature map.
In the data processing method provided by the present exemplary embodiment, an expansion rate calculation layer is configured, and an expansion rate table corresponding to a feature map is calculated by using the expansion rate convolution layer; when the expansion convolution calculation is carried out on the feature map, the expansion rate table can be queried to obtain expansion rate values corresponding to the pixels, and sampling is carried out according to the expansion rate values corresponding to the pixels, so that on one hand, the self-adaptive and fine adjustment receptive field of the pixels in the feature map is realized, and omission of the features in the feature extraction process is avoided; on the other hand, the sampling density can be improved, and the detail characteristics are reserved.
Hereinafter, each step of the data processing method in the present exemplary embodiment will be described in more detail with reference to the accompanying drawings and examples.
In step S11, a feature map to be processed is obtained, and the feature map is input into an expansion rate calculation layer to obtain an expansion rate table corresponding to the feature map; the expansion rate table comprises expansion rate values corresponding to pixels in the characteristic diagram.
In this exemplary embodiment, the feature map to be processed may be a feature map obtained by performing feature extraction once on the original image. For example, when the image is noise reduced, the image is recognized, and the image is aligned, a feature map obtained by extracting features from an input original image is obtained.
The data processing method can provide an expansion rate calculating layer and an expansion convolution layer. Specifically, after the feature map to be processed is obtained, the feature map can be input into an expansion rate convolution layer, and expansion rate values corresponding to pixels in the feature map are calculated to obtain an expansion rate table corresponding to the feature map.
Specifically, the expansion ratio calculation layer may be one or a plurality of convolution layers arranged in succession, and the characteristic map is subjected to convolution processing once or a plurality of successive times, thereby obtaining an expansion ratio table. The method and the device realize the self-adaptive setting of the expansion rate for each pixel by utilizing the pixel characteristics of each pixel in the characteristic diagram. Or in other exemplary embodiments of the present disclosure, when the expansion rate calculation layer includes a plurality of convolution layers, a pooling layer (pooling), a residual network module may also be included; in addition, the method can also comprise basic components of a neural network such as an image scaling module for performing dimension reduction processing on the feature map.
Further, the image size of the expansion ratio table may be configured according to the resolution or the image size of the feature map. Alternatively, the image size of the dilation rate table may be configured in combination with the feature map and information of the dilation convolutional layer. For example, the image size of the expansion ratio table may be configured according to the image size of the feature map; or configuring the image size of the expansion rate table by combining the image size of the feature map and the output channel number of the expansion convolution layer; or configuring the image size of the expansion rate table by combining the image size of the feature map and the filter grouping information of the output channel of the expansion convolution layer.
Specifically, the image size (or resolution) of the expansion ratio table may be set to be the same as the image size (or resolution) of the feature map; for example, if the image size of the input feature map is wide (height) and the number of input channels (channel_in), the image size of the expansion rate table is also wide (height) and channel_in. Or setting the image size of the expansion rate table as the product of the image size of the feature map and the number of output channels of the expansion convolution layer, for example, the image size of the input feature map is Width height channel_in, and the number of output channels of the expansion convolution layer is channel_out, that is, the image size of the output feature map is Width height channel_out, and at this time, the image size of the expansion rate array is Width height channel_out. Or setting the image size of the expansion rate table as the product of the image size of the feature map and the number of the expansion convolution layer output channel number filter packets; for example, the input feature image size is Width height channel_in, the filter group corresponding to each output channel of the expansion convolution layer is divided into several groups, and the number of groups is group, and the image size of the expansion ratio is Width height channel_in group.
In step S12, the expansion rate table is input into an expansion convolution layer, so that the expansion convolution layer performs expansion convolution processing on the feature map in combination with the expansion rate table, so as to obtain feature data corresponding to the feature map.
In this example embodiment, referring to fig. 3, the step S12 may include:
step S121, querying the expansion rate table to obtain expansion rate values corresponding to each pixel in the feature map.
Specifically, after the expansion rate table corresponding to the feature map is calculated, the feature map and the expansion rate table may be both input into the expansion convolution layer. The expansion convolution layer inquires an expansion rate table to obtain expansion rate values corresponding to all pixels, the sampling range of convolution operation is adjusted according to the expansion rate values, and then the convolution operation processing characteristics are carried out. Wherein the sampling is performed at the same interval in the horizontal direction and the vertical direction of the pixel based on the corresponding expansion rate value of each pixel. That is, when the expansion value corresponding to a pixel is d, the intervals of the sampling points in the horizontal direction and the vertical direction in the expansion convolution operation are d.
In some exemplary embodiments of the present disclosure, the expansion ratio value d in the expansion ratio table may be an integer, or a non-integer (fraction). When the expansion ratio value is an integer, the corresponding expansion ratio value may be directly used.
If the expansion rate value d corresponding to the pixel point in the feature map is a decimal, the coordinates of the sampling point may be a decimal. Therefore, an interpolation operation unit can be further arranged on the expansion convolution layer, and the interpolation operation unit is used for estimating the characteristic value of the sampling point by utilizing the pixel interpolation of the neighborhood when the expansion rate value is a non-integer, and then performing convolution operation of the expansion convolution layer. Specifically, it may include:
step S21, obtaining a neighborhood pixel value of a sampling point corresponding to the pixel, and calculating a corresponding neighborhood pixel interpolation based on the neighborhood pixel value;
And S22, estimating the characteristic value of the sampling point according to the neighborhood pixel interpolation so as to perform expansion convolution based on the characteristic value of the sampling point.
For example, a two-line interpolation method may be used to calculate the weights of pixels by distance using the number coordinate pixel in the specified neighborhood, and estimate the eigenvalue at the sampling coordinate. Specifically, the formula may include:
Wherein x is a received feature map, y is an output feature map, P 0 is a coordinate position on the output feature map y, w represents a convolution kernel of the expansion convolution layer when the expansion rate is 1, R represents all coordinate positions in w, P n enumerates coordinate positions in R, G () is an interpolation weight function, d represents the calculated expansion rate, (P 0+d·Pn) enumerates sampling point coordinate positions of the expansion convolution layer when the expansion rate is d, N represents neighborhood integer coordinate positions of the sampling points, and q enumerates integer coordinates in N.
G (q, P 0+d·Pn) in the above equation, an interpolation weight is calculated from the spatial distances of q and (P 0+d·Pn), and the interpolation weight is inversely proportional to the spatial distance. Alternatively, G () adopts a bilinear interpolation method or a bicubic interpolation method. N in the above formula represents the neighborhood integer coordinate position of the sampling point, and optionally, N is the integer coordinate position in 4*4 neighborhood of the sampling point. That is, if the sampling point coordinates are (row, col) and row, col can be approximated as integers row ', col', then N represents four rows and four columns from row (row-1) to row (row+2), and from column (col-1) to column (col+2), 16 integer coordinate positions. Referring to the embodiment shown in fig. 4, the expansion rate d may be a fraction, so that the coordinates of the sampling points may be a fraction, and therefore, the characteristic values of the sampling points are estimated by interpolation of pixels in the neighborhood, and then a convolution operation is performed with the expansion convolution layer.
For reference comparison, conventional convolution layer calculations are typically:
Where w refers to the weight value of the convolution kernel, P 0 refers to the center pixel coordinate to be calculated, and P n enumerates the pixel coordinate locations within the neighborhood. When the convolution kernel size is 3x3, P n represents 9 pixels within the 3*3 neighborhood of P 0. Whereas conventional dilation convolution calculations are typically:
The expansion coefficient d is increased compared to the conventional convolution described above. The physical meaning is that the spacing of the sampling points is changed from 1 to d.
The embodiments of the present disclosure add an interpolation function G (), an integer coordinate q, as compared to the prior art described above. The physical meaning is that the pixel value of the sampling point is estimated by adopting an interpolation mode according to the pixel value x (q) at the integer coordinate q of the neighborhood. The expansion rate value in the expansion rate table can be ensured to be decimal, and the corresponding expansion rate value can be accurately obtained for sampling.
As shown in fig. 3, the step S12 may further include:
and step S122, determining the sampling range of the corresponding pixel according to the expansion rate value so as to carry out expansion convolution processing.
In some exemplary embodiments of the present disclosure, when the expansion ratio table is the same as the image size of the input feature map, for each layer of output channels of the expansion convolution layer, the same coordinate pixel performs expansion convolution processing with the same expansion ratio value in each layer of output channels.
For example, when the feature map and the expansion ratio table have the same image size (Width height channel_in), and channel_in=2, the expansion ratio table includes two layers of expansion ratio sub-tables corresponding to the channels of the feature map in two dimensions, respectively, and the two expansion ratio sub-tables are the same. When the number of output channels of the expansion convolution calculation layer is channel_out=3, at the moment, the pixels of the first layer channel of the feature map inquire the expansion rate table to obtain a corresponding expansion rate value, the interval is determined according to the expansion rate value, sampling is carried out, and convolution calculation is carried out to obtain a first feature value; the pixels of the second layer channel of the feature map inquire the expansion rate table to obtain a corresponding expansion rate value, the interval is determined according to the expansion rate value, sampling is carried out, and convolution calculation is carried out to obtain a second feature value; and determining the characteristic value of each pixel of the output channel of the first layer of the expansion convolution layer according to the first characteristic value and the second characteristic value. Similarly, a characteristic map with 3 output channels of the expansion convolution layer can be obtained.
In some exemplary embodiments of the present disclosure, different expansion rates may also be calculated for the filter banks corresponding to the respective output channels of the expansion convolution layer, respectively. That is, the image size of the dilation rate table is the product of the feature map image size and the number of output channels of the dilation convolutional layer. For example, the input feature image size is Width height channel_in, and the number of output channels of the expansion convolution layer is channel_out, i.e. the output feature image size is Width height channel_out, and the image size of the expansion rate table is Width height channel_in channel_out. For example, the input feature map image size is Width height channel_in, channel_in=2; and the number of output channels of the expansion convolution layer is channel_out=3, the image size of the expansion rate table is Width height 6. At this time, the channel pixels of the first layer dimension of the feature map query the expansion rate value of the first layer dimension of the expansion rate table for sampling, and obtain a first feature value after convolution calculation; inquiring the expansion rate value of the second layer dimension of the expansion rate table by the channel pixels of the second layer dimension, sampling, and obtaining a second characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the first layer output channel of the expansion convolution layer according to the first characteristic value and the second characteristic value. Inquiring the expansion rate value of the third layer dimension of the expansion rate table by the channel pixels of the first layer dimension of the feature map, sampling, and obtaining a third feature value after convolution calculation; inquiring an expansion rate value of a fourth layer dimension of the expansion rate table by using the channel pixels of the second layer dimension, sampling, and obtaining a fourth characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the second layer output channel of the expansion convolution layer according to the third characteristic value and the fourth characteristic value. Inquiring the expansion rate value of the fifth layer dimension of the expansion rate table by the channel pixels of the first layer dimension of the feature map, sampling, and obtaining a fifth feature value after convolution calculation; inquiring the expansion rate value of the sixth layer dimension of the expansion rate table by the channel pixels of the second layer dimension, sampling, and obtaining a sixth characteristic value after convolution calculation; and obtaining the characteristic value of the pixel of the third layer output channel of the expansion convolution layer according to the fifth characteristic value and the sixth characteristic value. In this way, a feature map with an output channel number of 3 for the expanded convolution layer can be obtained.
In some exemplary embodiments of the present disclosure, the filters of the output channels of the expansion convolution layer may also be grouped according to a preset rule; inquiring expansion rate tables corresponding to all groups according to the grouping results, and performing expansion convolution processing on the feature map according to the expansion rate tables corresponding to all groups; wherein, in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
For example, the filter banks corresponding to the output channels of the expansion convolution layer may be divided into several groups, where the same expansion rate is used for the same coordinate pixels of different output channels in the groups. For example, when the input feature map image size is Width height channel_in and the number of packets is group, the image size of the expansion rate table is Width height (channel_in group). In the expansion convolution layer, the filters can be divided into groups according to actual requirements, for example, 4, 6 or 8 groups. For example, when channel_in=2, group=3, and the number of output channels of the expansion convolution layer is channel_out=3, the image size of the expansion ratio table is Width height 6. For example, when the expansion convolution layer channel_out=32, group=4, layers 1 to 4 use the same expansion rate table, layers 5 to 8 use the same expansion rate table, and so on. By dividing the filters into groups, the complexity of the network can be reduced, the computing resources can be saved, and the accuracy of the expansion convolution operation can be ensured.
In this exemplary embodiment, an image feature extraction model is provided, which includes an expansion rate calculation layer and an expansion convolution layer. The number of input channels of the model is 2, and the number of output channels is 3. The expansion ratio calculation layer comprises 1 convolution layer, the filter size is 3*3, and the output channel is 2. The layer carries out convolution operation on the input feature map and outputs an expansion rate table with the same image size. The expansion convolution layer firstly searches an expansion rate table to determine an expansion rate value corresponding to the pixel, expands the sampling range, and then calculates.
For example, for the pixels with coordinates of 100 th row, 100 th column and first channel in the input feature map, the same coordinates in the expansion rate array are searched, and the corresponding expansion rate is found to be 1.5. If the filter size of the expanded convolution layer is 3*3, then the normal sampling coordinates of this pixel are as follows, the first number of brackets represents the row coordinates and the second number represents the column coordinates:
(99,99) (99,100) (99,101)
(100,99) (100,100) (100,101)
(101,99) (101,100) (101,101)
And after the expansion ratio is set to 1.5, the sampling coordinates thereof become:
(98.5,98.5) (98.5,100) (98.5,101.5)
(100,98.5) (100,100) (100,101.5)
(101.5,98.5) (101.5,100) (101.5,101.5)
Because the sampling coordinates are decimal, a bilinear interpolation method is adopted, and the weights of all pixels are calculated according to the distance by utilizing integer coordinate pixels in 4*4 adjacent areas, so that the characteristic value at the sampling coordinates is estimated. Taking coordinate point (98.5,101.5) as an example, the integer coordinates of the reference pixels are from 97 th row to 100 th row, from 100 th column to 103 th column, specifically:
(97,100) (97,101) (97,102) (97,103)
(98,100) (98,101) (98,102) (98,103)
(99,100) (99,101) (99,102) (99,103)
(100,100) (100,101) (100,102) (100,103)
And then the result can be calculated by a convolution mode after the characteristic value at the sampling point is estimated.
The data processing method provided by the embodiment of the disclosure can be applied to a deep learning neural network and used in the fields of image noise reduction, image recognition or image registration, semantic segmentation, face recognition, image reconstruction and the like. By configuring the corresponding expansion rate table for the feature image by using the expansion rate calculation layer, the receptive field can be adaptively and finely adjusted at the pixel level in the expansion convolution layer according to the image feature. And the calculated amount is relatively small, which is beneficial to improving the performance in the fields of semantic segmentation, image reconstruction and the like. Taking the image noise reduction field as an example, in the case of block noise caused by image compression, the receptive field needs to be expanded as much as possible so as to lighten the block noise; for the texture region with abundant details, the details need to be sampled and reserved as densely as possible. These two tasks contradict each other, the actual processing is difficult to balance, or large pieces of noise remain, or high frequency details are smeared out. The scheme can calculate the expansion rate at the pixel level according to the image characteristics, and adaptively adjust the receptive field, thereby being beneficial to solving the problems.
It is noted that the above-described figures are only schematic illustrations of processes involved in a method according to an exemplary embodiment of the invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Further, referring to fig. 5, in the embodiment of the present example, there is further provided a data processing apparatus 50, including: an adaptive expansion rate table calculation module 501 and an expansion convolution processing module 502.
Wherein,
The adaptive expansion rate table calculating module 501 may be configured to obtain a feature map to be processed, and input the feature map to an expansion rate calculating layer to obtain an expansion rate table corresponding to the feature map; the expansion rate table comprises expansion rate values corresponding to pixels in the characteristic diagram.
The expansion convolution processing module 502 may be configured to input the expansion rate table into an expansion convolution layer, so that the expansion convolution layer performs expansion convolution processing on the feature map in combination with the expansion rate table to obtain feature data corresponding to the feature map.
In one example of the present disclosure, the expansion ratio calculation layer includes at least one convolution layer; the adaptive expansion ratio table calculating module 501 may be configured to perform at least one convolution process on the feature map to obtain the expansion ratio table.
In one example of the present disclosure, the expansion ratio table image size configuration module (not shown in the figure).
The expansion rate table image size configuration module may be configured to configure an image size of the expansion rate table according to an image size of the feature map; or alternatively
Configuring the image size of the expansion rate table by combining the image size of the feature map and the output channel number of the expansion convolution layer; or alternatively
The image size of the expansion rate table is configured by combining the image size of the feature map and the filter grouping information of the output channel of the expansion convolution layer.
In one example of the present disclosure, the image size of the expansion ratio table is the same as the image size of the feature map; or alternatively
The image size of the expansion rate table is the product of the image size of the feature map and the output channel number of the expansion convolution layer; or alternatively
The image size of the expansion rate table is the product of the image size of the feature map and the number of output channel filter packets of the expansion convolution layer.
In one example of the present disclosure, the expansion convolution processing module 502 may include: a query unit, a sampling unit (not shown in the figure).
The query unit may be configured to query the expansion rate table to obtain expansion rate values corresponding to each pixel in the feature map.
The sampling unit may be configured to determine a sampling range of a corresponding pixel according to the dilation value, so as to perform dilation convolution processing.
In one example of the present disclosure, the query unit may include: a first querying element (not shown).
The first query unit may be configured to perform expansion convolution processing on each layer of output channels of the expansion convolution layer by using the same expansion value for the same coordinate pixel in each layer of output channels; wherein the expansion ratio table is the same as the image size of the feature map.
In one example of the present disclosure, the query unit may include: a second querying element (not shown).
The second query unit may be configured to query an expansion rate table corresponding to each layer of output channels of the expansion convolution layer, so that the feature map performs expansion convolution processing by respectively combining expansion rate tables corresponding to each layer of output channels; the image size of the expansion rate table is the product of the image size of the characteristic image and the output channel number of the expansion convolution layer.
In one example of the present disclosure, the query unit may include: a third querying element (not shown).
The second query unit may be configured to group filters of each output channel of the expansion convolutional layer according to a preset rule; inquiring expansion rate tables corresponding to all groups according to the grouping results, and performing expansion convolution processing on the feature map according to the expansion rate tables corresponding to all groups; wherein, in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
In one example of the present disclosure, the sampling unit may include: and based on the expansion rate value corresponding to the pixel, sampling is performed at the same interval in the horizontal direction and the vertical direction of the pixel.
In one example of the present disclosure, the expansion rate value is an integer, or a non-integer.
In one example of the present disclosure, the expansion convolution processing module 502 may further include: and a neighborhood interpolation calculation unit. (not shown in the drawings).
The neighborhood interpolation calculation unit may be configured to obtain a neighborhood pixel value of a sampling point corresponding to the pixel when the expansion rate value is a non-integer, and calculate a corresponding neighborhood pixel interpolation based on the neighborhood pixel value; and estimating the characteristic value of the sampling point according to the neighborhood pixel interpolation so as to carry out expansion convolution based on the characteristic value of the sampling point.
The specific details of each module in the above data processing apparatus have been described in detail in the corresponding data processing method, so that the details are not repeated here.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Fig. 6 shows a schematic diagram of a wireless communication device suitable for use in implementing embodiments of the present invention.
It should be noted that the electronic device 600 shown in fig. 6 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may specifically include: processor 610, internal memory 621, external memory interface 622, universal serial bus (Universal Serial Bus, USB) interface 630, charge management module 640, power management module 641, battery 642, antenna 1, antenna 2, mobile communication module 650, wireless communication module 660, audio module 670, speaker 671, receiver 672, microphone 673, ear-piece interface 674, sensor module 680, display 690, camera module 691, indicator 692, motor 693, keys 694, and user identification module (subscriber identification module, SIM) card interface 695, among others. The sensor module 680 may include a depth sensor 6801, a pressure sensor 6802, a gyroscope sensor 6803, a barometric sensor 6804, a magnetic sensor 6805, an acceleration sensor 6806, a distance sensor 6807, a proximity light sensor 6808, a fingerprint sensor 6809, a temperature sensor 6810, a touch sensor 6811, an ambient light sensor 6812, and a bone conduction sensor 6813, among others.
It should be understood that the illustrated structure of the embodiment of the present application does not constitute a specific limitation on the electronic device 600. In other embodiments of the application, electronic device 600 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 610 may include one or more processing units, such as: the Processor 610 may include an application Processor (Application Processor, AP), a modem Processor, a graphics Processor (Graphics Processing Unit, GPU), an image signal Processor (IMAGE SIGNAL Processor, ISP), a controller, a video codec, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), a baseband Processor and/or a neural network Processor (Neural-etwork Processing Unit, NPU), and the like. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 610 for storing instructions and data. The memory may store instructions for implementing six modular functions: detection instructions, connection instructions, information management instructions, analysis instructions, data transfer instructions, and notification instructions, and are controlled to be executed by the processor 610. In some embodiments, the memory in the processor 610 is a cache memory. The memory may hold instructions or data that the processor 610 has just used or recycled. If the processor 610 needs to reuse the instruction or data, it may be called directly from the memory. Repeated accesses are avoided, reducing the latency of the processor 610 and thus improving the efficiency of the system.
In some embodiments, the processor 610 may include one or more interfaces. The interfaces may include an integrated circuit (Inter-INTEGRATED CIRCUIT, I2C) interface, an integrated circuit built-in audio (Inter-INTEGRATED CIRCUIT SOUND, I2S) interface, a pulse code modulation (Pulse Code Modulation, PCM) interface, a universal asynchronous receiver transmitter (Universal Asynchronous Receiver/RRANSMITTER, UART) interface, a mobile industry processor interface (Mobile Industry Processor Interface, MIPI), a General-Purpose Input/Output (GPIO) interface, a subscriber identity module (Subscriber Identity Module, SIM) interface, and/or a universal serial bus (Universal Serial Bus, USB) interface, among others.
The I2C interface is a bi-directional synchronous serial bus comprising a serial data line (SERIAL DATA LINE, SDA) and a serial clock line (Serail Clock line, SCL). In some embodiments, the processor 610 may contain multiple sets of I2C buses. The processor 610 may be coupled to the touch sensor 6811, the charger, the flash, the camera module 691, etc. through different I2C bus interfaces, respectively. For example: the processor 610 may couple the touch sensor 6811 through an I2C interface, causing the processor 610 to communicate with the touch sensor 6811 through an I2C bus interface, implementing the touch functionality of the electronic device 600.
The I2S interface may be used for audio communication. In some embodiments, the processor 610 may contain multiple sets of I2S buses. The processor 610 may be coupled to the audio module 670 via an I2S bus to enable communication between the processor 610 and the audio module 670. In some embodiments, the audio module 670 may communicate audio signals to the wireless communication module 660 via the I2S interface to enable phone answering via a bluetooth headset.
PCM interfaces may also be used for audio communication to sample, quantize and encode analog signals. In some embodiments, the audio module 670 and the wireless communication module 660 may be coupled by a PCM bus interface. In some embodiments, the audio module 670 may also transmit audio signals to the wireless communication module 660 via the PCM interface to enable phone answering via the bluetooth headset. Both the I2S interface and the PCM interface may be used for audio communication.
The UART interface is a universal serial data bus for asynchronous communications. The bus may be a bi-directional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is typically used to connect the processor 610 with the wireless communication module 660. For example: the processor 610 communicates with a bluetooth module in the wireless communication module 660 through a UART interface to implement a bluetooth function. In some embodiments, the audio module 670 may transmit audio signals to the wireless communication module 660 through a UART interface to implement a function of playing music through a bluetooth headset.
The MIPI interface may be used to connect the processor 610 to peripheral devices such as the display 690, camera module 691, etc. The MIPI interface includes a camera serial interface (CAMERA SERIAL INTERFACE, CSI), a display serial interface (DISPLAY SERIAL INTERFACE, DSI), and the like. In some embodiments, the processor 610 and the camera module 691 communicate through a CSI interface to implement the shooting function of the electronic device 600. The processor 610 and the display 690 communicate via a DSI interface to implement the display functions of the electronic device 600.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal or as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 610 with the camera module 691, the display 690, the wireless communication module 660, the audio module 670, the sensor module 680, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, an MIPI interface, etc.
The USB interface 630 is an interface conforming to the USB standard specification, and may specifically be a MiniUSB interface, a micro USB interface, USBTypeC interface, or the like. The USB interface 630 may be used to connect a charger to charge the electronic device 600, or may be used to transfer data between the electronic device 600 and a peripheral device. And can also be used for connecting with a headset, and playing audio through the headset. The interface may also be used to connect other electronic devices, such as AR devices, etc.
It should be understood that the interfacing relationship between the modules illustrated in the embodiments of the present application is only illustrative, and is not meant to limit the structure of the electronic device 600. In other embodiments of the present application, the electronic device 600 may also use different interfacing manners, or a combination of multiple interfacing manners, as in the above embodiments.
The charge management module 640 is used to receive a charge input from a charger. The charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charge management module 640 may receive a charging input of a wired charger through the USB interface 630. In some wireless charging embodiments, the charge management module 640 may receive wireless charging input through a wireless charging coil of the electronic device 600. The charging management module 640 may also provide power to the electronic device through the power management module 641 while charging the battery 642.
The power management module 641 is used for connecting the battery 642, the charge management module 640 and the processor 610. The power management module 641 receives input from the battery 642 and/or the charge management module 640 to power the processor 610, the internal memory 621, the display 690, the camera module 691, the wireless communication module 660, and the like. The power management module 641 may also be configured to monitor battery capacity, battery cycle times, battery health (leakage, impedance), and other parameters. In other embodiments, the power management module 641 may also be disposed in the processor 610. In other embodiments, the power management module 641 and the charge management module 640 may be disposed in the same device.
The wireless communication function of the electronic device 600 may be implemented by the antenna 1, the antenna 2, the mobile communication module 650, the wireless communication module 660, a modem processor, a baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 600 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 650 may provide a solution for wireless communication, including 2G/3G/4G/5G, as applied to the electronic device 600. The mobile communication module 650 may include at least one filter, switch, power amplifier, low noise amplifier (Low Noise Amplifier, LNA), or the like. The mobile communication module 650 may receive electromagnetic waves from the antenna 1, perform processes such as filtering and amplifying the received electromagnetic waves, and transmit the electromagnetic waves to the modem processor for demodulation. The mobile communication module 650 may amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate the electromagnetic waves. In some embodiments, at least some of the functional modules of the mobile communication module 650 may be disposed in the processor 610. In some embodiments, at least some of the functional modules of the mobile communication module 650 may be disposed in the same device as at least some of the modules of the processor 610.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low frequency baseband signal to the baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through an audio device (not limited to the speaker 671, the receiver 672, etc.), or displays images or videos through the display 690. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 650 or other functional module, independent of the processor 610.
The wireless Communication module 660 may provide solutions for wireless Communication including wireless local area network (Wireless Local Area Networks, WLAN) (e.g., wireless fidelity (WIRELESS FIDELITY, wi-Fi) network), bluetooth (BT), global navigation satellite system (Global Navigation SATELLITE SYSTEM, GNSS), frequency modulation (Frequency Modulation, FM), near Field Communication (NFC), infrared (IR), etc., as applied to the electronic device 600. The wireless communication module 660 may be one or more devices that integrate at least one communication processing module. The wireless communication module 660 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 610. The wireless communication module 660 may also receive signals to be transmitted from the processor 610, frequency modulate them, amplify them, and convert them to electromagnetic waves for radiation via the antenna 2.
In some embodiments, antenna 1 and mobile communication module 650 of electronic device 600 are coupled, and antenna 2 and wireless communication module 660 are coupled, such that electronic device 600 may communicate with a network and other devices via wireless communication techniques. The wireless communication techniques may include the Global System for Mobile communications (Global System for Mobile communications, GSM), general Packet Radio Service (GPRS), code division multiple Access (Code Division Multiple Access, CDMA), wideband code division multiple Access (Wideband Code Division Multiple Access, WCDMA), time division multiple Access (Time-Division Code Division Multiple Access, TDSCDMA), long term evolution (Long Term Evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR techniques, among others. The GNSS may include a global satellite positioning system (Global Positioning System, GPS), a global navigation satellite system (Global Navigation SATELLITE SYSTEM, GLONASS), a beidou satellite navigation system (Beidou avigation SATELLITE SYSTEM, BDS), a Quasi Zenith satellite system (Quasi-Zenith SATELLITE SYSTEM, QZSS) and/or a satellite based augmentation system (SATELLITE BASED AUGMENTATION SYSTEMS, SBAS).
The electronic device 600 implements display functions through a GPU, a display 690, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 690 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 610 may include one or more GPUs that execute program instructions to generate or change display information.
The display 690 is used to display images, videos, and the like. The display 690 includes a display panel. The display panel may employ a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), an Organic Light-Emitting Diode (OLED), an Active-Matrix Organic LIGHT EMITTING Diode (AMOLED), a flexible Light-Emitting Diode (FLED), miniled, microLed, micro-oLed, quantum dot LIGHT EMITTING Diodes (QLED), or the like. In some embodiments, the electronic device 600 may include 1 or N displays 690, N being a positive integer greater than 1.
The electronic device 600 may implement a photographing function through an ISP, a camera module 691, a video codec, a GPU, a display 690, an application processor, and the like.
The ISP is used to process the data fed back by the camera module 691. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and is converted into an image visible to naked eyes. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, an ISP may be provided in camera module 691.
The camera module 691 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (Charge Coupled Device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, the electronic device 600 may include 1 or N camera modules 691, where N is a positive integer greater than 1, and if the electronic device 600 includes N cameras, one of the N cameras is a master camera.
The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals. For example, when the electronic device 600 is selecting a frequency bin, the digital signal processor is used to fourier transform the frequency bin energy, or the like.
Video codecs are used to compress or decompress digital video. The electronic device 600 may support one or more video codecs. In this way, the electronic device 600 may play or record video in a variety of encoding formats, such as: dynamic picture experts group (Moving Picture Experts Group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
The NPU is a neural Network (Neural-Network, NN) computing processor, and can rapidly process input information by referencing a biological neural Network structure, such as referencing a transmission mode among human brain neurons, and can continuously learn. Applications such as intelligent awareness of the electronic device 600 may be implemented through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.
The external memory interface 622 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the electronic device 600. The external memory card communicates with the processor 610 through an external memory interface 622 to implement data storage functions. For example, files such as music, video, etc. are stored in an external memory card.
The internal memory 621 may be used to store computer-executable program code that includes instructions. The internal memory 621 may include a storage program area and a storage data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data created during use of the electronic device 600 (e.g., audio data, phonebook, etc.), and so forth. In addition, the internal memory 621 may include a high-speed random access memory, and may also include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (Universal Flash Storage, UFS), and the like. The processor 610 performs various functional applications of the electronic device 600 and data processing by executing instructions stored in the internal memory 621 and/or instructions stored in a memory provided in the processor.
The electronic device 600 may implement audio functions through an audio module 670, a speaker 671, a receiver 672, a microphone 673, an ear-headphone interface 674, an application processor, and the like. Such as music playing, recording, etc.
The audio module 670 is used to convert digital audio information to an analog audio signal output and also to convert an analog audio input to a digital audio signal. The audio module 670 may also be used to encode and decode audio signals. In some embodiments, the audio module 670 may be disposed in the processor 610, or some of the functional modules of the audio module 670 may be disposed in the processor 610.
The speaker 671, also referred to as a "horn", is used to convert audio electrical signals into sound signals. The electronic device 600 may listen to music, or to hands-free conversations, through the speaker 671.
A receiver 672, also called an "earpiece", is used to convert the audio electrical signal into a sound signal. When the electronic device 600 is answering a telephone call or voice message, the voice can be heard by placing the receiver 672 close to the human ear.
A microphone 673, also referred to as a "microphone" or "microphone", is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can sound near the microphone 673 through the mouth, inputting a sound signal to the microphone 673. The electronic device 600 may be provided with at least one microphone 673. In other embodiments, the electronic device 600 may be provided with two microphones 673, and may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 600 may also be provided with three, four, or more microphones 673 to enable collection of sound signals, noise reduction, identification of sound sources, directional recording functions, etc.
The ear-piece interface 674 is for connecting a wired ear-piece. The ear piece interface 674 may be a USB interface 630 or a 3.5mm open mobile electronic device platform (Open Mobile Terminal Platform, OMTP) standard interface, a american cellular telecommunications industry association (Cellular Telecommunications Industry Association of the USA, CTIA) standard interface.
The depth sensor 6801 is used to acquire depth information of a scene. In some embodiments, a depth sensor may be provided at the camera module 691.
The pressure sensor 6802 is configured to sense a pressure signal and convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 6802 may be provided to the display 690. The pressure sensor 6802 is of various types, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a capacitive pressure sensor comprising at least two parallel plates with conductive material. When a force is applied to the pressure sensor 6802, the capacitance between the electrodes changes. The electronics 600 determine the strength of the pressure from the change in capacitance. When a touch operation is applied to the display 690, the electronic apparatus 600 detects the intensity of the touch operation according to the pressure sensor 6802. The electronic device 600 may also calculate the location of the touch based on the detection signal of the pressure sensor 6802. In some embodiments, touch operations that act on the same touch location, but at different touch operation strengths, may correspond to different operation instructions. For example: and executing an instruction for checking the short message when the touch operation with the touch operation intensity smaller than the first pressure threshold acts on the short message application icon. And executing an instruction for newly creating the short message when the touch operation with the touch operation intensity being greater than or equal to the first pressure threshold acts on the short message application icon.
The gyro sensor 6803 may be used to determine a motion gesture of the electronic device 600. In some embodiments, the angular velocity of the electronic device 600 about three axes (i.e., x, y, and z axes) may be determined by the gyro sensor 6803. The gyro sensor 6803 may be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 6803 detects the shake angle of the electronic device 600, calculates the distance to be compensated by the lens module according to the angle, and makes the lens counteract the shake of the electronic device 600 through the reverse motion, so as to realize anti-shake. The gyro sensor 6803 may also be used for navigating, somatosensory game scenes.
The air pressure sensor 6804 is used to measure air pressure. In some embodiments, the electronic device 600 calculates altitude from barometric pressure values measured by the barometric pressure sensor 6804, aiding in positioning and navigation.
The magnetic sensor 6805 includes a hall sensor. The electronic device 600 may detect the opening and closing of the flip holster using the magnetic sensor 6805. In some embodiments, when the electronic device 600 is a flip machine, the electronic device 600 may detect the opening and closing of the flip according to the magnetic sensor 6805. And then according to the detected opening and closing state of the leather sheath or the opening and closing state of the flip, the characteristics of automatic unlocking of the flip and the like are set.
The acceleration sensor 6806 may detect the magnitude of acceleration of the electronic device 600 in various directions (typically three axes). The magnitude and direction of gravity may be detected when the electronic device 600 is stationary. The electronic equipment gesture recognition method can also be used for recognizing the gesture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 6807 for measuring distance. The electronic device 600 may measure the distance by infrared or laser. In some embodiments, the electronic device 600 may range using the distance sensor 6807 to achieve fast focus.
The proximity light sensor 6808 can include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 600 emits infrared light outward through the light emitting diode. The electronic device 600 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it may be determined that an object is in the vicinity of the electronic device 600. When insufficient reflected light is detected, the electronic device 600 may determine that there is no object in the vicinity of the electronic device 600. The electronic device 600 may use the proximity sensor 6808 to detect that the user is holding the electronic device 600 in close proximity to the ear for power saving purposes by automatically extinguishing the screen. The proximity light sensor 6808 can also be used in holster mode, pocket mode to automatically unlock and lock the screen.
The fingerprint sensor 6809 is used for capturing a fingerprint. The electronic device 600 may utilize the collected fingerprint characteristics to unlock the fingerprint, access the application lock, photograph the fingerprint, answer the incoming call, etc.
The temperature sensor 6810 is used to detect temperature. In some embodiments, the electronic device 600 performs a temperature processing strategy using the temperature detected by the temperature sensor 6810. For example, when the temperature reported by temperature sensor 6810 exceeds a threshold, electronic device 600 performs a reduction in the performance of a processor located near temperature sensor 6810 in order to reduce power consumption to implement thermal protection. In other embodiments, when the temperature is below another threshold, the electronic device 600 heats the battery 642 to avoid the low temperature causing the electronic device 600 to be abnormally shut down. In other embodiments, when the temperature is below a further threshold, the electronic device 600 performs boosting of the output voltage of the battery 642 to avoid abnormal shutdown caused by low temperatures.
The touch sensor 6811 is also referred to as a "touch device". The touch sensor 6811 may be disposed on the display 690, and the touch sensor 6811 and the display 690 form a touch screen, which is also called a "touch screen". The touch sensor 6811 is used to detect a touch operation acting thereon or thereabout. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through the display 690. In other embodiments, the touch sensor 6811 may also be disposed on the surface of the electronic device 600 at a different location than the display 690.
The ambient light sensor 6812 is used to sense ambient light. The electronic device 600 may adaptively adjust the brightness of the display 690 based on the perceived ambient light level. The ambient light sensor 6812 may also be used to automatically adjust white balance when taking a photograph. The ambient light sensor 6812 may also cooperate with the proximity light sensor 6808 to detect if the electronic device 600 is in a pocket to prevent false touches.
The bone conduction sensor 6813 may acquire a vibration signal. In some embodiments, bone conduction sensor 6813 may acquire a vibration signal of a human vocal tract vibrating bone pieces. The bone conduction sensor 6813 may also contact the pulse of the human body to receive the blood pressure pulsation signal. In some embodiments, bone conduction sensor 6813 may also be provided in the headset, in combination with an osteoinductive headset. The audio module 670 may parse out a voice signal based on the vibration signal of the sound portion vibration bone block obtained by the bone conduction sensor 6813, so as to implement a voice function. The application processor may analyze the heart rate information based on the blood pressure beat signal obtained by the bone conduction sensor 6813, so as to realize a heart rate detection function.
The keys 694 include a power on key, a volume key, etc. The keys 694 may be mechanical keys. Or may be a touch key. The electronic device 600 may receive key inputs, generate key signal inputs related to user settings and function controls of the electronic device 600.
The motor 693 may generate a vibration alert. The motor 693 may be used for incoming call vibration alerting as well as for touch vibration feedback. For example, touch operations acting on different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 693 may also correspond to different vibration feedback effects by touching different areas of the display 690. Different application scenarios (such as time reminding, receiving information, alarm clock, game, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
The indicator 692 may be an indicator light, which may be used to indicate a state of charge, a change in power, a message, a missed call, a notification, or the like.
The SIM card interface 695 is used to connect a SIM card. The SIM card may be inserted into the SIM card interface 695, or removed from the SIM card interface 695 to enable contact and separation with the electronic device 600. The electronic device 600 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 695 may support Nano SIM cards, micro SIM cards, and the like. The same SIM card interface 695 may be used to insert multiple cards simultaneously. The types of the plurality of cards may be the same or different. The SIM card interface 695 may also be compatible with different types of SIM cards. SIM card interface 695 may also be compatible with external memory cards. The electronic device 600 interacts with the network through the SIM card to perform functions such as talking and data communication. In some embodiments, the electronic device 600 employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device 600 and cannot be separated from the electronic device 600.
In particular, according to embodiments of the present application, the processes described below with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. When being executed by a Central Processing Unit (CPU), performs the various functions defined in the system of the present application.
It should be noted that, the computer readable medium shown in the embodiments of the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
It should be noted that, as another aspect, the present application also provides a computer-readable medium, which may be included in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by one of the electronic devices, cause the electronic device to implement the methods described in the embodiments below. For example, the electronic device may implement the steps shown in fig. 2.
Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (12)

1. A method of data processing, comprising:
acquiring a feature map to be processed, and inputting the feature map into an expansion rate calculation layer to acquire an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
inputting the expansion rate table into an expansion convolution layer, so that the expansion convolution layer carries out expansion convolution processing on the feature map in combination with the expansion rate table to obtain feature data corresponding to the feature map;
the expansion rate calculation layer comprises at least one convolution layer;
inputting the feature map into an expansion rate calculation layer to obtain an expansion rate table corresponding to the feature map, wherein the method comprises the following steps:
performing convolution processing on the characteristic map at least once to obtain the expansion rate table;
the step of inputting the expansion rate table into an expansion convolution layer to enable the expansion convolution layer to carry out expansion convolution processing on the characteristic map by combining the expansion rate table comprises the following steps:
inquiring the expansion rate table to obtain expansion rate values corresponding to pixels in the feature map;
And determining the sampling range of the corresponding pixel according to the expansion rate value so as to carry out expansion convolution processing.
2. The data processing method of claim 1, wherein the method further comprises:
Configuring the image size of the expansion rate table according to the image size of the feature map; or alternatively
Configuring the image size of the expansion rate table by combining the image size of the feature map and the output channel number of the expansion convolution layer; or alternatively
The image size of the expansion rate table is configured by combining the image size of the feature map and the filter grouping information of the output channel of the expansion convolution layer.
3. The data processing method according to claim 2, wherein an image size of the expansion ratio table is the same as an image size of the feature map; or alternatively
The image size of the expansion rate table is the product of the image size of the feature map and the output channel number of the expansion convolution layer; or alternatively
The image size of the expansion rate table is the product of the image size of the feature map and the number of output channel filter packets of the expansion convolution layer.
4. The method of claim 1, wherein the querying the expansion rate table to obtain the expansion rate value corresponding to each pixel in the feature map comprises:
For each layer of output channels of the expansion convolution layer, the same coordinate pixel adopts the same expansion value to carry out expansion convolution treatment in each layer of output channels; wherein the expansion ratio table is the same as the image size of the feature map.
5. The method of claim 1, wherein the querying the expansion rate table to obtain the expansion rate value corresponding to each pixel in the feature map comprises:
inquiring expansion rate tables corresponding to output channels of each expansion convolution layer, so that expansion convolution processing is carried out by combining the characteristic diagrams with the expansion rate tables corresponding to the output channels respectively; the image size of the expansion rate table is the product of the image size of the characteristic image and the output channel number of the expansion convolution layer.
6. The method of claim 1, wherein the querying the expansion rate table to obtain the expansion rate value corresponding to each pixel in the feature map comprises:
grouping the filters of each output channel of the expansion convolution layer according to a preset rule;
Inquiring expansion rate tables corresponding to all groups according to the grouping results, and performing expansion convolution processing on the feature map according to the expansion rate tables corresponding to all groups; wherein, in each group, the same coordinate pixels corresponding to each output channel adopt the same expansion rate value.
7. A data processing method according to any of claims 4-6, wherein said determining the sampling range of the corresponding pixel from the expansion value comprises:
And based on the expansion rate value corresponding to the pixel, sampling is performed at the same interval in the horizontal direction and the vertical direction of the pixel.
8. A data processing method according to any one of claims 4 to 6, wherein the expansion value is an integer or a non-integer.
9. The data processing method according to claim 8, wherein when the expansion rate value is a non-integer, the determining the sampling range of the corresponding pixel according to the expansion rate value to perform the expansion convolution processing includes:
Obtaining a neighborhood pixel value of a sampling point corresponding to the pixel, and calculating a corresponding neighborhood pixel interpolation based on the neighborhood pixel value;
And estimating the characteristic value of the sampling point according to the neighborhood pixel interpolation so as to carry out expansion convolution based on the characteristic value of the sampling point.
10. A data processing apparatus, comprising:
The self-adaptive expansion rate table calculation module is used for acquiring a feature map to be processed, inputting the feature map into the expansion rate calculation layer, and acquiring an expansion rate table corresponding to the feature map; wherein the expansion rate table comprises expansion rate values corresponding to pixels in the feature map;
The expansion convolution processing module is used for inputting the expansion rate table into an expansion convolution layer so that the expansion convolution layer carries out expansion convolution processing on the feature map in combination with the expansion rate table to obtain feature data corresponding to the feature map;
the expansion rate calculation layer comprises at least one convolution layer;
the adaptive expansion ratio table calculation module is configured to:
performing convolution processing on the characteristic map at least once to obtain the expansion rate table;
The expansion convolution processing module comprises:
inquiring the expansion rate table to obtain expansion rate values corresponding to pixels in the feature map;
And determining the sampling range of the corresponding pixel according to the expansion rate value so as to carry out expansion convolution processing.
11. A computer readable medium on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the data processing method according to any one of claims 1 to 9.
12. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which when executed by the one or more processors cause the one or more processors to implement the data processing method of any of claims 1 to 9.
CN202010963275.0A 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment Active CN112037157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010963275.0A CN112037157B (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010963275.0A CN112037157B (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN112037157A CN112037157A (en) 2020-12-04
CN112037157B true CN112037157B (en) 2024-07-02

Family

ID=73589857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010963275.0A Active CN112037157B (en) 2020-09-14 2020-09-14 Data processing method and device, computer readable medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112037157B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734015B (en) * 2021-01-14 2023-04-07 北京市商汤科技开发有限公司 Network generation method and device, electronic equipment and storage medium
CN113570859B (en) * 2021-07-23 2022-07-22 江南大学 Traffic flow prediction method based on asynchronous space-time expansion graph convolution network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961253A (en) * 2018-06-19 2018-12-07 深动科技(北京)有限公司 A kind of image partition method and device
CN109284782A (en) * 2018-09-13 2019-01-29 北京地平线机器人技术研发有限公司 Method and apparatus for detecting feature

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100463552B1 (en) * 2003-01-16 2004-12-29 엘지전자 주식회사 Cubic convolution interpolation apparatus and method
CN110543849B (en) * 2019-08-30 2022-10-04 北京市商汤科技开发有限公司 Detector configuration method and device, electronic equipment and storage medium
CN111311629B (en) * 2020-02-21 2023-12-01 京东方科技集团股份有限公司 Image processing method, image processing device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961253A (en) * 2018-06-19 2018-12-07 深动科技(北京)有限公司 A kind of image partition method and device
CN109284782A (en) * 2018-09-13 2019-01-29 北京地平线机器人技术研发有限公司 Method and apparatus for detecting feature

Also Published As

Publication number Publication date
CN112037157A (en) 2020-12-04

Similar Documents

Publication Publication Date Title
CN111179282B (en) Image processing method, image processing device, storage medium and electronic apparatus
CN111552451B (en) Display control method and device, computer readable medium and terminal equipment
CN113810601B (en) Terminal image processing method and device and terminal equipment
CN112954251B (en) Video processing method, video processing device, storage medium and electronic equipment
CN114422340B (en) Log reporting method, electronic equipment and storage medium
CN112686981A (en) Picture rendering method and device, electronic equipment and storage medium
CN112541861B (en) Image processing method, device, equipment and computer storage medium
CN111741303B (en) Deep video processing method and device, storage medium and electronic equipment
CN111770282B (en) Image processing method and device, computer readable medium and terminal equipment
CN112037157B (en) Data processing method and device, computer readable medium and electronic equipment
CN112700377A (en) Image floodlight processing method and device and storage medium
CN113709464A (en) Video coding method and related device
CN114005016A (en) Image processing method, electronic equipment, image processing system and chip system
CN112188094B (en) Image processing method and device, computer readable medium and terminal equipment
CN115412678B (en) Exposure processing method and device and electronic equipment
CN113518189A (en) Shooting method, shooting system, electronic equipment and storage medium
CN114466238B (en) Frame demultiplexing method, electronic device and storage medium
CN113674258B (en) Image processing method and related equipment
CN117593236A (en) Image display method and device and terminal equipment
CN111460942B (en) Proximity detection method and device, computer readable medium and terminal equipment
CN111294905B (en) Image processing method, image processing device, storage medium and electronic apparatus
CN115393676A (en) Gesture control optimization method and device, terminal and storage medium
CN113538226A (en) Image texture enhancement method, device, equipment and computer readable storage medium
CN113391735A (en) Display form adjusting method and device, electronic equipment and storage medium
CN117119314B (en) Image processing method and related electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant