CN109726803B

CN109726803B - Pooling method, image processing method and device

Info

Publication number: CN109726803B
Application number: CN201910022794.4A
Authority: CN
Inventors: 谷爱国; 蔡炀
Original assignee: Guangzhou Puppy Robot Technology Co ltd
Current assignee: Guangzhou Puppy Robot Technology Co ltd
Priority date: 2019-01-10
Filing date: 2019-01-10
Publication date: 2021-06-29
Anticipated expiration: 2039-01-10
Also published as: CN109726803A

Abstract

The embodiment of the application discloses a pooling method, an image processing method and an image processing device, and in the process of processing images, pooling operation considers not only the characteristics of a space domain, namely the characteristics in a sliding window at a real-time position, but also the characteristics of a time domain, namely the characteristics in the sliding window at the same position in adjacent characteristic graphs, so that the utilization rate of information is improved, and the processing precision is improved.

Description

Pooling method, image processing method and device

Technical Field

The present application relates to the field of computer vision technologies, and in particular, to a pooling method, an image processing method, and an image processing apparatus.

Background

Convolutional Neural Networks (CNN) are a kind of feed-forward Neural networks that are good at dealing with the relevant machine learning problem of images, especially large images. The convolutional neural network comprises a feature extractor consisting of a convolutional layer (convolutional layer) and a pooling layer (pooling layer), wherein the convolutional layer is used for extracting features of an input image to obtain a feature map sequence; the pooling layer is used for performing dimension reduction processing on each frame of feature map output by the convolutional layer.

Currently, for each frame feature map, the pooling method of the pooling layer is: and sliding in the feature map through a sliding window, and performing pooling operation according to the feature value in the sliding window every time the sliding window slides to a position to obtain the pooled feature value of the current position so as to achieve the purpose of reducing the dimension of the feature map.

The inventor finds that, based on the current pooling method, the accuracy of the processing result is low after the convolutional neural network processes the image. For example, when the convolutional neural network is used for target recognition, the recognition result has low accuracy.

Disclosure of Invention

It is an object of the present application to provide a pooling method, an image processing method and apparatus, which at least partially overcome the technical problems of the prior art.

In order to achieve the purpose, the application provides the following technical scheme:

a pooling method comprising:

sliding the sliding window on the first feature map;

when the sliding window slides to a first position, performing pooling operation according to the first feature map and the feature value in the sliding window at the first position in a second feature map adjacent to the first feature map to obtain a pooled feature value at the first position in the first feature map.

In the above method, preferably, the performing a pooling operation based on the feature values in the sliding window at the first position in the first feature map and a second feature map adjacent to the first feature map includes:

and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and the characteristic value in a preset area in the sliding window at the first position in the second characteristic diagram.

Preferably, the performing the pooling operation according to the feature value in the sliding window at the first position in the first feature map and the feature value in the preset area in the sliding window at the first position in the second feature map includes:

acquiring a characteristic value in a preset area in the sliding window at the first position in the second characteristic diagram;

processing the characteristic value in the preset area to obtain an area characteristic value;

and performing pooling operation by using the characteristic value in the sliding window at the first position in the first characteristic diagram and the region characteristic value.

and performing pooling operation by using the characteristic value in the sliding window at the first position in the first characteristic diagram and the characteristic value in a preset area in the sliding window at the first position in the second characteristic diagram.

and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and all the characteristic values in the sliding window at the first position in the second characteristic diagram.

In the method, preferably, the performing the pooling operation includes:

and performing a mean pooling operation or performing a maximum pooling operation.

An image processing method comprising:

processing the image through a pre-trained convolutional neural network to obtain an image processing result; the pooling layer in the convolutional neural network applies the pooling method to perform pooling processing on the characteristic diagram sequence output by the convolutional layer in the convolutional neural network.

A pooling device comprising:

the sliding module is used for sliding the sliding window on the characteristic diagram;

and the pooling module is used for performing pooling operation according to the first feature map and the feature value in the sliding window at the first position in a second feature map adjacent to the first feature map when the sliding window slides to the first position, so as to obtain a pooled feature value at the first position in the first feature map.

Preferably, in the above apparatus, the pooling module is specifically configured to:

Preferably, the pooling module is configured to perform a pooling operation, specifically, a mean pooling operation or a maximum pooling operation.

An image processing apparatus comprising:

the processing module is used for processing the image through a pre-trained convolutional neural network to obtain an image processing result; wherein, the pooling layer in the convolutional neural network applies the pooling method of any one of claims 1 to 6 to pool the feature map sequence output by the convolutional layer in the convolutional neural network.

According to the scheme, in the image processing process, the pooling operation considers not only the characteristics of a spatial domain, namely the characteristics in the sliding window at the real-time position, but also the characteristics of a time domain, namely the characteristics in the sliding window at the same position in the adjacent characteristic diagrams, so that the utilization rate of information is improved, and the processing precision is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of one implementation of a pooling method provided by an embodiment of the present application;

FIG. 2 is an exemplary diagram of three frames of feature maps that are sequentially and continuously output by the convolutional layer according to an embodiment of the present disclosure;

FIG. 3 is an exemplary graph of feature values within a sliding window at position No. 2 in the three-frame feature map shown in FIG. 2 according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a pooling device according to an embodiment of the present application.

The terms first, second, third, fourth and the like in the description and in the claims, and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be practiced otherwise than as specifically illustrated.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of an implementation of a pooling method according to an embodiment of the present application, which may include:

step S11: and sliding the sliding window on the first feature map.

The pooling method is applied to a convolutional neural network, the convolutional neural network comprises a convolutional layer and a pooling layer, the convolutional layer is used for extracting the characteristics of an image input into the convolutional neural network to obtain a characteristic diagram sequence, and the first characteristic diagram is any one of the characteristic diagram sequences output by the convolutional layer. The pooling method provided by the application is applied to a pooling layer of a convolutional neural network.

Step S12: and when the sliding window slides to the first position, performing pooling operation according to the first feature map and the feature value in the sliding window at the first position in the second feature map adjacent to the first feature map to obtain a pooled feature value at the first position in the first feature map.

The second feature map adjacent to the first feature map may refer to a feature map of a frame before the first feature map in the feature sequence output by the convolutional layer (hereinafter, referred to as a feature map of a frame before the first feature map), or refer to a feature map of a frame after the first feature map in the feature sequence output by the convolutional layer (hereinafter, referred to as a feature map of a frame after the first feature map), or refer to a feature map of a frame before and a feature map of a frame after the first feature map.

In the embodiment of the present application, once the sliding window slides to a position, pooling operation is performed according to the feature value in the sliding window at the first position in the first feature map and the feature value in the sliding window at the first position in the second feature map adjacent to the first feature map. In particular, the method comprises the following steps of,

the pooling operation may be performed according to a feature value in the sliding window at the first position in the first feature map and a feature value in the sliding window at the first position in a feature map of a previous frame of the first feature map.

Alternatively, the first and second electrodes may be,

and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and the characteristic value in the sliding window at the first position in the characteristic diagram of the next frame of the first characteristic diagram.

Alternatively, the first and second electrodes may be,

and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram, the characteristic value in the sliding window at the first position in the characteristic diagram of the previous frame of the first characteristic diagram and the characteristic value in the sliding window at the first position in the characteristic diagram of the next frame of the first characteristic diagram.

The sliding window is usually slid on the first feature map according to a preset sliding rule, and the first position is any position of the sliding window during sliding on the first feature map according to the preset sliding rule.

Wherein, the pooling operation may be: the mean pooling operation may be a maximum pooling operation, or may be another pooling operation, which is not illustrated here.

The pooling method provided by the application not only considers the characteristics of the space domain, namely the characteristics in the sliding window at the real-time position, but also considers the characteristics of the time domain, namely the characteristics in the sliding window at the same position in the adjacent characteristic diagrams, so that the utilization rate of information is improved, and the processing precision is improved.

In an alternative embodiment, if the first feature map is the first frame feature map in the feature sequence output by the convolutional layer, then: when the sliding window slides to a position, the pooling operation can be performed according to the feature value in the sliding window at the first position in the first feature map and the feature value in the sliding window at the first position in the feature map of the next frame of the first feature map.

If the first feature map is the last frame feature map in the feature sequence output by the convolutional layer, then: when the sliding window slides to a position, the pooling operation can be performed according to the feature value in the sliding window at the first position in the first feature map and the feature value in the sliding window at the first position in the feature map of the previous frame of the first feature map.

If the first feature map is neither the first frame feature map in the feature sequence output by the convolutional layer nor the last frame feature map in the feature sequence, then: when the sliding window slides to a position, the pooling operation may be performed according to the feature value in the sliding window at the first position in the first feature map, the feature value in the sliding window at the first position in the feature map of the previous frame of the first feature map, and the feature value in the sliding window at the first position in the feature map of the next frame of the first feature map.

In an alternative embodiment, if the first feature map is neither the first frame feature map in the feature sequence output by the convolutional layer nor the last frame feature map in the feature sequence, the pooling operation may be performed as follows:

when the sliding window slides to a position, performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and the characteristic value in the sliding window at the first position in the characteristic diagram of the next frame of the first characteristic diagram; alternatively, the first and second electrodes may be,

when the sliding window slides to a position, the pooling operation can be performed according to the feature value in the sliding window at the first position in the first feature map and the feature value in the sliding window at the first position in the feature map of the previous frame of the first feature map.

In an optional embodiment, the performing the pooling operation according to the feature value in the sliding window at the first position in the first feature map and the second feature map adjacent to the first feature map may specifically include:

That is, in this embodiment, the pooling operation is performed according to the feature values in the partial region within the sliding window at the first position in the second feature map (i.e., the partial feature values within the sliding window), the partial region may include only one feature value or may include a plurality of feature values, and the plurality of feature values may be a plurality of feature values that are sequentially adjacent to each other or a plurality of feature values that are not adjacent to each other (i.e., the partial region is composed of a plurality of sub-regions, and different sub-regions are not adjacent to each other).

In an optional embodiment, when performing the pooling operation according to the feature value in the sliding window at the first position in the first feature map and the feature value in the preset area in the sliding window at the first position in the second feature map, the pooling operation may be performed by directly using the feature value in the sliding window at the first position in the first feature map and the feature value in the preset area in the sliding window at the first position in the second feature map.

Fig. 2 is an exemplary diagram of three frames of feature maps that are sequentially and continuously output by the convolutional layer. The output sequence of the three frames of feature maps is X, Y, Z, namely, feature map X is output first, then feature map Y is output, and then feature map Z is output. That is, the feature map X is a feature map of a frame preceding the feature map Y, and the feature map Z is a feature map of a frame succeeding the feature map Y. In this example, it is assumed that the feature map X is the first frame feature map in the feature map sequence, and the sliding window is a window of 3 × 3.

When the current sliding window slides on the feature map X, then: in the prior art, when the sliding window slides to position 1, the pooled feature values at position 1 in the feature map X are calculated by using only the feature values in window 3 × 3 at position 1, and similarly, when the sliding window slides to position 2, the pooled feature values at position 2 in the feature map X are calculated by using only the feature values in window 3 × 3 at position 2. In the embodiment of the present application, when the sliding window slides to the position 1, the pooled feature values at the position 1 on the feature map X are calculated according to the feature values in the window 3 × 3 at the position 1 on the feature map X and the feature values in the window 3 × 3 at the position 1 on the feature map Y, and similarly, when the sliding window slides to the position 2, the pooled feature values at the position 2 on the feature map X are calculated according to the feature values in the window 3 × 3 at the position 2 on the feature map X and the feature values in the window 3 × 3 at the position 2 on the feature map Y.

Assuming that the current sliding window slides on the feature map Y: in the prior art, when the sliding window slides to position 1, the pooled feature value at position 1 in the feature map Y is calculated by using only the feature values in window 3 × 3 at position 1, and similarly, when the sliding window slides to position 2, the pooled feature value at position 2 in the feature map Y is calculated by using only the feature values in window 3 × 3 at position 2. In the embodiment of the present application, when the sliding window slides to the position 1, the pooled feature value at the position 1 on the feature map Y is calculated according to the feature value in the window 3 × 3 at the position 1 on the feature map X, the feature value in the window 3 × 3 at the position 1 on the feature map Y, and the feature value in the window 3 × 3 at the position 1 on the feature map Z, and similarly, when the sliding window slides to the position 2, the pooled feature value at the position 2 on the feature map Y is calculated according to the feature value in the window 3 × 3 at the position 2 on the feature map X, the feature value in the window 3 × 3 at the position 2 on the feature map Y, and the feature value in the window 3 × 3 at the position 2 on the feature map Z.

Specifically, assuming that the exemplary graph of the feature values in the sliding window at position No. 2 in the three-frame feature map shown in fig. 2 is as shown in fig. 3, the feature values in the sliding window at position No. 2 in the feature map X are X1 to X9, the feature values in the sliding window at position No. 2 in the feature map Y are Y1 to Y9, and the feature values in the sliding window at position No. 2 in the feature map Z are Z1 to Z9, when the sliding window slides on the feature map Y to position No. 2:

in the prior art, a specific implementation manner for calculating the pooled feature value Y at the position No. 2 on the feature map Y is as follows:

and (3) mean value pooling: y ═ y (y1+ y2+ y3+ y4+ y5+ y6+ y7+ y8+ y 9)/9;

maximum pooling operation: y-max { y1, y2, y3, y4, y5, y6, y7, y8, y9 }.

In this application, assuming that the preset region in the sliding window is the region No. 5, that is, the position of the second row and the second column in the sliding window, the feature value in the preset region at the position No. 2 in the feature map X is X5, the feature value in the preset region at the position No. 2 in the feature map Z is Z5, and accordingly, the specific implementation manner of calculating the pooled feature value Y at the position No. 2 on the feature map Y may be:

and (3) mean value pooling: y ═ y (y1+ y2+ y3+ y4+ y5+ y6+ y7+ y8+ y9+ x5+ z 5)/11;

maximum pooling operation: y-max { y1, y2, y3, y4, y5, y6, y7, y8, y9, x5, z5 }.

For another example, assuming that the preset region in the sliding window is

regions

1 and 2, that is, the positions of the first and second columns in the first row in the sliding window, the feature values in the preset region at position 2 in the feature map X are X1 and X2, the feature values in the preset region at position 2 in the feature map Z are Z1 and Z2, and accordingly, the specific implementation manner for calculating the pooled feature value Y at position 2 on the feature map Y based on the present application may be:

and (3) mean value pooling: y ═ y (y1+ y2+ y3+ y4+ y5+ y6+ y7+ y8+ y9+ x1+ x2+ z1+ z 2)/13;

maximum pooling operation: y-max { y1, y2, y3, y4, y5, y6, y7, y8, y9, x1, x2, z1, z2 }.

In another optional implementation, performing pooling operation according to the feature value in the sliding window at the first position in the first feature map and the feature value in the preset area in the sliding window at the first position in the second feature map may specifically include:

and acquiring a characteristic value in a preset area in the sliding window at the first position in the second characteristic diagram.

And processing the characteristic value in the preset area to obtain an area characteristic value. If two frames of second feature maps exist, the region feature values of the preset region in the sliding window at the first position in the frames of second feature maps are respectively calculated corresponding to each frame of second feature map, namely each frame of second feature map corresponds to one region feature value respectively.

The method for processing the feature value in the preset region may include, but is not limited to, any one of the following: and (4) solving the average value, the maximum value, the sum value, the median and the like.

And performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and the area characteristic value corresponding to the second characteristic diagram.

Different from the foregoing embodiment, in this embodiment, the feature value in the preset region in the sliding window at the first position in the second feature map is processed to obtain the region feature value of the preset region, and then the feature value in the sliding window at the first position in the first feature map and the region feature value are used to perform pooling operation.

Taking fig. 3 as an example, assuming that the preset area in the sliding window is the area 1 and the area 2, that is, the positions of the first row and the second column in the first row in the sliding window, and averaging the feature values in the preset area to obtain the area feature value, when the sliding window slides to the position No. 2 on the feature map Y, an implementation manner of calculating the pooled feature value Y at the position No. 2 on the feature map Y based on the present application may be:

and (3) mean value pooling: y ═ y1+ y2+ y3+ y4+ y5+ y6+ y7+ y8+ y9+ (x1+ x2)/2+ (z1+ z 2)/2)/11;

maximum pooling operation: y-max { y1, y2, y3, y4, y5, y6, y7, y8, y9, (x1+ x2)/2, (z1+ z2)/2 }.

Wherein, (X1+ X2)/2 is the region characteristic value corresponding to the characteristic diagram X, and (Z1+ Z2)/2 is the region characteristic value corresponding to the characteristic diagram Z.

and performing pooling operation according to the characteristic values in the sliding window at the first position in the first characteristic diagram and all the characteristic values in the sliding window at the first position in the second characteristic diagram.

Specifically, during the calculation, the feature value in the sliding window at the first position in the first feature map and all the feature values in the sliding window at the first position in the second feature map may be directly used for the pooling operation. Alternatively, the first and second electrodes may be,

all the characteristic values in the sliding window at the first position in the second characteristic diagram are processed to obtain the regional characteristic value of the region covered by the sliding window at the first position in the second characteristic diagram, and then the characteristic values in the sliding window at the first position in the first characteristic diagram and the regional characteristic value are used for carrying out pooling operation.

The manner of calculating the region feature value may refer to the foregoing embodiments, and is not described herein again.

Based on the pooling method, the present application also provides an image processing method, which may include: processing the image through a pre-trained convolutional neural network to obtain an image processing result; the pooling layer in the convolutional neural network applies the pooling method disclosed in the foregoing to pool the feature map sequence output by the convolutional layer in the convolutional neural network. Of course, in the convolutional neural network, besides the convolutional layer and the pooling layer, other layers, such as the activation layer and the fully-connected layer, are included, and the logical relationship between specific layers is mature and is not the focus of the present application, and will not be described in detail here.

In the present application, the convolutional neural network may be a convolutional neural network for target recognition, and based on the pooling method, the target recognition accuracy may be improved.

Alternatively, the first and second electrodes may be,

the convolutional neural network can be a convolutional neural network for target detection, and the target detection precision can be improved based on the pooling method.

Alternatively, the first and second electrodes may be,

the convolutional neural network may be a convolutional neural network for object segmentation, and based on the pooling method, the object segmentation accuracy may be improved. .

The following takes an Alexnet network in a convolutional neural network as an example to illustrate a specific application scenario of the present application.

step1, inputting a picture A and a label;

step2, performing convolution operation, activation function ReLU operation and local response normalization LRN operation on the sequence of the picture A to obtain a characteristic picture sequence B;

step3, performing time-space domain pooling on B (namely, the pooling method disclosed by the application) to obtain C;

step4, performing convolution operation, ReLU operation and LRN operation on the C sequence to obtain D;

step5, performing time-space domain pooling operation on the D to obtain E;

step6, performing convolution operation and ReLU operation on the E sequence to obtain F;

step7, performing convolution operation and ReLU operation on the F sequence to obtain G;

step8, performing convolution operation and ReLU operation on the G sequence to obtain H;

step9, performing time-space domain pooling operation on the H to obtain I;

step10, performing full connection operation, ReLU operation and Dropout operation on the I sequence to obtain J;

step11, performing full connection operation, ReLU operation and Dropout operation on the J sequence to obtain K;

step11, performing full connection operation on the K sequence to obtain L;

step12 training L and label input loss function.

Among them, Convolution operation (constraint), activation function (ReLU) operation, Dropout operation, all-connected operation (InnerProduct), and local response normalization operation (LRN) are common operations in a convolutional neural network, and the loss function (loss) may be selected as a cross-quotient loss function.

It should be noted that the pooling method disclosed herein is applicable not only to Alexnet networks, but also to other networks that include pooling operations.

Corresponding to the embodiment of the method, the present application further provides a pooling device, and a schematic structural diagram of the pooling device provided by the present application is shown in fig. 4, and the pooling device may include:

a slide module 41 and a pooling module 42; wherein the content of the first and second substances,

the sliding module 41 is used for sliding the sliding window on the feature map.

The pooling module 42 is configured to perform pooling operation according to the first feature map and the feature value in the sliding window at the first position in the second feature map adjacent to the first feature map when the sliding window slides to the first position, so as to obtain a pooled feature value at the first position in the first feature map.

The pooling device provided by the application has the advantages that the characteristics of a space domain, namely the characteristics in the sliding window at the position in real time, are taken into consideration in the pooling process, the characteristics of a time domain, namely the characteristics in the sliding window at the same position in the adjacent characteristic diagram are taken into consideration, the utilization rate of information is improved, and therefore the processing precision is improved.

In an alternative embodiment, the pooling module 42 may be specifically configured to:

acquiring a characteristic value in a preset area in a sliding window at a first position in a second characteristic diagram;

and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and the area characteristic value.

In an alternative embodiment, the pooling module 42 may be specifically configured to: and performing pooling operation according to the characteristic value in the sliding window at the first position in the first characteristic diagram and all the characteristic values in the sliding window at the first position in the second characteristic diagram.

In an optional embodiment, the pooling module 42 is specifically configured to perform a mean pooling operation or a maximum pooling operation when performing a pooling operation.

The present application further provides an image processing apparatus, which includes a processing module, and is specifically configured to:

processing the image through a pre-trained convolutional neural network to obtain an image processing result; the pooling layer in the convolutional neural network applies the pooling method to pool the feature map sequence output by the convolutional layer in the convolutional neural network.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

It should be understood that the technical problems can be solved by combining and combining the features of the embodiments from the claims.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A pooling method, comprising:

processing the image through a pre-trained convolutional neural network to obtain an image processing result; wherein, the pooling processing process of the characteristic diagram sequence output by the convolutional layer in the convolutional neural network by the pooling layer in the convolutional neural network comprises the following steps:

sliding the sliding window on the first feature map;

when the sliding window slides to a first position, performing mean pooling operation or maximum pooling operation according to the first feature map and the feature value in the sliding window at the first position in a second feature map adjacent to the first feature map to obtain a pooled feature value at the first position in the first feature map.

2. The method of claim 1, wherein performing a pooling operation based on feature values within the sliding window at the first location in the first feature map and a second feature map adjacent to the first feature map comprises:

3. The method of claim 2, wherein the performing a pooling operation according to the feature values in the sliding window at the first position in the first feature map and the feature values in a predetermined area in the sliding window at the first position in the second feature map comprises:

4. The method of claim 2, wherein the performing a pooling operation according to the feature values in the sliding window at the first position in the first feature map and the feature values in a predetermined area in the sliding window at the first position in the second feature map comprises:

5. The method of claim 1, wherein performing a pooling operation based on feature values within the sliding window at the first location in the first feature map and a second feature map adjacent to the first feature map comprises:

6. A pooling device, comprising:

the processing module is used for processing the image through a pre-trained convolutional neural network to obtain an image processing result; the pooling layer in the convolutional neural network performs pooling processing on a characteristic diagram sequence output by the convolutional layer in the convolutional neural network; the pooling layer includes:

the sliding module is used for sliding the sliding window on the first characteristic diagram;

and the pooling module is used for performing mean pooling operation or maximum pooling operation according to the first feature map and the feature value in the sliding window at the first position in a second feature map adjacent to the first feature map when the sliding window slides to the first position, so as to obtain the pooled feature value at the first position in the first feature map.

7. The apparatus of claim 6, wherein the pooling module is specifically configured to:

8. The apparatus of claim 7, wherein the pooling module is specifically configured to:

9. The apparatus of claim 7, wherein the pooling module is specifically configured to:

10. The apparatus of claim 6, wherein the pooling module is specifically configured to: