CN111684489B

CN111684489B - Image processing method and device

Info

Publication number: CN111684489B
Application number: CN201880088083.0A
Authority: CN
Inventors: 庄哲綸; 潘积桂
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-12-04
Filing date: 2018-12-04
Publication date: 2023-07-11
Anticipated expiration: 2038-12-04
Also published as: WO2020113419A1; WO2020113419A9; CN111684489A

Abstract

An image processing method and apparatus are disclosed, the method comprising: acquiring a first frame image and a second frame image of a shooting moving object; dividing the first frame image into MxN first rectangular windows, dividing the second frame image into MxN second rectangular windows, and calculating a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window, wherein each of the first average pixel value and the second average pixel value comprises m average pixel values in the horizontal direction and n average pixel values in the vertical direction; at least one motion window of the MxN second rectangular windows is determined from a first average pixel value of the MxN first rectangular windows and a second average pixel value of the MxN second rectangular windows. The method simplifies all pixel values of the original image into average values and stores the average values by using an average value algorithm of the pixel values, thereby achieving the beneficial effects of small occupied storage resources and hardware cost saving.

Description

Image processing method and device

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method and apparatus.

Background

When shooting, the phenomenon of image blurring can be generated for the object moving in shooting, so that the film forming rate is lower, and the shooting effect is poor. In order to improve the film forming rate of the snap shot moving object and reduce the ambiguity of shooting the moving object, the speed and the direction of the moving object need to be detected in real time, then the exposure parameters are controlled according to the speed and the direction of the moving object or other algorithms are adopted for processing, for example, in the automatic focusing process, the focusing parameters are controlled according to the detected speed and direction of the moving object, so that the real-time continuous focusing can be realized, and the effect of shooting clear images is achieved.

In general, one of the detection methods for a moving object is implemented based on an image processing algorithm. In the process of detecting the motion of an object based on an image processing algorithm, a series of characteristic points are mostly selected from a shot image, and the motion vector of the object is determined through detection and analysis of the characteristic points. Specifically, the motion condition of the object can be judged by comparing the difference of the two images of the current frame and the previous frame. When the device stores the previous frame of image, all pixel information of the previous frame of image needs to be stored, so that a large amount of storage space is occupied, and the hardware cost is increased.

Disclosure of Invention

The application provides an image processing method and device, which can determine the motion condition of an object by using a method with low memory access and reduce hardware cost.

In a first aspect, the present application provides an image processing method, which may be performed by a terminal device, and in particular, the method includes: taking a first frame image and a second frame image of a shooting moving object; dividing the first frame image into MxN first rectangular windows, dividing the second frame image into MxN second rectangular windows, wherein M represents the number of rectangular windows in the horizontal direction, N represents the number of rectangular windows in the vertical direction, M and N are positive integers, each of the MxN first rectangular windows and the MxN second rectangular windows comprises M multiplied by N pixels, M is the number of pixels in the horizontal direction, N is the number of pixels in the vertical direction, and M and N are positive integers; calculating a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window, each of the first average pixel value and the second average pixel value comprising m average pixel values in a horizontal direction and n average pixel values in a vertical direction; at least one motion window of the MxN second rectangular windows is determined from the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows.

In the method, all pixel values of the original image are simplified into average values by using an average value algorithm of the pixel values, and the average values can be cached or stored only without storing all pixel values, so that the beneficial effects of small occupied storage resources and hardware cost saving are achieved. In addition, the method can quickly determine at least one motion window by using a motion detection algorithm with low operand memory access, and improves the calculation efficiency.

With reference to the first aspect, in a possible implementation manner of the first aspect, the calculating a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window includes: determining first and second pixel regions of each first rectangular window, and determining third and fourth pixel regions of each second rectangular window, the first pixel region including the each first rectangular window and at least one first rectangular window horizontally adjacent to the each first rectangular window, the second pixel region including the each first rectangular window and at least one first rectangular window vertically adjacent to the each first rectangular window, the third pixel region including the each second rectangular window and at least one second rectangular window horizontally adjacent to the each second rectangular window, the fourth pixel region including the each second rectangular window and at least one second rectangular window vertically adjacent to the each second rectangular window; calculating m average pixel values in the horizontal direction in a first pixel region, and obtaining m average pixel values in the horizontal direction in the first average pixel value; calculating average pixel values of n vertical directions in a second pixel region to obtain average pixel values of n vertical directions in the first average pixel value; calculating m average pixel values in the horizontal direction in the third pixel region to obtain m average pixel values in the horizontal direction in the second average pixel value; and calculating average pixel values in n vertical directions in the fourth pixel region, and obtaining average pixel values in n vertical directions in the second average pixel value.

In the implementation manner, when the first pixel area and the second pixel area of each first rectangular window are determined, the average value of each pixel of the current window and each pixel of the adjacent window are calculated, so that the calculation accuracy is improved, namely, the noise reduction effect is achieved by accumulating the projection histograms of the adjacent windows, and the calculation result is more accurate.

With reference to the first aspect, in another possible implementation manner of the first aspect, the determining at least one motion window of the MxN rectangular windows according to the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows includes: calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window; judging whether the correlation coefficient is smaller than a first threshold value or not; and if so, determining the second rectangular window corresponding to the correlation coefficient as a motion window. Optionally, the correlation coefficient includes a confidence level.

With reference to the first aspect, in a further possible implementation manner of the first aspect, the method further includes: calculating SAD values of all adjacent two pixels in the search range by using the absolute error and SAD algorithm; selecting a minimum SAD value from all the SAD values, and determining the speed of each motion window according to the minimum SAD value; wherein the velocity is determined by at least one of the number of horizontally displaced pixels or the number of vertically displaced pixels within each of the motion windows.

With reference to the first aspect, in a further possible implementation manner of the first aspect, after calculating the correlation coefficient, the method further includes: calculating a gradient value of each second rectangular window, wherein the gradient value comprises a sum of a horizontal gradient value and a vertical gradient value in each second rectangular window, the horizontal gradient value is a sum of differences between every two adjacent pixel values in m horizontal direction pixel values, and the vertical gradient value is a sum of differences between every two adjacent pixel values in n vertical direction pixel values; determining a probability value of each second rectangular window according to the gradient value of each second rectangular window; judging whether the probability value is larger than a second threshold value or not; and if so, reducing the correlation coefficient of the second rectangular window corresponding to the probability value.

In the implementation manner, since the accuracy of the motion information of the low texture regions is low, all the low texture regions of the current frame image can be determined by a method of calculating gradient values, and the confidence of the low texture regions is reduced, so that the accuracy of the overall motion information is improved.

With reference to the first aspect, in a further possible implementation manner of the first aspect, the method further includes: and performing spatial domain filtering processing on the MxN second rectangular windows in the second frame image to obtain M 'x N' third rectangular windows and the speed and the confidence of each third rectangular window, wherein M 'and N' are positive integers, M 'is smaller than M, and N' is smaller than N.

In the implementation mode, the spatial domain filtering processing is carried out on the motion information to obtain stable displacement and confidence, so that accuracy in calculating the motion speed and direction is improved, and noise interference is reduced.

In a second aspect, the present application also provides an image processing apparatus comprising functional units for performing the method of the first aspect and various implementations of the first aspect.

Optionally, the functional unit includes an acquisition unit and a processing unit. Further, a transmitting unit, a storage unit, and the like may be included.

In a third aspect, embodiments of the present application also provide a communication device including a processor coupled with a memory for storing instructions; the processor is configured to execute the instructions in the memory, so that the communication device performs the image processing method in the foregoing first aspect and various implementation manners of the first aspect.

Optionally, the communication means comprises a hardware device, such as a terminal device.

In a fourth aspect, embodiments of the present application further provide a computer readable storage medium having stored therein instructions for performing the foregoing first aspect and the image processing methods in the various implementations of the first aspect when the instructions are run on a computer or a processor.

In a fifth aspect, embodiments of the present application further provide a computer program product comprising computer instructions which, when executed by a computer or processor, implement the foregoing first aspect and the image processing methods in various implementations of the first aspect.

In a sixth aspect, embodiments of the present application further provide a chip system, the chip system including a processor and an interface circuit, the interface circuit being coupled to the processor, the processor being configured to execute a computer program or instructions to implement the foregoing first aspect and the methods in various implementations of the first aspect; the interface circuit is used for communicating with other modules outside the chip system.

The application provides an image processing method and device, wherein the method simplifies all pixel values of an original image into an average value by using an average value algorithm of the pixel values, and only the average value can be cached or saved without storing all pixel values, so that the beneficial effects of small occupied storage resources and hardware cost saving are achieved. The method can rapidly determine at least one motion window by using a motion detection algorithm with low operand memory access, and improves the calculation efficiency.

Drawings

Fig. 1 is an overall flowchart of an image processing method provided in an embodiment of the present application;

fig. 2 is a flowchart of an image processing method according to an embodiment of the present application;

fig. 3 is a schematic diagram of a correspondence relationship between a pixel value and a weight value according to an embodiment of the present application;

fig. 4 is a schematic diagram of dividing a pixel area according to an embodiment of the present application;

FIG. 5 is a schematic diagram of calculating an average pixel value of a rectangular window using pixel values according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a low-memory stored projection histogram according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a scanning search for a minimum SAD value using an SAD algorithm according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a calculation to determine a minimum SAD value according to an embodiment of the present application;

FIG. 9 is a flowchart of a texture region determination method according to an embodiment of the present application;

fig. 10 is a schematic diagram of a correspondence relationship between a gradient value and a probability value according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a relationship between displacement and confidence according to an embodiment of the present disclosure;

FIG. 12 is a schematic diagram of a multi-dimensional filtered image according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

Fig. 14 is a schematic structural diagram of a hardware device according to an embodiment of the present application.

Detailed Description

In order to better understand the technical solution in the embodiments of the present application and make the above objects, features and advantages of the embodiments of the present application more obvious, the technical solution in the embodiments of the present application is described in further detail below with reference to the accompanying drawings.

Before the technical scheme of the embodiment of the present application is described, first, a technical scenario of the embodiment of the present application is briefly described.

The method and the device are applied to the technical field of image processing, particularly relate to processing of images shot by moving objects, determine the motion condition of the objects by acquiring and calculating the information of the moving objects, and achieve the effects of reducing the calculated amount, reducing the memory for storing the information and reducing the hardware cost in the process of calculating the motion information of the objects.

The method provided by the embodiment relates to processing a series of images including two adjacent frames of images, wherein the two frames of images include: a first frame image obtained by shooting at a previous moment and a second frame image obtained by shooting at a current moment. The object photographed by the first frame image and the second frame image are the same, but the contents of the two frame images are different due to the motion of the object.

Specifically, referring to fig. 1, an overall flowchart of an image processing method according to an embodiment of the present application is shown, where the flowchart mainly includes three processing procedures, namely, a first portion (S1), a second portion (S2), and a third portion (S3).

Taking the second frame image acquired at the current moment as an example, in the whole processing flow:

s1: image preprocessing process. Mainly comprises the following steps: the acquired second frame image is subjected to a low-pass filtering process, and a process of cutting into a plurality of (e.g., m×n) rectangular windows is performed.

S2: and (5) calculating a projection histogram. Mainly comprises the following steps: and calculating projection histograms of the M multiplied by N rectangular windows after cutting to obtain an average pixel value of each rectangular window, and storing the calculated average pixel value in a random access memory (random access memory, RAM).

S3: motion information estimation. Mainly comprises the following steps: the motion information estimation can determine the displacement and direction of an object in the front and back frames of images, and the multidimensional filtering processing is to further process the calculated displacement and direction so as to improve the stability of calculation.

In the process diagrams of S1 to S3 described above, it is also included to store the calculated projection histogram of the second frame image, the motion information of the second frame image in the RAM, and acquire the related information of the first frame image, such as the projection histogram of the first frame image and the motion information of the first frame image, from the RAM, for use in the S3 "motion information estimation" flow.

The flow of each part in S1 to S3 related to the present embodiment is described in detail below.

The image processing method provided by the embodiment of the application can solve the problem of calculating how to generate high-precision and high-stability motion information by using a design with low hardware cost.

As shown in fig. 2, an image processing method is provided, and an execution subject of the method may be a terminal including a camera, such as a User Equipment (UE), and a form of the terminal includes, but is not limited to, a mobile phone, a computer, or a wearable device. Alternatively, the subject of execution of the method may be other devices including cameras, such as a network monitoring device or server, etc. The method comprises the following steps:

step 201: a first frame image and a second frame image of a moving object are acquired.

The first frame image is an image shot at the moment t-1 or a previous frame image; the second frame image is an image shot at the moment t or a current frame image.

In this embodiment, the motion condition of the object is determined by using the first frame image and the second frame image, so before the second frame image is acquired, the method further includes performing low-pass filtering processing on the second frame image to reduce noise and improve resolution of the second frame image.

S1 specifically comprises: the second frame image is processed using a two-dimensional low pass filter (lowpass filter). As shown in FIG. 3, the pixels of the second frame image are P ₀₀ To P ₂₂ The 9 pixels represent, where each pixel corresponds to a pixel value. The pixel points are also simply referred to as pixels. For example P ₀₀ ＝P ₀₁ ＝P ₀₂ ＝P ₁₀ ＝P ₁₂ ＝P ₂₀ ＝P ₂₁ ＝P ₂₂ ＝100，P ₁₁ =150 due to P ₁₁ Far greater than the surrounding other pixel values, the pixel value P can be considered ₁₁ Is a noise point.

This noise point needs to be denoised by a low pass filter, and the denoising process includes: calculating a target pixel value (e.g., P ₁₁ ' representation) to replace the noise point, the calculation process is: multiplying the pixel values related to the noise point by the convolution weight values corresponding to the pixel values, respectively, averaging,

will calculate the target pixel value P ₁₁ ' =112 substitution of original P ₁₁ =150, so that the target pixel value is similar to other surrounding pixel values, and accurate motion-related information is obtained by subsequent calculation through denoising operation. It can be understood that other noise points can be denoised by the same method, which is not described in detail in this embodiment.

The convolution weight corresponding to each pixel can be set according to the ambient brightness or a camera parameter, such as an ISO (international standard organization) value, so as to obtain the best denoising effect; in addition, the low-pass filter can be realized by utilizing a multiplier, an adder and a displacement arithmetic unit, thereby being beneficial to reducing the hardware cost.

Step 202: the first frame image is divided into MxN first rectangular windows, and the second frame image is divided into MxN second rectangular windows.

M represents the number of rectangular windows in the horizontal direction, N represents the number of rectangular windows in the vertical direction, M and N are positive integers, each of the MxN first rectangular windows and the MxN second rectangular windows comprises M times N pixels, M is the number of pixels in the horizontal direction, N is the number of pixels in the vertical direction, and M and N are positive integers.

Step 203: a first average pixel value for each first rectangular window and a second average pixel value for each second rectangular window are calculated, each of the first average pixel value and the second average pixel value comprising m average pixel values in a horizontal direction and n average pixel values in a vertical direction.

Further, step 202 specifically includes: determining a first pixel region and a second pixel region of each first rectangular window, and determining a third pixel region and a fourth pixel region of each second rectangular window, the first pixel region including the each first rectangular window and at least one first rectangular window horizontally adjacent to the each first rectangular window, the second pixel region including the each first rectangular window and at least one first rectangular window vertically adjacent to the each first rectangular window, the third pixel region including the each second rectangular window and at least one second rectangular window horizontally adjacent to the each second rectangular window, the fourth pixel region including the each second rectangular window and at least one second rectangular window vertically adjacent to the each second rectangular window.

Step 203 comprises: and calculating the average pixel values of m horizontal directions in the first pixel region, and obtaining the average pixel values of m horizontal directions in the first average pixel value.

And calculating average pixel values of n vertical directions in the second pixel region, and obtaining average pixel values of n vertical directions in the first average pixel value.

And calculating the average pixel values in the m horizontal directions in the third pixel area to obtain the average pixel values in the m horizontal directions in the second average pixel value.

And calculating average pixel values in n vertical directions in the fourth pixel region, and obtaining average pixel values in n vertical directions in the second average pixel value.

Specifically, as shown in fig. 4, for a first frame image, which is composed of several pixel values, the first frame image is divided into 3×3 (9 total) rectangular windows, m=3, n=3. A total of 9 first rectangular windows are included in the first frame image, respectively: p1, P2, P3, P4, P5, P6, P7, P8 and P9. The first pixel region and the second pixel region are divided for each of the 9 rectangular windows. Taking the first rectangular window P5 as an example, two first rectangular windows horizontally adjacent to P5 are P4 and P6, respectively, the first pixel region includes P5, P4 and P6. Similarly, if the first rectangular window vertically adjacent to P5 has P2 and P8, the second pixel region includes P5, P2 and P8.

For a first rectangular window at a corner, such as P1, the first pixel regions are P1 and P2, and the second pixel regions are P1 and P4; for a first rectangular window located in the edge area, such as P4, the first pixel areas are P4 and P5, and the second pixel areas are P4, P1 and P7.

Similarly, for the second frame image, the same number of the second rectangular windows of 3×3 are divided, and the method process of determining the third pixel area and the fourth pixel area of each rectangular window is the same as that of the first pixel area and the second pixel area in this embodiment, which is not described in detail in this embodiment.

As shown in fig. 5, within each divided first rectangular window, m times n pixels are further included, each pixel corresponding to a pixel value, for example, m=4, n=3. The average pixel value of the first pixel region of each first rectangular window is calculated according to the following formula (1), and the average pixel value of the second pixel region of each first rectangular window is calculated according to the formula (2).

Wherein X is _j Represents the average pixel value in the horizontal direction, Y _i Average pixel value in vertical direction, int represents rounding, P represents pixel value, i, j are constants, i has value range of [0, m-1 ]]The value range of j is [0, n-1 ] ]And each includes an end value.

Optionally, the X _j May also be denoted as "PorjHist_X", said Y _i May also be denoted as "PorjHist_Y".

For example, for the first rectangular window P5, average pixel values in 4 horizontal directions in the first pixel region thereof are calculated, and average pixel values (X2, X3, and X4) in 4 horizontal directions in the first average pixel value are obtained, specifically:

the first rectangular window P5 calculates average pixel values in 3 vertical directions in the second pixel region, and obtains average pixel values (Y1, Y2, and Y3) in 3 vertical directions in the first average pixel value, specifically:

the average pixel values {54, 64, 74, 84} in the 4 horizontal directions and the horizontal pixel values {68, 69, 70} in the 3 vertical directions of the first rectangular window P5 are calculated, and 7 average pixel values are taken as a total. And, the method further comprises the step of saving 7 average pixel values corresponding to the first rectangular window P5.

Similarly, the same method is used for the other 8 first rectangular windows of the first frame image, and 7 average pixel values can be calculated, including 4 average pixel values in the horizontal direction and 3 horizontal pixel values in the vertical direction, and these average pixel values are saved to prepare for subsequent calculation.

It will be appreciated that for the second frame image, since the same MxN second rectangular windows are also divided and each second rectangular window includes m times n pixels, m=4, n=3, the corresponding average pixel value can be calculated according to the above formulas (1) and (2).

In this embodiment, the algorithm of calculating the average value of the pixel values is utilized to simplify the original pixel values from a plurality of pixel values to an average value, so that only the average value can be cached or saved without storing all the pixel values, for example, for the first rectangular window P5, only 7 average pixel values are needed to be stored, compared with the original pixel values corresponding to 12 pixels, the storage space is saved, the corresponding calculation amount is also reduced, and therefore the beneficial effects of reducing storage resources and saving hardware cost are achieved.

In addition, when the first pixel area and the second pixel area of each first rectangular window are determined, the average value of each pixel of the current window and each pixel of the adjacent window are calculated, so that the calculation accuracy is improved, namely, the projection histogram of the adjacent windows is accumulated to achieve the noise reduction effect.

Alternatively, as shown in fig. 6, the method replaces a large number of pixel values of each row or each column with calculated average pixel values, which is equivalent to storing horizontal projection values and vertical projection values of the pixel values of each column, that is, pixel information of a frame of image can be represented by projection histograms, and referring to fig. 6, each rectangular window projects in the x-axis direction and the y-axis direction respectively.

By using the method provided by the embodiment of the application, for one frame of image, the number of the average pixel values to be stored is 63 (7 multiplied by 9), compared with the original 108 (12 multiplied by 9) pixel values, the storage space is saved, and the hardware cost is reduced.

Step 204: at least one motion window of the MxN second rectangular windows is determined from the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows.

The step 204 specifically includes: calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window; judging whether the correlation coefficient is smaller than a first threshold value, if so, determining that the second rectangular window corresponding to the correlation coefficient is a motion window; if not, i.e. the correlation coefficient is greater than or equal to a first threshold, it is determined that it is not a moving window.

The motion window may be understood as an area where the displacement of the object in the current image changes compared to the previous frame image.

Specifically, calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window includes:

And (3) calculating according to the formula (1) and the formula (2) to obtain an average pixel value of each first rectangular window and an average pixel value of each second rectangular window, and comparing pixels at the same position in the first frame image and the second frame image when calculating the correlation system and comparing.

Wherein the correlation coefficient is calculated using the following formulas (3) to (6), the correlation coefficient including correlation coefficients in the horizontal direction and the vertical direction. Further, the correlation coefficient is a confidence coefficient, which is denoted by "Q", wherein "Qx" denotes a confidence coefficient in a horizontal direction and "Qy" denotes a confidence coefficient in a vertical direction.

Horizontal direction:

vertical direction:

in a specific application example, assuming that m average pixel values in the horizontal direction of the second rectangular window of the current frame (Curr) image are {54, 64, 74, 84}, and m average pixel values in the horizontal direction of the first rectangular window of the previous frame (Pre) image are {100, 20, 150, 40}, determining whether the second rectangular window of the current frame image is a moving window according to the above formula includes:

horizontal direction:

I＝54 ² +64 ² +74 ² +84 ² ＝19544，

T＝100 ² +20 ² +150 ² +40 ² ＝34500，

I′×T′＝54×100+64×20+74×150+84×40＝21140，

generally, the correlation coefficient, that is, the confidence level Q, has a value ranging from [ -1,1], including an end value, and if the calculated correlation coefficient is closer to 1, it indicates that the two signals are similar; conversely, the closer to-1, the greater the phase difference between the two signals. In this embodiment, assuming that the first threshold is 0.9, 0.753 < 0.9 is compared, that is, the correlation coefficient is smaller than the first threshold, and the rectangular window is determined to be a moving window.

In the same way, the correlation coefficient (confidence) of each rectangular window is calculated by traversal, and compared with a first threshold value, all the motion windows of the second frame image compared with the first frame image are determined.

It should be noted that, in this embodiment, whether the horizontal direction and the vertical direction of each rectangular window are moving windows is calculated respectively, and when only the currently calculated horizontal direction confidence coefficient and the currently calculated vertical direction confidence coefficient of the rectangular window are both greater than or equal to the first threshold value, it is determined that the rectangular window is not a moving window. Otherwise, if the confidence level of at least one direction is smaller than the first threshold value, the rectangular window is considered to be a motion window.

Alternatively, the first threshold may be obtained experimentally or empirically by one skilled in the art.

According to the method provided by the embodiment, the average pixel value and the correlation coefficient of each rectangular window are calculated by using the low-memory projection histogram, so that the storage resources occupied by storing the pixel values can be reduced, the hardware cost is reduced, and meanwhile, the operand is reduced, so that the calculation efficiency of determining the motion window is improved. For example, taking a two-dimensional eight-bit (bit) diagram with a rectangular window size of 12x9 (m=12, n=9) of the split image as an example, the storage space required for the two-dimensional eight-bit (bit) diagram is 12x9x 8=864 bits, and after the projection histogram is calculated, the storage space occupied by the two-dimensional eight-bit (bit) diagram only needs 12x 8+9x8=168 bits.

In addition, in the third section S3, the motion information estimation process further includes determining motion information of each rectangular window, where the motion information includes a speed and a direction of displacement of the object.

Specifically, the method comprises the following steps: calculating SAD values of all two adjacent pixels in the search range by using a sum of absolute error (sum of absolute difference, SAD) algorithm; selecting a minimum SAD value from all the SAD values, and determining the speed of each motion window according to the minimum SAD value; wherein the velocity is determined by at least one of the number of horizontally displaced pixels or the number of vertically displaced pixels within each of the motion windows. The specific algorithm of SAD belongs to the prior art, and reference may be made to the calculation process in the prior art, which is not described here.

The Search Range (Search Range) may be preset in relation to the hardware device. For example, if the time interval between the first frame image and the second frame image is 1/30 second, all the processing procedures are completed within the time interval of 1/30 second, at least two approximately estimated rectangular windows are used as the search range, in this embodiment, fig. 7 shows the search range of 7 rectangular windows, but may also be 3, 4 or other rectangular windows, which is not limited in this embodiment.

Further, a fixed range is traversed and searched in the horizontal direction, pixels are taken as displacement units, a projection histogram in the x direction and an SAD algorithm are adopted to calculate horizontal pixel displacement of a single rectangular window of a motion area, and the x-direction displacement of the rectangular window is the number of pixels with the minimum SAD value, and a schematic diagram for scanning and searching the minimum SAD value by using the SAD algorithm is shown in FIG. 7. Typically, the search range includes 3 rectangular windows extending horizontally in the positive x-axis direction and 3 rectangular windows extending in the negative x-axis direction, for a total of 6 rectangular window ranges. Referring to fig. 8, each rectangular window includes 4 horizontal pixel values, and when searching for the minimum SAD value, each time the first rectangular window is shifted by one pixel, the first rectangular window is shifted to a position when the average pixel value of the current rectangular window is the same as or similar to the average pixel value of the previous frame.

Similarly, in the vertical y-direction, the y-direction displacement of the rectangular window is similar to the x-direction, i.e., the projected histogram in the x-direction is replaced with the projected histogram in the y-direction, and a search is performed in the up-down range to determine the number of pixels of the minimum SAD value.

Alternatively, the search range may be preset and determined. The group of pixels in the horizontal direction shown in fig. 7 has a range of 7 pixels (pixels).

The SAD algorithm is as follows:

in the horizontal direction of the machine, the machine is provided with a plurality of air channels,

in the vertical direction of the vertical direction,

wherein, pre PorjHist_X _j Average pixel value in horizontal direction representing previous frame image, pre PorjHist_Y _j An average pixel value representing one vertical direction of the previous frame image; curr PorjHist_X _j Average pixel value in one horizontal direction of current frame image _j Representing the average pixel value in one vertical direction of the current frame image.

In the present embodiment, as shown in fig. 8, taking an example of average pixel values in 4 horizontal directions within one motion window of the current frame image, it is assumed that an average pixel value (Pre porjhist_x) in the horizontal direction of the previous frame (first frame) image _j ) Is {84, 74, 64, 54}, the average pixel value in the horizontal direction of the current frame (second frame) image (Curr porjhist_x _j ) Is {54, 64, 74, 84}.

The calculated respective SAD values are:

SAD1＝|84-54|+|74-64|+|64-74|+|54-84|＝80，

SAD2＝|74-54|+|64-64|+|54-74|+|64-84|＝60，

SAD3＝|64-54|+|54-64|+|64-74|+|74-84|＝40，

SAD4＝|54-54|+|64-64|+|74-74|+|84-84|＝0。

comparing SAD1 to SAD4, the minimum SAD value is sad4=0, and compared with the previous frame image, the horizontal displacement of the rectangular frame is 3 pixels, that is, the horizontal displacement of the current motion window is 3 pixels, and the direction is horizontal to the right, by a total of 4 pixel positions.

Similarly, all SAD values in the vertical direction of each motion window are calculated according to the above formula (8), and the displacement and direction in the vertical direction are determined based on the smallest SAD value among all SAD values; and finally, according to the calculated displacement pixels and directions of the moving window in the horizontal direction and the vertical direction, the actual moving speed and direction of the moving window are determined in a synthesized mode.

Specifically, the process of calculating the horizontal displacement and the vertical displacement of each moving window by using the formula (7) and the formula (8) may refer to the above-mentioned process of calculating the 4 average pixel values of one moving window in the horizontal direction, which is not described in detail in this embodiment.

It should be noted that according to this method, m×n rectangular windows may also be traversed, and the speed of each rectangular window is calculated, where for a non-moving window, its horizontal displacement or vertical displacement may be 0.

In this embodiment, the displacement and direction of the moving window are calculated by the SAD algorithm and the average pixel value of each rectangular window, and a motion detection algorithm with non-feature point detection, low operation amount and low memory access amount is adopted, so that compared with the existing feature point sampling algorithm, the operation amount is reduced, and the operation efficiency is improved.

The motion of the object may also be represented by a "displacement amount", where the displacement amount includes a horizontal displacement amount and a vertical displacement amount, and further, in this embodiment, the horizontal displacement of the motion window may be represented by 3 pixels, the horizontal displacement amount of the motion window may be represented by 3 pixels, and similarly, the displacement amount in the vertical direction may also represent the number of pixels that the motion window moves in the vertical direction.

Optionally, the embodiment further provides a flat area determining method, which can identify a low texture area of the image, and improve accuracy of motion information by adjusting a correlation coefficient, such as a confidence coefficient, of the low texture area.

Specifically, as shown in fig. 9, after calculating the correlation coefficient of each rectangular window of the second frame image in the above step 204, the method further includes:

step 301: and calculating the gradient value of each second rectangular window.

Wherein the gradient values include a sum of a horizontal gradient value and a vertical gradient value within each of the second rectangular windows, the horizontal gradient value being a sum of differences between each adjacent two of the m horizontal-direction pixel values, and the vertical gradient value being a sum of differences between each adjacent two of the n vertical-direction pixel values.

Specifically, the gradient value is calculated using the following formulas (9) to (11):

formula (11) gradient=gradient_x+gradient_y.

Wherein gradient_X represents the sum of differences between every two adjacent pixel values among the pixel values in the horizontal direction, curr ProjHist_X _j Average pixel value in horizontal direction representing jth pixel of current frame, curr ProjHist_X _j-1 Average pixel value in horizontal direction representing j-1 th pixel of current frame; gradent_Y represents the sum of the differences between every two adjacent pixel values in the vertical direction pixel values, curr ProjHist_Y represents the average pixel value in the vertical direction of the jth pixel of the current frame, curr ProjHist_Y _j-1 An average pixel value in a vertical direction representing the j-1 th pixel of the current frame; gradient represents the Gradient value.

In one embodiment, taking the average pixel values in the 4 horizontal directions and the average pixel values in the 3 vertical directions shown in Table 1 as examples,

Curr ProjHist_X	54	64	74	84
					Curr ProjHist_Y	68	69	70

TABLE 1

The process of calculating the gradient value includes:

Gradient_X＝|64-54|+|74-64|+|84-74|＝30，

Gradient_Y＝|69-68|+|70-69|＝2，

Gradient＝Gradient_X+Gradient_Y＝30+2＝32。

therefore, the gradient value of a second rectangular window in the image obtained after calculation is 32.

Similarly, the gradient values of other rectangular windows can also be calculated by the above formulas (9) to (11), and this embodiment will not be described in detail here.

Step 302: and determining the probability value of each second rectangular window according to the gradient value of each second rectangular window.

Wherein there is a correspondence between each gradient value and the probability of a flat region (ratio_sarea). As shown in FIG. 10, the probability of the gradient value corresponding between 30 and 40 is [1,0], inclusive. As can be seen from fig. 10, the smaller the gradient value, the larger the probability value corresponding to the region is represented; conversely, the larger the gradient value, the smaller the corresponding probability value.

In this embodiment, when the gradient value is 32, the corresponding probability value is 0.8.

It can be understood that the present embodiment only exemplifies the probability value of the low texture region converted from the total gradient value through a nonlinear function in fig. 10, and may also include other corresponding relations, and the present embodiment does not limit the expression form of the corresponding relation between the gradient value and the probability value.

Step 303: and judging whether the probability value is larger than a second threshold value.

Alternatively, the second threshold may be a number or a range of values. For example, the second threshold is 0.5, or 0.4 to 0.6.

Step 304: and if the probability value is determined to be larger than a second threshold value, reducing the correlation coefficient of the second rectangular window corresponding to the probability value.

In this embodiment, if the calculated probability value is 0.8 or greater than the second threshold value 0.5, that is, if the probability value is greater than the second threshold value, the confidence level of the region is reduced, and in a specific manner, the correlation coefficient of the region is reduced, where the correlation coefficient is the confidence level of the second rectangular window.

The probability value is smaller than or equal to the second threshold value, and the rectangular window corresponding to the probability value is a low texture region. According to the steps 301 to 304, all the second rectangular windows of the second frame image are traversed, whether each second rectangular window is a low texture region is determined, and the confidence of all the determined low texture regions is reduced, so that the accuracy of the overall motion information is improved.

In this embodiment, since the accuracy of the motion information of the low texture regions is low, all the low texture regions of the current frame image can be determined by calculating the gradient value, and the confidence of the low texture regions is reduced, thereby improving the accuracy of the overall motion information.

Optionally, the method provided in this embodiment further includes a "multidimensional filtering process" in S3, which may be applied to a multidimensional filtering algorithm for anti-noise and high resolution images: if the displacement calculated by each window has larger displacement difference with the adjacent area or the displacement difference between the front and rear images at the same position is larger, the unstable jumping phenomenon is easily generated for the moving object speed or the frame of the moving object window, and the multidimensional filtering treatment is adopted to overcome the phenomenon.

Further, the flow of the "multidimensional filtering process" may include the following steps:

1. if the displacement amounts of the target window in the horizontal direction x and the vertical direction y are different from the average displacement amounts of the adjacent multiple (e.g. 8) direction windows by a larger value, for example, the average displacement amount of the target window can be calculated to replace the current displacement amount, so as to achieve the denoising effect.

Specifically, the process is similar to the process of denoising with a two-dimensional low-pass filter (lowpass filter) in the above step S1, and the specific calculation process is referred to above, which is not repeated here for example.

2. The method further comprises the steps of: and performing spatial domain filtering processing on the MxN second rectangular windows in the second frame image to obtain M 'x N' third rectangular windows and the speed and the confidence of each third rectangular window, wherein M 'and N' are positive integers, M 'is smaller than M, and N' is smaller than N.

Specifically, the displacement and confidence of every 2x2 rectangular windows are subjected to a maximum likelihood estimation (Maximum likelihood estimation) method to generate a displacement and confidence with higher spatial domain stability, and after the displacement and confidence are processed by the step, the number of original rectangular windows can be reduced, for example, if the number of original windows is 32x32 rectangular windows, the number of the original rectangular windows is reduced to 16x16 after the processing. In the maximum likelihood estimation method, a known sample result is used in statistics, a certain model is used to derive a reliable result, in this embodiment, a weight sum model (weight sum model) is used, confidence is used as a weight value to estimate a displacement with higher stability, the confidence is used to estimate a confidence with higher stability, and the calculation of the horizontal direction x is shown in the following formula (12) and formula (13).

Wherein, (W) _x ) _ij ＝(Q _x ) _ij ，(W _x ) _ij Confidence of the ith row and jth column rectangular window representing the horizontal x-axis, (V) _x ) _ij Representing the displacement of a rectangular window of an ith row and a jth column of an x-axis in the horizontal direction, wherein the displacement is the number of a plurality of pixels which move; v (V) _x Representing the target speed of the horizontal x-axis synthesis, Q _x Representing the target speed V _x The corresponding target confidence level.

Specifically, as shown in fig. 11, taking 2×2 as an example of m×n second rectangular windows selected, m=2, n=2, the displacement amount corresponding to each rectangular window is {5,7,6, 20}, the confidence coefficient corresponding to each second rectangular window is {0.8,0.8,0.9,0.2} calculated by using the method of the above embodiment, spatial domain filtering is performed on the 4 second rectangular windows, 1 third rectangular window is generated, M '=1, N' =1, and the speed and confidence coefficient of the third rectangular window are calculated.

Therefore, the calculated displacement and confidence of the third rectangular window are [5.7,0.675], and the displacement and confidence of the third rectangular window replace the displacement and confidence of the original 2×2 second rectangular windows, so that the number of the second rectangular windows is reduced, and the stability of the space motion information is improved.

Similarly, the following equations (14) and (15) may also be employed in the vertical y-axis to calculate the displacement amount and the confidence of the third rectangular window.

/>

Wherein, (W) _y ) _ij ＝(Q _y ) _ij ，(W _y ) _ij Confidence of the ith row and jth column rectangular window representing the vertical y-axis, (V) _y ) _ij Representing the displacement of a rectangular window of an ith row and a jth column of a y axis in the vertical direction, wherein the displacement is the number of a plurality of pixels which move; v (V) _y Representing the target speed of the y-axis synthesis in the vertical direction, Q _y Representing the target speed V _y The corresponding target confidence level.

As shown in fig. 12, in the spatial domain filtering process, for example, a total of 32×32 second rectangular windows are included, and weighted average is performed in units of 4 (2×2) rectangular windows, for example, V is calculated ₀₀ ，V ₀₁ ，V ₁₀ ，V ₁₁ The 4 second rectangular windows are spatially filtered to obtain a third rectangular window V, and then the last 4 second rectangular windows, e.g., V ₀₂ ，V ₀₃ ，V ₁₂ ，V ₁₃ And carrying out weighted combination to obtain another third rectangular window, and traversing all the second rectangular windows to obtain 16 multiplied by 16 third rectangular windows so as to improve the stability of the second image.

It should be noted that, in this embodiment, only 4 second rectangular windows are used for merging, and more or fewer rectangular window combinations may be included, for example, 8 or 2 second rectangular windows are merged into 1 third rectangular window, which is not limited in this embodiment.

3. In the multidimensional filtering processing process provided in this embodiment, time domain filtering is also included. As shown in fig. 12, may be implemented by a time domain filter.

Specifically, the displacement and confidence of the window at the same position as the window at the previous frame in the current frame are averaged to generate a displacement and confidence with higher time domain stability, and the moving average may be a weighted moving average or an exponential moving average.

For example, the displacement of the target window of the current frame (V _t ) The shift amount (V of the window at the same position as the previous frame _t-1 ) Weighted average is performed to generate a moving average (SV _t-1 )。

The image processing method provided by the embodiment comprises the following beneficial effects:

the first embodiment of the invention provides a motion detection method using an average pixel value as a projection histogram, which has small memory space and low operation complexity compared with the feature point detection method in the prior art, so that the cost and the power consumption are obviously improved.

The second embodiment of the application provides a motion detection algorithm with low memory access and high stability, which comprises the steps of using a projection histogram to detect a low texture region and using a multidimensional filter to improve the stability of motion information, and can provide functional services based on information such as motion marks, directions, intensity and position of an applied product.

Third, the method provided by the embodiment of the application solves the problem of instability caused by noise in the motion direction and speed detection under high resolution, including noise reduction processing of low-pass filtering on influence and multidimensional filtering processing on motion information, and the processing methods can be realized by using simple multipliers, adders and displacement operators, so that the cost and the power consumption are obviously improved.

Referring to fig. 13, an image processing apparatus 130 is provided for implementing the image processing method in the foregoing embodiment.

As shown in fig. 13, the apparatus 130 may include an acquisition unit 1301 and a processing unit 1302, and in addition, the apparatus 130 may further include more or fewer components, such as a transmission unit, a storage unit, and the like, which is not limited in this application.

An acquisition unit 1301 for acquiring a first frame image and a second frame image of a moving object photographed.

A processing unit 1302, configured to divide the first frame image into MxN first rectangular windows, divide the second frame image into MxN second rectangular windows, and calculate a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window, where each of the first average pixel value and the second average pixel value includes m average pixel values in a horizontal direction and n average pixel values in a vertical direction; at least one motion window of the MxN second rectangular windows is determined from the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows.

Wherein M represents the number of rectangular windows in the horizontal direction, N represents the number of rectangular windows in the vertical direction, M and N are positive integers, each of the MxN first rectangular windows and the MxN second rectangular windows includes M times N pixels, M is the number of pixels in the horizontal direction, N is the number of pixels in the vertical direction, and M and N are positive integers.

The first and second captured frame images may be implemented in hardware, such as a camera or a camera device.

Optionally, in a specific implementation manner of this embodiment, the processing unit 1302 is specifically configured to: determining a first pixel area and a second pixel area of each first rectangular window, and determining a third pixel area and a fourth pixel area of each second rectangular window; calculating m average pixel values in the horizontal direction in a first pixel region, and obtaining m average pixel values in the horizontal direction in the first average pixel value; calculating average pixel values of n vertical directions in a second pixel region to obtain average pixel values of n vertical directions in the first average pixel value; calculating m average pixel values in the horizontal direction in the third pixel region to obtain m average pixel values in the horizontal direction in the second average pixel value; and calculating average pixel values in n vertical directions in the fourth pixel region, and obtaining average pixel values in n vertical directions in the second average pixel value.

Wherein the first pixel region includes the each first rectangular window and at least one first rectangular window horizontally adjacent to the each first rectangular window, the second pixel region includes the each first rectangular window and at least one first rectangular window vertically adjacent to the each first rectangular window, the third pixel region includes the each second rectangular window and at least one second rectangular window horizontally adjacent to the each second rectangular window, and the fourth pixel region includes the each second rectangular window and at least one second rectangular window vertically adjacent to the each second rectangular window.

Optionally, in another specific implementation manner of this embodiment, the processing unit 1302 is specifically configured to: calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window; judging whether the correlation coefficient is smaller than a first threshold value or not; and if so, determining the second rectangular window corresponding to the correlation coefficient as a motion window.

Optionally, in a further specific implementation manner of this embodiment, the processing unit 1302 is further configured to: calculating SAD values of all adjacent two pixels in the search range by using the absolute error and SAD algorithm; selecting a minimum SAD value from all the SAD values, and determining the speed of each motion window according to the minimum SAD value; wherein the velocity is determined by at least one of the number of horizontally displaced pixels or the number of vertically displaced pixels within each of the motion windows.

Optionally, in a further specific implementation manner of this embodiment, the processing unit 1302 is further configured to: after calculating the correlation coefficient, calculating a gradient value of each second rectangular window, wherein the gradient value comprises a sum of a horizontal gradient value and a vertical gradient value in each second rectangular window, the horizontal gradient value is a sum of differences between every two adjacent pixel values in m horizontal direction pixel values, and the vertical gradient value is a sum of differences between every two adjacent pixel values in n vertical direction pixel values; determining a probability value of each second rectangular window according to the gradient value of each second rectangular window; judging whether the probability value is larger than a second threshold value or not; and if so, reducing the correlation coefficient of the second rectangular window corresponding to the probability value.

Optionally, in a further specific implementation manner of this embodiment, the processing unit 1302 is further configured to: and performing spatial domain filtering processing on the MxN second rectangular windows in the second frame image to obtain M 'x N' third rectangular windows and the speed and the confidence of each third rectangular window, wherein M 'and N' are positive integers, M 'is smaller than M, and N' is smaller than N.

Optionally, in a further specific implementation manner of this embodiment, the processing unit 1302 is further configured to: time domain filtering is performed on each rectangular window of the current frame image and each rectangular window of the same position of the previous frame image, so as to generate a displacement amount and a confidence degree with higher time domain stability, wherein the time domain filtering is performed on each rectangular window specifically includes performing moving average, such as weighted moving average or exponential moving average, which is not limited in the embodiment.

It will be appreciated that the elements of the above apparatus embodiments may be implemented in software, hardware or a combination of software and hardware. The software may run on a computer or processor.

Referring to fig. 14, the embodiment of the present application further provides a communication apparatus, which may be a hardware device, for implementing part or all of the steps of the image processing method described in the foregoing embodiment. Alternatively, the communication device may be replaced by other devices having a camera function. Optionally, the hardware device is a terminal.

As shown in fig. 14, the hardware device 140 includes: processor 1401, memory 1402 and image collector 1403, in addition, may include more or fewer components in the hardware device, or may combine certain components or different component arrangements, which are not limited in this application.

Wherein the processor 1401 is operable to implement the overall method flow of the first portion S1, the second portion S2 and the third portion S3 in fig. 1 of the embodiment of the present application, the memory 1402 may be a random access memory RAM, for storing the projection histogram of the second frame image calculated in S2 and the motion information of the second frame image calculated in S3, and also for storing the projection histogram of the first frame image and the motion information, so as to use these information in the "motion information estimation" of S3. The image collector 1403 is used to capture images of an object, such as a first frame image and a second frame image.

Further, the processor 1401 is a control center of the hardware device, connects various parts of the entire hardware device using various interfaces and lines, and performs various functions of the hardware device by running or executing software programs and/or modules stored in the memory 1402, and calling data stored in the memory.

The processor 1401 may be comprised of integrated circuits (integrated circuit, ICs), such as a single packaged IC, or may be comprised of packaged ICs that connect multiple identical or different functions. For example, the processor may include only a CPU, or may be a combination of a GPU, a digital signal processor (digital signal processor, DSP), and a control chip in the transceiver module.

The memory 1402 is used for storing program codes for executing the technical solutions of the present application, and is controlled to be executed by the processor 1401. The processor 1401 is configured to execute program codes stored in the memory 1402, and implement the image processing method in the above embodiment.

Further, memory 1402 may be, but is not limited to, a read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a compact disc read-Only Memory (Compact Disc Read-Only Memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be separate or integrated with the processor.

The image collector 1403 may include a camera or other unit or module having a photographing function.

Illustratively, the hardware device is the image processing apparatus provided in the foregoing embodiment, further, in the embodiment of the image processing apparatus shown in fig. 13 of the present application, the function to be implemented by the obtaining unit 1301 may be implemented by the processor 1401 of the device controlling the image collector 1403; the functions to be performed by the processing unit 1302 may be performed by the processor 1401 of the device.

In a specific implementation, the hardware device may be a Terminal device, and further, the Terminal device may also be referred to as a Terminal (Terminal), a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), and so on. The terminal device may be a mobile phone (mobile phone), a tablet (Pad), a computer with wireless transceiving function, a Virtual Reality (VR) terminal device, an augmented reality (Augmented Reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned-driving (self-driving), a wireless terminal in teleoperation (remote medical surgery), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation security (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), a smart meter with wireless communication function, a smart water meter, an environmental sensor, a device tag, a positioning tag, and the like.

The terminal device is connected with the network device in a wireless mode, and the network device can be connected with the core network device in a wireless or wired mode. The core network device and the radio access network device may be separate physical devices, or may integrate the functions of the core network device and the logic functions of the radio access network device on the same physical device, or may integrate the functions of part of the core network device and part of the radio access network device on one physical device. The terminal device may be fixed in position or may be movable.

The terminal device provided in this embodiment can obtain motion detection information with high accuracy and high stability under the condition of low hardware cost, and can complete the following characteristics in cooperation with other algorithm modules:

1. the film forming rate of the motion snapshot can be improved by combining exposure control;

2. the focusing definition of a moving object can be improved by combining automatic focusing;

3. the image alignment accuracy during superposition can be improved by combining a plurality of image superposition modules.

Further, the present application provides a computer storage medium, in which a program may be stored, which program may include some or all of the steps in the embodiments of the image processing method provided in the present application when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory ROM or a random access memory RAM, etc.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When the computer program is loaded and executed by a computer, the processes or functions described in the various embodiments of the present application are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus.

The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a network node, computer, server, or data center to another site, computer, or server by wire or wirelessly.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for an embodiment of an image processing apparatus, since it is substantially similar to the method embodiment, the description is relatively simple, and reference is made to the description in the method embodiment for the matters.

Furthermore, in the description of the present application, unless otherwise indicated, "a plurality" means two or more than two. In addition, in order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

The above-described embodiments of the present application are not intended to limit the scope of the present application.

Claims

1. An image processing method, the method comprising:

acquiring a first frame image and a second frame image of a shooting moving object;

dividing the first frame image into MxN first rectangular windows, dividing the second frame image into MxN second rectangular windows, wherein M represents the number of rectangular windows in the horizontal direction, N represents the number of rectangular windows in the vertical direction, M and N are positive integers, each of the MxN first rectangular windows and the MxN second rectangular windows comprises M multiplied by N pixels, M is the number of pixels in the horizontal direction, N is the number of pixels in the vertical direction, and M and N are positive integers;

Calculating a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window, each of the first average pixel value and the second average pixel value comprising m average pixel values in a horizontal direction and n average pixel values in a vertical direction;

determining at least one motion window of the MxN second rectangular windows from the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows, comprising: calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window; judging whether the correlation coefficient is smaller than a first threshold value or not; and if so, determining the second rectangular window corresponding to the correlation coefficient as a motion window, wherein the motion window is a region where the displacement of the second frame image relative to the image content in the first frame image changes.

2. The method of claim 1, wherein the calculating the first average pixel value for each first rectangular window and the second average pixel value for each second rectangular window comprises:

Determining first and second pixel regions of each first rectangular window, and determining third and fourth pixel regions of each second rectangular window, the first pixel region including the each first rectangular window and at least one first rectangular window horizontally adjacent to the each first rectangular window, the second pixel region including the each first rectangular window and at least one first rectangular window vertically adjacent to the each first rectangular window, the third pixel region including the each second rectangular window and at least one second rectangular window horizontally adjacent to the each second rectangular window, the fourth pixel region including the each second rectangular window and at least one second rectangular window vertically adjacent to the each second rectangular window;

calculating m average pixel values in the horizontal direction in a first pixel region, and obtaining m average pixel values in the horizontal direction in the first average pixel value;

calculating average pixel values of n vertical directions in a second pixel region to obtain average pixel values of n vertical directions in the first average pixel value;

calculating m average pixel values in the horizontal direction in the third pixel region to obtain m average pixel values in the horizontal direction in the second average pixel value;

3. The method according to claim 1 or 2, characterized in that the method further comprises:

calculating SAD values of all adjacent two pixels in the search range by using the absolute error and SAD algorithm;

selecting a minimum SAD value from all the SAD values, and determining the speed of each motion window according to the minimum SAD value; wherein the velocity is determined by at least one of the number of horizontally displaced pixels or the number of vertically displaced pixels within each of the motion windows.

4. A method according to claim 3, wherein after calculating the correlation coefficient, the method further comprises:

calculating a gradient value of each second rectangular window, wherein the gradient value comprises a sum of a horizontal gradient value and a vertical gradient value in each second rectangular window, the horizontal gradient value is a sum of differences between every two adjacent pixel values in m horizontal direction pixel values, and the vertical gradient value is a sum of differences between every two adjacent pixel values in n vertical direction pixel values;

determining a probability value of each second rectangular window according to the gradient value of each second rectangular window;

Judging whether the probability value is larger than a second threshold value or not;

and if so, reducing the correlation coefficient of the second rectangular window corresponding to the probability value.

5. The method according to claim 4, wherein the method further comprises:

and performing spatial domain filtering processing on the MxN second rectangular windows in the second frame image to obtain M 'x N' third rectangular windows and the speed and the confidence of each third rectangular window, wherein M 'and N' are positive integers, M 'is smaller than M, and N' is smaller than N.

6. An image processing apparatus, characterized in that the apparatus comprises:

an acquisition unit configured to acquire a first frame image and a second frame image of a moving object;

a processing unit configured to divide the first frame image into MxN first rectangular windows, divide the second frame image into MxN second rectangular windows, calculate a first average pixel value of each first rectangular window and a second average pixel value of each second rectangular window, each of the first average pixel value and the second average pixel value including m average pixel values in a horizontal direction and n average pixel values in a vertical direction; determining at least one motion window of the MxN second rectangular windows from the first average pixel value of the MxN first rectangular windows and the second average pixel value of the MxN second rectangular windows; comprising the following steps: calculating a correlation coefficient of a first average pixel value of each first rectangular window and a second average pixel value of a second rectangular window corresponding to each first rectangular window; judging whether the correlation coefficient is smaller than a first threshold value or not; if so, determining the second rectangular window corresponding to the correlation coefficient as a motion window, wherein the motion window is a region in which the displacement of the second frame image relative to the image content in the first frame image is changed;

7. The apparatus of claim 6, wherein the device comprises a plurality of sensors,

the processing unit is specifically configured to determine a first pixel area and a second pixel area of each first rectangular window, and determine a third pixel area and a fourth pixel area of each second rectangular window,

calculating average pixel values in n vertical directions in a fourth pixel region, and obtaining average pixel values in n vertical directions in the second average pixel value;

8. The apparatus according to claim 6 or 7, wherein,

the processing unit is also used for calculating SAD values of all adjacent two pixels in the search range by using the absolute error and SAD algorithm; selecting a minimum SAD value from all the SAD values, and determining the speed of each motion window according to the minimum SAD value; wherein the velocity is determined by at least one of the number of horizontally displaced pixels or the number of vertically displaced pixels within each of the motion windows.

9. The apparatus of claim 8, wherein the device comprises a plurality of sensors,

The processing unit is further configured to calculate, after calculating the correlation coefficient, a gradient value of each second rectangular window, where the gradient value includes a sum of a horizontal gradient value and a vertical gradient value in each second rectangular window, the horizontal gradient value is a sum of differences between every two adjacent pixel values in m horizontal direction pixel values, and the vertical gradient value is a sum of differences between every two adjacent pixel values in n vertical direction pixel values; determining a probability value of each second rectangular window according to the gradient value of each second rectangular window; judging whether the probability value is larger than a second threshold value or not; and if so, reducing the correlation coefficient of the second rectangular window corresponding to the probability value.

10. The apparatus of claim 9, wherein the device comprises a plurality of sensors,

the processing unit is further configured to perform spatial domain filtering processing on MxN second rectangular windows in the second frame image to obtain M 'x N' third rectangular windows, and a speed and a confidence coefficient of each third rectangular window, where M 'and N' are positive integers, and M 'is smaller than M, and N' is smaller than N.

11. A communications device comprising a processor coupled to a memory, wherein,

The memory is used for storing instructions;

the processor configured to execute instructions in the memory, causing the image processing apparatus to perform the method of any one of claims 1 to 5.

12. A computer-readable storage medium having instructions stored therein, characterized in that,

the instructions, when executed, implement the method of any one of claims 1 to 5.