CN112001362A

CN112001362A - Image analysis method, image analysis device and image analysis system

Info

Publication number: CN112001362A
Application number: CN202010955196.5A
Authority: CN
Inventors: 汪秀英
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2020-11-27

Abstract

The invention relates to the technical field of image analysis, and discloses an image analysis method, which comprises the following steps: acquiring an image to be analyzed, and converting the image to be analyzed into a gray image by using a weighted average method; carrying out binarization processing on the gray level image by using a local self-adaptive binarization method to obtain a binarized image to be analyzed; segmenting the foreground and the background of the binary image by using a maximum inter-class variance method based on neighborhood information; detecting foreground entities in the segmented images by using an image detection algorithm based on an image background; extracting image features by using a feature extraction algorithm based on an image signal frequency spectrum to obtain image signal features; and taking the image signal characteristics as the input of the multilayer neural network to classify the images. The invention also provides an image analysis device and system. The invention realizes the analysis of the image.

Description

Image analysis method, image analysis device and image analysis system

Technical Field

The present invention relates to the field of image analysis technologies, and in particular, to an image analysis method, an image analysis apparatus, and an image analysis system.

Background

With the development of the internet, more and more image data are sent to the internet by users, and how to analyze images in the internet becomes a hot topic in the current research field.

The existing image detection algorithm mainly comprises a background difference method, an interframe difference method, an optical flow method and the like, but most of the existing image detection algorithms can bring certain influence to detection due to the influence of external environment, noise and the like; meanwhile, in the existing image segmentation algorithm, the maximum inter-class variance method is mainly used for counting and summarizing the feature information of the image histogram, and the maximum inter-class variance method is suitable for the image with the gray histogram being a single peak value or a double peak value, so that a final result cannot obtain a proper threshold value due to a strong offset phenomenon in the distribution process of the gray histogram, and the image segmentation result is not ideal.

In view of this, how to effectively detect an image, and perform segmentation, identification and analysis on the detected image becomes an urgent problem to be solved by those skilled in the art.

Disclosure of Invention

The invention provides an image analysis method, which is characterized in that the image is segmented by utilizing a maximum inter-class variance method based on neighborhood information, an improved image detection algorithm is provided for detecting an image target, image features are extracted by utilizing a feature extraction algorithm based on an image signal frequency spectrum, and a multilayer neural network is used for classifying images.

In order to achieve the above object, the present invention provides an image analysis method, including:

acquiring an image to be analyzed, and converting the image to be analyzed into a gray image by using a weighted average method;

carrying out binarization processing on the gray level image by using a local self-adaptive binarization method to obtain a binarized image to be analyzed;

segmenting the foreground and the background of the binary image by using a maximum inter-class variance method based on neighborhood information;

detecting foreground entities in the segmented images by using an image detection algorithm based on an image background;

extracting image features by using a feature extraction algorithm based on an image signal frequency spectrum to obtain image signal features;

and taking the image signal characteristics as the input of the multilayer neural network to classify the images.

Optionally, the converting the image to be analyzed into a gray-scale map by using a weighted average method includes:

the weighted average formula is:

G(i，j)＝0.299*R(i，j)+0.578*G(i，j)+0.114*B(i,j)

wherein:

g (i, j) is the gray value of the pixel (i, j);

r (i, j), G (i, j), B (i, j) are the pixel values of pixel (i, j) on R, G, B color channels, respectively.

Optionally, the binarizing the gray scale map by using a local adaptive binarization method includes:

the method comprises the following steps of utilizing a local self-adaptive binarization method to carry out binarization processing on a gray map, taking a pixel set reaching a self-adaptive threshold value as an image effective pixel, taking a pixel set not reaching the self-adaptive threshold value as an image invalid pixel, and simultaneously taking the image effective pixel as a binarization image to be analyzed, wherein the calculation formula of the local self-adaptive threshold value is as follows:

T＝a*E+b*P+c*Q

wherein:

e represents the pixel average;

q is the root mean square value between pixels;

p is the square of the difference between pixels;

a, b and c are free parameters and are numbers between (0 and 1), and a + b + c is equal to 1.

Optionally, the segmenting the foreground and the background of the image by using the maximum inter-class variance method based on the neighborhood information includes:

1) dividing the gray level of the image and the neighborhood average gray level of the pixel into L levels according to the gray level value of each pixel;

2) calculating the neighborhood average gray g (x, y) of each pixel:

wherein:

f (x, y) represents a gray value at the pixel (x, y);

3) using a two-dimensional expression (i, j) for the binary image, wherein i is f (x, y), j is g (x, y), and the probability of occurrence of the two-dimensional expression (i, j) in the neighborhood is f_ijAnd the probability of occurrence of the two-dimensional expression (i, j) in the whole binary image is P_ij：

Wherein:

m multiplied by N is the size of the binary image;

4) dividing the image into 4 image areas according to the clockwise direction, and defining a two-dimensional inter-class variance matrix:

wherein:

ω_kis the probability distribution of k region pixel occurrence;

μ_kis the k area pixel mean value;

μ_ris the image pixel mean;

5) using a two-dimensional inter-class variance matrix as a measurement value of inter-class dispersion to obtain an optimal segmentation threshold(s) of a maximum inter-class variance method based on neighborhood information^*,t^*)：

S_B(s^*，t^*)＝max{P_ijS_B}

And segmenting the foreground and the background of the binary image by utilizing the optimal segmentation threshold value.

Optionally, the detecting the foreground entity by using an image detection algorithm based on an image background includes:

1) selecting proper image B from segmented image background_n(x, y) as a background image;

2) the foreground image to be analyzed and the background image are subjected to difference value to obtain F_D(x,y)：

Wherein:

F_n(x, y) is a foreground image to be analyzed;

t is a threshold value;

3) calculating F_D(x, y) and the operation result of the foreground image to be analyzed, wherein the obtained intersection is the detection target contour:

F＝F_D(x，y)∩F_n(x，y)

wherein:

F_nand (x, y) is a foreground image to be analyzed.

Optionally, the extracting the image features by using a feature extraction algorithm based on an image signal spectrum includes:

scanning the detection target image according to the sequence of rows and columns to obtain two one-dimensional signals in the horizontal direction and the vertical direction of the detection target image;

dividing the total observed time length into various small sections, and performing FFT operation and cyclic frequency sliding in each small section to obtain the SCF of each signal;

tapering the SCF of each signal by using a Hamming window;

forming image signal feature vectors by the Norm-1 energy and the standard deviation in the tapered SCF, wherein the feature vector of the N region of each signal is as follows:

wherein:

E_in，σ_inthe energy and standard deviation of the SCF of the i-th signal of the n-th region, respectively.

Optionally, the classifying the image by using the multi-layer neural network includes:

1) taking the image signal characteristics as input of a multilayer neural network, wherein the structure of an encoder part is derived from 2D DPN, extracting a characteristic diagram by using 2 convolutional layers before the 1 st Max-Pooling, then extracting depth characteristics by using 8 double-path connecting blocks, connecting a decoder by using 3 double-path connecting blocks after the encoder, and performing cross-layer connection on parts with the same scale;

2) in the decoder part, the characteristic diagram is processed by transposition convolution, and is integrated with a double-path connecting block, and finally a dropout layer stable training layer and a candidate area output layer are connected;

3) in the final output matrix, each 15-dimensional vector represents 3 prediction candidate frames, and corresponds to 3 candidate regions with preset sizes during training to determine whether the candidate regions are a target, wherein the preset candidate regions have preset target frames with three sizes, the sizes of the preset target frames are 1 × 1 pixel, 2 × 2 pixel and 3 × 3 pixel respectively, by calculating the intersection ratio between the prediction candidate frames and the preset target frames, when the intersection ratio is less than 0.5, the image category label in the preset target frame with the size of 3 × 3 pixel is output, when the intersection ratio is equal to 0.5, the label in the preset target frame with the size of 2 × 2 pixel is output, and when the intersection ratio is greater than 0.5, the image category label in the preset target frame with the size of 1 × 1 pixel is output.

In order to achieve the above object, the present invention provides an image analysis apparatus, comprising:

a communication unit for receiving image information;

the image processing unit is used for carrying out conversion and detection processing on the image information;

and the image analysis unit is used for analyzing and processing the image information.

Further, to achieve the above object, the present invention also provides an image analysis system, comprising:

image acquisition means for receiving an image to be analyzed;

the image processor is used for converting an image to be analyzed into a gray image by using a weighted average method, performing binarization processing on the gray image by using a local self-adaptive binarization method, and performing foreground and background segmentation on the binarized image by using a maximum inter-class variance method based on neighborhood information to obtain a segmented image;

the image analysis device is used for detecting foreground entities in the segmented images by using an image detection algorithm based on an image background, extracting image features by using a feature extraction algorithm based on an image signal frequency spectrum to obtain image signal features, and classifying the images by using the image signal features as the input of the multilayer neural network.

Further, to achieve the above object, the present invention also provides a computer readable storage medium having stored thereon image analysis program instructions executable by one or more processors to implement the steps of the implementation method of image analysis as described above.

Compared with the prior art, the invention provides an image analysis method, which has the following advantages:

first, the conventional maximum inter-class variance method has the defect that the segmentation is easily interfered by noise, and a relatively excellent segmentation effect is difficult to obtain. Therefore, the invention provides a maximum inter-class variance method based on neighborhood information, the original gray level histogram is converted into a two-dimensional histogram by introducing pixel neighborhood gray level information into an algorithm, compared with the traditional technology that the gray level histogram is divided into two parts, in the algorithm disclosed by the invention, one threshold point (s, t) divides the two-dimensional gray level histogram into four parts, and the self-adaptive threshold value calculation is carried out based on the pixel probability value, so that the segmentation threshold value depends on the pixel distribution of different images, and the anti-interference capability of the segmentation result is effectively enhanced.

The invention provides a characteristic extraction algorithm based on image signal frequency spectrum, namely, the second-order statistical characteristic is extracted by using a frequency spectrum correlation function on the signal frequency, firstly, two one-dimensional signals are obtained by arranging pixels in each image row by row and column by column, then the frequency spectrum correlation function of each signal is calculated by accumulated Fourier transform, compared with the traditional signal frequency spectrum extraction process, the invention divides the total observed time length into various small sections, uses FFT operation and cycle frequency sliding in each small section, thereby dividing the SCF into a plurality of small areas in two frequency dimensions more quickly, and can carry out classification extraction by using the statistic in each area, so that the extracted image signal characteristic contains more local signal information, because the correlation degree between the local signals is small, the detection of other characteristics can not be influenced by the disappearance of partial characteristics under the shielding condition, the extracted image signal characteristics have robustness on image transformation such as illumination, rotation, viewpoint change and the like, and the accuracy of image analysis can be effectively improved.

Drawings

Fig. 1 is a schematic flow chart of an image analysis method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an image analysis system according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The image is segmented by utilizing a maximum inter-class variance method based on neighborhood information, an improved image detection algorithm is provided for detecting an image target, image features are extracted by utilizing a feature extraction algorithm based on an image signal frequency spectrum, and a multilayer neural network is used for classifying the images. Fig. 1 is a schematic diagram illustrating an image analysis method according to an embodiment of the present invention.

In the present embodiment, the image analysis method includes:

and S1, obtaining an image to be analyzed, converting the image to be analyzed into a gray map by using a weighted average method, and performing binarization processing on the gray map by using a local adaptive binarization method to obtain a binarized image to be analyzed.

Firstly, the invention obtains an image to be analyzed, and calculates the pixel brightness value of the image to be analyzed based on a weighted average method according to the importance of brightness value quantization to obtain a gray scale map of the image to be analyzed, wherein the weighted average formula is as follows:

G(i,j)＝0.299*R(i,j)+0.578*G(i,j)+0.114*B(i,j)

wherein:

g (i, j) is the gray value of the pixel (i, j);

r (i, j), G (i, j), B (i, j) are the pixel values of the pixel (i, j) on R, G, B color channels, respectively;

furthermore, because the problem of local binarization is mainly that the threshold value is selected unreasonably, in order to solve the unreasonable problem of threshold value selection, the invention uses a local self-adaptive binarization method to carry out binarization processing on a gray scale image, uses a pixel set reaching the self-adaptive threshold value as an image effective pixel, uses a pixel set not reaching the self-adaptive threshold value as an image invalid pixel, and simultaneously uses the image effective pixel as an image to be analyzed, wherein the calculation formula of the local self-adaptive threshold value is as follows:

T＝a*E+b*P+c*Q

wherein:

e represents the pixel average;

q is the root mean square value between pixels;

p is the square of the difference between pixels;

And S2, segmenting the foreground and the background of the binary image by utilizing a maximum inter-class variance method based on neighborhood information to obtain an image after segmentation.

Further, the invention utilizes a maximum inter-class variance method based on neighborhood information to carry out foreground and background segmentation on the binary image, and the image segmentation process comprises the following steps:

2) calculating the neighborhood average gray g (x, y) of each pixel:

wherein:

f (x, y) represents a gray value at the pixel (x, y);

Wherein:

m multiplied by N is the size of the binary image;

wherein:

ω_kis the probability distribution of k region pixel occurrence;

μ_kis the k area pixel mean value;

μ_ris the image pixel mean;

S_B(s^*,t^*)＝max{P_ijS_B}

And S3, detecting foreground entities in the segmented images by using an image detection algorithm based on image backgrounds.

Further, the invention uses an image detection algorithm based on image background to detect the entity in the segmentation image, and the algorithm flow is as follows:

2) the foreground image to be analyzed and the background image are subjected to difference value to obtain F_D(x，y)：

Wherein:

F_n(x, y) is a foreground image to be analyzed;

t is a threshold value;

F＝F_D(x，y)∩F_n(x,y)

wherein:

F_nand (x, y) is a foreground image to be analyzed.

And S4, extracting the image features by using a feature extraction algorithm based on the image signal frequency spectrum to obtain the image signal features.

Further, the invention scans the detection target image according to the sequence of rows and columns respectively to obtain two one-dimensional signals of the detection target image in the horizontal direction and the vertical direction;

tapering the SCF of each signal by using a Hamming window;

wherein:

And S5, classifying the images by taking the image signal characteristics as the input of the multilayer neural network.

Further, the present invention uses the image signal characteristics as input of a multilayer neural network to classify images, and the image classification process of the multilayer neural network is as follows:

The following describes embodiments of the present invention through an algorithmic experiment and tests of the inventive treatment method. The hardware test environment of the algorithm of the invention is as follows: the system is Ubuntu16.04, the open source framework is sensor Flow 1.7, the processor is Intel i7-7700K, and the graphics card is Nvidia GTX 1080-Ti; the comparison algorithm models are SVM, LBP and LeNet models.

In the algorithm experiment of the invention, the data set is a plurality of collected image data sets with different labels. In the experiment, image data is input into an algorithm model, and the accuracy of algorithm label classification is used as an evaluation index of algorithm performance.

According to the experimental result, the image label classification accuracy of the SVM model is 79.62%, the image label classification accuracy of the LBP model is 86.39%, the image label classification accuracy of the LeNet model is 88.12%, and the image analysis accuracy of the algorithm is 90.11%.

The invention also provides an image analysis system. Fig. 2 is a schematic diagram of an internal structure of an image analysis system according to an embodiment of the present invention.

In the present embodiment, the image analysis system 1 includes at least an image acquisition device 11, an image processor 12, an image analysis device 13, a communication bus 14, and a network interface 15.

The image capturing device 11 may be a PC (Personal Computer), a terminal device such as a smart phone, a tablet Computer, or a mobile Computer, or may be a server.

Image processor 12 includes at least one type of readable storage medium including flash memory, a hard disk, a multi-media card, a card-type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. Image processor 12 may in some embodiments be an internal storage unit of image analysis system 1, such as a hard disk of image analysis system 1. The image processor 12 may also be an external storage device of the image analysis system 1 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the image analysis system 1. Further, the image processor 12 may also include both an internal storage unit and an external storage device of the image analysis system 1. The image processor 12 may be used not only to store application software installed in the image analysis system 1 and various types of data, but also to temporarily store data that has been output or is to be output.

Image analysis device 13 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip for executing program codes stored in image processor 12 or Processing data, such as image analysis program instructions.

The communication bus 14 is used to enable connection communication between these components.

The network interface 15 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), and is typically used to establish a communication link between the system 1 and other electronic devices.

Optionally, the system 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the image analysis system 1 and for displaying a visualized user interface.

Fig. 2 only shows the image analysis system 1 with the components 11-15, and it will be understood by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the image analysis system 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

In the embodiment of apparatus 1 shown in FIG. 2, image processor 12 has stored therein image analysis program instructions; the steps of the image analysis program instructions stored in the image processor 12 executed by the image analysis device 13 are the same as the implementation method of the image analysis method, and are not described here.

Furthermore, an embodiment of the present invention also provides a computer-readable storage medium having stored thereon image analysis program instructions executable by one or more processors to implement the following operations:

It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method of image analysis, the method comprising:

2. An image analysis method as claimed in claim 1, characterized in that said converting the image to be analyzed into a gray map by means of a weighted average method comprises:

the weighted average formula is:

G(i,j)＝0.299*R(i,j)+0.578*G(i,j)+0.114*B(i,j)

wherein:

g (i, j) is the gray value of the pixel (i, j);

3. An image analysis method as claimed in claim 2, wherein the binarizing process of the gray map by using the local adaptive binarization method comprises:

T＝a*E+b*P+c*Q

wherein:

e represents the pixel average;

q is the root mean square value between pixels;

p is the square of the difference between pixels;

4. An image analysis method as claimed in claim 3, wherein the segmenting the foreground and the background of the image by using the maximum inter-class variance method based on the neighborhood information comprises:

2) calculating the neighborhood average gray g (x, y) of each pixel:

wherein:

f (x, y) represents a gray value at the pixel (x, y);

Wherein:

m multiplied by N is the size of the binary image;

wherein:

ω_kis the probability distribution of k region pixel occurrence;

μ_kis the k area pixel mean value;

μ_ris the image pixel mean;

S_B(s^*,t^*)＝max{P_ijS_B}

5. An image analysis method as claimed in claim 4, wherein the detecting of the foreground entity by using an image detection algorithm based on the image background comprises:

1) selecting suitable image from segmented image backgroundB_n(x, y) as a background image;

Wherein:

F_n(x, y) is a foreground image to be analyzed;

t is a threshold value;

F＝F_D(x，y)∩F_n(x，y)

wherein:

F_nand (x, y) is a foreground image to be analyzed.

6. An image analysis method as claimed in claim 5, wherein the extracting the image features by using the feature extraction algorithm based on the image signal spectrum comprises:

tapering the SCF of each signal by using a Hamming window;

wherein:

7. The image analysis method of claim 6, wherein the classifying the image by using the multi-layer neural network comprises:

3) in the final output matrix, each 15-dimensional vector represents 3 prediction candidate frames, and corresponds to 3 candidate regions with preset sizes during training to determine whether the target is the target, wherein the preset candidate regions have preset target frames with three sizes, the sizes of the preset target frames are 1 × 1 pixel, 2 × 2 pixel and × 3 pixel respectively, by calculating the intersection ratio between the prediction candidate frames and the preset target frames, when the intersection ratio is less than 0.5, the image category label in the preset target frame with the size of 3 × 3 pixel is output, when the intersection ratio is equal to 0.5, the label in the preset target frame with the size of 2 × 2 pixel is output, and when the intersection ratio is greater than 0.5, the image category label in the preset target frame with the size of 1 × 1 pixel is output.

8. An image analysis apparatus, characterized in that the apparatus comprises:

a communication unit for receiving image information;

9. An image analysis system, characterized in that the system comprises:

image acquisition means for receiving an image to be analyzed;

10. A computer readable storage medium having stored thereon image analysis program instructions executable by one or more processors to perform the steps of a method of implementing image analysis according to any one of claims 1 to 7.