CN106874942B

CN106874942B - Regular expression semantic-based target model rapid construction method

Info

Publication number: CN106874942B
Application number: CN201710044816.8A
Authority: CN
Inventors: 芦兵; 许晓东; 夏纯中
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2017-01-21
Filing date: 2017-01-21
Publication date: 2020-03-31
Anticipated expiration: 2037-01-21
Also published as: CN106874942A

Abstract

The invention discloses a regular expression semantic-based target model rapid construction method, and belongs to the field of machine vision and pattern recognition. Firstly, preprocessing an image of an identified object to improve the quality of feature extraction, then extracting local features of an image target through a Harris feature detection algorithm, and finally describing a target matching model of the identified object through an extended defined regular expression semantic in combination with a defined image material library. The method makes an extensive definition on a 'regular theory' on a statistical computer theory, so that the method can be applied to the field of image target identification, retains the characteristic of high retrieval efficiency, and has a good effect under the condition that physical characteristics such as the outline, the color and the like of an identified object are constant.

Description

Regular expression semantic-based target model rapid construction method

Technical Field

The invention relates to a machine vision technology and a pattern recognition technology, in particular to an image target recognition method based on a regular expression

Background

The identification of the image target is to process and analyze the available information of various forms of the characterizers in the image. The process of describing, identifying, classifying and interpreting the object. The target recognition can be classified into supervised classification and unsupervised classification, and the main difference between the supervised classification and the unsupervised classification is whether the class to which each experimental sample belongs is known in advance. Supervised classification, also known as training classification, is a process of identifying other unknown classes of pixels using sample pixels of a confirmed class. Prior to classification, the target attributes on the image are known a priori through means such as visual judgment. Generally, supervised classification usually needs to provide a large number of samples of known classes, and according to the samples provided by a known training area, the feature parameters are calculated by selecting the feature parameters to serve as decision rules, and a discriminant function is established to classify images to be classified, so that the method is a method for pattern recognition. The training area is required to be typical and representative. Through years of summary, people design a plurality of methods for supervised training feature extraction, and the methods can be divided into four types after summary: a bottom global feature, a bottom local feature, a middle layer feature, and an attribute feature. The feature extraction of Tamura texture, color histogram, Harris operator, dense sampling SIFT, Texton and the like proposed in the early stage is mainly the bottom layer global feature and the bottom layer local feature. The global feature extraction is information such as global color, texture, shape and the like of the image, and plays a great promoting role in promoting the development of early image understanding tasks, particularly content-based image retrieval systems. With the progress of research, people find that the global visual features can not completely meet the requirement of an image classification task on improving the classification precision, and the local features represented by SIFT features have stronger description capacity, so that the global visual features can be fully characterized. In recent years, due to the fact that the description capability of the local features is increasingly improved, research on the local visual features starts to turn to how to improve the extraction efficiency of the local features on the premise of ensuring the description capability. Meanwhile, the middle-layer characteristics, which describe the visual content organization information on the basis of the local characteristics, are also beginning to be paid attention by the academic community, and the attribute characteristics of the supervision information of the graph itself become another research hotspot.

Disclosure of Invention

The invention aims to identify an identified object in an image and establish a matching template when a target is matched. Therefore, a regular expression semantic-based target model rapid construction method is provided.

The technical scheme adopted by the invention is as follows:

a regular expression semantic based target model rapid construction method comprises the following steps:

step 1: acquiring a position image of a target recognition object, and then performing image preprocessing: selecting an image acquisition point, and acquiring the great unchanged physical characteristics of the identified object; step 2: extracting target features, including extracting and selecting color features, spatial position features and angular point features; and step 3: establishing a pixel library of the image, and performing extended definition on a regular expression: firstly, establishing a pixel library capable of describing object characteristics according to physical characteristics of a general object, wherein the pixel library mainly comprises a line library, a shape library, a color library and a spatial position information library; organizing the image elements by utilizing regular semantics, and giving the image elements the capability of describing object features; and 4, step 4: and describing the target matching model by using the extended regular grammar.

Further, in the step 1, in a specific scene, the target image is selected to be obtained from the front side, so that the obtained same type of target object can be ensured to have the maximum physical feature similarity; image filtering and edge enhancement: firstly, carrying out gray processing on an image, then filtering background noise by using median filtering, and finally carrying out edge enhancement on the image by using a canny operator.

Further, the step 1 further comprises:

1) replacing the value of a point in the image sequence with the median of the values of the points in a neighborhood of the point, the values of the surrounding pixels being close to the true values: sorting pixels in the plate according to the size of pixel values by using a two-dimensional sliding template with a square matrix structure to generate a monotonously-rising (or-falling) two-dimensional data sequence; the two-dimensional median filtering output is g (x, y) ═ med { f (x-k, y-l), (k, l ∈ W) }, wherein f (x, y), g (x, y) are respectively an original image and a processed image, W is a two-dimensional template and is selected as a 3 × 3 area;

2) the higher brightness gradient is likely to be an edge, but no exact value is provided to limit how large the brightness gradient is, and the effective gradient range of the acquired target is dynamically adjusted by using a hysteresis threshold in a Canny operator; assuming that important edges in an image are continuous curves, it is possible to track blurred portions in a given curve and avoid taking noise pixels that do not constitute the curve as edges; starting from a larger threshold, this will identify more confident real edges from which the entire edge is tracked in the image; during tracking, a smaller threshold is used so that the blurred portion of the curve can be tracked until the starting point is reached.

Further, the step 2 specifically includes:

extracting color features: extracting first, second and third moments from each color channel, and calculating, and setting h_ijRepresenting the probability of occurrence of a pixel with a gray level of j in the ith color channel component, where n is the total number of pixels, the three lower-order moment mathematical expressions of the color moments are:

these 3 low-order moments are called mean, variance, and skewness, respectively;

spatial feature extraction: in order to improve the description phase rate of the position information, when calculating the feature vector position information, the D4 model is adopted to calculate:

D₄(P,Q)＝|x_p-x_q|+|y_p-y_q|

the distance D4 is the distance of the block, and only the horizontal and vertical directions are selected to calculate the relative distance;

extracting corner features based on a Harris operator: harris angle detection is an algorithm for finding angle features on an image through mathematical calculation, and has the characteristic of rotation invariance; before establishing a feature regular expression for image matching, detecting word elements of image features through a Harris angle, wherein the mathematical principle is as follows:

wherein W (x, y) represents a moving window, I (x, y) represents the intensity of a pixel gray value, and the range is 0-255; calculating a partial derivative from the first order to the N order according to the Taylor series, and finally obtaining a Harris matrix formula:

computing a matrix eigenvalue lambda from the Harris matrix₁,λ₂Then, a Harris angular response value is calculated:

R＝detM-K(traceM)²

detM＝λ₁λ₂

traceM＝λ₁+λ₂

wherein K is a coefficient value, and the value range is usually between 0.04 and 0.06.

Further, in the step 3, the step of,

the line bank is divided according to line shapes and comprises the following steps: horizontal straight lines, oblique lines, right-angle lines, arcs, S-shaped arcs, and other custom line shapes; the shape library comprises squares, rectangles, circles, semicircles, diamonds, heart shapes and other customized irregular figures; the color library is marked by adding a numeric variable Wk to letters, wherein the first letter W represents a color system, the later numeric variable k represents brightness, and the value range is 0-255;

the position information base describes relative position information in the image by fully customized symbols, "| (X1) - > (X2) |" represents starting from the pel X1 position and ending horizontally to the right to pel X2; "| (x1) - < Lambda > (K) (x2) |", where K is a variable coefficient and can take on a value between 0 and 1, denotes K from pel x1 to the pel x2 vertically below it.

Further, the basic definition of the horizontal and vertical position information given in the step 3, the definition symbol and the definition rule of the position information can be defined and added according to the actual situation; the pixel library described herein is an open element library, and technical indexes for describing physical characteristics can be used as basic indexes of the pixel library, such as angle information, temperature information, timbre information, and vibration information, which can be used as index items of the element library for describing targets to be expanded.

Further, in the step 4, connectivity connection is firstly performed on the target peripheral feature points, and then the image elements are cut inside the connected image blocks according to the maximum similarity principle. I.e. the cut local features can find the most similar "pel" in the pel base.

The invention has the beneficial effects that:

the traditional approach to target matching template acquisition is through extensive supervised training. Generally, supervised classification usually needs to provide a large number of samples of known types, and according to the samples provided by a known training area, a feature parameter is calculated by selecting the feature parameter to serve as a decision rule, and a discriminant function is established to classify images to be classified. The invention can quickly define the matching model of the identified object by the extended definition of the regular algorithm and the combination of the pre-defined pixel material library, can adjust the model in time by a visual judgment method, has strong flexibility, can quickly construct the matching model of the identified object under the application scene that the physical characteristics of the identified object are not changed greatly, such as a high-speed outlet, a product production line conveyor belt and the like, and can greatly improve the efficiency of target identification.

Drawings

The invention is described in further detail below with reference to the following figures and detailed description:

fig. 1 is a flow chart of object recognition.

Fig. 2 is a diagram of the effect after certain image preprocessing.

Fig. 3 is an example of selecting an icon for a feature corner point of an image.

FIG. 4 is pixel elements extracted according to definitions in a pixel library.

FIG. 5 depicts a target matching model using an extended definition of canonical grammar.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

FIG. 1 is a flow chart of object recognition, wherein the image preprocessing aims to filter environmental interference points and enhance extracted object contour features; the feature extraction comprises color feature extraction based on color moments, space position feature extraction based on invariant geometry and shape feature extraction; the shape feature extraction adopts a Harris angular point feature extraction algorithm, an angular point which can represent the geometric feature of the identified object most is selected from the selected feature angular point set, and finally, a target matching model which is closest to the target feature is described by utilizing a basic material library of image elements through regular expression semantic rules.

Step 1: acquiring the position image of the target recognition object, and then carrying out image preprocessing

(1) And selecting an image acquisition point to acquire the great unchanged physical characteristics of the identified object.

The specific method is that the target image is selected to be obtained from the front side in a specific scene, such as a high-speed exit, so that the obtained same type of target object can be ensured to have the maximum physical characteristic similarity, and the matching success rate of the matching model is improved.

(2) Image filtering and edge enhancement

Firstly, carrying out gray processing on an image, then filtering background noise by using median filtering, and finally carrying out edge enhancement on the image by using a canny operator so as to improve the quality of edge feature extraction.

Further adopting mathematical morphology processing, specifically comprising:

1) the value of one point in the image sequence is replaced by the median value of each point value in a neighborhood of the point, and the surrounding pixel values are close to the real values, so that the isolated noise point is eliminated. The method is to use a two-dimensional sliding template with a square matrix structure to sort pixels in the template according to the size of pixel values, and generate a monotonously rising (or falling) two-dimensional data sequence. The two-dimensional median filter output is g (x, y) ═ med { f (x-k, y-l), (k, l ∈ W) }, where f (x, y), g (x, y) are the original image and the processed image, respectively. W is a two-dimensional template, selected as 3 x 3 regions.

2) Higher intensity gradients are likely to be edges, but there is no exact value to define how large intensity gradients are, and the effective gradient range of the acquisition target can be dynamically adjusted by using hysteresis thresholds-high and low thresholds-in the Canny operator. Assuming that the important edges in the image are all continuous curves, it is possible to track the blurred portions of a given curve and avoid having noisy pixels that do not make up the curve as edges. So starting from a larger threshold, this will identify more confident real edges from which the entire edge can be tracked in the image. During tracking, a smaller threshold is used so that the blurred portion of the curve can be tracked until the starting point is reached. Fig. 2 shows the enhancement of the contour edges after edge detection and filtering.

Step 2: and extracting target features, including extracting and selecting color features, spatial position features and angular point features. (1) Color feature extraction

Since the color information is mainly concentrated in the low order moments, only the first, second and third order moments need to be extracted for each color channel for statistics. Is provided with h_ijRepresenting the probability of occurrence of a pixel with a gray level of j in the ith color channel component, where n is the total number of pixels, the three lower-order moment mathematical expressions of the color moments are:

these 3 low-order moments are called mean, variance, and skewness, respectively.

(2) Spatial feature extraction

In order to improve the description phase rate of the position information, when calculating the feature vector position information, the D4 model is adopted to calculate:

D₄(P,Q)＝|x_p-x_q|+|y_p-y_q| (4)

the distance D4 is the distance of the block, and it only selects the horizontal and vertical directions to calculate the relative distance, and it is more convenient in the description of the positioning.

(3) Harris operator-based corner feature extraction

Harris angle detection is an algorithm that finds angular features on an image by mathematical computation and it has the property of rotational invariance. Before establishing a feature regular expression for image matching, detecting word elements of image features through a Harris angle, wherein the mathematical principle is as follows:

wherein W (x, y) represents the moving window, and I (x, y) represents the intensity of the pixel gray value, and the range is 0-255. Calculating a partial derivative from the first order to the N order according to the Taylor series, and finally obtaining a Harris matrix formula:

wherein K is a coefficient value, and the value range is usually between 0.04 and 0.06. Fig. 3 shows the extracted feature points.

And step 3: establishing a pixel library of the image and performing extension definition on regular expressions

Firstly, establishing a pixel library capable of describing object characteristics according to physical characteristics of a general object, wherein the pixel library mainly comprises a line library, a shape library, a color library and a spatial position information library: these image elements can then be organized using canonical semantics giving them the ability to describe the characteristics of the object.

The line library can be divided into: the shape library comprises conventional shapes such as squares, rectangles, circles, semi-circles, rhombuses, heart shapes and other customized irregular figures. The color library is marked by a letter plus number mode, such as Wk, wherein the first letter W represents a color system, the latter number k represents brightness, and the value range is 0-255. The position information base describes the relative position information in the image by fully self-defined symbols, for example, "| (X1) - > (X2) |" represents the position from the pixel X1 and ends from the horizontal to the right to the pixel X2, and "| (X1) - < Lambda > (K) (X2) |" represents the position from the pixel X1 to the pixel X2 vertically below the pixel, wherein K is a variable coefficient and can take a value between 0 and 1, for example, 0.25 represents the distance between X1 and X2, and only the basic definition of the horizontal and vertical position information is given here, and the definition symbol and definition rule of the position information can be defined and added according to the actual conditions. The pixel library described herein is an open element library, and technical indexes for describing physical characteristics can be used as basic indexes of the pixel library, such as angle information, temperature information, timbre information, vibration information, and the like, which can be used as an element library index item for describing a target.

TABLE 1 Pixel element library

These image elements can then be organized using canonical semantics giving them the ability to describe the characteristics of the object. Regular expressions, also known as regular expressions, are a concept of computer science. The regular table is typically used to retrieve, replace, text that conforms to a certain pattern (rule). The regular expression is a logic formula for operating on character strings, namely a 'regular character string' is formed by using a plurality of specific characters defined in advance and a combination of the specific characters, and the 'regular character string' is used for expressing a filtering logic for the character strings. The core idea of regular expressions is to abstract and classify the object being described. Given a regular expression and another string, we can achieve the following:

1. filtering logic (referred to as "matching") of whether a given string conforms to a regular expression;

2. the specific part which we want can be obtained from the character string through the regular expression;

TABLE 2 conventional canonical grammar

In table 1, the "pixel elements" are used to describe the image features, and the pixel elements are organized according to the syntax of the regular expression, so that the local features of the image can be quickly defined. Such as: | ● - > ● | the expression may represent a portion of an image that matches two black circles in the horizontal direction. Of course, as well as defining a text regular expression, some more compact characters may be used to represent regular "pixel elements" of an image.

And 4, step 4: describing a target matching model by using an extended regular grammar:

on the basis of fig. 3, the feature points that can describe the image features most are selected, and image elements are extracted according to the definitions in the pixel element library, as shown in fig. 4. Firstly, connectivity connection is carried out on target peripheral feature points, and then image elements are cut in a connected image block according to the principle of maximum similarity. I.e. the cut local features can find the most similar "pel" in the pel base. Shape pixels can be extracted in fig. 4 after dicing: trapezoidal, rectangular, circular. The extracted color pixels are: w12 (white 12), B10 (black 10), the extracted position pixel information is "| - ^ |" (from the upper position to the lower position), "| - ^ (0.25) |" (one quarter from the upper position to the lower position), "| - | (0.75) |" (three quarters from the upper position to the lower position).

The pixel elements extracted are as follows:

TABLE 3 basic image elements

Based on the pixel elements in table 3, a target matching canonical expression based on the pixel elements can be constructed:

the expression describes the target model according to the priority from left to right. The content in "{ }" represents a description in entirety, i.e., where the description of the position information is with respect to the first picture element in parentheses. The description in the above formula can be understood as starting with a trapezoid with color number W12, which is followed vertically by a color block element with color number B10, followed vertically by a descriptor in the form of "{ }", which is a rectangle with color number W12 starting, a circle with color number B10 at 1/4 at its lower position, and a circle with color number B10 at 3/4 at its lower position. The graphical features depicted are shown in fig. 5.

The preferred embodiment:

a preferred embodiment of the present invention: the method comprises the steps of erecting an image acquisition device on the complete surface of a target object, acquiring the maximum physical characteristics of an image, carrying out enhancement and noise reduction pretreatment on the edge of the target through median filtering and a canny operator, so that the acquisition quality of characteristic points can be improved, detecting edge angular points in the image through a harris operator, selecting the angular point which can most mark the physical characteristics of an identified object, providing color information through color moments, and calculating the relative position information of the image through a D4 operator. And finally, describing a matching model closest to the recognition target by using the established pixel element library through the semantic description of the extended and defined regular expression.

The regular expression based on the character elements is widely applied to text character retrieval due to high matching efficiency and good adaptability, and the commonality of the image matching and the character matching can be found in a matching method although the matching content has larger difference, namely, the target can be identified by a method of regularly organizing basic elements to construct a matching model. Based on the commonality, the semantics of the regular expression is subjected to extended definition by endowing graphic attribute features, a new concept and semantic rules of 'image elements' are introduced, and experiments show that an image feature matching model can be completely and conveniently established by compiling the image matching regular expression, so that the aim of target matching is fulfilled.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A regular expression semantic based target model rapid construction method is characterized by comprising the following steps:

step 1: acquiring a position image of a target recognition object, and then performing image preprocessing: selecting an image acquisition point, and acquiring the great unchanged physical characteristics of the identified object; step 2: extracting target features, including extracting and selecting color features, spatial position features and angular point features; and step 3: establishing a pixel library of the image, and performing extended definition on a regular expression: firstly, establishing a pixel library capable of describing object characteristics according to physical characteristics of a general object, wherein the pixel library mainly comprises a line library, a shape library, a color library and a spatial position information library; organizing the image elements by utilizing regular semantics, and giving the image elements the capability of describing object features; and 4, step 4: describing a target matching model by using a regular grammar defined by an extension;

in the step 3, the step of processing the image,

the line bank is divided according to line shapes and comprises the following steps: horizontal straight lines, oblique lines, right-angle lines, arcs, S-shaped arcs, and other custom line shapes; the shape library comprises squares, rectangles, circles, semicircles, diamonds, heart shapes and other customized irregular figures; the color library is marked by adding a numeric variable Ck to letters, wherein the first letter C represents a color system, the later numeric variable k represents brightness, and the value range is 0-255;

the position information base describes relative position information in the image by fully customized symbols, "| (x1) - > (x2) |" represents starting from the pel x1 position and ending horizontally to the right to pel x 2; "| (x1) - [ Lambda (T) (x2) |", where T is a variable coefficient and can take on a value between 0 and 1, denotes the T from pel x1 to the pel x2 vertically below it.

2. The regular expression semantic based target model rapid construction method according to claim 1, characterized in that step 1 is to select to obtain a target image from a front side in a specific scene, so that the obtained same type of target object can be ensured to have the maximum physical feature similarity; image filtering and edge enhancement: firstly, carrying out gray processing on an image, then filtering background noise by using median filtering, and finally carrying out edge enhancement on the image by using a canny operator.

3. The regular expression semantics based target model rapid construction method according to claim 1, wherein the step 1 further comprises:

1) replacing the value of a point in the image sequence with the median of the values of the points in a neighborhood of the point, the values of the surrounding pixels being close to the true values: sorting pixels in the plate according to the size of pixel values by using a two-dimensional sliding template with a square matrix structure to generate a two-dimensional data sequence which is monotonically increased or decreased; the two-dimensional median filtering output is g (x, y) ═ med { f (x-k, y-l), (k, l ∈ W) }, wherein f (x, y), g (x, y) are respectively an original image and a processed image, W is a two-dimensional template and is selected as a 3 × 3 area;

4. The regular expression semantic based target model rapid construction method according to claim 1, wherein the step 2 specifically comprises:

D₄(P,Q)＝|x_p-x_q|+|y_p-y_q|

wherein w (x, y) represents a moving window, and I (x, y) represents the intensity of a pixel gray value, and the range is 0-255; calculating a partial derivative from the first order to the N order according to the Taylor series, and finally obtaining a Harris matrix formula:

R＝detM-K(traceM)²

detM＝λ₁λ₂

traceM＝λ₁+λ₂

wherein K is a coefficient value, the value range is usually between 0.04 and 0.06, traceM represents the trace of the M matrix, and detM represents the determinant of the M matrix.

5. The regular expression semantics based target model rapid construction method according to claim 1, wherein the basic definition of the horizontal and vertical position information given in the step 3, the definition symbols and definition rules of the position information can be defined and added according to actual conditions; the described pixel element library is an open element library, technical indexes for describing physical characteristics can be used as basic indexes of the pixel element library, such as angle information, temperature information, tone information and vibration information can be used as an element library index item for describing a target to be expanded.

6. The regular expression semantic based target model rapid construction method according to claim 1, characterized in that in step 4, connectivity connection is performed on target peripheral feature points, and then pixel elements are cut within a connected graph block according to a maximum similarity principle, that is, the cut local features can find the most similar "pixel elements" in a pixel element library.