CN107464245B - Image structure edge positioning method and device - Google Patents

Image structure edge positioning method and device Download PDF

Info

Publication number
CN107464245B
CN107464245B CN201710517455.4A CN201710517455A CN107464245B CN 107464245 B CN107464245 B CN 107464245B CN 201710517455 A CN201710517455 A CN 201710517455A CN 107464245 B CN107464245 B CN 107464245B
Authority
CN
China
Prior art keywords
edge
image
straight line
channel characteristics
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710517455.4A
Other languages
Chinese (zh)
Other versions
CN107464245A (en
Inventor
伍更新
高大帅
李健
张连毅
武卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sinovoice Technology Co Ltd
Original Assignee
Beijing Sinovoice Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sinovoice Technology Co Ltd filed Critical Beijing Sinovoice Technology Co Ltd
Priority to CN201710517455.4A priority Critical patent/CN107464245B/en
Publication of CN107464245A publication Critical patent/CN107464245A/en
Application granted granted Critical
Publication of CN107464245B publication Critical patent/CN107464245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method and a device for positioning an image structure edge, wherein the method detects the structure edge of an image to be detected through an edge detection model to generate an edge score matrix with the same size as the image to be detected, and the interested structure edge pixel points have high scores and the uninteresting structure edge and non-structure edge pixel points have low scores; performing linear detection on the edge scoring matrix, and obtaining candidate structure edges according to a linear detection result; finally determining the target structure edge in the candidate structure edges according to a preset constraint condition; the method positions the structure edge according to the edge scoring matrix, the line detection and the constraint condition, has stronger robustness to illumination, shadow and the like, and ensures that the positioning of the image structure edge is more accurate.

Description

Image structure edge positioning method and device
Technical Field
The invention relates to the technical field of computer science, in particular to a method and a device for positioning an image structure edge.
Background
In the existing image text recognition, no matter a photographed image or a scanned image, the structure edges in the image need to be separated before the text recognition, and then four-point perspective transformation and layout analysis are performed. The positioning of the edges of the image structure is of great significance.
The existing image structure edge detection usually uses a color or gray gradient detection method to determine a binary image, such as a Canny operator, a sobel operator, and the like, and then detects straight lines from the binary image and locates structure edges. The binary threshold parameters are often required to be set in the process of determining the binary image according to the gradient, and the threshold parameters are very sensitive to factors such as illumination, shadow and the like, so that the conditions of missing detection and false detection are often generated, and the application of the binary image in a natural scene is greatly limited.
Therefore, one technical problem that needs to be urgently solved by those skilled in the art is: how to make the location of the edges of the image structure more accurate.
Disclosure of Invention
The technical problem to be solved by the embodiment of the invention is to provide a method for positioning the edge of an image structure, so that the positioning of the edge of the image structure is more accurate.
Correspondingly, the embodiment of the invention also provides a device for positioning the edge of the image structure, which is used for ensuring the realization and the application of the method.
In order to solve the above problem, the present invention discloses a method for positioning an edge of an image structure, the method comprising:
acquiring an image to be detected;
adopting a pre-trained edge detection model to carry out edge detection on the image to be detected to obtain an edge scoring matrix;
performing straight line detection on the edge score matrix;
determining candidate structure edges according to the straight line detection result;
and determining the target structure edge in the candidate structure edges according to a preset constraint condition.
Preferably, the step of determining the candidate structure edge according to the straight line detection result includes:
determining a potential straight line according to the straight line detection result;
determining a horizontal line and a vertical line according to the angle of the potential straight line;
and obtaining candidate structure edges according to the horizontal lines and the vertical lines.
Preferably, the step of determining the horizontal line and the vertical line according to the angle of the potential straight line comprises:
when the included angle between the potential straight line and a preset reference straight line is more than 45 degrees, determining that the potential straight line is a vertical line;
and when the included angle between the potential straight line and a preset reference straight line is less than or equal to 45 degrees, determining that the potential straight line is a horizontal line.
Preferably, the step of obtaining candidate structure edges according to the horizontal lines and the vertical lines includes:
and obtaining a candidate structure edge according to any two horizontal lines and any two vertical lines, wherein the candidate structure edge is quadrilateral in shape.
Preferably, the method of line detection includes: hough transform or radon transform.
Preferably, the constraints comprise at least: one of an angle constraint, an aspect ratio constraint, an area constraint, a recommended region constraint, and a score matrix constraint.
Preferably, the step of obtaining the edge detection model includes:
acquiring an image sample;
labeling the image samples, and determining image blocks taking the structural edge pixel points as the center as positive samples and image blocks taking the non-structural edge pixel points as the center as negative samples;
acquiring multi-channel characteristics of the positive sample and the negative sample;
and training a machine learning classifier according to the image sample and the multi-channel characteristics of the positive sample and the negative sample to obtain an edge detection model.
Preferably, the machine learning classifier comprises a random forest classifier or a CNN convolutional neural network classifier.
Preferably, when the machine learning classifier is a random forest classifier, the multi-channel features include: color channel characteristics, gradient magnitude channel characteristics, gradient direction channel characteristics.
Correspondingly, the embodiment of the invention also provides a device for positioning the edge of the image structure, which comprises:
the image acquisition module is used for acquiring an image to be detected;
the edge detection module is used for carrying out edge detection on the image to be detected by adopting a pre-trained edge detection model to obtain an edge scoring matrix;
the straight line detection module is used for carrying out straight line detection on the edge scoring matrix;
an edge candidate module for determining candidate structure edges according to the results of the line detection module;
and the edge determining module is used for determining a target structure edge in the candidate structure edges according to a preset constraint condition.
Preferably, the edge candidate module includes:
the straight line determining submodule is used for determining a potential straight line according to the result of the straight line detecting module;
the line classification submodule is used for determining a horizontal line and a vertical line according to the angle of the potential line;
and the edge forming submodule is used for obtaining a candidate structure edge according to the horizontal line and the vertical line.
Preferably, the straight line classification submodule includes:
the vertical line determining submodule is used for determining the potential straight line as a vertical line when the included angle between the potential straight line and a preset reference straight line is more than 45 degrees;
and the horizontal line determining submodule is used for determining the potential straight line as the horizontal line when the included angle between the potential straight line and a preset reference straight line is less than or equal to 45 degrees.
Preferably, the edges constitute sub-modules comprising:
and the quadrangle forming submodule is used for obtaining a candidate structure edge according to any two horizontal lines and any two vertical lines, and the shape of the candidate structure edge is a quadrangle.
Preferably, the line detection module includes: a hough transform submodule, or a radon transform submodule.
Preferably, the constraints comprise at least: one of an angle constraint, an aspect ratio constraint, an area constraint, a recommended region constraint, and a score matrix constraint.
Preferably, the apparatus further comprises a model training module, the model training module comprising:
the sample acquisition submodule is used for acquiring an image sample;
the sample labeling submodule is used for labeling the image sample, and determining an image block taking a structural edge pixel point as a center as a positive sample and an image block taking a non-structural edge pixel point as a center as a negative sample;
the characteristic obtaining submodule is used for obtaining multi-channel characteristics of the positive sample and the negative sample;
and the model obtaining submodule is used for training a machine learning classifier according to the image sample and the multi-channel characteristics of the positive sample and the negative sample to obtain an edge detection model.
Preferably, the machine learning classifier comprises a random forest classifier or a CNN classifier.
Preferably, when the machine learning classifier is a random forest classifier, the multi-channel features include: color channel characteristics, gradient magnitude channel characteristics, gradient direction channel characteristics.
Compared with the prior art, the embodiment of the invention has the following advantages:
the method and the device detect the structural edge of the image to be detected through the edge detection model, generate an edge score matrix with the same size as the image to be detected, and have high scores of interested structural edge pixel points and low scores of uninteresting structural edge pixel points and non-structural edge pixel points; performing linear detection on the edge scoring matrix, and obtaining candidate structure edges according to a linear detection result; finally determining the target structure edge in the candidate structure edges according to a preset constraint condition; the method positions the structure edge according to the edge scoring matrix, the line detection and the constraint condition, has stronger robustness to illumination, shadow and the like, and ensures that the positioning of the image structure edge is more accurate.
Drawings
FIG. 1 is a flowchart illustrating a method for locating edges of an image structure according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a step of determining candidate structure edges in a method for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a step of line classification in a method for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating steps of obtaining an edge detection model in a method for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 5 is a block diagram of an apparatus for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 6 is a block diagram of an edge candidate module in an apparatus for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 7 is a block diagram of a straight line classification sub-module in an apparatus for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 8 is a block diagram of an edge forming sub-module in an apparatus for locating an edge of an image structure according to an embodiment of the present invention;
FIG. 9 is a block diagram of a model training module in an apparatus for locating an edge of an image structure according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, a flowchart illustrating steps of a method for positioning an image structure edge according to an embodiment of the present invention is shown, which may specifically include the following steps:
step 101: and acquiring an image to be detected.
The image to be detected can be a photographed image or a scanned image of a certificate such as a business card, an identity card, a driving license and the like. In the optical character recognition image preprocessing process, it is often necessary to first extract the structure edges from the image to lay down for subsequent perspective transformation and layout analysis. Structural edges are inherent edge patterns that may appear in an image, such as edges of horizontal lines, vertical lines, T-shapes, Y-shapes, and the like.
When this step is performed, the structure edge shape information of the image to be detected can also be obtained at the same time, so as to narrow the candidate range of the subsequent target structure edge.
Step 102: and carrying out edge detection on the image to be detected by adopting a pre-trained edge detection model to obtain an edge scoring matrix.
The edge detection model can firstly collect various images in a natural scene and carry out accurate artificial marking on the structural edges of the images; and selecting a proper edge feature vector, and using a machine learning classifier to supervise and learn the features to obtain an edge detection model. And (3) carrying out edge detection on the image to be detected obtained in the step (101) by using the edge detection model, wherein each pixel point in the image corresponds to an edge score to obtain a score matrix, namely an edge score matrix, with the same size as the image to be detected, and the structure edge pixel points which are interested have high scores, and the structure edge pixels which are not interested and the non-structure edge pixel points have low scores.
Step 103: and carrying out straight line detection on the edge scoring matrix.
There are various methods for performing the line detection on the edge score matrix obtained in step 102. Optionally, the method of line detection may include: hough transform or radon transform.
The Hough transformation is to convert a straight line representation in a Cartesian coordinate system into a point representation in Hough space, and the accumulated value of the straight line representation exceeds a certain threshold value and is regarded as a straight line. The Radon transform is the line integration of image pixels along a particular angle, which is considered to be a straight line when the value of the line integration exceeds a certain threshold.
The method may be determined according to practical applications, which is not limited in the embodiment of the present invention.
Step 104: and determining candidate structure edges according to the straight line detection result.
In practical application, straight lines may be determined according to the result of the straight line detection in step 103, and then various polygonal structures may be obtained through any combination of the straight lines, and the polygonal structures are used as candidate structure edges.
Step 105: and determining the target structure edge in the candidate structure edges according to a preset constraint condition.
Optionally, the constraint condition may include at least: one of an angle constraint, an aspect ratio constraint, an area constraint, a recommended region constraint, and a score matrix constraint.
In practical applications, the candidate structure edges may be filtered according to the recommended region constraints. Recommending area constraint, namely reserving the constraint when the overlapping proportion of the edge area of the candidate structure and the recommended area is larger than a preset overlapping threshold; the recommendation area can be determined by an edgebox algorithm according to the edge scoring matrix and the edge feature vector, and the preset coincidence threshold value can be determined according to historical experience. The edgebox algorithm is a fast region of interest recommendation algorithm.
In practical applications, the candidate structure edges may also be filtered according to the score matrix constraints. The score matrix constraint is retained when the edge score of the candidate structure edges is greater than a preset score threshold, wherein the preset score threshold can be determined according to historical experience.
In practical application, for the condition that the structure edge shape of the image to be detected is obtained in advance, corresponding constraint conditions can be preset according to the structure edge shape of the image to be detected, so that the target structure edge can be selected from candidate structure edges more quickly. For example, an angle constraint condition, an aspect ratio constraint condition, an area constraint condition, and the like corresponding to the structure edge shape of the image to be detected may be preset, and the candidate structure edge may be filtered according to the preset constraint condition.
The angle constraint is to retain the candidate structure edges when the angles meet a certain condition, for example, for the case that the shape of the structure edge of the image to be detected is a rectangle, the angle constraint condition can be set to retain the candidate quadrangle when the four angles are all in the range of 70-110 degrees, and any angle is filtered out when the condition is not met.
Aspect ratio constraints, i.e. the aspect ratio of the edges of the candidate structure is preserved when it is within a certain range; for example, in the case where the shape of the structure edge of the image to be detected is rectangular, the aspect ratio constraint condition may be defined such that the aspect ratio of the candidate structure edge is retained within a ± 30% floating range of the standard aspect ratio obtained in advance, and is filtered out beyond the ± 30% floating range of the standard aspect ratio.
Area constraint, namely, reserving the area of the edge of the candidate structure within a certain range; for example, the area constraint may be specified such that the area of the candidate structure edge remains within a + -30% float range of the area of the pre-obtained standard structure, and is filtered out beyond the + -30% float range of the area of the standard structure.
In practical applications, the constraint condition may be selected according to practical application situations, which is not limited in the embodiment of the present invention.
For the case where multiple candidate structure edges remain after filtering by the above constraints, a weighted combination of constraints can be used to determine the optimal target structure edge.
For example, in the case that it is predetermined that the structure edge shape of the image to be detected is rectangular, only the candidate structure edges of the quadrangle may be retained by constraint condition filtering first, and then the angle constraint weight may be defined as w1
Figure BDA0001336944710000071
Wherein, αiIs the ith angle of the edge of the candidate structure. When all 4 included angles are close to 90 degrees, w1Close to 1. Multiplying the weights of the four angles to obtain an angle constraint weight;
an aspect ratio constraint weight of w may also be defined2:
Figure BDA0001336944710000072
Wherein, R is the standard length-width ratio of the edge of the rectangular structure of the image to be detected, and w and h are the average values of two long sides and short sides of the candidate quadrangle respectively. As the aspect ratio of the candidate quadrangle is closer to the standard aspect ratio, w2The closer to 1;
the edge score E of the candidate quadrilateral may also be defined:
Figure BDA0001336944710000081
wherein L isiThe side score of the candidate quadrangle is the sum of scores of all pixel points on the side;
a candidate quadrilateral final score S may be defined:
S=w1w2E
and the candidate quadrangle with the highest final score S is the target structure edge.
In practical applications, the weighted combination mode of the constraint conditions may be selected and defined according to practical situations, which is not limited in the embodiment of the present invention.
The method and the device detect the structural edge of the image to be detected through the edge detection model, generate an edge score matrix with the same size as the image to be detected, and have high scores of interested structural edge pixel points and low scores of uninteresting structural edge pixel points and non-structural edge pixel points; performing linear detection on the edge scoring matrix, and obtaining candidate structure edges according to a linear detection result; finally determining the target structure edge in the candidate structure edges according to a preset constraint condition; the method positions the structure edge according to the edge scoring matrix, the line detection and the constraint condition, has stronger robustness to illumination, shadow and the like, and ensures that the positioning of the image structure edge is more accurate.
In another embodiment of the present application, referring to fig. 2, the step 104 may further include:
step 201: and determining a potential straight line according to the straight line detection result.
According to the result of the line detection in step 103, a potential line, i.e. a potential edge of the candidate structure edge, can be determined according to the result of hough transform or radon transform, for example.
Step 202: from the angles of the potential straight lines, the horizontal and vertical lines are determined.
In practical application, a reference straight line or a reference plane may be predefined, and the potential straight line is divided into two groups, i.e. a horizontal line and a vertical line, according to the size of the included angle between the potential straight line and the reference straight line or the reference plane. The method can be determined according to practical applications, and the embodiment of the invention is not limited thereto.
Step 203: and obtaining candidate structure edges according to the horizontal lines and the vertical lines.
In practical application, a plurality of polygons can be obtained by arbitrarily combining the horizontal lines and the vertical lines, and the polygons are used as candidate structural edges. For example, a triangle may be formed according to any one horizontal line and any two vertical lines as a candidate structure edge; a quadrangle formed by any two horizontal lines and any two vertical lines can also be used as a candidate structure edge.
In another embodiment of the present application, referring to fig. 3, the step 202 may further include:
step 301: and when the included angle between the potential straight line and a preset reference straight line is more than 45 degrees, determining that the potential straight line is a vertical line.
Step 302: and when the included angle between the potential straight line and a preset reference straight line is less than or equal to 45 degrees, determining that the potential straight line is a horizontal line.
In practical applications, the preset reference straight line may be a straight line parallel to the horizontal plane in the image plane to be detected. Because the included angle range of the two straight lines is 0-90 degrees, the potential straight line with the included angle of more than 45 degrees with the preset reference straight line can be defined as a vertical line, and the potential straight line with the included angle of less than or equal to 45 degrees with the preset reference straight line is defined as a horizontal line.
In another embodiment of the present application, the step 203 may further include:
and obtaining candidate structure edges according to any two horizontal lines and any two vertical lines, wherein the candidate structure edges are quadrilateral in shape.
Specifically, any two horizontal lines and any two vertical lines may preferentially form a plurality of quadrangles as candidate quadrangles, and whether to reserve the quadrangles is determined according to a preset constraint condition.
In another embodiment of the present application, referring to fig. 4, the step of obtaining the edge detection model in step 102 may include:
step 401: an image sample is acquired.
In practical application, a plurality of thousands of photographed images or scanned images containing structure edges such as business cards, identity cards, driving licenses and the like which are common in natural scenes can be collected as samples, wherein the larger the number of the samples is, the more accurate the edge detection model obtained by training is.
Step 402: and labeling the image samples, and determining the image blocks taking the structural edge pixel points as the center as positive samples and the image blocks taking the non-structural edge pixel points as the center as negative samples.
In practical application, the structure edge of the image sample can be accurately and manually labeled, an image block with a proper size is constructed by taking the pixel points on the structure edge as the center to serve as a positive sample, and an image block with a proper size is constructed by taking the pixel points on the non-structure edge as the center to serve as a negative sample. The size of the image block may be determined based on the size of the image itself and experience. The center of the negative sample image block can be at least 8 pixels away from the edge pixel.
Step 403: and acquiring multichannel characteristics of the positive sample and the negative sample.
In practical application, the multi-channel features of the positive sample image block and the negative sample image block can be respectively extracted, and a structure edge feature vector is generated by adopting a multi-channel fusion mode. The feature vector extracted from an image block represents a feature value represented by a center point of the image block, i.e., whether the image block is an edge point can be represented by block information.
Step 404: and training the machine learning classifier according to the image samples and the multi-channel characteristics of the positive samples and the negative samples to obtain an edge detection model.
In practical application, the multi-channel features are input into a machine learning classifier, so that the confidence coefficient of a classification result, namely an edge score, can be obtained, the machine learning classifier can be used for learning the multi-channel features of the structure edge which is artificially labeled in a supervision mode, and finally an edge detection model is obtained.
In this embodiment, optionally, the machine learning classifier includes a random forest classifier or a CNN convolutional neural network classifier.
In this embodiment, optionally, when the machine learning classifier is a random forest classifier, the multi-channel features include: color channel characteristics, gradient magnitude channel characteristics, gradient direction channel characteristics.
Specifically, the random forest classifier combines a plurality of random classification decision trees to form a forest, so that the robustness and the accuracy of a single decision tree are improved. In practical applications, when the machine learning classifier is a random forest classifier, the multi-channel features may be composed of color channel features, gradient magnitude channel features, and gradient direction channel features. For example, for a 32 x 32 patch, the multi-channel features may include R, G, B three color channel features or other ycbcr or lab color channel features that are more robust to illumination; the gradient amplitude characteristic and the gradient direction characteristic can be extracted by a gabor filter, for example, two wavelengths and two directions can be selected to be combined to generate four amplitude characteristic channels and four direction characteristic channels. Before the random forest classifier is used for model training, the principal component analysis method or the kmeans method can be used for reducing the dimension of the extracted high-dimensional multi-channel features. The principal component analysis method is a common method for data dimension reduction, can ensure mutual independence between the feature vectors after dimension reduction, and saves the vast majority of variances of the original data. The Kmeans method is a common clustering algorithm, and clustering or dimensionality reduction is realized by continuously iteratively updating a clustering center to reduce a loss function.
In practical applications, the machine learning classifier may also be a CNN convolutional neural network classifier. The CNN convolutional neural network is a local connection network and is mainly characterized by local connectivity and weight sharing. Local connectivity means that for a certain pixel point p in an image, the closer the pixel point p is, the greater the influence on the pixel point p is; the weight sharing property is that according to the statistical characteristics of natural images, the weight of a certain region can also be used in another region, and the weight sharing is that of convolution kernel sharing. When the CNN classifier is used, the multichannel characteristics can be extracted by a plurality of convolution kernels, one convolution kernel is convolved with the image to obtain one characteristic channel of the image, and different convolution kernels can extract different image characteristic channels. After obtaining the multi-channel features of the image by convolution, aggregation statistics may be performed on features at different positions in the image before model training, for example, an average value or a maximum value of a specific feature on a region of the image may be calculated, and this aggregation statistics process is pooling (Pooling). By pooling, not only can the dimensionality of the multi-channel features be reduced, but also the over-fitting phenomenon can be avoided. In practical applications, for 32 × 32 image blocks, 3 stacked convolutional layers and one full-connection layer may be generally used to complete the task of feature extraction and classification, where the stacked convolutional layers include convolutional layers, linear rectification units and pooling layers, the sizes of convolutional cores may be unified to 3 × 3, the number of feature layers may be 8, 16, and 32, and the size of a posing layer may be 2 × 2.
The above-described method for positioning the image structure edge can be applied to positioning any structure edge with polygonal characteristics, and the edges of certificates with more application objects in daily life, such as business cards, identity cards, driving licenses, and the like, have obvious rectangular characteristics. For the condition that the structure edge of the image to be detected is rectangular, the target structure edge can be more accurately and quickly selected from the candidate structure edges by the method for presetting the constraint condition corresponding to the rectangle; and when the candidate structure edge is determined, defining a quadrangle formed by only any two horizontal lines and any two vertical lines as the candidate structure edge, and finally determining the target structure edge by combining constraint conditions.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 5, a structural block diagram of a device for locating an image structure edge according to an embodiment of the present invention is shown, which may specifically include the following modules:
an image obtaining module 501, configured to obtain an image to be detected;
an edge detection module 502, configured to perform edge detection on the image to be detected by using a pre-trained edge detection model to obtain an edge score matrix;
a line detection module 503, configured to perform line detection on the edge score matrix;
an edge candidate module 504, configured to determine a candidate structure edge according to a result of the straight line detection module 503;
an edge determining module 505, configured to determine a target structure edge among the candidate structure edges according to a preset constraint condition.
In this embodiment, optionally, the straight line detecting module 503 may include: a hough transform submodule, or a radon transform submodule.
In this embodiment, optionally, the constraint conditions in the edge determination module 505 at least include: one of an angle constraint, an aspect ratio constraint, an area constraint, a recommended region constraint, and a score matrix constraint.
In another embodiment of the present application, referring to fig. 6, the edge candidate module 504 may further include:
a straight line determining submodule 601, configured to determine a potential straight line according to a result of the straight line detecting module 503;
a line classification submodule 602, configured to determine a horizontal line and a vertical line according to the angle of the potential line;
an edge construction sub-module 603 configured to obtain candidate structure edges according to the horizontal lines and the vertical lines.
In another embodiment of the present application, referring to fig. 7, the straight line classification sub-module 602 may further include:
the vertical line determining submodule 701 is used for determining that the potential straight line is a vertical line when the included angle between the potential straight line and the preset reference straight line is greater than 45 degrees;
the horizontal line determination submodule 702 is configured to determine that the potential straight line is the horizontal line when an included angle between the potential straight line and the preset reference straight line is smaller than or equal to 45 degrees.
In another embodiment of the present application, referring to fig. 8, the edge forming sub-module 603 may further include:
the quadrilateral forming sub-module 801 is configured to obtain candidate structure edges according to any two horizontal lines and any two vertical lines, where the candidate structure edges are quadrilateral in shape.
In another embodiment of the present application, referring to fig. 9, the edge detection module 502 may further include a model training module 900, and the model training module 900 may include:
a sample obtaining submodule 901 for obtaining an image sample;
the sample labeling submodule 902 is configured to label an image sample, determine an image block with a structural edge pixel point as a center as a positive sample, and determine an image block with a non-structural edge pixel point as a center as a negative sample;
a feature obtaining submodule 903, configured to obtain multichannel features of the positive sample and the negative sample;
and the model obtaining sub-module 904 is configured to train the machine learning classifier to obtain an edge detection model according to the image samples and the multi-channel features of the positive samples and the negative samples.
In this embodiment, optionally, the machine learning classifier in the model obtaining sub-module 904 may include a random forest classifier or a CNN classifier.
In this embodiment, optionally, when the machine learning classifier is a random forest classifier, the multi-channel features in the feature acquisition sub-module 903 may include: color channel characteristics, gradient magnitude channel characteristics, gradient direction channel characteristics.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method for positioning an image structure edge and the device for positioning an image structure edge provided by the present invention are described in detail above, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method of the present invention and the core idea thereof; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (6)

1. A method for locating an edge of an image structure, the method comprising:
acquiring an image to be detected;
adopting a pre-trained edge detection model to carry out edge detection on the image to be detected to obtain an edge scoring matrix;
performing straight line detection on the edge score matrix;
determining candidate structure edges according to the straight line detection result;
determining a target structure edge in the candidate structure edges according to a preset constraint condition;
wherein the step of training the edge detection model comprises: acquiring an image sample;
labeling the image samples, and determining image blocks taking the structural edge pixel points as the center as positive samples and image blocks taking the non-structural edge pixel points as the center as negative samples;
acquiring multi-channel characteristics of the positive sample and the negative sample, wherein the multi-channel characteristics comprise color channel characteristics, gradient amplitude channel characteristics and gradient direction channel characteristics, and the color channel characteristics are ycbcr or lab color channel characteristics;
and training a machine learning classifier according to the image sample and the multi-channel characteristics of the positive sample and the negative sample to obtain an edge detection model.
2. The method of claim 1, wherein the step of determining the candidate structure edge according to the straight line detection result comprises:
determining a potential straight line according to the straight line detection result;
determining a horizontal line and a vertical line according to the angle of the potential straight line;
and obtaining candidate structure edges according to the horizontal lines and the vertical lines.
3. The method of claim 2, wherein the step of obtaining candidate structure edges based on the horizontal lines and the vertical lines comprises:
and obtaining a candidate structure edge according to any two horizontal lines and any two vertical lines, wherein the candidate structure edge is quadrilateral in shape.
4. An apparatus for locating an edge of an image structure, the apparatus comprising:
the image acquisition module is used for acquiring an image to be detected;
the edge detection module is used for carrying out edge detection on the image to be detected by adopting a pre-trained edge detection model to obtain an edge scoring matrix;
the straight line detection module is used for carrying out straight line detection on the edge scoring matrix;
an edge candidate module for determining candidate structure edges according to the results of the line detection module;
an edge determining module, configured to determine a target structure edge from the candidate structure edges according to a preset constraint condition;
the apparatus further includes a model training module, the model training module including:
the sample acquisition submodule is used for acquiring an image sample;
the sample labeling submodule is used for labeling the image sample, and determining an image block taking a structural edge pixel point as a center as a positive sample and an image block taking a non-structural edge pixel point as a center as a negative sample;
the characteristic obtaining submodule is used for obtaining multi-channel characteristics of the positive sample and the negative sample; the multi-channel characteristics consist of color channel characteristics, gradient amplitude channel characteristics and gradient direction channel characteristics, and the color channel characteristics are ycbcr or lab color channel characteristics;
and the model obtaining submodule is used for training a machine learning classifier according to the image sample and the multi-channel characteristics of the positive sample and the negative sample to obtain an edge detection model.
5. The apparatus of claim 4, wherein the edge candidate module comprises:
the straight line determining submodule is used for determining a potential straight line according to the result of the straight line detecting module;
the line classification submodule is used for determining a horizontal line and a vertical line according to the angle of the potential line;
and the edge forming submodule is used for obtaining a candidate structure edge according to the horizontal line and the vertical line.
6. The apparatus of claim 5, wherein the edge comprises a sub-module comprising:
and the quadrangle forming submodule is used for obtaining a candidate structure edge according to any two horizontal lines and any two vertical lines, and the shape of the candidate structure edge is a quadrangle.
CN201710517455.4A 2017-06-29 2017-06-29 Image structure edge positioning method and device Active CN107464245B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710517455.4A CN107464245B (en) 2017-06-29 2017-06-29 Image structure edge positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710517455.4A CN107464245B (en) 2017-06-29 2017-06-29 Image structure edge positioning method and device

Publications (2)

Publication Number Publication Date
CN107464245A CN107464245A (en) 2017-12-12
CN107464245B true CN107464245B (en) 2020-08-18

Family

ID=60544060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710517455.4A Active CN107464245B (en) 2017-06-29 2017-06-29 Image structure edge positioning method and device

Country Status (1)

Country Link
CN (1) CN107464245B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921864B (en) * 2018-06-22 2022-02-15 广东工业大学 Light strip center extraction method and device
CN111260876B (en) * 2018-11-30 2022-02-25 北京欣奕华科技有限公司 Image processing method and device
CN109598737B (en) * 2018-12-04 2021-01-12 广东智媒云图科技股份有限公司 Image edge identification method and system
CN109886302A (en) * 2019-01-21 2019-06-14 河北新兴铸管有限公司 Caliber judgment method and terminal device based on machine learning
CN110276346B (en) * 2019-06-06 2023-10-10 北京字节跳动网络技术有限公司 Target area recognition model training method, device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065140A (en) * 2012-12-30 2013-04-24 信帧电子技术(北京)有限公司 Location method and device for automobile logos
CN103425984A (en) * 2013-08-15 2013-12-04 北京京北方信息技术有限公司 Method and device for detecting regular polygonal seal in bill
CN105488791A (en) * 2015-11-25 2016-04-13 北京奇虎科技有限公司 Method and apparatus for locating image edge in natural background
CN106157308A (en) * 2016-06-30 2016-11-23 北京大学 Rectangular target object detecting method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102006059663B4 (en) * 2006-12-18 2008-07-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for identifying a traffic sign in an image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065140A (en) * 2012-12-30 2013-04-24 信帧电子技术(北京)有限公司 Location method and device for automobile logos
CN103425984A (en) * 2013-08-15 2013-12-04 北京京北方信息技术有限公司 Method and device for detecting regular polygonal seal in bill
CN105488791A (en) * 2015-11-25 2016-04-13 北京奇虎科技有限公司 Method and apparatus for locating image edge in natural background
CN106157308A (en) * 2016-06-30 2016-11-23 北京大学 Rectangular target object detecting method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于机器学习的边缘检测方法研究";赵彤洲,王海晖,徐迪迪;《湖北大学学报(自然科学版)》;20110930;第33卷(第3期);第370-372页 *

Also Published As

Publication number Publication date
CN107464245A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
CN107464245B (en) Image structure edge positioning method and device
CN110414507B (en) License plate recognition method and device, computer equipment and storage medium
TWI774659B (en) Image text recognition method and device
WO2019169532A1 (en) License plate recognition method and cloud system
CN109918969B (en) Face detection method and device, computer device and computer readable storage medium
CN109711264B (en) Method and device for detecting occupation of bus lane
JP6192271B2 (en) Image processing apparatus, image processing method, and program
CN104834933A (en) Method and device for detecting salient region of image
CN108197644A (en) A kind of image-recognizing method and device
US9740965B2 (en) Information processing apparatus and control method thereof
CN111160169B (en) Face detection method, device, equipment and computer readable storage medium
CN109409384A (en) Image-recognizing method, device, medium and equipment based on fine granularity image
CN105760858A (en) Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features
CN103456003A (en) Device and method for tracking object by using characteristic point descriptor, device and method for removing erroneous characteristic
Wang et al. Recognition and localization of occluded apples using K-means clustering algorithm and convex hull theory: a comparison
CN107844737B (en) Iris image detection method and device
CN112686248B (en) Certificate increase and decrease type detection method and device, readable storage medium and terminal
TW200529093A (en) Face image detection method, face image detection system, and face image detection program
US11816946B2 (en) Image based novelty detection of material samples
CN111695373B (en) Zebra stripes positioning method, system, medium and equipment
CN111860309A (en) Face recognition method and system
CN103824090A (en) Adaptive face low-level feature selection method and face attribute recognition method
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN116342525A (en) SOP chip pin defect detection method and system based on Lenet-5 model
CN108960246B (en) Binarization processing device and method for image recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant