CN112906708A - Picture processing method and device, electronic equipment and computer storage medium - Google Patents

Picture processing method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN112906708A
CN112906708A CN202110332466.1A CN202110332466A CN112906708A CN 112906708 A CN112906708 A CN 112906708A CN 202110332466 A CN202110332466 A CN 202110332466A CN 112906708 A CN112906708 A CN 112906708A
Authority
CN
China
Prior art keywords
corner
picture
processed
boundary point
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110332466.1A
Other languages
Chinese (zh)
Other versions
CN112906708B (en
Inventor
张子浩
杨家博
李兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century TAL Education Technology Co Ltd
Original Assignee
Beijing Century TAL Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century TAL Education Technology Co Ltd filed Critical Beijing Century TAL Education Technology Co Ltd
Priority to CN202110332466.1A priority Critical patent/CN112906708B/en
Publication of CN112906708A publication Critical patent/CN112906708A/en
Application granted granted Critical
Publication of CN112906708B publication Critical patent/CN112906708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/247Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a picture processing method, a picture processing device, electronic equipment and a computer storage medium, and the method comprises the following steps: performing feature extraction on a picture to be processed to obtain a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content, a third corner feature map used for representing the upper right corner information of the picture content, a fourth corner feature map used for representing the lower right corner information of the picture content, a fifth corner feature map used for representing two opposite sides corresponding to the picture content and a sixth corner feature map; determining the upper left corner position, the lower left corner position, the upper right corner position, the lower right corner position, the first boundary point position and the second boundary point position of the picture content according to the at least six-channel feature map; and correcting the picture to be processed according to the angular point position to obtain a corrected picture. The scheme can better reserve the original content in the picture.

Description

Picture processing method and device, electronic equipment and computer storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing an image, an electronic device, and a computer storage medium.
Background
Pictures are widely used in work and life of people, but in a common scene, many pictures, especially pictures with contents such as characters or pictures taken by users, have a certain malformed angle, so that the contents in the pictures are distorted to some extent, for example, to present a certain angle (such as narrow top and wide bottom, wide top and narrow bottom, and the like). Therefore, in order to better acquire the picture content carried in the picture, the picture needs to be rectified.
At present, a common method for picture rectification processing is to find 4 straight lines at the edge of picture content through hough transform, and then perform perspective transform on the picture to obtain a final result.
However, the above-mentioned clipping of the picture content through the 4 straight lines at the edge can easily clip the information at the edge in the picture content, which is not favorable for keeping the original content in the picture.
Disclosure of Invention
In view of the above, embodiments of the present application provide a picture processing method, an apparatus, an electronic device, and a computer storage medium to at least partially solve the above problem.
In a first aspect, an embodiment of the present application provides an image processing method, including:
performing feature extraction on a picture to be processed to obtain at least six-channel feature maps, wherein the at least six-channel feature maps comprise: a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the picture content of the picture to be processed, and a sixth corner feature map;
determining the corner position of the picture content of the picture to be processed according to the at least six-channel feature map, including: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, and the positions of a first boundary point and a second boundary point on the two opposite edges;
and according to the corner position, correcting the picture to be processed to obtain a corrected picture.
In a second aspect, an embodiment of the present application further provides an image processing apparatus, including:
the processing module is used for extracting features of a picture to be processed to obtain at least six-channel feature maps, wherein the at least six-channel feature maps comprise: a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the picture content of the picture to be processed, and a sixth corner feature map;
a determining module, configured to determine, according to the at least six-channel feature map obtained by the processing module, a corner position of picture content of the picture to be processed, including: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, and the positions of a first boundary point and a second boundary point on the two opposite edges;
and the correction module is used for correcting the picture to be processed according to the corner position determined by the determination module to obtain a corrected picture.
In a third aspect, an embodiment of the present application further provides an electronic device, including: the image processing method includes a processor and a memory, where the processor is connected with the memory, and the memory stores a computer program, and the processor is configured to execute the computer program to implement the image processing method provided in any one of the possible implementation manners of the first aspect.
In a fourth aspect, an embodiment of the present application further provides a computer storage medium, including: the computer storage medium stores a computer program, and when the computer program is executed by a processor, the image processing method provided by any one of the possible implementation manners of the first aspect is implemented.
According to the technical scheme, the to-be-processed picture has a possibly abnormal angle, so that problems are caused when the to-be-processed picture is correspondingly processed subsequently, and the original picture content is possibly lost. Therefore, in the scheme of the embodiment of the application, feature extraction is performed on the picture to be processed, so that feature information in the picture to be processed, especially feature maps representing information of corners of the part where the picture content is located, can be extracted, and the feature maps include at least six-channel feature maps of a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. And determining the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, the position of the first boundary point and the position of the second boundary point of the picture content in the picture to be processed based on the at least six-channel feature map. Therefore, the malformed angle of the picture content in the picture to be processed is determined based on the corner position, and the correction mode of the picture to be processed is further determined, so that the information at the edge of the picture content can be effectively prevented from being cut, and the original content of the picture can be better reserved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a picture processing method according to an embodiment of the present application;
fig. 2 is a flowchart of another picture processing method according to the second embodiment of the present application;
FIG. 3 is a schematic diagram of a seven-channel feature diagram provided in the second embodiment of the present application;
fig. 4 is a schematic diagram of corner positions of picture contents in a picture to be processed according to a second embodiment of the present application;
fig. 5 is a schematic diagram of a divided first sub-picture provided in the second embodiment of the present application;
fig. 6 is a schematic diagram of a second divided sub-picture provided in the second embodiment of the present application;
fig. 7 is a schematic diagram of a first sub-picture and a second sub-picture after being stitched according to a second embodiment of the present application;
fig. 8 is a schematic diagram of a picture processing apparatus according to a third embodiment of the present application;
fig. 9 is a schematic view of an electronic device according to a fourth embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application shall fall within the scope of protection of the embodiments of the present application.
Example one
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present application. Referring to fig. 1, the method comprises the steps of:
101. extracting features of the picture to be processed to obtain at least six-channel feature maps, wherein the at least six-channel feature maps comprise: the image processing method comprises a first corner feature map used for representing the upper left corner information of the image content of the image to be processed, a second corner feature map used for representing the lower left corner information of the image content of the image to be processed, a third corner feature map used for representing the upper right corner information of the image content of the image to be processed, a fourth corner feature map used for representing the lower right corner information of the image content of the image to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the image content of the image to be processed and a sixth corner feature map.
In the embodiment of the present application, the two opposite sides corresponding to the picture content of the picture to be processed may be an upper boundary, a lower boundary, a left boundary, or a right boundary of the picture content of the picture to be processed.
The two opposite edges corresponding to the picture content are two opposite edges, and an included angle between the two edges is larger than a preset included angle threshold value. For example, due to a user's shooting angle problem, the picture content to be processed is deformed in shape, such as picture content having a shape of a quadrangle (e.g., a trapezoid), a pentagon (e.g., a quadrangle lacks any one corner), or a hexagon (e.g., a quadrangle lacks any two corners). The two opposite sides may be the two waists of the trapezoid or the two sides opposite in the vertical or horizontal direction. The picture content of the picture to be processed may include, but is not limited to, text, drawings, and the like. The display direction of the picture content may be a forward direction convenient for people to refer, or a direction with a certain deflection angle inconvenient for people to refer, so that in order to reduce the deformity of the picture content to the maximum extent in the process of correcting the picture content of the picture to be processed, the two opposite sides may also be two opposite sides in the forward direction of the picture content.
The picture to be processed may be any suitable picture, including but not limited to a picture carrying text information, a picture including pictorial information, and the like.
The feature extraction of the picture to be processed can be realized by a person skilled in the art in a proper manner, and in the embodiment of the application, the feature extraction of the picture to be processed can be performed in a manner of a neural network model, and the neural network model has a function of detecting a target object, particularly detecting corner points of picture content. The neural network model can be implemented by using structures such as CNN, RESNET and the like.
The at least six-channel feature map obtained through feature extraction comprises a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. Different feature maps can effectively represent the features of the regions where different corner points are located.
102. Determining the corner position of the picture content of the picture to be processed according to the at least six-channel feature map, wherein the method comprises the following steps: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner and the positions of a first boundary point and a second boundary point on two opposite edges.
Each feature map in the at least six-channel feature maps represents information of a corner point or information of a boundary point of the picture content of the picture to be processed, so that the position of the corner point and the position of the boundary point of the picture content can be determined through the at least six-channel feature maps, and the malformation degree of the picture content can be subsequently determined according to the information, and the picture to be processed can be further corrected in a proper mode.
The upper left corner position, the lower left corner position, the upper right corner position and the lower right corner position respectively represent the uppermost leftmost position, the lowermost leftmost position, the uppermost rightmost position and the lowermost rightmost position of the picture content; the first boundary point position and the second boundary point position may be a position (except for the upper left corner position, the lower left corner position, the upper right corner position, and the lower right corner position) on the boundary of the picture content, for example, the first boundary point position or the second boundary point position is located on the boundary between the upper left corner position and the upper right corner position, or located on the boundary between the upper right corner position and the lower right corner position, or located on the boundary between the lower left corner position and the lower right corner position, or located on the boundary between the upper left corner position and the lower left corner position. But the first boundary point location and the second boundary point location are not on the same boundary of the picture content.
For example, the first boundary point position may be an intersection position of a left boundary of the picture content and a left boundary of the picture to be processed (e.g., position 6 in fig. 4), and the second boundary point position may be an intersection position of a right boundary of the picture content and a right boundary of the picture to be processed (e.g., position 3 in fig. 4). But in some pictures to be processed. The boundary of the picture content and the boundary of the picture to be processed may not have an intersection, in which case the first boundary point position and the second boundary point position may be an appropriate position on the left boundary, an appropriate position on the right boundary, and an appropriate position on the right boundary of the picture content. In one possible approach, there may be locations near the corresponding lower left and lower right corner locations. In this case, the embodiment of the present application does not limit the proximity of the boundary point position to the corner point position, and for example, the position may be 1/3, 1/4, 1/5, or the like below the line connecting the upper left corner position and the lower left corner position.
103. And according to the angular point position, carrying out correction treatment on the picture to be treated to obtain a corrected picture.
According to the corner position, the relation between the boundary of the picture content and the boundary of the picture to be processed can be judged, and further, an appropriate correction processing strategy can be determined according to the relation for correction processing.
For example, if the boundary of the picture content is determined to intersect with the boundary of the picture to be processed at the middle section according to the corner position, the picture to be processed can be properly split and then respectively corrected; if the boundary of the picture content is determined not to intersect with the boundary of the picture to be processed or to intersect only at the lower left corner or the lower right corner according to the corner position, the picture to be processed can be directly corrected. Therefore, the corrected pictures are guaranteed to retain original content as much as possible.
According to the scheme provided by the embodiment of the application, the to-be-processed picture has a possibly abnormal angle, so that problems are caused when the to-be-processed picture is correspondingly processed subsequently, and the original picture content is possibly lost. Therefore, in the scheme of the embodiment of the application, feature extraction is performed on the picture to be processed, so that feature information in the picture to be processed, especially feature maps representing information of corners of the part where the picture content is located, can be extracted, and the feature maps include at least six-channel feature maps of a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. And determining the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, the position of the first boundary point and the position of the second boundary point of the picture content in the picture to be processed based on the at least six-channel feature map. Therefore, the malformed angle of the picture content in the picture to be processed is determined based on the corner position, and the correction mode of the picture to be processed is further determined, so that the information at the edge of the picture content can be effectively prevented from being cut, and the original content of the picture can be better reserved.
Example two
In this embodiment, a detailed description is given to the picture processing method provided in the second embodiment of the present application, by taking as an example the seven-channel feature map, where the first boundary point is located at a left boundary point on a left boundary of picture content of the picture to be processed, and the second boundary point is located at a right boundary point on a right boundary of the picture content of the picture to be processed. However, it should be understood by those skilled in the art that the picture processing scheme of the present application can be implemented with reference to this embodiment by using a six-channel feature map or a feature map with more channels than a seven-channel feature map as described in the first embodiment, or by using the first boundary point position and the second boundary point position as positions on other boundaries of the picture content of the picture to be processed. In addition, in this embodiment, feature extraction is performed on the picture to be processed in a manner of using a corner detection neural network model, and therefore, a training process of the corner detection neural network model is explained first.
Referring to fig. 2, the method comprises the steps of:
the training process of the angular point detection neural network model comprises the following steps:
201. and inputting the picture training sample into the angular point detection neural network model for feature extraction to obtain a seven-channel sample feature map.
Firstly, a picture training sample containing corner position marking information of picture content is obtained, wherein the corner position marking information comprises upper left corner position information, lower left corner position information, upper right corner position information, lower right corner position information, first boundary point position information and second boundary point position information of the picture content. The corner point labeling information may be obtained through manual labeling, or may be obtained through labeling of other models except the corner point detection neural network model. The first boundary point position information and the second boundary point position information may be set by a person skilled in the art as appropriate according to actual requirements, for example, the first boundary point position information or the second boundary point position information may be boundary point information on a boundary between an upper left corner and an upper right corner of the picture content, or boundary point information on a boundary between an upper left corner and a lower left corner of the picture content, or boundary point information on a boundary between an upper left corner and a lower right corner of the picture content, or boundary point information on a boundary between an upper right corner and a lower right corner of the picture content, which is not limited in the embodiment of the present application.
Secondly, inputting the picture training sample into a corner detection neural network model, extracting the characteristics of the picture training sample through the corner detection neural network model to obtain a seven-channel sample characteristic diagram, wherein, the sample characteristic diagrams of the seven channels comprise a first corner sample characteristic diagram of the upper left corner information of the picture content of the picture training sample, a second corner sample characteristic diagram of the lower left corner information of the picture content of the picture training sample, a third corner sample characteristic diagram of the upper right corner information of the picture content of the picture training sample, a fourth corner sample characteristic diagram of the lower right corner information of the picture content of the picture training sample, a fifth corner sample characteristic diagram of the boundary point information of two opposite sides corresponding to the picture content of the picture training sample, and a sixth corner sample characteristic diagram, and the background sample characteristic diagram is used for representing the background information of the picture content of the picture training sample.
It should be noted that, in the embodiment of the present application, a specific form and structure adopted by the corner detection neural network model are not limited, as long as the positions of the at least six corners of the picture content can be detected. For example, the corner detection neural network model can be realized by a Convolutional Neural Network (CNN) model, and the features of the picture to be processed are extracted by a convolutional layer of the CNN model; or, the corner detection neural network model can be realized by the residual error network model RESNET, and the feature extraction is performed on the picture to be processed through the convolution layer and the residual error layer, and the like. Taking RESNET as an example, the algorithm, the channel, the parameters and the like are set on the convolution layer part of the RESNET, so that the RESNET can extract the features of the picture and finally output a feature map of seven channels.
It should be noted that, compared with the six-channel feature map described in the first embodiment, the addition of the background feature map enables the corner detection neural network model to determine the content of the picture more accurately, so as to better predict the corner position of the picture to be processed. However, as mentioned above, in practical applications, the background feature map may not be present.
202. And predicting and obtaining the corner point prediction position of the picture content of the picture training sample according to the sample feature map of the seven channels.
As described above, the arrangement of the neural network model can be detected by diagonal points, so that the seven-channel feature map is output. In the training phase, a sample feature map of seven channels is output. For each sample feature map in the seven-channel sample feature map, the method can effectively predict the corner position in a certain area to obtain the corresponding corner predicted position. In the process, because the number of pixel points in the corresponding part of the corresponding to-be-processed picture in each region is usually large, in order to improve the accuracy of the corner point prediction position, the corner point region prediction can be performed through Gaussian distribution, and the probability that each pixel point in the predicted corner point region is the corner point is obtained. And then, determining the pixel point with the maximum probability in each corner region as the corner of the region, and obtaining the corner position of the determined corner.
In a feasible manner, when the pixel point with the maximum probability in each corner region is determined as a corner, normalization processing can be performed on the probability that each pixel point is a corner, so as to eliminate influences caused by different magnitudes between features, and enable the probability that each pixel point is a corner to be unified between (0, 1). Alternatively, the normalization process may take the form of a sigmod function.
After normalization processing, the pixel point with the highest probability in each corner region can be used as the predicted corner according to the result of normalization processing. And finally, determining the position of the corner point in the picture training sample, namely the corner point prediction position.
203. And obtaining a corresponding loss value according to the corner predicted position, the corner position marking information and a preset loss function.
In this embodiment, the loss value is calculated by the following loss function:
Figure 55593DEST_PATH_IMAGE001
wherein ,MSEthe loss value characterizing the loss function is,y i coordinate values of the ith coordinate point representing the corner point predicted position of the picture training sample,
Figure 857327DEST_PATH_IMAGE002
and representing the coordinate value of the ith coordinate point of the angular point marking information of the picture training sample, and n representing the number of the coordinate points of the angular point position marking information.
204. And training the angular point detection neural network model according to the loss value.
After the loss values are obtained, back propagation can be performed according to the loss values to adjust each model parameter of the corner detection neural network model.
It should be noted that, since the neural network model for corner detection needs to be trained through a large number of image training samples, the above-mentioned step 201 and 204 are iteratively executed until reaching a training termination condition, such as when the loss value reaches a preset threshold or the training times reaches a preset training time.
Through the process, the training of the corner detection neural network model is completed, and the model can be used for a long time subsequently. Based on the trained corner detection neural network model, a subsequent reasoning stage can be performed, namely, the corner detection neural network model is used for detecting the corners of the picture.
(II) reasoning process of the angular point detection neural network model:
205. and (3) performing feature extraction on the picture to be processed by using a pre-trained corner detection neural network model to obtain a seven-channel feature map.
After the training of the corner detection neural network model is completed, the corner positions of the picture to be processed can be detected by using the model.
For example, a to-be-processed picture with a size of 512 × 3 may be subjected to feature extraction by using a RESNET50 model, so as to obtain a seven-channel feature map with a size of 128 × 7. The seven-channel feature map comprises a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the left boundary information of the picture content of the picture to be processed, a sixth corner feature map used for representing the right boundary information of the picture content of the picture to be processed, and a background feature map used for representing the background information of the picture to be processed.
206. And (3) performing corner region prediction on each feature map except the background feature map in the seven-channel feature map through Gaussian distribution to obtain the probability that each pixel point in the corner region is a corner.
The gaussian distribution is a normal distribution, and in order to accurately predict the corner region, the feature vector data corresponding to each feature map may be processed by using a gaussian distribution function, so as to obtain a central region corresponding to each feature map, where the central region is a corner region, and a pixel point that may be a corner exists in the central region. And the possibility of whether a certain pixel point is a corner point is represented by the probability of the corner point.
207. And normalizing the probability of each pixel point as an angular point, determining the pixel point with the maximum probability in each angular point region as the angular point according to the normalization processing result, and obtaining the angular point position of the determined angular point.
Wherein, the corner position includes: the device comprises a left upper corner position, a left lower corner position, a right upper corner position, a right lower corner position, a first boundary point position and a second boundary point position. The corner points in each corner point region are determined, i.e. the corresponding corner point positions can be determined.
The following takes the first corner feature map of the top left corner information for representing the picture content of the picture to be processed as an example to illustrate the determination of the top left corner position.
Because the probability that the pixel points in the predicted central area in the first corner feature map are the upper left corner is also high, the corner area prediction can be performed by adopting Gaussian distribution to obtain the probability that each pixel point in the corner area is the upper left corner.
The obtained probability is mapped to the (0, 1) interval by the sigmod function. And then, by comparing the probability of each pixel point, selecting the pixel point with the maximum probability as the upper left corner point. Further, the coordinates of the top left corner point in the feature map may be determined by the subsequent steps, such as step 208, i.e. the corner point position of the top left corner point may be determined.
Fig. 3 is a schematic illustration of the predicted positions of corner points. In fig. 3, a mark a represents an upper left corner position of a picture content of a picture to be processed, a mark b represents an upper right corner position of the picture content, a mark c represents a second boundary point position of the picture content, a mark d represents a lower right corner position of the picture content, a mark e represents a lower left corner position of the picture content, a mark f represents a first boundary point position of the picture content, and a mark g represents background information of the picture to be processed.
208. And determining coordinates corresponding to the left boundary point position and the two corner point positions on the same side as the left boundary point position, and the right boundary point position and the two corner point positions on the same side as the right boundary point position in the picture to be processed respectively.
In this embodiment, since the position of the left boundary point is located on the left boundary of the picture content, the two corner point positions on the same side as the position of the left boundary point are two end points of the left boundary of the picture content, namely, the upper left corner and the lower left corner.
Since the position of the right boundary point is located on the right boundary of the picture content, two corner point positions on the same side as the position of the right boundary point are two end points of the right boundary of the picture content, namely, the upper right corner and the lower right corner.
It can be understood that, if the position of the first boundary point is located on the boundary between the upper left corner position and the upper right corner position, and the boundary is the upper boundary of the picture content of the picture to be processed, the two corner point positions on the same side as the position of the first boundary point are the upper left corner position and the upper right corner position. If the position of the first boundary point is located on the boundary between the lower left corner position and the lower right corner position, the boundary is the lower boundary of the picture content of the picture to be processed, and therefore, the two corner point positions on the same side as the position of the first boundary point are the lower left corner position and the lower right corner position.
It should be noted that, when some of the corner detection neural network models perform feature extraction, downsampling operation is performed. For example, a feature map obtained by feature extraction using the RESNET50 model is typically a feature map that is reduced by a factor of 4 compared to the original map. In this case, if the coordinate of the top left corner point in the first corner point feature map is (x, y), the coordinate of the top left corner point in the first corner point feature map mapped to the coordinate of the to-be-processed picture is (4 x, 4 y), and the coordinate (4 x, 4 y) is the top left corner position of the top left corner point in the to-be-processed picture.
209. And calculating the position of the left boundary point and a first vector and a second vector corresponding to the positions of the two corner points on the same side of the left boundary point according to the determined coordinates.
210. And calculating the position of the right boundary point and a third vector and a fourth vector corresponding to the positions of the two corner points on the same side of the right boundary point according to the determined coordinates.
In a possible way, the difference of the coordinates can be obtained as the corresponding vector by directly subtracting the coordinates.
It should be noted that, in this embodiment, the first vector and the second vector are obtained first, and then the third vector and the fourth vector are obtained as an example, but in practical application, the execution of steps 208 and 210 may not be in a sequential order, or may be executed in parallel.
211. And determining a first position relation between the position of the left boundary point and two corner points corresponding to the position according to the first vector and the second vector.
Specifically, in this embodiment, the first positional relationship is used to represent whether the left boundary point, the upper left corner, and the lower left corner are collinear.
212. And determining a second position relation between the position of the right boundary point and two corner points corresponding to the right boundary point according to the third vector and the fourth vector.
Specifically, in this embodiment, the second positional relationship is used to represent whether the right boundary point, the upper right corner, and the lower right corner are collinear.
After the corner position relation of the picture content of the picture to be processed is determined, whether the upper left corner, the upper right corner, the lower left corner and the lower right corner of the picture content are shielded or not needs to be judged. Taking the upper left corner position, the left boundary position, and the lower left corner position as an example, the vector (coordinates of the left boundary position-coordinates of the upper left corner position) is denoted as vector a. Based on (coordinates of left boundary position-coordinates of lower left corner position) as vector b. Judging the position relation among the upper left corner position, the left boundary position and the lower left corner position of the picture content according to the vector a and the vector b, wherein the following two modes are adopted:
the first mode is as follows: by determining whether the included angle between the vector a and the vector b is within a preset first interval, if the included angle is within the first interval, the position of the upper left corner, the position of the left boundary point and the position of the lower left corner are collinear, and the straight line is from the position of the upper left corner to the position of the lower left corner. Otherwise, the upper left corner position, the left boundary position and the lower left corner position are considered not to form a straight line. Wherein the first interval is [175,180] or [178,180 ]. It is understood that the first interval can be set by those skilled in the art as appropriate according to actual needs.
The second way is: and determining whether the slopes of the vector a and the vector b are within a preset second interval, wherein if the slopes are within the second interval, the upper left corner position, the left boundary position and the lower left corner position are collinear, and the straight line is from the upper left corner position to the lower left corner position. Otherwise, the upper left corner position, the left boundary position and the lower left corner position are considered not to form a straight line. Wherein the second interval is, for example, [ -0.5,0.5] or [ -0.1,0.1 ]. It is understood that the second interval can be set by those skilled in the art as appropriate according to actual needs.
The calculation method of the position relationship among the upper right corner position, the right boundary position and the lower right corner position is the same as above, and is not described herein again.
213. And according to the first position relation and the second position relation, correcting the picture to be processed to obtain a corrected picture.
The position relationship among the first boundary point position, the upper left corner position and the lower left corner position of the picture content of the picture to be processed may be on the same line or not. Similarly, the position relationship among the right boundary position, the upper right corner position, and the lower right corner position of the picture content of the picture to be processed may be on the same line or not. Therefore, it can be determined that the positional relationship between the corner positions may be the following:
the first condition is as follows: the position relations of the left boundary position, the upper left corner position and the lower left corner position of the picture content are collinear, and the position relations of the right boundary position, the upper right corner position and the lower right corner position of the picture content are not collinear (namely, the upper right corner position or the lower right corner position is blocked).
Case two: the position relationships of the left boundary position, the upper left corner position and the lower left corner position of the picture content are not collinear (that is, there is a case that the upper left corner position or the lower left corner position is blocked), and the position relationships of the right boundary position, the upper right corner position and the lower right corner position of the picture content are collinear.
Case three: the position relations of the left boundary position, the upper left corner position and the lower left corner position of the picture content are not collinear (namely, the condition that the upper left corner position or the lower left corner position is blocked), and the position relations of the right boundary position, the upper right corner position and the lower right corner position of the picture content are not collinear (namely, the condition that the upper right corner position or the lower right corner position is blocked).
Case four: the position relations of the left boundary position, the upper left corner position and the lower left corner position of the picture content are collinear, and the position relations of the right boundary position, the upper right corner position and the lower right corner position of the picture content are collinear, namely, the upper left corner position, the lower left corner position and the first boundary position are on the same line, and the upper right corner position, the lower right corner position and the second boundary position are on the same line.
For case one, case two, and case three:
when the first, second and third conditions exist, it is indicated that the condition that the corner points are not collinear exists in the picture to be processed, that is, at least one corner of the upper left corner, the lower left corner, the upper right corner and the lower right corner of the picture content exists a condition that is blocked, at this time, the upper left corner position, the upper right corner position, the second boundary point position, the lower right corner position, the lower left corner position and the left boundary point position of the picture content in the picture to be processed are sequentially connected to form a hexagon, and the picture to be processed needs to be split, the split sub-pictures are respectively corrected and then are spliced.
For example, a quadrangle can be obtained by sequentially connecting the upper left corner position, the upper right corner position, the right boundary position, and the first boundary position of the picture content in the picture to be processed, and the quadrangle is divided from the picture to be processed, so that the first sub-picture for the upper left corner position, the upper right corner position, the right boundary position, and the left boundary position can be obtained.
And sequentially connecting the left boundary point position, the right boundary point position, the lower right corner position and the lower left corner position to obtain another quadrangle, and dividing the quadrangle from the picture to be processed to obtain a second sub-picture aiming at the left boundary point position, the right boundary point position, the lower right corner position and the lower left corner position.
And determining the widths corresponding to the first sub-picture and the second sub-picture respectively, and correcting the first sub-picture and the second sub-picture based on the width with the maximum numerical value. By the mode, on one hand, the standard of correction can be unified, and on the other hand, the corrected pictures can be spliced more easily.
For example, the distance from the top left corner position to the top right corner position of the first sub-picture is 13cm, and the distance from the first boundary point position to the second boundary point position of the first sub-picture is 18 cm. The distance between the position of the first boundary point and the position of the second boundary point of the second sub-picture is 18cm, and the distance between the position of the lower left corner and the position of the lower right corner of the second sub-picture is 16 cm. The distance between the position of the first boundary point and the position of the second boundary point is the largest, so the first sub-picture and the second sub-picture are corrected based on the distance between the position of the left boundary point and the position of the right boundary point. But not limited thereto, any other suitable way of splitting and rectifying the picture and determining the rectification reference is applicable to the solution of the embodiment of the present application.
And splicing the corrected first sub-picture and the corrected second sub-picture according to the relative positions of the corrected first sub-picture and the corrected second sub-picture in the picture to be processed.
For case four:
and if the position relations of the left boundary position, the upper left corner position and the lower left corner position of the picture content of the picture to be processed are collinear, and the position relations of the right boundary position, the upper right corner position and the lower right corner position of the picture content are collinear, representing that the picture content is quadrilateral in shape. The malformation of the shape of the picture content is usually caused by the shooting angle of the user, and the upper left corner, the lower right corner and the upper right corner of the picture content are all in the picture to be processed, so that the picture content does not need to be split, and the shape of the picture content can be directly corrected based on the corner position.
At this time, the picture content may be divided from the picture to be processed based on the corner position. For example, the upper left corner position, the upper right corner position, the right boundary point position, the lower right corner position, the lower left corner position, and the left boundary point position of the picture content in the picture to be processed are sequentially connected, and a corresponding picture content area is defined for correction processing.
Fig. 4 shows the detected position 1 of the upper left corner, the detected position 2 of the upper right corner, the detected position 3 of the right boundary, the detected position 4 of the lower right corner, the detected position 5 of the lower left corner, and the detected position 6 of the left boundary, where the identifier 7 represents the background information of the picture to be processed, and the identifier 8 represents the picture content of the picture to be processed, where the lower left corner and the lower right corner of the picture content in the picture to be processed are blocked due to the shooting angle problem.
Fig. 5 is a schematic diagram of a first sub-picture partitioned from a to-be-processed picture according to an upper left corner position, an upper right corner position, a right boundary position, and a left boundary position of the picture content when a lower left corner and a lower right corner of the picture content are blocked.
Fig. 6 is a schematic diagram of a second sub-picture partitioned from the to-be-processed picture according to a left boundary position, a right boundary position, a lower right corner position, and a lower left corner position of the picture content when the lower left corner and the lower right corner of the picture content are blocked.
Fig. 7 is a schematic diagram of the first sub-picture Z1 and the second sub-picture Z2 divided from the picture to be processed after being respectively subjected to rectification processing and splicing.
It should be noted that, the specific implementation means of the above-mentioned correction processing and splicing processing can be set by those skilled in the art according to actual needs, and the embodiment of the present application does not limit this.
With the embodiment, the to-be-processed picture may have a malformed angle, so that a problem may occur when the to-be-processed picture is subsequently processed correspondingly, and the original picture content may be lost. Therefore, in the scheme of the embodiment of the application, feature extraction is performed on the picture to be processed, so that feature information in the picture to be processed, especially feature maps representing information of corners of the part where the picture content is located, can be extracted, and the feature maps include at least six-channel feature maps of a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. And determining the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, the position of the first boundary point and the position of the second boundary point of the picture content in the picture to be processed based on the at least six-channel feature map. Therefore, the malformed angle of the picture content in the picture to be processed is determined based on the corner position, and the correction mode of the picture to be processed is further determined, so that the information at the edge of the picture content can be effectively prevented from being cut, and the original content of the picture can be better reserved.
EXAMPLE III
Fig. 8 is a schematic diagram of a picture processing apparatus according to a third embodiment of the present application, and referring to fig. 8, the apparatus includes:
a processing module 801, configured to perform feature extraction on a picture to be processed to obtain at least six channel feature maps, where the at least six channel feature maps include: a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the picture content of the picture to be processed, and a sixth corner feature map;
a determining module 802, configured to determine, according to the at least six-channel feature map obtained by the processing module 801, a corner position of picture content of the picture to be processed, where the determining module includes: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, and the positions of a first boundary point and a second boundary point on the two opposite edges;
a correcting module 803, configured to perform correction processing on the to-be-processed picture according to the corner position determined by the determining module 802, so as to obtain a corrected picture.
In the embodiment of the present application, the to-be-processed picture may have a malformed angle, so that a problem may occur when the to-be-processed picture is subsequently processed correspondingly, and the original picture content may be lost. Therefore, in the scheme of the embodiment of the application, feature extraction is performed on the picture to be processed, so that feature information in the picture to be processed, especially feature maps representing information of corners of the part where the picture content is located, can be extracted, and the feature maps include at least six-channel feature maps of a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. And determining the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, the position of the first boundary point and the position of the second boundary point of the picture content in the picture to be processed based on the at least six-channel feature map. Therefore, the malformed angle of the picture content in the picture to be processed is determined based on the corner position, and the correction mode of the picture to be processed is further determined, so that the information at the edge of the picture content can be effectively prevented from being cut, and the original content of the picture can be better reserved.
In a possible implementation manner, the processing module 801 is configured to perform, for each feature map in the at least six-channel feature map, corner region prediction through gaussian distribution to obtain a probability that each pixel point in the corner region is a corner; and determining the pixel point with the maximum probability in each corner region as a corner, and obtaining the determined corner position of the corner.
In a possible implementation manner, the processing module 801 is configured to perform normalization processing on the probability that each pixel is an angular point, and determine, according to a result of the normalization processing, a pixel with the highest probability in each angular point region as an angular point.
In a possible implementation manner, the correcting module 803 is configured to determine two corner positions on the same side as the first boundary point position and two corner positions on the same side as the second boundary point position, respectively; determining a first position relation between the first boundary point and two corner points corresponding to the first boundary point and a second position relation between the second boundary point and two corner points corresponding to the second boundary point according to a coordinate relation between the position of the first boundary point and two corner points on the same side of the first boundary point and a coordinate relation between the position of the second boundary point and two corner points on the same side of the second boundary point; and according to the first position relation and the second position relation, correcting the picture to be processed to obtain a corrected picture.
In a possible implementation manner, the correcting module 803 is configured to determine coordinates of the first boundary point position and two corner point positions on the same side of the first boundary point position, and corresponding coordinates of the second boundary point position and two corner point positions on the same side of the second boundary point position in the to-be-processed picture, respectively; according to the coordinates, calculating a first vector and a second vector corresponding to the position of the first boundary point and the positions of two corner points on the same side of the first boundary point, and calculating a third vector and a fourth vector corresponding to the position of the second boundary point and the positions of two corner points on the same side of the second boundary point; determining a first position relation between a first boundary point and two corner points corresponding to the first boundary point according to the first vector and the second vector; and determining a second position relation between the first boundary point and two corner points corresponding to the first boundary point according to the third vector and the fourth vector.
In a possible implementation manner, the correcting module 803 is configured to, if two corner point positions on the same side as the first boundary point position are an upper left corner position and a lower left corner position, respectively, and two corner point positions on the same side as the second boundary point position are an upper right corner position and a lower right corner position, respectively;
if the first position relationship represents that the first boundary point position, the upper left corner position and the lower left corner position are collinear, and the second position relationship represents that the second boundary point position, the upper right corner position and the lower right corner position are collinear, obtaining a to-be-processed picture including the picture content according to the corner positions, and correcting the to-be-processed picture.
In a possible implementation manner, the correcting module 803 is configured to, if two corner point positions on the same side as the first boundary point position are an upper left corner position and a lower left corner position, respectively, and two corner point positions on the same side as the second boundary point position are an upper right corner position and a lower right corner position, respectively;
if the first position relationship represents that the first boundary point position, the upper left corner position and the lower left corner position are not collinear, or the second position relationship represents that the second boundary point position, the upper right corner position and the lower right corner position are not collinear, splitting the picture to be processed according to the corner point positions to obtain at least two split sub-pictures, and correcting the at least two split sub-pictures.
In a possible implementation manner, the correcting module 803 is configured to split a first sub-picture from the to-be-processed picture according to an upper-left corner position, an upper-right corner position, a second boundary point position, and a first boundary point position in the corner position; splitting a second sub-picture from the picture to be processed according to a first boundary point position, a second boundary point position, a lower right corner position and a lower left corner position in the corner positions; performing rectification processing on the first sub-picture and the second sub-picture; and splicing the first sub-picture after the correction processing and the second sub-picture after the correction processing according to the relative positions of the first sub-picture and the second sub-picture in the picture to be processed.
In a possible implementation manner, the processing module 801 is configured to perform feature extraction on a picture to be processed by using a pre-trained corner detection neural network model, so as to obtain at least six-channel feature maps; the corner detection neural network model is trained by using a picture training sample containing corner position marking information, wherein the corner position marking information comprises: the picture content comprises upper left corner position information, lower left corner position information, upper right corner position information, lower right corner position information, first boundary point position information and second boundary point position information.
In a possible implementation manner, the processing module 801 is configured to input the picture training sample into the corner detection neural network model for feature extraction, so as to obtain a sample feature map of at least six channels; predicting and obtaining the corner point prediction position of the picture content of the picture training sample according to the sample feature map of the at least six channels; obtaining a corresponding loss value according to the corner predicted position, the corner position marking information and a preset loss function; and training the corner detection neural network model according to the loss value.
In one possible implementation, the loss function is:
Figure 937278DEST_PATH_IMAGE001
wherein ,MSEa loss value characterizing the loss function,y i coordinate values of an ith coordinate point representing the corner predicted position of the picture training sample,
Figure 25320DEST_PATH_IMAGE002
and representing the coordinate value of the ith coordinate point of the angular point marking information of the picture training sample, and n representing the number of the coordinate points of the angular point position marking information.
In a possible implementation manner, the at least six-channel feature map is a seven-channel feature map, and the seven-channel feature map further includes a background feature map used for representing background information of the picture to be processed.
In a possible implementation manner, the two opposite edges are two opposite edges having an included angle greater than a preset included angle threshold.
In a possible implementation manner, the two opposite edges are two edges opposite in a forward direction of the picture content.
Example four
Based on the image processing methods described in the first and second embodiments, an embodiment of the present application provides an electronic device for executing the image processing methods in the first and second embodiments. Fig. 9 is a schematic view of an electronic device according to a fourth embodiment of the present application. Referring to fig. 9, an electronic device 90 provided in an eighth embodiment of the present application includes: at least one processor (processor)902, memory 904, bus 906, and communication interface (communication interface) 908. Wherein the content of the first and second substances,
the processor 902, communication interface 908, and memory 904 communicate with one another via a communication bus 906.
A communication interface 908 for communicating with other devices.
The processor 902 is configured to execute the program 910, and may specifically perform the relevant steps in the methods described in the first embodiment, the second embodiment, and the third embodiment.
In particular, the program 910 may include program code that includes computer operating instructions.
The processor 902 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present application. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 904 for storing a program 910. Memory 904 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
EXAMPLE five
An embodiment of the present application provides a computer storage medium, including: the computer storage medium stores a computer program, and when the computer program is executed by a processor, the picture processing method as described in any embodiment of the present application is implemented.
In the embodiment of the present application, the to-be-processed picture may have a malformed angle, so that a problem may occur when the to-be-processed picture is subsequently processed correspondingly, and the original picture content may be lost. Therefore, in the scheme of the embodiment of the application, feature extraction is performed on the picture to be processed, so that feature information in the picture to be processed, especially feature maps representing information of corners of the part where the picture content is located, can be extracted, and the feature maps include at least six-channel feature maps of a first corner feature map, a second corner feature map, a third corner feature map, a fourth corner feature map, a fifth corner feature map and a sixth corner feature map. And determining the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, the position of the first boundary point and the position of the second boundary point of the picture content in the picture to be processed based on the at least six-channel feature map. Therefore, the malformed angle of the picture content in the picture to be processed is determined based on the corner position, and the correction mode of the picture to be processed is further determined, so that the information at the edge of the picture content can be effectively prevented from being cut, and the original content of the picture can be better reserved.
So far, specific embodiments of the present application have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may be advantageous.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transmyedia) such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the phrase "including a" does not exclude the presence of other, identical elements in the process, method, article, or apparatus that comprises the same element, whether or not the same element is present in all of the same element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular transactions or implement particular abstract data types. The application may also be practiced in distributed computing environments where transactions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (16)

1. An image processing method, comprising:
performing feature extraction on a picture to be processed to obtain at least six-channel feature maps, wherein the at least six-channel feature maps comprise: a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the picture content of the picture to be processed, and a sixth corner feature map;
determining the corner position of the picture content of the picture to be processed according to the at least six-channel feature map, including: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, and the positions of a first boundary point and a second boundary point on the two opposite edges;
and according to the corner position, correcting the picture to be processed to obtain a corrected picture.
2. The method according to claim 1, wherein the determining the corner positions of the picture content of the picture to be processed according to the at least six-channel feature map comprises:
performing corner region prediction on each feature map in the at least six-channel feature map through Gaussian distribution to obtain the probability that each pixel point in the corner region is a corner; and determining the pixel point with the maximum probability in each corner region as a corner, and obtaining the determined corner position of the corner.
3. The method according to claim 2, wherein the determining the pixel point with the highest probability in each corner region as the corner comprises:
and normalizing the probability of each pixel point as an angular point, and determining the pixel point with the maximum probability in each angular point region as the angular point according to the result of the normalization processing.
4. The method according to claim 1, wherein said performing a rectification process on the to-be-processed picture according to the corner position to obtain a rectified picture comprises:
respectively determining two corner positions on the same side as the first boundary point position and two corner positions on the same side as the second boundary point position;
determining a first position relation between the first boundary point and two corner points corresponding to the first boundary point and a second position relation between the second boundary point and two corner points corresponding to the second boundary point according to a coordinate relation between the position of the first boundary point and two corner points on the same side of the first boundary point and a coordinate relation between the position of the second boundary point and two corner points on the same side of the second boundary point;
and according to the first position relation and the second position relation, correcting the picture to be processed to obtain a corrected picture.
5. The method according to claim 4, wherein determining a first positional relationship between the first boundary point and its corresponding two corner points and a second positional relationship between the second boundary point and its corresponding two corner points according to the coordinate relationship between the first boundary point and its two corner point positions on the same side and the coordinate relationship between the second boundary point and its two corner point positions on the same side comprises:
determining coordinates of the first boundary point position and two corner point positions on the same side of the first boundary point position, and the second boundary point position and two corner point positions on the same side of the second boundary point position in the picture to be processed respectively;
according to the coordinates, calculating a first vector and a second vector corresponding to the position of the first boundary point and the positions of two corner points on the same side of the first boundary point, and calculating a third vector and a fourth vector corresponding to the position of the second boundary point and the positions of two corner points on the same side of the second boundary point;
determining a first position relation between a first boundary point and two corner points corresponding to the first boundary point according to the first vector and the second vector;
and determining a second position relation between the first boundary point and two corner points corresponding to the first boundary point according to the third vector and the fourth vector.
6. The method according to claim 5, wherein the performing a rectification process on the to-be-processed picture according to the first positional relationship and the second positional relationship to obtain a rectified picture comprises:
if the two corner point positions on the same side as the first boundary point position are respectively an upper left corner position and a lower left corner position, and the two corner point positions on the same side as the second boundary point position are respectively an upper right corner position and a lower right corner position;
if the first position relationship represents that the first boundary point position, the upper left corner position and the lower left corner position are collinear, and the second position relationship represents that the second boundary point position, the upper right corner position and the lower right corner position are collinear, obtaining a to-be-processed picture including the picture content according to the corner positions, and correcting the to-be-processed picture.
7. The method according to claim 5, wherein the performing a rectification process on the to-be-processed picture according to the first positional relationship and the second positional relationship to obtain a rectified picture comprises:
if the two corner point positions on the same side as the first boundary point position are respectively an upper left corner position and a lower left corner position, and the two corner point positions on the same side as the second boundary point position are respectively an upper right corner position and a lower right corner position;
if the first position relationship represents that the first boundary point position, the upper left corner position and the lower left corner position are not collinear, or the second position relationship represents that the second boundary point position, the upper right corner position and the lower right corner position are not collinear, splitting the picture to be processed according to the corner position to obtain at least two split sub-pictures, and correcting the at least two split sub-pictures;
and splicing the at least two sub-pictures after the correction treatment.
8. The method of claim 7,
splitting the picture to be processed according to the angular point positions to obtain at least two split sub-pictures, and correcting the at least two split sub-pictures, including: splitting a first sub-picture from the picture to be processed according to the upper left corner position, the upper right corner position, the second boundary point position and the first boundary point position in the corner position; splitting a second sub-picture from the picture to be processed according to a first boundary point position, a second boundary point position, a lower right corner position and a lower left corner position in the corner positions; performing rectification processing on the first sub-picture and the second sub-picture;
the splicing of the at least two sub-pictures after the correction processing includes: and splicing the first sub-picture after the correction processing and the second sub-picture after the correction processing according to the relative positions of the first sub-picture and the second sub-picture in the picture to be processed.
9. The method according to claim 1, wherein the extracting features of the picture to be processed to obtain at least six-channel feature maps comprises:
performing feature extraction on a picture to be processed by using a pre-trained corner detection neural network model to obtain at least six-channel feature maps;
the corner detection neural network model is trained by using a picture training sample containing corner position marking information, wherein the corner position marking information comprises: the picture content comprises upper left corner position information, lower left corner position information, upper right corner position information, lower right corner position information, first boundary point information and second boundary point information.
10. The method of claim 9, wherein the corner detection neural network model is trained by:
inputting the picture training sample into the corner detection neural network model for feature extraction to obtain a sample feature map of at least six channels;
predicting and obtaining the corner point prediction position of the picture content of the picture training sample according to the sample feature map of the at least six channels;
obtaining a corresponding loss value according to the corner predicted position, the corner position marking information and a preset loss function;
and training the corner detection neural network model according to the loss value.
11. The method of claim 10, wherein the loss function is:
Figure 65182DEST_PATH_IMAGE001
wherein ,MSEa loss value characterizing the loss function,y i coordinate values of an ith coordinate point representing the corner predicted position of the picture training sample,
Figure 247902DEST_PATH_IMAGE002
and representing the coordinate value of the ith coordinate point of the angular point marking information of the picture training sample, and n representing the number of the coordinate points of the angular point position marking information.
12. The method according to any one of claims 1 to 11,
the at least six-channel feature map is a seven-channel feature map, and the seven-channel feature map further comprises a background feature map used for representing background information of the picture to be processed.
13. The method according to any one of claims 1 to 11,
the two opposite edges are two opposite edges with an included angle larger than a preset included angle threshold value;
or ,
the two opposite edges are two opposite edges in the forward direction of the picture content.
14. A picture processing apparatus, comprising:
the processing module is used for extracting features of a picture to be processed to obtain at least six-channel feature maps, wherein the at least six-channel feature maps comprise: a first corner feature map used for representing the upper left corner information of the picture content of the picture to be processed, a second corner feature map used for representing the lower left corner information of the picture content of the picture to be processed, a third corner feature map used for representing the upper right corner information of the picture content of the picture to be processed, a fourth corner feature map used for representing the lower right corner information of the picture content of the picture to be processed, a fifth corner feature map used for representing the boundary point information of two opposite sides corresponding to the picture content of the picture to be processed, and a sixth corner feature map;
a determining module, configured to determine, according to the at least six-channel feature map obtained by the processing module, a corner position of picture content of the picture to be processed, including: the position of the upper left corner, the position of the lower left corner, the position of the upper right corner, the position of the lower right corner, and the positions of a first boundary point and a second boundary point on the two opposite edges;
and the correction module is used for correcting the picture to be processed according to the corner position determined by the determination module to obtain a corrected picture.
15. An electronic device, comprising: a processor and a memory, the processor and the memory being connected, the memory storing a computer program, the processor being configured to execute the computer program to implement the picture processing method of any of the preceding claims 1 to 13.
16. A computer storage medium, comprising: the computer storage medium stores a computer program that, when executed by a processor, implements the picture processing method according to any one of claims 1 to 13.
CN202110332466.1A 2021-03-29 2021-03-29 Picture processing method and device, electronic equipment and computer storage medium Active CN112906708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110332466.1A CN112906708B (en) 2021-03-29 2021-03-29 Picture processing method and device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110332466.1A CN112906708B (en) 2021-03-29 2021-03-29 Picture processing method and device, electronic equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN112906708A true CN112906708A (en) 2021-06-04
CN112906708B CN112906708B (en) 2023-10-24

Family

ID=76109204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110332466.1A Active CN112906708B (en) 2021-03-29 2021-03-29 Picture processing method and device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN112906708B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850238A (en) * 2021-11-29 2021-12-28 北京世纪好未来教育科技有限公司 Document detection method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63232677A (en) * 1987-03-20 1988-09-28 Fuji Xerox Co Ltd Area recognizing device
JPH01307879A (en) * 1988-06-06 1989-12-12 Hitachi Software Eng Co Ltd Pattern recognizing device
US5129012A (en) * 1989-03-25 1992-07-07 Sony Corporation Detecting line segments and predetermined patterns in an optically scanned document
US20100302366A1 (en) * 2009-05-29 2010-12-02 Zhao Bingyan Calibration method and calibration device
US20150281519A1 (en) * 2014-03-31 2015-10-01 Brother Kogyo Kabushiki Kaisha Image processing apparatus configured to execute correction on scanned image
CN111241808A (en) * 2020-01-07 2020-06-05 北大方正集团有限公司 Character stroke splitting method and device
CN111339846A (en) * 2020-02-12 2020-06-26 深圳市商汤科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN111598884A (en) * 2020-05-21 2020-08-28 北京世纪好未来教育科技有限公司 Image data processing method, apparatus and computer storage medium
WO2020223859A1 (en) * 2019-05-05 2020-11-12 华为技术有限公司 Slanted text detection method, apparatus and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63232677A (en) * 1987-03-20 1988-09-28 Fuji Xerox Co Ltd Area recognizing device
JPH01307879A (en) * 1988-06-06 1989-12-12 Hitachi Software Eng Co Ltd Pattern recognizing device
US5129012A (en) * 1989-03-25 1992-07-07 Sony Corporation Detecting line segments and predetermined patterns in an optically scanned document
US20100302366A1 (en) * 2009-05-29 2010-12-02 Zhao Bingyan Calibration method and calibration device
US20150281519A1 (en) * 2014-03-31 2015-10-01 Brother Kogyo Kabushiki Kaisha Image processing apparatus configured to execute correction on scanned image
WO2020223859A1 (en) * 2019-05-05 2020-11-12 华为技术有限公司 Slanted text detection method, apparatus and device
CN111241808A (en) * 2020-01-07 2020-06-05 北大方正集团有限公司 Character stroke splitting method and device
CN111339846A (en) * 2020-02-12 2020-06-26 深圳市商汤科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN111598884A (en) * 2020-05-21 2020-08-28 北京世纪好未来教育科技有限公司 Image data processing method, apparatus and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
OMAR BOUDRAA等: "Using skeleton and Hough transform variant to correct skew in historical documents", MATHEMATICS AND COMPUTERS IN SIMULATION, vol. 167, pages 389 - 403, XP085867894, DOI: 10.1016/j.matcom.2019.05.009 *
牛青;: "一种基于边缘检测的图像倾斜校正算法", 中国科技信息, no. 19, pages 116 - 117 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850238A (en) * 2021-11-29 2021-12-28 北京世纪好未来教育科技有限公司 Document detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112906708B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
CN112861560B (en) Two-dimensional code positioning method and device
US9305360B2 (en) Method and apparatus for image enhancement and edge verification using at least one additional image
CN108921897B (en) Method and apparatus for locating card area
US9390475B2 (en) Backlight detection method and device
CN111091123A (en) Text region detection method and equipment
CN111882520B (en) Screen defect detection method and device and head-mounted display equipment
CN109344824B (en) Text line region detection method, device, medium and electronic equipment
CN112348836A (en) Method and device for automatically extracting building outline
CN115937003A (en) Image processing method, image processing device, terminal equipment and readable storage medium
CN113449534A (en) Two-dimensional code image processing method and device
CN111079793A (en) Icon similarity determining method and electronic equipment
CN108960247B (en) Image significance detection method and device and electronic equipment
CN113850238B (en) Document detection method and device, electronic equipment and storage medium
CN112906708B (en) Picture processing method and device, electronic equipment and computer storage medium
CN110929738A (en) Certificate card edge detection method, device, equipment and readable storage medium
CN110969640A (en) Video image segmentation method, terminal device and computer-readable storage medium
CN113129298A (en) Definition recognition method of text image
CN116894849A (en) Image segmentation method and device
CN112580638B (en) Text detection method and device, storage medium and electronic equipment
CN107480710B (en) Feature point matching result processing method and device
CN115239612A (en) Circuit board positioning method, device, equipment and storage medium
CN113012132A (en) Image similarity determining method and device, computing equipment and storage medium
CN110866928A (en) Target boundary segmentation and background noise suppression method and device based on neural network
CN108073581B (en) Map area name display method and device
CN111582021B (en) Text detection method and device in scene image and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant