CN111405194A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN111405194A
CN111405194A CN202010503030.XA CN202010503030A CN111405194A CN 111405194 A CN111405194 A CN 111405194A CN 202010503030 A CN202010503030 A CN 202010503030A CN 111405194 A CN111405194 A CN 111405194A
Authority
CN
China
Prior art keywords
image
frame
target
shooting
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010503030.XA
Other languages
Chinese (zh)
Inventor
方涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010503030.XA priority Critical patent/CN111405194A/en
Publication of CN111405194A publication Critical patent/CN111405194A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

The embodiment of the specification provides an image processing method and an image processing device, wherein the image processing method comprises the steps of receiving a shooting request sent by a user, and acquiring parameter information of a user terminal based on the shooting request; generating a shooting plan for the target object to be shot by the user according to the parameter information of the user terminal; receiving a target image of a target object acquired by the user according to the shooting plan; analyzing the target image to obtain target characters in the target image; the exclusive shooting plan is made for the user through the acquired parameter information of the user terminal, so that the user can shoot a clear image for the target object on the basis of the current user terminal, and the related information confirmation is quickly and accurately completed on the basis of the clear image.

Description

Image processing method and device
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to an image processing method. One or more embodiments of the present specification also relate to an image processing apparatus, a computing device, and a computer-readable storage medium.
Background
At present, online APP usually requires a user to upload document materials to confirm some information, such as identity information and a small knot in a hospital, but part of document information is compact in layout, large in information amount and small in font, and a clear certificate photo is difficult to acquire by shooting once due to the limitation of shooting pixels and the like of a terminal of a low-end mobile phone.
Therefore, it is urgently needed to provide an image processing method which can assist a mobile phone to capture a clear image, so that a user can complete related information confirmation based on the clear image.
Disclosure of Invention
In view of this, the present specification provides an image processing method. One or more embodiments of the present specification also relate to an image processing apparatus, a computing device, and a computer-readable storage medium to address technical deficiencies in the prior art.
According to a first aspect of embodiments herein, there is provided an image processing method including:
receiving a shooting request sent by a user, and acquiring parameter information of a user terminal based on the shooting request;
generating a shooting plan for the target object to be shot by the user according to the parameter information of the user terminal;
receiving a target image of a target object acquired by the user according to the shooting plan;
and analyzing the target image to obtain the target characters in the target image.
According to a second aspect of embodiments herein, there is provided an image processing apparatus comprising:
the parameter acquisition module is configured to receive a shooting request sent by a user and acquire parameter information of a user terminal based on the shooting request;
the plan generating module is configured to generate a shooting plan for a target object to be shot by the user according to the parameter information of the user terminal;
an image acquisition module configured to receive a target image of a target object acquired by the user according to the shooting plan;
and the image analysis module is configured to analyze the target image to acquire target characters in the target image.
According to a third aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the image processing method when executing the computer-executable instructions.
According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the image processing method.
One embodiment of the present specification implements an image processing method and an image processing apparatus, where the image processing method includes receiving a shooting request sent by a user, and acquiring parameter information of a user terminal based on the shooting request; generating a shooting plan for the target object to be shot by the user according to the parameter information of the user terminal; receiving a target image of a target object acquired by the user according to the shooting plan; analyzing the target image to obtain target characters in the target image; the exclusive shooting plan is made for the user through the acquired parameter information of the user terminal, so that the user can shoot a clear image for the target object on the basis of the current user terminal, and the related information confirmation is quickly and accurately completed on the basis of the clear image.
Drawings
FIG. 1 is a flow chart of an image processing method provided in one embodiment of the present description;
FIG. 2 is a flowchart illustrating a processing procedure of an image processing method according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present specification;
fig. 4 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
First, the noun terms to which one or more embodiments of the present specification relate are explained.
Image stitching technology is a technology for stitching a plurality of images with overlapped parts (which may be obtained at different times, different viewing angles or different sensors) into a seamless panoramic image or a high-resolution image.
In the present specification, an image processing method is provided. One or more embodiments of the present specification relate to an image processing apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
Referring to fig. 1, fig. 1 shows a flowchart of an image processing method provided according to an embodiment of the present specification, including the steps of:
step 102: and receiving a shooting request sent by a user, and acquiring the parameter information of the user terminal based on the shooting request.
In practical applications, the image processing method may be applied to an application installed in a user terminal, such as an application for insurance claim payment, an application for instant messaging, and the like.
The specific application scenarios of the image processing method are as follows: the user uploads document materials and the like on line according to the requirements of the application program to perform identity confirmation or material verification and the like, for example, in the scene of insurance claim, the image processing method can be applied to the application program of insurance claim payment, and guides the user to upload materials necessary for claim payment in the application program of insurance claim payment: hospitalization nodules, etc.
Specifically, the receiving of the shooting request sent by the user, that is, the receiving of the shooting request sent by the user operating the user terminal, for example, the receiving of the shooting request generated by the user clicking a shooting control of an application installed in the user terminal, or the like, may be understood as.
The user terminal includes, but is not limited to, a mobile phone, a computer, and the like, and the parameter information of the user terminal includes, but is not limited to, parameter information of a pixel of a shooting device of the user terminal, a size of a display device, a memory, a CPU, and the like, for example, a size of a display screen of the mobile phone, a pixel of a camera, and the like.
In specific implementation, after receiving a shooting request sent by a user, an application program first obtains parameter information of a carrier (user terminal) of the application program based on the shooting request.
Step 104: and generating a shooting plan for the target object to be shot by the user according to the parameter information of the user terminal.
The target object is document information, including but not limited to certificate information, credential information, and the like, such as certificates, invoices, and certificates of admission summary, discharge certificate, and the like, which can prove identity.
In practical applications, the parameter information of the user terminal in the embodiments of the present specification may be understood as parameter information of a shooting device and a display device of the user terminal, such as a pixel of a camera of a mobile phone, a size of a screen of the mobile phone, and the like.
The parameter information of the user terminals is different, the shooting method of the user when the user obtains the target object to be shot based on the user terminal is also different, and a dedicated shooting plan can be made for the user according to the parameter information of the user terminal, so that the user using different user terminals can obtain a clear image of the target object to be shot based on the current user terminal, the problem that the user needs to replace the user terminal with different parameters to clearly shoot the target object is avoided, the cost is increased, and the specific implementation mode is as follows:
the generating of the shooting plan for the user according to the parameter information of the user terminal includes:
judging whether the parameter information of the user terminal is less than or equal to the preset terminal parameter information,
if yes, generating a first shooting plan for the user according to the parameter information of the user terminal,
and if not, generating a second shooting plan for the user according to the parameter information of the user terminal.
The preset terminal parameter information can be set according to practical application, such as preset resolution of a mobile phone, pixels of a camera and the like; in practical application, when the resolution of the mobile phone is 640 × 480 and the pixels of the camera are 30 ten thousand, the image shot by the mobile phone is unclear, and the image shot by the mobile phone higher than the resolution and the pixels is clear, the preset terminal parameter information may be set to have the resolution of 640 × 480 and the pixels of 30 ten thousand.
Specifically, parameter information of the user terminal is obtained first, then the obtained parameter information of the user terminal is compared with preset terminal parameter information, and if the parameter information of the user terminal is less than or equal to the preset terminal parameter information, a first shooting plan is generated for a user according to the parameter information of the user terminal; and if the parameter information of the user terminal is larger than the preset terminal parameter information, generating a second shooting plan for the user according to the parameter information of the user terminal.
The first shooting plan can be understood as a shooting plan generated for the user when the parameters of the shooting equipment of the user terminal are low and the shot image is fuzzy, and the second shooting plan can be understood as a shooting plan generated for the user when the parameters of the shooting equipment of the user terminal are high relative to the first shooting plan and the shot image is relatively clear.
In specific implementation, the first shooting plan comprises multi-frame shooting of the target object, and a target image of the target object is obtained based on multi-frame shooting results;
correspondingly, after the first shooting plan is generated for the user according to the parameter information of the user terminal, the method further includes:
determining a multi-frame photographing mode of the user terminal based on the first photographing plan.
The multi-frame shooting can be understood as shooting a target object for three or more frames, namely shooting continuous three or more frames of images of the target object by using shooting equipment of a user terminal; and then determining a target image of the target object according to the shot multi-frame images.
Specifically, the first shooting plan generated for the user according to the parameter information of the user terminal is as follows: and performing multi-frame shooting on the target object, and determining a multi-frame shooting mode of the user terminal based on the first shooting plan under the condition that a target image of the target object is obtained based on a multi-frame shooting result, namely, controlling the user terminal to enter the multi-frame shooting mode based on the first shooting plan, and guiding a user to perform multi-frame shooting on the target object according to shooting equipment of the user terminal.
In the image processing method provided by the embodiment of the present specification, when it is determined that the parameter information of the user terminal is less than or equal to the preset terminal parameter information, a first shooting plan is generated for a user, so that the user can perform multi-frame shooting on a target object according to the first shooting plan, and subsequently, multi-frame images of the target object can be spliced into a clear target image.
In another embodiment of the present specification, the second photographing plan includes performing a single photographing on the target object, and acquiring a target image of the target object based on a result of the single photographing;
correspondingly, after generating the second shooting plan for the user according to the parameter information of the user terminal, the method further includes:
determining a single-frame photographing mode of the user terminal based on the second photographing plan.
Specifically, under the condition that the parameter information of the user terminal is greater than the preset terminal parameter information, it can be determined that the image of the target object shot by the shooting device of the user terminal should be relatively clear.
In specific implementation, after a second shooting plan is generated for the user according to the parameter information of the user terminal, the application program controls the user terminal to enter a single-frame shooting mode based on the second shooting plan, and guides the user to shoot the target object once according to the shooting equipment of the user terminal, so that the target image of the target object can be rapidly obtained.
Step 106: and receiving a target image of a target object acquired by the user according to the shooting plan.
Specifically, after a shooting plan is generated for a target object to be shot by a user according to parameter information of a user terminal, a target image of the target object obtained by the user according to the shooting plan is received.
In specific implementation, the receiving the target image of the target object acquired by the user according to the shooting plan includes:
and receiving a target image of a target object acquired by the user in a multi-frame shooting mode of the user terminal according to the first shooting plan.
The first shooting plan comprises multi-frame shooting of the target object, a target image of the target object is obtained based on multi-frame shooting results, after the first shooting plan is determined according to parameter information of the user terminal, the application program controls the user terminal to form a multi-frame shooting mode based on the first shooting plan, then continuous multi-frame initial images of the target object shot by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan are received, the continuous multi-frame initial images can be spliced into the target image of the target object based on the multi-frame initial images shot in the multi-frame shooting mode subsequently, the definition and the integrity of the target image of the target object are guaranteed, the continuous multi-frame initial images of the target object are continuous multi-frame initial images which are shot from the head part to the tail part of the target object and have a partially overlapped area, and the continuous multi-frame initial images can be spliced into the complete target image of the target object subsequently, in practical application, in order to ensure the subsequent image processing efficiency, the area of an overlapping region between continuous multiple frames of initial images is not large, the overlapping region of each frame of initial image is preferably one fourth of that of each frame of initial image, thus not only ensuring the integrity of image splicing, but also avoiding the overlapping of multiple characters or multiple patterns due to the large area of the overlapping region, and subsequently, the overlapping characters or the overlapping patterns need to be subjected to de-overlapping.
In order to ensure that the clear and complete target object can be obtained under the condition that the shooting quality of the shooting equipment of the user terminal is low, the use requirement of a user is met, the clear and complete target image of the target object spliced by the multi-frame initial images can be obtained in a multi-region and multi-frame image obtaining mode, and the specific implementation mode is as follows:
the receiving of the target image of the target object acquired by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan includes:
receiving continuous multi-frame initial images of the target object, which are acquired by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan;
and under the condition that each frame of initial image meets the requirement of a preset image, splicing the plurality of frames of initial images to form a target image of the target object.
After determining the multi-frame shooting mode of the user terminal according to the first shooting plan, at least two areas to be shot of the target object can be determined according to the multi-frame shooting mode, for example, the target object is a discharge node, and then after determining the multi-frame shooting mode, the discharge node can be determined according to the multi-frame shooting mode to perform image shooting better for several areas to be shot, for example, 3 areas to be shot with overlapping parts are divided from top to bottom.
Specifically, receiving initial images of at least two to-be-shot areas of a target object, which are acquired by a user according to a first shooting plan in a multi-frame shooting mode of a user terminal, wherein the at least two to-be-shot areas can be understood as at least two to-be-shot areas with partially overlapped areas. In practical applications, such a multi-frame shooting mode may be understood as scanning, that is, scanning from a first to-be-shot region of a target object (e.g. a head of the target object) to a last to-be-shot region of the target object (e.g. a tail of the target object), where each scanned to-be-shot region is an initial image, and after performing multi-frame scanning on the target object, forming multi-frame initial images corresponding to a plurality of to-be-shot regions of the target object, and in order to ensure that the multi-frame initial images corresponding to the target object can be subsequently spliced into a complete non-missing target image, when performing multi-frame shooting on the target object, a part of an initial image shot in a next frame overlaps with a part of an initial image shot in an immediately previous frame, and at the same time, in order to ensure that the overlapping parts of two initial images are not too many, the subsequent image identification is burdened, and the tail quarter of the previous frame initial image is overlapped with the head quarter of the adjacent next frame initial image, or the overlapped area is smaller than the quarter.
In addition, in order to avoid that all the operations need to be executed again when a splicing error is found when characters in a target image are identified after a plurality of frames of initial images are spliced to form the target image of a target object, and the workload is increased; when a plurality of frames of initial images are spliced, the splicing result after splicing of every two frames of initial images is analyzed, so that the accuracy of splicing at each time is ensured, the problem that the error is identified at last, all processing flows need to be repeated, image processing burden is caused, and poor use experience is caused for a user is avoided.
In specific implementation, multi-frame initial images of at least two to-be-shot areas of the target object are obtained in the above mode, and then the multi-frame initial images are spliced to form a target image of the target object under the condition that each frame of initial image meets the requirement of a preset image; the preset image requirement may be set according to an actual application, and is not limited herein, for example, the preset image requirement is that the image resolution reaches a preset resolution threshold (e.g., 3840 × 2160 resolution), all characters in the image are clearly recognizable and/or the image has no shadow, distortion, noise, and the like. In practical application, under the condition that each frame of initial image meets the requirement of a preset image, each frame of initial image can be determined to be clear and can be rapidly identified in the follow-up process, at the moment, the multi-frame initial images are spliced according to the shooting sequence to form a target image of a target object, and the multi-frame initial images are sequentially obtained from top to bottom of the target object in a multi-frame shooting mode by a user terminal, so that the multi-frame initial images are sequentially associated; in addition, for the convenience of splicing subsequent target objects, each frame can be sequentially numbered according to the shooting sequence based on the initial images acquired by the user terminal in the multi-frame shooting mode, and the subsequent frames can be sequentially spliced based on the numbers corresponding to the multi-frame initial images, so that the splicing disorder is avoided, the acquisition time of the target images is saved, and the subsequent recognition efficiency for recognizing characters in the target images is improved.
Specifically, the splicing of multiple frames of initial images into a target image of a target object can be realized in multiple splicing manners, for example, the initial images are spliced two by two, then the images spliced two by two are spliced two by two, and the process is repeated until a target image finally spliced into one is obtained; or the first frame of initial image and the second frame of initial image are spliced, the spliced image is spliced with the third frame of initial image, the spliced image is formed, and the spliced image is continuously spliced with the fourth frame of initial image, and the process is circulated until the spliced image and the last frame of initial image are spliced into a target image, and the like, that is, any mode capable of realizing image splicing can be applied to the scheme, and no limitation is made herein, but in the embodiment of the description, the splicing mode of the second multi-frame of initial image is described in detail:
the splicing the multiple frames of initial images to form the target image of the target object comprises:
splicing the ith frame initial image and the (i + 1) th frame initial image to obtain a jth spliced frame image, wherein i and j are positive integers, and i ∈ [1, n ];
it is determined whether i +1 is greater than n,
if yes, the j-th spliced frame image is taken as a target image of the target object,
if not, increasing the number i by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image;
it is determined whether i +1 is greater than n,
if yes, increasing j by 1, and taking the j-th spliced frame image as a target image of the target object,
and if not, increasing both i and j by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image.
Specifically, taking the i and j initials as 1 for example, the description is made, that is, the 1 st frame initial image and the 2 nd frame initial image are spliced to obtain a 1 st spliced frame image, whether the 2 nd frame initial image is a last frame initial image is judged, if yes, it is described that the initial image of the target object obtained by the user terminal in the multi-frame shooting mode may be only two frames, the two frames of images are spliced together to obtain the target image of the target object, in the specific implementation, after a shooting plan is generated for the target object to be shot by the user according to the parameter information of the user terminal, if the shooting plan is the first shooting plan, an initial complete image of the target object may be obtained first, then the initial complete image is analyzed to determine that the target object is suitable to be shot by several frames, if the initial complete image is analyzed, determining that the target object is better shot in two frames, the first shooting plan may also include shooting the target object in two frames, and obtaining the target object of the target object based on the two-frame shooting result; if not, splicing the 1 st spliced frame image with the 3 rd frame initial image to obtain a 2 nd spliced frame image;
and continuously judging whether the 3 rd frame initial image is the last frame initial image or not, if so, indicating that the initial image of the target object acquired by the user terminal in the multi-frame shooting mode is 3 frames, splicing the three frames of images together to obtain the target image of the target object, otherwise, splicing the 2 nd spliced frame image with the 4 th frame initial image to obtain the 3 rd spliced frame image, and continuously executing the steps until the multi-frame initial images are spliced into one initial image, wherein the specific image splicing can be realized by extracting the traditional texture features of the images, splicing the two initial images to be spliced under the condition that the texture features of the two initial images to be spliced are matched, and primarily splicing all the initial images based on the mode.
In order to ensure the accuracy of the spliced frame image, the spliced frame image is verified every time a spliced frame image is obtained, and the specific implementation mode is as follows:
after the j-th spliced frame image is obtained, the method further comprises the following steps:
analyzing the jth spliced frame image, judging whether the character information in the jth spliced frame image after analysis meets the preset character requirement or not,
and if not, re-acquiring the ith frame initial image and/or the (i + 1) th frame initial image.
The preset text requirement can be set according to practical application, for example, the preset text requirement is text language smoothness, correct semantics and the like.
Specifically, taking j as an example 1, analyzing the 1 st stitched frame image to obtain text information in the 1 st stitched frame image, then judging whether the text information in the 1 st stitched frame image is language-compliant and has correct semantics, if so, determining that the 1 st stitched frame image is a correct image, and if not, re-obtaining the 1 st initial image and/or the 2 nd initial image which are stitched to form the 1 st stitched frame image.
The following description will be made by taking a spliced frame image obtained by splicing the spliced frame image with the initial image as an example:
after the j +1 th spliced frame image is obtained, the method further comprises the following steps:
analyzing the j +1 th spliced frame image, judging whether the character information in the analyzed j +1 th spliced frame image meets the preset character requirement or not,
and if not, re-acquiring the initial image of the (i + 1) th frame.
The method includes the steps that a spliced frame image is obtained by splicing an initial image, or the spliced frame image formed by splicing the spliced frame image and the initial image is analyzed and verified to ensure the accuracy of a target image obtained by final splicing, the problem that splicing errors are found after the final target image is obtained is solved, all the initial images need to be obtained again, the acquisition speed and the safety of the target image are improved through the method, and the working flow is saved.
More conveniently, whether each spliced frame image is a correct spliced image can be judged through a pre-trained language model, and the specific implementation mode is as follows:
after the j-th spliced frame image is obtained, the method further comprises the following steps:
extracting character information in the j spliced frame images by an optical character recognition method;
inputting the character information in the j spliced frame images into a language model to obtain a first score;
and under the condition that the first score is smaller than a first preset score threshold value, the ith frame initial image and/or the (i + 1) th frame initial image are/is acquired again. And
after the j +1 th spliced frame image is obtained, the method further comprises the following steps:
extracting character information in the j +1 th spliced frame image by an optical character recognition method;
inputting the character information in the j +1 th spliced frame image into a language model to obtain a second score;
and under the condition that the second score is smaller than a second preset score threshold value, the (i + 1) th frame initial image is obtained again.
In the same manner as the above, after each spliced frame image is spliced, the text information in the spliced frame image is extracted, then the text information is input into a pre-trained language model for score prediction, and under the condition that the score is smaller than a preset score threshold value, the text information obtained from the spliced frame image is not smooth and possibly caused by splicing errors of the initial image, under the condition, the initial image of the target object needs to be obtained again for splicing, so that the accuracy of the target image of the target object is ensured.
In another embodiment of the present specification, the receiving a target image of a target object acquired by the user according to the shooting plan includes:
receiving an initial image of a target object acquired by the user in a single-frame shooting mode of the user terminal according to the second shooting plan;
and taking the initial image as a target image of the target object under the condition that the initial image meets the preset image requirement.
Specifically, an initial image of a target object obtained in a single-frame shooting mode based on a user terminal is only a single initial image, and the initial image is taken as the target image of the target object when the initial image is determined to meet a preset image requirement, that is, the target image of the target object can be determined by shooting once under the condition that the shooting quality of a shooting device of the user terminal is good, so that the image processing efficiency can be greatly improved, wherein the preset image requirement can refer to the above embodiment, and is not described herein again.
Step 108: and analyzing the target image to obtain the target characters in the target image.
Specifically, after the target image is confirmed, the target image is analyzed to obtain target characters in the target image, and a specific application, such as an insurance claim review or an identity information review, is completed.
In another embodiment of this specification, after the target image of the target object is acquired, in order to further ensure the integrity of the target image, the target image is detected again, and the specific implementation manner is as follows:
after receiving the target image of the target object acquired by the user according to the shooting plan, the method further includes:
and carrying out corner detection on the target object and the target image so as to determine the integrity of the target image.
In addition, after the target image is determined to be complete through corner detection, the target image may be analyzed to obtain the target characters in the target image, and for a specific manner of analyzing the target image to obtain the target characters, reference may be made to the above-mentioned embodiment, which is not described herein again.
In a specific implementation, in order to further confirm the accuracy of the target characters in the target object, after the target characters of the target image are acquired, the target characters may be detected again, and a specific implementation manner is as follows:
after analyzing the target image to obtain the target characters in the target image, the method further includes:
inputting the target characters into a language model to obtain a third score;
and under the condition that the third score is smaller than a third preset score threshold value, the target image is obtained again.
The third preset score threshold, as well as the first preset score threshold and the second preset score threshold, may be set according to an actual application, and is not limited herein, for example, set to 80, 90, and the like.
In the image processing method provided by the embodiment of the specification, after a plurality of frames of initial images of a target object are acquired, adjacent initial images are spliced according to texture features, then character information in the spliced frame images is extracted through an OCR technology, a language model of N L P is used for judging whether the context of the character information acquired from the spliced frame images is smooth or not according to the semantic information of the character information, if so, the splicing, the extraction and the detection are continuously performed according to the steps, and if not, the text information is returned in time to enable a user to shoot again.
The following image processing method is further described by taking the application of the image processing method provided in this specification to image processing of a certificate as an example with reference to fig. 2, and specifically includes the following steps:
step 202: receiving a shooting request sent by a user a, and acquiring parameter information of a mobile phone of the user a based on the shooting request.
Specifically, a shooting request sent by a user a is received, and parameter information of a mobile phone of the user a is acquired based on the shooting request, which can be understood as that the user a sends the shooting request to an application program through the mobile phone, and the application program acquires the parameter information of the mobile phone of the user a after receiving the shooting request.
Step 204: and generating a first shooting plan for the certificate to be shot of the user a according to the parameter information of the mobile phone of the user a.
Step 206: receiving initial images of three areas to be shot of the certificate, which are acquired by a user a in a multi-frame shooting mode of the mobile phone according to a first shooting plan: initial image 1, initial image 2, and initial image 3.
Step 208: and splicing the initial image 1, the initial image 2 and the initial image 3 by an image splicing algorithm to obtain a target image of the complete voucher.
Step 210: and analyzing the target image to obtain target characters in the certificate.
In the embodiment of the specification, after a shooting request sent by a user a is received, a first shooting plan is generated for a certificate to be shot of the user a according to parameter information of a mobile phone of the user a, then the image obtained by gradually scanning each local part of the certificate according to an interactive prompt is received, as the shooting area is smaller, compared with the direct shooting of the complete certificate, the lower end machine is easier to shoot clearly, then the image of each local part of the scanned certificate is spliced through an image splicing algorithm based on image characteristics and character information to form a target image of the complete certificate, and finally characters in the certificate are accurately obtained based on the complete clear target image; the embodiment aims at the problem that clear documents cannot be directly shot by low-end models (mobile phones or computers and the like) due to resolution limit, the scheme only shoots part of the documents at a single time, then splices multi-frame images through an image splicing algorithm to form a complete voucher picture, and more clear images are easier to shoot due to the fact that the area shot at the single time is smaller, so that the problem that the vouchers are difficult to shoot by the low-end mobile phones is solved.
Corresponding to the above method embodiment, the present specification further provides an image processing apparatus embodiment, and fig. 3 shows a schematic structural diagram of an image processing apparatus provided in an embodiment of the present specification. As shown in fig. 3, the apparatus includes:
a parameter obtaining module 302 configured to receive a shooting request sent by a user, and obtain parameter information of a user terminal based on the shooting request;
a plan generating module 304 configured to generate a shooting plan for a target object to be shot by the user according to the parameter information of the user terminal;
an image acquisition module 306 configured to receive a target image of a target object acquired by the user according to the shooting plan;
an image parsing module 308 configured to parse the target image to obtain target characters in the target image.
Optionally, the plan generating module 304 is further configured to:
judging whether the parameter information of the user terminal is less than or equal to the preset terminal parameter information,
if yes, generating a first shooting plan for the user according to the parameter information of the user terminal,
and if not, generating a second shooting plan for the user according to the parameter information of the user terminal.
Optionally, the first shooting plan includes performing multi-frame shooting on the target object, and acquiring a target image of the target object based on a multi-frame shooting result;
correspondingly, the device further comprises:
a multi-frame mode determination module configured to determine a multi-frame photographing mode of the user terminal based on the first photographing plan.
Optionally, the image obtaining module 306 is further configured to:
and receiving a target image of a target object acquired by the user in a multi-frame shooting mode of the user terminal according to the first shooting plan.
Optionally, the image obtaining module 306 is further configured to:
receiving continuous multi-frame initial images of the target object, which are acquired by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan;
and under the condition that each frame of initial image meets the requirement of a preset image, splicing the plurality of frames of initial images to form a target image of the target object.
Optionally, the image obtaining module 306 is further configured to:
splicing the ith frame initial image and the (i + 1) th frame initial image to obtain a jth spliced frame image, wherein i and j are positive integers, and i ∈ [1, n ];
it is determined whether i +1 is greater than n,
if yes, the j-th spliced frame image is taken as a target image of the target object,
if not, increasing the number i by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image;
it is determined whether i +1 is greater than n,
if yes, increasing j by 1, and taking the j-th spliced frame image as a target image of the target object,
and if not, increasing both i and j by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image.
Optionally, the apparatus further includes:
a first image analysis module configured to analyze the jth stitched frame image, determine whether text information in the jth stitched frame image after analysis meets a preset text requirement,
and the first retry module is configured to reacquire the ith frame initial image and/or the (i + 1) th frame initial image under the condition that the text information in the jth spliced frame image does not meet the preset text requirement.
Optionally, the apparatus further includes:
a second image analysis module configured to analyze the j +1 th stitched frame image, determine whether the text information in the analyzed j +1 th stitched frame image meets a preset text requirement,
and the second retry module is configured to reacquire the i +1 th frame initial image under the condition that the text information in the j +1 th spliced frame image does not meet the preset text requirement.
Optionally, the apparatus further includes:
a first character extraction module configured to extract character information in the j spliced frame images by an optical character recognition method;
a first score obtaining module configured to input the text information in the j spliced frame images into a language model to obtain a first score;
a first image retrieving module configured to retrieve the i frame initial image and/or the i +1 frame initial image if the first score is smaller than a first preset score threshold.
Optionally, the apparatus further includes:
the second character extraction module is configured to extract character information in the j +1 th spliced frame image by an optical character recognition method;
a second score obtaining module configured to input the text information in the j +1 th spliced frame image into the language model to obtain a second score;
a second image retrieving module configured to retrieve the i +1 th frame initial image if the second score is smaller than a second preset score threshold.
Optionally, the apparatus further includes:
a corner detection module configured to perform corner detection on the target object and the target image to determine the integrity of the target image.
Optionally, the apparatus further includes:
a third score obtaining module configured to input the target text into a language model to obtain a third score;
a third image retrieving module configured to retrieve the target image if the third score is less than a third preset score threshold.
Optionally, the second shooting plan includes performing a single shooting on the target object, and acquiring a target image of the target object based on a single shooting result;
correspondingly, the device further comprises:
a single frame mode determination module configured to determine a single frame photographing mode of the user terminal based on the second photographing plan.
Optionally, the image obtaining module 306 is further configured to:
receiving an initial image of a target object acquired by the user in a single-frame shooting mode of the user terminal according to the second shooting plan;
and taking the initial image as a target image of the target object under the condition that the initial image meets the preset image requirement.
The above is a schematic configuration of an image processing apparatus of the present embodiment. It should be noted that the technical solution of the image processing apparatus belongs to the same concept as the technical solution of the image processing method, and details that are not described in detail in the technical solution of the image processing apparatus can be referred to the description of the technical solution of the image processing method.
FIG. 4 illustrates a block diagram of a computing device 400 provided in accordance with one embodiment of the present description. The components of the computing device 400 include, but are not limited to, a memory 410 and a processor 420. Processor 420 is coupled to memory 410 via bus 430 and database 450 is used to store data.
The computing device 400 also includes AN access device 440, the access device 440 enabling the computing device 400 to communicate via one or more networks 460. examples of such networks include a Public Switched Telephone Network (PSTN), a local area network (L AN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the Internet the access device 440 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as AN IEEE802.11 wireless local area network (W L AN) wireless interface, a Global microwave Internet Access (Wi-MAX) interface, AN Ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a Bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 400, as well as other components not shown in FIG. 4, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 4 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 400 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 400 may also be a mobile or stationary server.
Wherein the processor 420 is configured to execute computer-executable instructions, wherein the processor implements the steps of the image processing method when executing the computer-executable instructions.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the image processing method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the image processing method.
An embodiment of the present specification also provides a computer readable storage medium storing computer instructions, which when executed by a processor, implement the steps of the image processing method.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the image processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the image processing method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (17)

1. An image processing method comprising:
receiving a shooting request sent by a user, and acquiring parameter information of a user terminal based on the shooting request;
generating a shooting plan for the target object to be shot by the user according to the parameter information of the user terminal;
receiving a target image of a target object acquired by the user according to the shooting plan;
and analyzing the target image to obtain the target characters in the target image.
2. The image processing method according to claim 1, wherein the generating of the shooting plan for the user according to the parameter information of the user terminal comprises:
judging whether the parameter information of the user terminal is less than or equal to the preset terminal parameter information,
if yes, generating a first shooting plan for the user according to the parameter information of the user terminal,
and if not, generating a second shooting plan for the user according to the parameter information of the user terminal.
3. The image processing method according to claim 2, wherein the first shooting plan includes performing multi-frame shooting on the target object, and acquiring a target image of the target object based on a result of the multi-frame shooting;
correspondingly, after the first shooting plan is generated for the user according to the parameter information of the user terminal, the method further includes:
determining a multi-frame photographing mode of the user terminal based on the first photographing plan.
4. The image processing method according to claim 3, the receiving a target image of a target object acquired by the user according to the shooting plan comprising:
and receiving a target image of a target object acquired by the user in a multi-frame shooting mode of the user terminal according to the first shooting plan.
5. The image processing method according to claim 4, wherein the receiving of the target image of the target object acquired by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan includes:
receiving continuous multi-frame initial images of the target object, which are acquired by the user in the multi-frame shooting mode of the user terminal according to the first shooting plan;
and under the condition that each frame of initial image meets the requirement of a preset image, splicing the plurality of frames of initial images to form a target image of the target object.
6. The image processing method according to claim 5, wherein the stitching the plurality of frames of initial images to form a target image of the target object comprises:
splicing the ith frame initial image and the (i + 1) th frame initial image to obtain a jth spliced frame image, wherein i and j are positive integers, and i ∈ [1, n ];
it is determined whether i +1 is greater than n,
if yes, the j-th spliced frame image is taken as a target image of the target object,
if not, increasing the number i by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image;
it is determined whether i +1 is greater than n,
if yes, increasing j by 1, and taking the j-th spliced frame image as a target image of the target object,
and if not, increasing both i and j by 1, and splicing the jth spliced frame image with the (i + 1) th frame initial image to obtain a (j + 1) th spliced frame image.
7. The image processing method according to claim 6, further comprising, after obtaining the jth stitched frame image:
analyzing the jth spliced frame image, judging whether the character information in the jth spliced frame image after analysis meets the preset character requirement or not,
and if not, re-acquiring the ith frame initial image and/or the (i + 1) th frame initial image.
8. The image processing method according to claim 6, further comprising, after obtaining the j +1 th stitched frame image:
analyzing the j +1 th spliced frame image, judging whether the character information in the analyzed j +1 th spliced frame image meets the preset character requirement or not,
and if not, re-acquiring the initial image of the (i + 1) th frame.
9. The image processing method according to claim 6, further comprising, after obtaining the jth stitched frame image:
extracting character information in the j spliced frame images by an optical character recognition method;
inputting the character information in the j spliced frame images into a language model to obtain a first score;
and under the condition that the first score is smaller than a first preset score threshold value, the ith frame initial image and/or the (i + 1) th frame initial image are/is acquired again.
10. The image processing method according to claim 6, further comprising, after obtaining the j +1 th stitched frame image:
extracting character information in the j +1 th spliced frame image by an optical character recognition method;
inputting the character information in the j +1 th spliced frame image into a language model to obtain a second score;
and under the condition that the second score is smaller than a second preset score threshold value, the (i + 1) th frame initial image is obtained again.
11. The image processing method according to claim 1, further comprising, after receiving a target image of a target object acquired by the user according to the shooting plan:
and carrying out corner detection on the target object and the target image so as to determine the integrity of the target image.
12. The image processing method of claim 11, after parsing the target image to obtain target characters in the target image, further comprising:
inputting the target characters into a language model to obtain a third score;
and under the condition that the third score is smaller than a third preset score threshold value, the target image is obtained again.
13. The image processing method according to claim 2, the second shooting plan including a single shooting of the target object, the target image of the target object being acquired based on a single shooting result;
correspondingly, after generating the second shooting plan for the user according to the parameter information of the user terminal, the method further includes:
determining a single-frame photographing mode of the user terminal based on the second photographing plan.
14. The image processing method according to claim 13, the receiving a target image of a target object acquired by the user according to the shooting plan comprising:
receiving an initial image of a target object acquired by the user in a single-frame shooting mode of the user terminal according to the second shooting plan;
and taking the initial image as a target image of the target object under the condition that the initial image meets the preset image requirement.
15. An image processing apparatus comprising:
the parameter acquisition module is configured to receive a shooting request sent by a user and acquire parameter information of a user terminal based on the shooting request;
the plan generating module is configured to generate a shooting plan for a target object to be shot by the user according to the parameter information of the user terminal;
an image acquisition module configured to receive a target image of a target object acquired by the user according to the shooting plan;
and the image analysis module is configured to analyze the target image to acquire target characters in the target image.
16. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the image processing method according to any one of claims 1 to 14 when executing the computer-executable instructions.
17. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the image processing method of any one of claims 1 to 14.
CN202010503030.XA 2020-06-05 2020-06-05 Image processing method and device Pending CN111405194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010503030.XA CN111405194A (en) 2020-06-05 2020-06-05 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010503030.XA CN111405194A (en) 2020-06-05 2020-06-05 Image processing method and device

Publications (1)

Publication Number Publication Date
CN111405194A true CN111405194A (en) 2020-07-10

Family

ID=71433679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010503030.XA Pending CN111405194A (en) 2020-06-05 2020-06-05 Image processing method and device

Country Status (1)

Country Link
CN (1) CN111405194A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437231A (en) * 2020-11-24 2021-03-02 维沃移动通信(杭州)有限公司 Image shooting method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
US8335402B1 (en) * 2008-01-23 2012-12-18 A9.Com, Inc. Method and system for detecting and recognizing text in images
CN106878623A (en) * 2017-04-20 2017-06-20 努比亚技术有限公司 Photographic method, mobile terminal and computer-readable recording medium
CN107105167A (en) * 2017-06-05 2017-08-29 广东小天才科技有限公司 Method and device for shooting picture during scanning question and terminal equipment
CN108848309A (en) * 2018-07-13 2018-11-20 维沃移动通信有限公司 A kind of camera programm starting method and mobile terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8335402B1 (en) * 2008-01-23 2012-12-18 A9.Com, Inc. Method and system for detecting and recognizing text in images
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
CN106878623A (en) * 2017-04-20 2017-06-20 努比亚技术有限公司 Photographic method, mobile terminal and computer-readable recording medium
CN107105167A (en) * 2017-06-05 2017-08-29 广东小天才科技有限公司 Method and device for shooting picture during scanning question and terminal equipment
CN108848309A (en) * 2018-07-13 2018-11-20 维沃移动通信有限公司 A kind of camera programm starting method and mobile terminal

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112437231A (en) * 2020-11-24 2021-03-02 维沃移动通信(杭州)有限公司 Image shooting method and device, electronic equipment and storage medium
WO2022111458A1 (en) * 2020-11-24 2022-06-02 维沃移动通信(杭州)有限公司 Image capture method and apparatus, electronic device, and storage medium
CN112437231B (en) * 2020-11-24 2023-11-14 维沃移动通信(杭州)有限公司 Image shooting method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20190205618A1 (en) Method and apparatus for generating facial feature
KR102385463B1 (en) Facial feature extraction model training method, facial feature extraction method, apparatus, device and storage medium
US10867171B1 (en) Systems and methods for machine learning based content extraction from document images
JP6030240B2 (en) Method and apparatus for face recognition
US11625433B2 (en) Method and apparatus for searching video segment, device, and medium
CN111652093B (en) Text image processing method and device
CN109409349B (en) Credit certificate authentication method, credit certificate authentication device, credit certificate authentication terminal and computer readable storage medium
CN105678242B (en) Focusing method and device under hand-held certificate mode
CN109712082B (en) Method and device for collaboratively repairing picture
CN109886223B (en) Face recognition method, bottom library input method and device and electronic equipment
CN111507181B (en) Correction method and device for bill image and computer equipment
CN112101359B (en) Text formula positioning method, model training method and related device
CN110969154A (en) Text recognition method and device, computer equipment and storage medium
CN112883983B (en) Feature extraction method, device and electronic system
CN110942067A (en) Text recognition method and device, computer equipment and storage medium
US8773733B2 (en) Image capture device for extracting textual information
CN113642639A (en) Living body detection method, living body detection device, living body detection apparatus, and storage medium
CN112866577B (en) Image processing method and device, computer readable medium and electronic equipment
CN110490022A (en) A kind of bar code method and device in identification picture
CN111405194A (en) Image processing method and device
CN112101296B (en) Face registration method, face verification method, device and system
US20230245483A1 (en) Handwriting recognition method and apparatus, and electronic device and storage medium
KR20220005243A (en) Sharing and recognition method and device of handwritten scanned document
CN110956133A (en) Training method of single character text normalization model, text recognition method and device
CN115860026A (en) Bar code detection method and device, bar code detection equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200710