CN112861844A

CN112861844A - Service data processing method and device and server

Info

Publication number: CN112861844A
Application number: CN202110341368.4A
Authority: CN
Inventors: 刘华杰; 王雅欣; 冯歆然; 罗杰文
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2021-03-30
Filing date: 2021-03-30
Publication date: 2021-05-28

Abstract

The specification provides a service data processing method, a service data processing device and a server. Based on the method, the target image containing the target service data can be processed by using the existing preset OCR recognition model, and the target text field is extracted from the target image; determining whether the target text field belongs to the special text field to be corrected or not according to a target text field library matched with the target service data; under the condition of determining that the text belongs to the special text field to be corrected, acquiring the sound-shape code data simultaneously containing the sound-scale codes and the character-shape codes; further, a reference text field can be screened from the preset proprietary text field contained in the target text field library according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field; and the reference text field is utilized to carry out targeted correction processing on the currently obtained target text field, so that a more accurate corrected target text field can be obtained.

Description

Service data processing method and device and server

Technical Field

The present specification belongs to the technical field of artificial intelligence, and in particular, relates to a method, an apparatus, and a server for processing service data.

Background

In some service data processing scenarios, image data including service data may be directly acquired, and at this time, a relevant text field needs to be identified and extracted from the image data, so that a text field in the form of the extracted text data may be subsequently used to perform specific service data processing.

Based on the existing processing method of the business data, if a relevant text field needs to be identified and extracted from image data containing certain business data, a large amount of image data containing the business data of the theme type is often required to be collected as sample data; then, a large amount of model training is carried out by utilizing the sample data, so that an OCR (Optical Character Recognition) Recognition model aiming at the theme type can be obtained; the image data may then be processed using the OCR recognition model recognition described above to extract the relevant text fields.

Therefore, the conventional processing method based on the business data needs to consume a large amount of resources, time and cost to specially train the OCR recognition model aiming at the theme type of the business data to be processed, so that the processing cost of the business data is increased, and the processing efficiency of the business data is also reduced.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The specification provides a method, a device and a server for processing service data, which can efficiently and accurately identify and extract a special text field in target service data from a target image directly based on an existing preset OCR recognition model, effectively reduce the processing cost of the service data and improve the processing efficiency of the service data.

The present specification provides a method for processing service data, including:

acquiring a target image; wherein, the target image contains target service data;

processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result;

determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields;

under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters;

determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field;

screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

In one embodiment, the target traffic data includes at least one of: financial statements, transfer vouchers, trade orders.

In one embodiment, processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result includes:

determining the subject type of the target business data;

screening out a target structure template corresponding to the theme type from a plurality of preset structure templates according to the theme type of the target service data;

according to the target structure template, carrying out slice processing on the target image to obtain a plurality of image slices;

and calling a preset OCR recognition model to process the plurality of image slices so as to obtain a corresponding OCR recognition result.

In one embodiment, determining whether the target text field belongs to a proprietary text field to be corrected based on a library of target text fields comprises:

determining a target text field library matched with the subject type of the target service data from a plurality of preset text field libraries;

counting the number of characters contained in the target text field;

screening out preset special text fields with the same number of characters from the target text field library to serve as comparison text fields;

counting the proportion of the same characters between the contrast text field and the target text field;

detecting whether the ratio of the same characters between at least one contrast text field and a target text field is larger than a preset ratio threshold value;

under the condition that the proportion of the same character between at least one contrast text field and the target text field is determined to be larger than a preset proportion threshold value, obtaining a confidence coefficient parameter of the target text field;

and determining whether the target text field belongs to the proprietary text field to be corrected or not according to the confidence coefficient parameter of the target text field.

In one embodiment, obtaining the confidence parameter of the target text field comprises:

calling a preset OCR recognition model to process the plurality of image slices so as to obtain a corresponding OCR recognition result, and acquiring a prediction score value aiming at the OCR recognition result and output by the preset OCR recognition model;

and determining a confidence coefficient parameter of the target text field according to the prediction score value.

In one embodiment, obtaining the phonetic and shape code data of the target text field comprises:

determining the vowels, initials and tones of the characters in the target text field;

constructing corresponding scale codes according to the vowels, the initials and the tones of the characters in the target text field and a preset coding rule;

determining the structure, four-corner coding and stroke number of characters in the target text field;

constructing a corresponding font code according to the structure, the four-corner coding and the stroke number of the characters in the target text field and a preset coding rule;

and splicing the tone codes and the font codes to obtain the font code data of the target text field.

In one embodiment, determining a similarity parameter between a target text field and a preset proprietary text field according to the phonetic and font code data of the target text field and the edit distance between the target text field and the preset proprietary text field includes:

calculating the similarity parameter of the target text field and the preset scale code and the similarity parameter of the font code of the special text field according to the font code data of the target text field;

and according to a preset weight parameter, carrying out weighted summation on the editing distance between the target text field and a preset proprietary text field, the scale code similarity parameter and the font code similarity parameter to obtain the similarity parameter between the target text field and the preset proprietary text field.

In one embodiment, the similarity parameter between the target text field and the preset proprietary text field is calculated according to the following formula:

wherein E is a similarity parameter between the target text field and the preset proprietary text field, edit _ distance is an editing distance between the target text field and the preset proprietary text field, P is a similarity parameter between the target text field and the preset proprietary text field, S is a similarity parameter between the target text field and the preset proprietary text field, and S is a similarity parameter between the target text field and the preset proprietary text field₁Is the character length, s, of the target text field₂For preset proprietary text fieldsLength of character, w₁Is a first weight parameter corresponding to the edit distance, w₂Is a second weight parameter, w, corresponding to the scale code similarity parameter₃Is the third weight parameter corresponding to the similarity parameter of the font code.

In one embodiment, correcting the target text field based on the reference text field comprises:

comparing the characters in the reference text field with the characters in the target text field to determine the character positions with character differences;

and replacing the characters at the same character positions in the target text field with the characters at the character positions with character differences in the reference text field to obtain the corrected target text field.

In one embodiment, prior to acquiring the target image, the method further comprises:

acquiring a plurality of historical service data; the historical business data and the target business data have the same theme type;

extracting a plurality of text fields from a plurality of historical service data;

and filtering out text fields representing data values from the text fields to obtain a plurality of preset proprietary text fields so as to construct a preset text field library matched with the subject type of the target service data.

This specification also provides a service data processing apparatus, including:

the first acquisition module is used for acquiring a target image; wherein, the target image contains target service data;

the first processing module is used for processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result;

the first determining module is used for determining whether the target text field belongs to the special text field to be corrected according to the target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields;

the second acquisition module is used for acquiring the sound and shape code data of the target text field under the condition that the target text field is determined to belong to the special text field to be corrected; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters;

the second determining module is used for determining similarity parameters of the target text field and the preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field;

the second processing module is used for screening out a preset proprietary text field meeting the requirement from the target text field library according to the similarity parameter and taking the text field as a reference text field; and correcting the target text field according to the reference text field.

The present specification also provides a server comprising a processor and a memory for storing processor-executable instructions, wherein the processor executes the instructions to implement the steps associated with the method for processing business data.

The present specification also provides a computer readable storage medium having stored thereon computer instructions which, when executed, implement the relevant steps of the method for processing business data.

The present specification also provides a method for processing service data, including:

determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field;

Based on the method, the server can directly use the existing preset OCR recognition model to process the target image containing the target service data, and preliminarily extract the target text field from the OCR recognition result processed by the preset OCR recognition model; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library matched with the subject type of the target service data; under the condition that the specific text field to be corrected is determined, the server can acquire the phonetic-shape code data which simultaneously contains the scale code obtained based on the pronunciation characteristics of the characters and the font code obtained based on the font characteristics of the characters; furthermore, matched reference text fields meeting the requirements can be accurately screened from the preset special text fields contained in the target text field library according to the sound-shape code data of the target text fields and the editing distance between the target text fields and the preset special text fields in the target text field library; and the reference text field is utilized to carry out targeted correction processing on the currently obtained target text field, so as to obtain a more accurate corrected target text field. Therefore, the server can identify and extract the special text field in the target business data from the target image more efficiently and accurately without consuming a large amount of resources, time and cost to specially train the OCR recognition model aiming at the theme type of the target business data, so that the processing cost of the business data can be effectively reduced, and the processing efficiency of the business data is improved.

Drawings

In order to more clearly illustrate the embodiments of the present specification, the drawings needed to be used in the embodiments will be briefly described below, and the drawings in the following description are only some of the embodiments described in the present specification, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.

Fig. 1 is a schematic diagram of an embodiment of a structural composition of a system to which a method for processing service data provided by an embodiment of the present specification is applied;

fig. 2 is a flowchart illustrating a method for processing service data according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of an embodiment of a method for processing service data provided by an embodiment of the present specification, in an example scenario;

fig. 4 is a flowchart illustrating a method for processing service data according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a server according to an embodiment of the present disclosure;

fig. 6 is a schematic structural composition diagram of a service data processing device provided in an embodiment of the present specification;

fig. 7 is a schematic diagram of an embodiment of a method for processing service data provided by an embodiment of the present specification, in a scenario example;

fig. 8 is a schematic diagram of an embodiment of a method for processing service data provided by an embodiment of the present specification, in a scenario example;

fig. 9 is a schematic diagram of an embodiment of a method for processing service data provided by an embodiment of the present specification, in an example scenario.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.

In consideration of the fact that generally, if a general OCR recognition model or an OCR recognition model trained for business data of other topic types is directly used to process target image data containing target business data, the error of the obtained OCR recognition result is often relatively large, and further, a text field extracted based on the OCR recognition result also has a large error, which is not accurate enough.

Therefore, based on the existing processing method of business data, it is often necessary to train an OCR recognition model specially for the subject type of the target business data, and then process the target image data by using the OCR recognition model to obtain the text field with satisfactory accuracy. However, based on the above method, it is necessary to consume a large amount of resources, time, and cost to specially train the OCR recognition model, thereby increasing the processing cost of the business data and reducing the processing efficiency of the business data.

For the root cause of the above problems, the present specification considers that an existing preset OCR recognition model may be directly used to process a target image containing target service data to extract a target text field with possible errors; then, whether the target text field belongs to the special text field to be corrected or not can be determined according to a target text field library matched with the subject type of the target service data; under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound-shape code data which simultaneously contains a sound order code obtained based on the pronunciation characteristic and a character-shape code obtained based on the character-shape characteristic; furthermore, a reference text field which meets the requirement and has high matching degree can be accurately screened from the preset special text fields contained in the target text field library according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset special text field; and correcting the target text field by utilizing the reference text field to eliminate errors and obtain a corrected target text field with higher accuracy.

By the mode, the OCR recognition model aiming at the theme type to which the target service data belongs can be specially trained without consuming a large amount of resources, time and cost, and the special text field in the target service data can be efficiently and accurately recognized and extracted from the target image, so that the processing cost of the service data can be reduced, and the processing efficiency of the service data can be improved.

The embodiment of the present specification provides a method for processing service data, which may be specifically applied to a system including a server and a terminal device. In particular, reference may be made to fig. 1. The server and the terminal equipment can be connected in a wired or wireless mode so as to carry out specific data interaction.

In specific implementation, the terminal equipment can acquire a target image containing target service data; and then the collected target image is sent to a server.

For example, a user may capture a financial statement (a kind of target business data) through a camera disposed on a smart phone (a kind of terminal device) to obtain a corresponding target image; and then the target image is sent to a server through a mobile phone network.

Correspondingly, the server receives the target image sent by the terminal equipment and acquires the target image. The server needs to perform corresponding data processing on the target image to accurately identify and extract the proprietary text field in the target service data from the target image.

Firstly, a server can process the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; and extracting a target text field according to the OCR recognition result. The preset OCR recognition model may be a general OCR recognition model, or an OCR recognition model trained on business data of other topic types.

At this time, the target text field extracted by the server is a text field which may have errors and is not accurate enough, and further correction processing is required.

Then, the server can determine a preset text field library matched with the subject type of the target service data as a target text field library; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; wherein the target text field bank stores a plurality of preset proprietary text fields.

Under the condition that the target text field is determined to belong to the special text field to be corrected, the server can acquire the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters.

Then, the server can comprehensively utilize the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field to determine similarity parameters with good effect and high reference value between the target text field and the preset proprietary text field.

Furthermore, the server can accurately screen out the preset proprietary text fields meeting the requirements from a plurality of preset proprietary text fields contained in the target text field library according to the similarity parameters to serve as reference text fields; and according to the reference text field, the target text field is corrected in a targeted manner so as to eliminate the error existing when the preset OCR recognition model is used for extracting the target text field, and the corrected target text field with smaller error and higher accuracy is obtained.

Further, the server may perform subsequent service data processing according to the corrected target text field. For example, accounting and logging of financial data, etc.

Therefore, the server can efficiently and accurately identify and extract the special text field in the target business data from the target image without specially training an OCR (optical character recognition) model aiming at the theme type to which the target business data belongs.

In this embodiment, the server may specifically include a background server that is applied to a service data processing platform side and is capable of implementing functions such as data transmission and data processing. Specifically, the server may be, for example, an electronic device having data operation, storage function and network interaction function. Alternatively, the server may be a software program running in the electronic device and providing support for data processing, storage and network interaction. In the present embodiment, the number of servers is not particularly limited. The server may specifically be one server, or may also be several servers, or a server cluster formed by several servers.

In this embodiment, the terminal device may specifically include a front-end electronic device that is applied to a user (e.g., an attendant) side and can implement functions such as data acquisition and data transmission. Specifically, the terminal device may be, for example, a desktop computer, a tablet computer, a notebook computer, a smart phone, and the like. Alternatively, the terminal device may be a software application capable of running in the electronic device. For example, it may be some APP running on a smartphone, etc.

Referring to fig. 2, an embodiment of the present disclosure provides a method for processing service data, where the method is specifically applied to a server side. In particular implementations, the method may include the following.

S201: acquiring a target image; wherein, the target image contains target service data;

s202: processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result;

s203: determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields;

s204: under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters;

s205: determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field;

s206: screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

In this embodiment, the preset OCR recognition model may be specifically understood as an existing OCR recognition model. Specifically, the method may be a general OCR recognition model (e.g., a basic OCR recognition model), or an OCR recognition model trained for business data of other topic types.

Through the embodiment, the server does not need to consume a large amount of resources, time and cost to specially train the OCR recognition model aiming at the theme type to which the target service data belongs, and can efficiently and accurately recognize and extract the special text field in the required target service data from the target image by utilizing the existing preset OCR recognition model, so that the processing cost of the service data can be effectively reduced, and the processing efficiency of the service data is improved.

In some embodiments, the target image may be specifically understood as image data containing target service data. Specifically, the target service data may be a photograph obtained by shooting the target service data with a camera, or a screenshot captured from a video and containing the target service data, or the like.

In some embodiments, the target service data may be specifically understood as service data to be processed. In specific implementation, the target service data may include service data of a plurality of different topic types for different application scenarios.

In some embodiments, taking a business data processing scenario of a bank as an example, the target business data may specifically include at least one of the following: financial statements, transfer certificates, trade orders, and the like. It should be noted that the above listed subject types of the service data are only schematic illustrations. In specific implementation, according to a specific application scenario and a processing requirement, service data of other theme types may be included.

Through the embodiment, the service data processing method provided by the specification can be further expanded and applied to service data of more theme types in more application scenes, so that the application range of the service data processing method can be effectively expanded.

In some embodiments, proprietary text fields may be included in the target traffic data, as well as data value fields. The text field may be specifically understood as a field representing a key value in the target service data. Accordingly, the data value field may be specifically understood as a field representing a value in the target service data.

Specifically, for example, in a financial statement, the field "unallocated profit" may be understood to be a proprietary text field; the field "12.23 ten thousand yuan" in the same row and located after the field "unallocated profit" can be understood as a kind of data value field.

In this embodiment, it is desirable to process the target image by the service data processing method provided in this specification, so as to accurately identify and extract the proprietary text field in the target service data from the target image.

In some embodiments, after the target image is obtained, a preset OCR recognition model may be directly invoked to process the target image, so as to obtain an OCR recognition result with a possibility of error and low accuracy. The preset OCR recognition model may be understood as an existing OCR recognition model without performing special targeted training. Specifically, the preset OCR recognition model may be a common OCR recognition model, and may be an OCR recognition model trained for business data of other topic types.

In some embodiments, the aforementioned processing the target image by invoking a preset OCR recognition model to obtain a corresponding OCR recognition result may include the following steps: determining the subject type of the target business data; screening out a target structure template corresponding to the theme type from a plurality of preset structure templates according to the theme type of the target service data; according to the target structure template, carrying out slice processing on the target image to obtain a plurality of image slices; and calling a preset OCR recognition model to process the plurality of image slices so as to obtain a corresponding OCR recognition result.

By the embodiment, the target image can be firstly found and segmented by using the target structure template corresponding to the subject type of the target service to obtain a plurality of image slices containing text fields; and performing OCR recognition on each image slice by using a preset OCR recognition model, so that text fields in the image slices can be recognized more accurately, and an OCR recognition result with higher accuracy and better effect is obtained.

In some embodiments, before the specific implementation, the learning may be performed in advance for the plate-type structures of the business data of different topic types to construct and obtain a plurality of preset structure templates. Wherein, a preset structure template corresponds to a theme type.

In some embodiments, a specific text field may be extracted from the OCR recognition result as a target text field. It should be noted that, since the target text field is obtained by directly invoking a preset OCR recognition model, there may be an error, or a proprietary text field that needs to be obtained is not obtained, and further determination and processing are required.

In some embodiments, the extracting a target text field according to the OCR recognition result may include: and performing word segmentation processing on the OCR recognition result to obtain a plurality of corresponding target text fields.

In some embodiments, it may be determined whether the target text field may be a required proprietary text field; under the condition that the target text field is determined to be the required proprietary text field, whether the target text field belongs to the text field which has errors and needs to be corrected or not can be further judged, namely whether the target text field belongs to the proprietary text field to be corrected or not is judged; and under the condition that the target text field is determined to belong to the special text field to be corrected, performing targeted correction processing on the target text field to finally obtain the special text field with the eliminated error and high accuracy.

In some embodiments, the determining whether the target text field belongs to the proprietary text field to be corrected according to the target text field library may include the following steps.

S1: determining a target text field library matched with the subject type of the target service data from a plurality of preset text field libraries;

s2: counting the number of characters contained in the target text field;

s3: screening out preset special text fields with the same number of characters from the target text field library to serve as comparison text fields;

s4: counting the proportion of the same characters between the contrast text field and the target text field;

s5: detecting whether the ratio of the same characters between at least one contrast text field and a target text field is larger than a preset ratio threshold value;

s6: under the condition that the proportion of the same character between at least one contrast text field and the target text field is determined to be larger than a preset proportion threshold value, obtaining a confidence coefficient parameter of the target text field;

s7: and determining whether the target text field belongs to the proprietary text field to be corrected or not according to the confidence coefficient parameter of the target text field.

By the embodiment, on one hand, the special text fields which belong to the required acquired special text fields with high probability can be screened out from the target text fields obtained based on the OCR recognition result more accurately; on the other hand, certain errors exist, and the special text field to be corrected needs to be further corrected. And then, the further correction processing can be carried out only on the special text field to be corrected, and the correction processing is not carried out on all target text fields uniformly without distinguishing, so that the data processing amount during the subsequent correction processing can be reduced, and the overall data processing efficiency is improved.

In some embodiments, the preset text field library corresponds to a topic type, and specifically stores proprietary text fields (which may be recorded as preset proprietary text fields) appearing in the service data of the corresponding topic type. The preset text field library may be obtained by summarizing and sorting the historical service data of the corresponding topic type in advance. The preset text field library is obtained in a manner that will be described later.

In some embodiments, in specific implementation, the subject type of the target service data in the target image may be determined first; and determining a preset text field library matched with the subject type of the target service data from a plurality of preset text field libraries as the target text field library according to the subject type of the target service data.

The server may then count the number of characters contained in the target text field. For example, with the target text field: "undistributed benefit agent" is an example. The server may count the number of characters contained in the target text field to 5.

Further, the server may first filter out a preset proprietary text field with the same number of characters from the target text field library as a comparison text field. In this way, only the comparison text fields are required to be compared subsequently to determine whether the target text field belongs to the proprietary text field to be corrected, and all the preset proprietary text fields contained in the target text field library are not required to be compared. Therefore, the data processing amount in the process of determining whether the target text field belongs to the special text field to be corrected can be reduced, and the overall data processing efficiency is improved.

For example, the server may screen out a preset proprietary text field with a number of characters of 5 from the library of target text fields as a control text field associated with the target text field "no benefit assigned".

In the implementation, it is considered that the target text field is obtained by directly calling a preset OCR recognition model, and therefore, the number of characters included in the obtained target text field may sometimes have an error. For example, a field originally containing 4 characters is identified as a field containing 5 characters. Thus, in filtering the comparison text field, the server may simultaneously filter out a preset exclusive text field having a number of characters equal to the number of characters of the target text field, a number of characters equal to the number of characters of the target text field minus the threshold number of characters, and a number of characters equal to the number of characters of the target text field plus the threshold number of characters, as the comparison text field associated with the target text field. Therefore, the special text field really matched with the target text field can be prevented from being missed, and the accuracy of subsequent correction is improved. The threshold number of characters may be set to a smaller number such as 1.

In some embodiments, the server may compare the target text field with the filtered plurality of control text fields. Specifically, the server may count the same character ratio between the target text field and each of the comparison text fields. And judging whether the target text field is possible to be the special text field required to be acquired or not by detecting whether the ratio of the same characters between the comparison text field and the target text field is larger than a preset ratio threshold or not.

Specifically, for example, the server may count that the ratio of the same characters (including the characters "not", "assigned", "benefit") between the target text field "unassigned benefit" and one of the control text fields "unassigned benefit" is 4/5 ═ 0.8. Similarly, the ratio of the same characters (including the characters "assign" and "assign") between the target text field "unassigned benefit agent" and the other control text field "to-be-assigned asset" is counted as 2/5 ═ 0.4. The proportion of the same character between each comparison text field and the target text field can be counted in the manner described above. Then find out the ratio (e.g. 0.8) with the largest value, and compare it with the preset ratio threshold (e.g. 0.7).

And if the ratio with the largest numerical value is found to be larger than the preset ratio threshold, determining that the ratio of the same character between at least one comparison text field and the target text field is larger than the preset ratio threshold. Then, it can be determined that the target text field has a relatively high probability of possibly being a proprietary text field to be acquired, and further determination processing can be performed on the target text field.

On the contrary, if the ratio with the largest numerical value is found to be less than or equal to the preset ratio threshold, it is determined that the ratio of the same character between any of the control text fields and the target text field is not greater than the preset ratio threshold. It may be determined that the target text field may not be the proprietary text field that needs to be obtained with a relatively high probability, and further the further determination processing for the target text field may be stopped.

In some embodiments, in the case that it is determined that the target text field has a higher probability of belonging to the proprietary text field that needs to be acquired, an error condition of the target text field may be further determined by acquiring and according to the confidence parameter of the target text field, so as to determine whether further correction needs to be performed on the target text field. The confidence parameter may be specifically understood as parameter data for representing the confidence of the target text field.

If the target text field is judged to have a large error and low reliability (for example, the confidence coefficient parameter is smaller than a preset parameter threshold) according to the confidence coefficient parameter, the target text field can be determined to belong to the proprietary text field to be corrected, and further, the subsequent correction processing can be performed on the target text field.

On the contrary, if the target text field is judged to have a small error and a high confidence level (for example, the confidence level parameter is greater than or equal to a preset parameter threshold) according to the confidence level parameter, it may be determined that the target text field does not belong to the proprietary text field to be corrected, that is, the target text field is accurate and does not need to be subjected to subsequent correction processing.

In some embodiments, the obtaining the confidence parameter of the target text field may include, in specific implementation: calling a preset OCR recognition model to process the plurality of image slices so as to obtain a corresponding OCR recognition result, and acquiring a prediction score value aiming at the OCR recognition result and output by the preset OCR recognition model; and determining a confidence coefficient parameter of the target text field according to the prediction score value.

In the embodiment, when the conventional OCR recognition model performs recognition processing on an image, the prediction score value for the OCR recognition result is calculated by the model interior at the same time so as to assist the model to finally output the OCR recognition result.

Therefore, the server may derive, through an interface related to the preset OCR recognition model, a prediction score value generated and used by the model when obtaining the OCR recognition result, as a confidence parameter for determining the credibility of the target text field obtained based on the OCR recognition result, while calling the preset OCR recognition model to process the plurality of image slices to obtain the corresponding OCR recognition results.

By the embodiment, the server can obtain the confidence coefficient parameter for judging the credibility of the target text field without consuming extra processing resources and processing time.

In some embodiments, the sound and shape code data of the target text field can be obtained under the condition that the target text field is determined to belong to the special text field to be corrected; according to the sound-shape code data of the target text field, combining the editing distance, and accurately finding out a preset special text field which is really matched with the target text field correspondingly from the target text field library as a reference text field; and then, the reference text field can be utilized to carry out targeted correction processing on the target text field so as to finally extract the special text field with the error eliminated and the higher accuracy.

In some embodiments, the sound-shape code data can be specifically understood as coded data capable of simultaneously characterizing the pronunciation characteristics and the font characteristics of the character. The sound-shape code data body can comprise a scale code obtained based on the pronunciation characteristics of characters in the target text field and a font code based on the font characteristics of the same character in the target text field.

It should be noted that, for the editing distance, for the pictographs such as chinese characters having both phonogram structure and semantic structure, the characteristics of the characters can be more comprehensively and accurately described by using the phonogram code data, and then the characters with higher matching degree and closer semantics can be more accurately found by using the invisible code data, so that the method has better use effect. The editing distance is designed for Latin characters such as English and the like only having a phonogram structure, and the font characteristics of the characters cannot be described, so that errors are easy to occur when the editing distance is used alone to search for characters with high matching degree and close semantics.

In some embodiments, the obtaining of the sound-shape code data of the target text field may include the following steps: determining the vowels, initials and tones of the characters in the target text field; constructing corresponding scale codes according to the vowels, the initials and the tones of the characters in the target text field and a preset coding rule; determining the structure, four-corner coding and stroke number of characters in the target text field; constructing a corresponding font code according to the structure, the four-corner coding and the stroke number of the characters in the target text field and a preset coding rule; and splicing the tone codes and the font codes to obtain the font code data of the target text field.

Through the embodiment, the pronunciation and font code data which can simultaneously describe the pronunciation characteristics and the font characteristics of the characters contained in the target text field can be accurately obtained. Therefore, the reference text fields which have higher matching degree and closer semantics and meet the requirements can be found more accurately subsequently based on the sound-shape code data for subsequent correction processing.

Specifically, for example, taking the example of obtaining the phonetic-configurational code data of a character "Lang" in the target text field, see fig. 3.

It can be determined that the vowel of the character is "ANG", the initial consonant is "L", and the tone is the second sound. Then, according to a preset encoding rule, an encoding value (which may be denoted as p) of a first data bit corresponding to a final₁) Setting 'F' corresponding to the vowel; the encoded value of the second data bit (which may be denoted as p) to correspond to the initial₂) Set to "7" corresponding to the initial; encode value of the third data bit (which may be denoted as p)₃) Set to null, noted 0; the encoded value of the fourth data bit (which may be denoted as p) corresponding to the tone₄) Set to "2" corresponding to the tone. Thereby, the scale code which can describe the pronunciation characteristics of the character can be obtained: "F702" corresponds to the first 4 data bit encoded values in the phonetic-shape code of FIG. 3.

Similarly, the structure (e.g., left-right structure, top-bottom structure, surrounding structure, etc.), the four corner coding, and the number of strokes of the character may be further determined. Then, according to a preset encoding rule, an encoding value (which may be denoted as p) of a fifth data bit corresponding to the structure of the character can be obtained₅) Set to a corresponding "1"; the coded value (which may be denoted as p) of the sixth data bit connected corresponding to the four-corner coding of the character₆) The encoded value of the seventh data bit (which may be denoted as p)₇) The coded value of the eighth data bit (which may be denoted as p)₈) The encoded value of the ninth data bit (which may be denoted as p)₉) Are sequentially set as corresponding '1', '3', '1', '3'; the coded value of the tenth data bit (which may be denoted as p) corresponding to the stroke number₁₀) Set to the corresponding "B". So that the font code capable of describing the font character of the character can be obtained: "11313B" corresponds to the last 6 data bit encoded values in the pictogram code of fig. 3.

And then, the sound code and the font code are spliced, so that the sound and font code data which can simultaneously describe the pronunciation characteristic and the font characteristic of the character 'Lang' in the figure 3 can be obtained.

In the above manner, the phonetic and font code data of other characters in the target field can be determined and obtained. And then combining the sound and shape code data of the characters contained in the target field to obtain the sound and shape code data of the target field.

In some embodiments, in specific implementation, the similarity parameter between the target text field and each preset proprietary text field in the target text field library can be calculated by simultaneously using the sound-shape code data of the target text field, the sound-shape code data of the preset proprietary text field in the target text field library, and the editing distance between the target text field and the preset proprietary text field; and based on the similarity parameter, finding out a preset special text field which is relatively matched with the target text field and has relatively close semantics from a plurality of preset special text fields contained in the target text field library, and using the preset special text field as a reference text field meeting the requirement.

The similarity parameter may be specifically understood as parameter data that comprehensively describes the similarity between different text fields based on a plurality of dimensional factors such as font, pronunciation, and voice. Generally, if the similarity parameter of two text fields is larger, the two text fields are more similar, and have relatively higher probability of belonging to the same text field. Conversely, if the similarity parameter of two text fields is smaller, the two text fields are more dissimilar, and do not belong to the same text field with a relatively higher probability.

In some embodiments, the determining, according to the phonetic and font code data of the target text field and the edit distance between the target text field and the preset proprietary text field, a similarity parameter between the target text field and the preset proprietary text field may include the following steps: calculating the similarity parameter of the target text field and the preset scale code and the similarity parameter of the font code of the special text field according to the font code data of the target text field; and according to a preset weight parameter, carrying out weighted summation on the editing distance between the target text field and a preset proprietary text field, the scale code similarity parameter and the font code similarity parameter to obtain the similarity parameter between the target text field and the preset proprietary text field.

Through the embodiment, the similarity parameter between the target text field with good effect and high accuracy and the preset proprietary text field can be accurately calculated.

In some embodiments, the preset weighting parameter may specifically include: a first weight parameter (which may be denoted as w) corresponding to the edit distance₁) And a second weight parameter (which may be denoted as w) corresponding to the similarity parameter of the scale code₂) And a third weight parameter (which may be denoted as w) corresponding to the similarity parameter of the font code₃)。

In some embodiments, the values of the three weight parameters may be flexibly set according to the type characteristics of the characters in the target text field. Specifically, for example, for pictographic characters such as chinese characters, the values of the third weight parameter and the first weight parameter may be set to be relatively large, respectively 0.4, and the value of the second weight parameter may be set to be relatively small, respectively 0.2. Of course, the above listed preset weighting parameters are only illustrative. In specific implementation, according to specific situations and precision requirements, other numerical values can be adopted to set the preset weight parameters. The present specification is not limited to these.

In some embodiments, when implemented, the similarity parameter between the target text field and the preset proprietary text field may be calculated according to the following formula:

wherein E may be specifically represented as a similarity parameter between the target text field and the preset proprietary text field, edge _ distance may be specifically represented as an edit distance between the target text field and the preset proprietary text field, and P may be specifically represented as a similarity parameter between the target text field and a scale code of the preset proprietary text fieldS can be specifically expressed as a font code similarity parameter, S, of the target text field and a preset special text field₁Which may be expressed in particular as the character length, s, of the target text field₂The character length, w, which may be particularly expressed as a preset proprietary text field₁In particular, it can be expressed as a first weight parameter, w, corresponding to the edit distance₂Specifically, the second weight parameter, w, may be expressed as a second weight parameter corresponding to the similarity parameter of the tone code₃And specifically, the third weight parameter may be represented as a third weight parameter corresponding to the font code similarity parameter.

The similarity parameter with higher precision can be accurately calculated by the formula.

In some embodiments, when the edit distance between the target text field and the preset proprietary text field is specifically calculated, the target text field and the preset proprietary text field may be converted into two corresponding character strings based on the same conversion mapping rule, and then the feature vectors corresponding to the two character strings are calculated; and calculating the vector distance between the two feature vectors to obtain the editing distance between the target text field and the preset special text field.

In some embodiments, when the similarity parameter between the target text field and the preset scale code of the proprietary text field is specifically calculated, a preset similarity operation may be performed by using the coded value of the data bit of the scale code of each character in the target text field and the coded value of the data corresponding to the scale code of the corresponding character in the preset proprietary text field, so as to obtain the similarity parameter between the target text field and the preset scale code of the proprietary text field.

Specifically, taking the calculation of the similarity parameter between the target text field containing only one character and the preset special text field as an example, the similarity parameter between the corresponding scale codes can be calculated as follows:

wherein P is the sound of the target text field and the preset proprietary text fieldThe parameter of the degree of similarity of the code,

indicating that a preset similarity operation is performed using the encoded value of the first data bit of the character in the target text field and the encoded value of the first data bit of the character at the same position of the preset proprietary text field,

indicating that a preset similarity operation is performed using the encoded value of the second data bit of the character in the target text field and the encoded value of the second data bit of the character at the same position as the preset proprietary text field,

indicating that a preset similarity operation is performed using the encoded value of the third data bit of the character in the target text field and the encoded value of the third data bit of the character at the same position as the preset proprietary text field,

and performing preset similarity operation by using the coded value of the fourth data bit of the character in the target text field and the coded value of the fourth data bit of the character at the same position of the preset proprietary text field.

Representing a predetermined similarity operation based on which: and when the two encoding values participating in the operation are the same, returning to 1, otherwise, returning to 0.

In some embodiments, similarly, when the similarity parameter of the font codes of the target text field and the preset proprietary text field is specifically calculated, a preset similarity operation may be performed by using the coded value of the data bit of the font code of each character in the target text field and the coded value of the corresponding data of the font code of the corresponding character in the preset proprietary text field, so as to obtain the similarity parameter of the font codes of the target text field and the preset proprietary text field.

Specifically, taking calculation of the similarity parameter between the font code of the target text field containing only one character and the preset proprietary text field as an example, the similarity parameter between the corresponding font codes can be calculated in the following manner:

wherein S is a font code similarity parameter of the target text field and a preset proprietary text field, p₁₀The coded value of the seventh data bit in the font code for a character in the target text field, p₁₀' is the encoded value of the seventh data bit in the glyph code of the character at the same location in the preset proprietary text field.

Through the embodiment, the similarity parameters of the target text field and each preset proprietary text field contained in the target text field library can be calculated; and then, a preset special text field with the maximum similarity parameter with the target text field can be screened out from the target text field library to serve as a reference text field which is most matched with the target text field and corresponds to the target text field and has the closest semanteme and meets the requirement.

In some embodiments, when the similar parameters are specifically calculated, in order to reduce data processing amount and improve data processing efficiency, a comparison text field may be first screened from a plurality of preset proprietary text fields included in the target text field library according to the number of characters included in the target text field; then only calculating the similarity parameters of the target text field and each comparison text field; and screening out a reference text field matched and corresponding to the target text field from the plurality of comparison text fields based on the similarity parameter. Therefore, the effects of reducing data processing amount and improving data processing efficiency can be achieved.

In some embodiments, the correcting the target text field according to the reference text field may include the following steps: comparing the characters in the reference text field with the characters in the target text field to determine the character positions with character differences; and replacing the characters at the same character positions in the target text field with the characters at the character positions with character differences in the reference text field to obtain the corrected target text field.

By the embodiment, the target text field can be corrected specifically by effectively utilizing the standard reference text field, so that errors in the target text field are eliminated, the corrected target text field with high accuracy is obtained, and the required special text field can be identified and extracted from the target image more accurately.

Specifically, for example, taking the target text field "no profit is allocated" as an example, the preset proprietary text field with the largest numerical value of the similarity parameter found from the target text field library based on the above manner is "no profit allocated". Accordingly, the preset private text field may be determined as the reference text field. Further, the reference text field can be used to perform a targeted correction process on the target text field "no benefit agent is allocated".

In the specific correction, the characters at the positions in the target text field can be sequentially compared with the characters at the same positions in the reference text field, so as to find out the character positions with character differences. For example, the character at the fifth character position in the target text field is "agent" and the character at the fifth character position in the reference text field is "run", and thus the fifth character position can be determined as the character position where there is a character difference.

Then, the characters in the reference text field can be used as the standard, and the characters at the positions with character differences in the reference text field are used for replacing the characters at the same character positions in the target text field, so that the target text field is corrected in a targeted manner, and the corrected target text field is obtained. For example, the character "agent" at the same character position in the target text field may be replaced with the character "run" at the fifth character position in the reference text field, resulting in a corrected target text field.

In some embodiments, after obtaining the corrected target text field, when the method is implemented, the method may further include: and carrying out corresponding target service data processing according to the corrected target text field. For example, the corrected target text field may be used to extract a data value field having a corresponding relationship with the target text field, so as to restore the target service data in a complete text form. And then, the target service data in the text form is utilized to perform relevant data processing such as service handling, service data statistics, service data storage and the like.

In some embodiments, before implementation, a plurality of preset text field libraries may be established by sorting and summarizing the historical business data.

Specifically, before the target image is acquired, when the method is implemented, the following may be further included: acquiring a plurality of historical service data; the historical business data and the target business data have the same theme type; extracting a plurality of text fields from a plurality of historical service data; and filtering out text fields representing data values from the text fields to obtain a plurality of preset proprietary text fields so as to construct a preset text field library matched with the subject type of the target service data.

Through the embodiment, the preset text field library which corresponds to the plurality of theme types respectively and stores the preset proprietary text fields related to the theme types can be constructed in advance.

In some embodiments, after filtering out the text fields representing the data values from the plurality of text fields to obtain a plurality of preset proprietary text fields, the historical usage frequency of the plurality of preset proprietary text fields may be counted according to the historical service data. And further screening out a plurality of preset proprietary text fields with higher use frequency from the plurality of preset proprietary text fields according to the historical use frequency to construct a corresponding preset text field library. Therefore, a preset text field library which is relatively more simplified and has higher efficiency in subsequent use can be obtained.

In some embodiments, after the preset text field library is constructed in the above manner, the service data of the same theme type in the time period can be acquired every preset time period; and then, updating the preset text field library by using the service data of the same theme type in the previous time period at every preset time period. For example, a preset proprietary text field newly appearing in the previous time period may be added to a preset text field library, or a preset proprietary text field with the use frequency of 0 in the previous time period may be deleted from the preset text field library.

As can be seen from the above, in the service data processing method provided in this specification, based on the method, the server may directly use the existing preset OCR recognition model to process the target image containing the target service data, and extract the target text field from the OCR recognition result obtained by processing the preset OCR recognition model; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library matched with the subject type of the target service data; under the condition that the specific text field to be corrected is determined, the server can acquire the phonetic-shape code data which simultaneously contains the scale code obtained based on the pronunciation characteristics of the characters and the font code obtained based on the font characteristics of the characters; furthermore, according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field, a reference text field meeting the requirements can be screened out from the preset proprietary text field contained in the target text field library accurately; and the reference text field is utilized to carry out targeted correction processing on the currently identified target text field, so as to obtain a more accurate corrected target text field. Therefore, the server can identify and extract the special text field in the target business data from the target image more efficiently and accurately without consuming a large amount of resources, time and cost to specially train the OCR identification model aiming at the theme type to which the target business data belongs, so that the processing cost of the business data can be effectively reduced, and the processing efficiency of the business data is improved.

Referring to fig. 4, another method for processing service data is further provided in the embodiments of the present specification. When the method is implemented, the following contents may be included.

S401: acquiring a target image; wherein, the target image contains target service data;

s402: processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result;

s403: determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields;

s404: under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters;

s405: determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field;

s406: screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

Through the embodiment, the corrected target text field can be acquired from the target image relatively accurately without calculating the editing distance, so that the data processing flow can be further simplified, and the data processing efficiency is improved.

Embodiments of the present specification further provide a server, including a processor and a memory for storing processor-executable instructions, where the processor, when implemented, may perform the following steps according to the instructions: acquiring a target image; wherein, the target image contains target service data; processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields; under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters; determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field; screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

In order to more accurately complete the above instructions, referring to fig. 5, another specific server is provided in the embodiments of the present specification, wherein the server includes a network communication port 501, a processor 502 and a memory 503, and the above structures are connected by an internal cable, so that the structures can perform specific data interaction.

The network communication port 501 may be specifically configured to acquire a target image; wherein, the target image contains target service data.

The processor 502 may be specifically configured to process the target image by invoking a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields; under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters; determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field; screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

The memory 503 may be specifically configured to store a corresponding instruction program.

In this embodiment, the network communication port 501 may be a virtual port that is bound to different communication protocols, so that different data can be sent or received. For example, the network communication port may be a port responsible for web data communication, a port responsible for FTP data communication, or a port responsible for mail data communication. In addition, the network communication port can also be a communication interface or a communication chip of an entity. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it can also be a Wifi chip; it may also be a bluetooth chip.

In this embodiment, the processor 502 may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The description is not intended to be limiting.

In this embodiment, the memory 503 may include multiple layers, and in a digital system, the memory may be any memory as long as binary data can be stored; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank.

An embodiment of the present specification further provides a computer storage medium based on the foregoing service data processing method, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the computer storage medium implements: acquiring a target image; wherein, the target image contains target service data; processing the target image by calling a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result; determining whether the target text field belongs to a special text field to be corrected or not according to a target text field library; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields; under the condition that the target text field is determined to belong to the special text field to be corrected, acquiring the sound and shape code data of the target text field; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters; determining similarity parameters of the target text field and a preset proprietary text field according to the sound-shape code data of the target text field and the editing distance between the target text field and the preset proprietary text field; screening out a preset proprietary text field meeting the requirement from a target text field library according to the similarity parameter to serve as a reference text field; and correcting the target text field according to the reference text field.

In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.

In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.

Referring to fig. 6, in a software level, an embodiment of the present specification further provides a device for processing service data, where the device may specifically include the following structural modules.

The first obtaining module 601 may be specifically configured to obtain a target image; wherein, the target image contains target service data;

the first processing module 602 may be specifically configured to process the target image by invoking a preset OCR recognition model to obtain a corresponding OCR recognition result; extracting a target text field according to the OCR recognition result;

the first determining module 603 may be specifically configured to determine, according to a target text field library, whether the target text field belongs to a proprietary text field to be corrected; the target text field library is matched with the subject type of the target service data; the target text field library stores a plurality of preset proprietary text fields;

the second obtaining module 604 may be specifically configured to obtain the pictophonetic code data of the target text field when it is determined that the target text field belongs to the special text field to be corrected; the sound-shape code data comprises a scale code obtained based on the pronunciation characteristics of the characters and a font code obtained based on the font characteristics of the characters;

the second determining module 605 is specifically configured to determine a similarity parameter between the target text field and the preset proprietary text field according to the phonetic-configurational code data of the target text field and the edit distance between the target text field and the preset proprietary text field;

the second processing module 606 may be specifically configured to screen out a preset proprietary text field meeting the requirement from the target text field library according to the similarity parameter, and use the preset proprietary text field as a reference text field; and correcting the target text field according to the reference text field.

It should be noted that, the units, devices, modules, etc. illustrated in the above embodiments may be implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. It is to be understood that, in implementing the present specification, functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of a plurality of sub-modules or sub-units, or the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

As can be seen from the above, the processing apparatus for business data provided in this specification can identify and extract the proprietary text field in the target business data from the target image relatively efficiently and accurately without consuming a large amount of resources, time, and cost to train the OCR recognition model specifically for the topic type to which the target business data belongs, so that the processing cost of the business data can be effectively reduced, and the processing efficiency of the business data can be improved.

In a specific scenario example, the processing method of the service data provided in the present specification may be applied to correct the text field recognized by the OCR. The following can be referred to for specific implementation.

As shown in fig. 7, the method mainly includes: data acquisition, OCR recognition, professional vocabulary judgment, judgment on whether to correct errors or not, correction calculation, result output and the like.

Step 1: first, data collection (for example, obtaining a target image containing target business data) is performed, and mainly financial statements and certificates (target business data) can be collected, for example, scanning or photographing is performed. The collected data is sliced according to a format (e.g., a preset structure template) to obtain corresponding fragment data, which can be seen from fig. 8. The sliced fragment data may be used for subsequent OCR recognition.

Step 2: the OCR recognition is mainly performed on the sliced image patches (obtaining OCR recognition results).

And step 3: the recognized text is participled and a professional vocabulary (e.g., a proprietary text field) determination is made if the recognized text is a professional vocabulary of a professional domain (e.g., a target topic type).

And 4, step 4: this step may determine whether the text determined in the previous step as professional vocabulary requires error correction processing (which corresponds to determining whether the target text field belongs to the exclusive text field to be corrected). In the specific determination, a determination method based on a confidence coefficient parameter may be adopted to determine whether the confidence coefficient of the OCR recognition is higher than a threshold. If the decision is below the threshold, it is determined that error correction is required to trigger the entry to step 5.

And 5: this step may specifically employ a similarity calculation method based on glyph encoding and syllable encoding (e.g., the sound-shape code data of characters in a field) for the correction calculation.

Step 6: the result is output and the process ends.

In this scenario example, the similarity calculation method to be described may specifically be to calculate the similarity between two different character strings by a certain method. A percentage may be used to measure the degree of similarity between different strings.

But based on the existing method, better effect can be achieved when text matching of Latin words is processed. However, the above method is designed for latin characters, and when the method is applied to similarity calculation of pictographs (e.g., chinese characters), errors are easily generated, and the effect is relatively poor.

For example, such as: nan tong city-hardly tong city-bei tong city. The existing method adopts an edit distance algorithm to calculate and obtain the similarity between the three: the similarity between Nantong city and difficultly-passing city, and the similarity between Nantong city and Beitong city are the same. This is because the same penalty is required to convert to the other for both groups based on existing methods. For example, using the N-Gram algorithm, the same results are obtained.

However, for those familiar with Chinese, Nantong city and Touchi city have similar meanings and actually have higher similarity. Since the pronunciation of both is identical. When a person says "nan tong city", a person familiar with chinese can very quickly reflect what he is referring to "nan tong city".

As another example, a fir is called a gift-fir gift. Based on the existing method, if one stands at a point of view that resolves the similarity of latin characters, then the two strings have only about 50% similarity. Since of the four characters, there are two characters that are completely different. The two characters are not only different in appearance but also completely different in pronunciation.

However, for those familiar with Chinese, the two fields should have a relatively high degree of similarity. Because the first and second characters, although different, have a very close glyph.

In order to solve the above problem, in this scenario example, a processing idea is proposed: the Chinese character in the field is firstly converted into a group of alphanumeric sequences, and the hash algorithm used for the conversion must be capable of retaining the font characteristics and pronunciation characteristics of the Chinese character. Based on the conversion, the similarity problem of Chinese characters can be changed into the similarity problem of two groups of alphanumeric sequences. Therefore, the key to the algorithm is to find a suitable hash algorithm.

Based on this idea, it was determined through long-term practical tests to introduce and use a phonographic code (e.g., phonographic code data) to solve the above-mentioned problems. The sound-shape code can be a coding mode aiming at Chinese characters, the coding can convert a Chinese character into a ten-digit alphanumeric sequence, and the pronunciation and the character pattern characteristics of the Chinese character are reserved to a certain extent. As shown in fig. 3, the whole phonetic-shape code is divided into two parts: the first part is a phonetic code (corresponding to a tone code) and the second part is a font code.

The phonetic codes mainly cover the content of vowels, initials, complementary codes and tones.

In the font code, the first bit (the fifth data bit) may be called a structure bit, and the structure of a chinese character may be represented by one character according to its different structure. The four next bits (the sixth data bit, the seventh data bit, the eighth data bit and the ninth data bit) are coded by four corners to describe the shape of the Chinese character. The last digit (the tenth data is), which is the stroke digit of the Chinese character, specifically 1 to 9 may be used to respectively represent the number of strokes less than 10, and for the number of strokes greater than 10, a may be used to represent 10, B may be used to represent 11, and so on. Where Z may be used to represent 35, and other stroke numbers greater than 35.

After encoding in the above manner, similarity calculation may be performed, including: the similarity of the sound shapes (e.g. the sound scale code similarity parameter) and the similarity of the character shapes (e.g. the character shape code similarity parameter) are calculated.

In the present scenario example, similarity calculation may be performed based on the above-described voice font and font simultaneously, in combination with the edit distance. The specific similarity calculation may consist of two parts: the first part is the calculation of the edit distance, and the second part is the calculation of the similarity of the sound and the shape. The specific formula can be expressed as:

where S1 and S2 respectively indicate the string lengths of the two fields, P is the degree of similarity (e.g., degree of similarity parameter) of the phonetic codes, and S is the degree of similarity (e.g., similarity parameter) of the font codes. w1, w2 and w3 are weights (i.e. the first weight parameter, the second weight parameter and the third weight parameter) of the weight control portions.

Wherein, the calculation formula of the P and the S is as follows:

wherein the content of the first and second substances,

represents an operation: if two characters are the same, 1 is returned, and different 0 is returned.

Step 6: and finally, outputting the most similar professional vocabulary as a corrected result.

In the present scenario example, the result after the specific correction can be seen in fig. 9. According to the method, only the fragment data of the sequence number 3 is identified to have errors, and the accurate identification result is obtained through correction after other fragment data are identified.

Through the scene example, the method provided by the specification is verified, and the special text field in the target service data can be identified and extracted from the target image more efficiently and accurately only by using the existing OCR recognition model, so that the processing cost of the service data can be effectively reduced, and the processing efficiency of the service data is improved.

Although the present specification provides method steps as described in the examples or flowcharts, additional or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an apparatus or client product in practice executes, it may execute sequentially or in parallel (e.g., in a parallel processor or multithreaded processing environment, or even in a distributed data processing environment) according to the embodiments or methods shown in the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded. The terms first, second, etc. are used to denote names, but not any particular order.

Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus necessary general hardware platform. With this understanding, the technical solutions in the present specification may be essentially embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments in the present specification.

The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims

1. A method for processing service data is characterized by comprising the following steps:

2. The method of claim 1, wherein the target traffic data comprises at least one of: financial statements, transfer vouchers, trade orders.

3. The method of claim 2, wherein processing the target image by invoking a preset OCR recognition model to obtain a corresponding OCR recognition result comprises:

determining the subject type of the target business data;

4. The method of claim 3, wherein determining whether the target text field belongs to the proprietary text field to be corrected based on a library of target text fields comprises:

counting the number of characters contained in the target text field;

5. The method of claim 4, wherein obtaining the confidence parameter for the target text field comprises:

6. The method of claim 1, wherein obtaining the phonetic and font code data for the target text field comprises:

7. The method of claim 6, wherein determining the similarity parameter between the target text field and the preset proprietary text field according to the phonetic and font code data of the target text field and the edit distance between the target text field and the preset proprietary text field comprises:

8. The method of claim 7, wherein the similarity parameter between the target text field and the predetermined proprietary text field is calculated according to the following equation:

wherein E is a similarity parameter between the target text field and the preset proprietary text field, edit _ distance is an editing distance between the target text field and the preset proprietary text field, P is a similarity parameter between the target text field and the preset proprietary text field, S is a similarity parameter between the target text field and the preset proprietary text field, and S is a similarity parameter between the target text field and the preset proprietary text field₁Is the character length, s, of the target text field₂For a preset character length, w, of the proprietary text field₁Is a first weight parameter corresponding to the edit distance, w₂Is a second weight parameter, w, corresponding to the scale code similarity parameter₃Is the third weight parameter corresponding to the similarity parameter of the font code.

9. The method of claim 1, wherein correcting the target text field based on the reference text field comprises:

comparing the characters in the reference text field with the characters at the same character position in the target text field to determine the character position with character difference;

10. The method of claim 1, wherein prior to acquiring the target image, the method further comprises:

11. A device for processing service data, comprising:

12. A server comprising a processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method of any one of claims 1 to 10.

13. A computer-readable storage medium having stored thereon computer instructions which, when executed, implement the steps of the method of any one of claims 1 to 10.

14. A method for processing service data is characterized by comprising the following steps: