CN111027345A

CN111027345A - Font identification method and apparatus

Info

Publication number: CN111027345A
Application number: CN201811172388.8A
Authority: CN
Inventors: 邓斌; 章庆元
Original assignee: Beijing Kingsoft Office Software Inc; Zhuhai Kingsoft Office Software Co Ltd; Guangzhou Kingsoft Mobile Technology Co Ltd
Current assignee: Beijing Kingsoft Office Software Inc; Zhuhai Kingsoft Office Software Co Ltd; Guangzhou Kingsoft Mobile Technology Co Ltd
Priority date: 2018-10-09
Filing date: 2018-10-09
Publication date: 2020-04-17

Abstract

The embodiment of the invention provides a font identification method and a font identification device, wherein the method comprises the following steps: determining a picture to be identified, wherein the picture to be identified is a picture containing characters to be identified. And inputting the picture to be recognized into the font recognition network model to obtain the font of the character to be recognized in the picture to be recognized. The font recognition network model is a model obtained by training according to a training set, and the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture. Because the scanned part can be converted into the format of the picture, the font of the characters in the scanned part can be identified by using the font identification method provided by the embodiment of the invention.

Description

Font identification method and apparatus

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a font identification method and apparatus.

Background

The scanning piece is an electronic file obtained by scanning a paper file through a scanner. The text in the scan piece is not editable. In order to realize the editing of characters in the scanned part, the scanned part can be converted into an editable document.

In order to convert the scanned part into an editable document, information such as the font size, font style and position of characters in the scanned part needs to be acquired. The information such as the character size and the position of the characters in the scanned piece can be acquired by the existing image processing method. In the prior art, the font of the characters in the scanned part cannot be identified.

Disclosure of Invention

The embodiment of the invention aims to provide a font identification method and a font identification device so as to realize identification of the font of characters in a scanned piece. The specific technical scheme is as follows:

in order to solve the above problem, an embodiment of the present invention provides a font identification method, where the method includes:

determining a picture to be identified, wherein the picture to be identified is a picture containing characters to be identified;

inputting the picture to be recognized into a font recognition network model to obtain the font of the character to be recognized in the picture to be recognized;

the font recognition network model is a model obtained by training a preset neural network model according to a training set, wherein the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture.

Optionally, the step of determining the picture to be recognized includes:

acquiring a scanned document to be identified;

and intercepting the character area in the scanned document to be identified as the picture to be identified.

Optionally, if the picture to be recognized includes a plurality of characters to be recognized, the step of inputting the picture to be recognized into the font recognition network model includes:

cutting the picture to be recognized into a plurality of target pictures, wherein each target picture comprises a character to be recognized;

and respectively inputting each target picture into the font identification network model.

Optionally, the font recognition network model is obtained by training through the following steps:

acquiring a preset neural network model and the training set;

inputting the sample pictures contained in the training set into the neural network model to obtain the predicted fonts of characters in the sample pictures;

determining a loss value according to the obtained predicted font and the real font of the characters in the sample pictures contained in the training set;

determining whether the neural network model converges according to the loss value;

if not, adjusting parameter values in the neural network model, and returning to the step of inputting the sample pictures contained in the training set into the updated neural network model to obtain the predicted fonts of the characters in the sample pictures;

and if so, determining the current neural network model as the font recognition network model.

Optionally, the preset neural network model is a sensor Flow neural network model.

In order to solve the above problem, an embodiment of the present invention further provides a font identification apparatus, where the apparatus includes:

the device comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a picture to be identified, and the picture to be identified is a picture containing characters to be identified;

the recognition module is used for inputting the picture to be recognized into a font recognition network model to obtain the font of the character to be recognized in the picture to be recognized;

Optionally, the determining module is specifically configured to:

acquiring a scanned document to be identified;

Optionally, if the picture to be recognized includes a plurality of characters to be recognized, the recognition module is specifically configured to:

Optionally, the apparatus further comprises: a training module for training the training of the device,

the training module is specifically configured to:

acquiring a preset neural network model and the training set;

The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any method step when executing the program stored in the memory.

An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above method steps.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a font identification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of capturing a scanned document to obtain a picture to be recognized according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a picture to be recognized according to an embodiment of the present invention;

FIG. 4 is a flowchart of training a font recognition network model according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a font identification apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Aiming at the technical problem that the font of characters in a scanned part cannot be identified in the prior art, the embodiment of the invention provides a font identification method, which comprises the following steps: determining a picture to be identified, wherein the picture to be identified is a picture containing characters to be identified. And inputting the picture to be recognized into the font recognition network model to obtain the font of the character to be recognized in the picture to be recognized. The font recognition network model is a model obtained by training according to a training set, and the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture. Because the scanned part can be converted into the format of the picture, the font of the characters in the scanned part can be identified by using the font identification method provided by the embodiment of the invention.

The present invention will be described in detail below with reference to specific examples.

Referring to fig. 1, fig. 1 is a flowchart of a font identification method according to an embodiment of the present invention, where the method may include the following steps:

s101: determining a picture to be identified, wherein the picture to be identified is a picture containing characters to be identified.

In one embodiment, the picture to be identified may be a picture taken from a scanned document. Specifically, the step S101 of determining the picture to be recognized may include the following steps:

step 11: and determining a scanned document to be identified.

The scanned document to be identified can be a scanned piece in a PDF format, and can also be a scanned piece in a picture format.

Step 12: and intercepting a text area in the scanned document as a picture to be identified.

To identify the font of the text in the scanned document, the text region in the scanned document may be truncated. The intercepted text area is in a picture format and can be used as a picture to be identified. Referring to fig. 2, fig. 2 is a schematic diagram of capturing a scanned document to obtain a picture to be recognized.

S102: and inputting the picture to be recognized into the font recognition network model to obtain the font of the character to be recognized in the picture to be recognized. The font recognition network model is obtained by training a preset neural network model according to a training set. The training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture.

In an embodiment of the present invention, the preset neural network model may be a sensor Flow neural network model.

In the embodiment of the invention, the font identification network model is trained according to the sample picture containing one character and the real font of the character in the sample picture, and after the training is finished, the picture to be identified is input into the font identification network model, so that the font of the character in the picture to be identified can be obtained.

In an embodiment of the present invention, if the picture to be recognized includes a plurality of characters to be recognized, the step of inputting the picture to be recognized into the font recognition network model may include the following refining steps:

step 21: and cutting the picture to be recognized into a plurality of target pictures, wherein each target picture comprises a character to be recognized.

If the picture to be recognized comprises a plurality of characters, the picture to be recognized can be cut to obtain a plurality of target pictures. Wherein each target picture contains a character.

For example, referring to fig. 3, fig. 3 is another schematic diagram of a picture to be recognized. Since the picture to be recognized shown in fig. 3 includes 7 characters, the picture to be recognized can be divided into 7 target pictures, and each target picture includes one character.

Step 22: and respectively inputting each target picture into the font recognition network model.

After the cutting is finished, the target pictures can be input into the font identification network model respectively to obtain the fonts of the characters in the target pictures.

In an embodiment provided by the present invention, considering that fonts of characters in individual segments in a scanned document are usually consistent, when performing font recognition on a plurality of target pictures corresponding to the individual segments in the scanned document, if a few recognition results are different from other recognition results, the few recognition results may be considered as being incorrect, and may be corrected.

Referring to fig. 4, in an embodiment of the present invention, the font identification network model may be obtained by training through the following steps:

s401: and acquiring a preset neural network model and a training set.

Wherein, the training set may be: a plurality of sample pictures including a text, and a real font of the text in each sample picture.

S402: and inputting the sample pictures contained in the training set into the neural network model to obtain the predicted fonts of the characters in the sample pictures.

The neural network model can be a cyclic neural network model, a convolutional neural network model, a cyclic convolutional neural network model, a deep neural network model, and the like. The embodiment of the present invention is not limited thereto.

In an embodiment of the invention, after the sample picture is input into the neural network model, the neural network model can realize the feature extraction of the sample picture, carry out convolution processing on the extracted features, and then carry out maximum pooling on the convolution result to obtain a maximum pooling result. And inputting the maximum pooling result into a full-connection layer in the neural network model to obtain the predicted font of the characters in the sample picture.

S403: and determining a loss value according to the obtained predicted font and the real font of the characters in the sample picture contained in the training set.

After the predicted font of the characters in the sample picture is obtained, the loss value can be calculated by combining the real font of the characters in the sample picture.

In one embodiment of the present invention, the fonts of the text can be marked with numbers. For example, a song style is labeled "0001", a regular style is labeled "0002", a bold style is labeled "0003", and so on.

The loss value may be calculated from the numerical identification of the true font of the text in the sample picture and the numerical identification of the predicted font of the text in the sample picture. When calculating the loss value, a Mean Squared Error (MSE) may be used as the loss function, or other loss functions may be selected, which is not limited to this.

S404: and determining whether the neural network model converges according to the loss value. If so, step S405 is performed. If not, step S406 is performed.

In an embodiment of the present invention, a loss threshold may be preset, and when the calculated loss value is greater than the preset loss threshold, it indicates that the neural network model is not converged, and the training may be continued. And when the calculated loss value is not greater than the preset loss threshold value, the neural network model is converged.

In another embodiment of the present invention, a maximum iteration number may be preset, and when the iteration number reaches the maximum iteration number in the training process, it may be considered that the neural network model has converged. And when the iteration times are not reached, if the calculated loss value is not greater than a preset loss threshold value, the convergence of the neural network model is also indicated.

S405: and determining the current neural network model as a font recognition network model.

S406: and adjusting parameter values in the neural network model, and returning to execute the step S402.

Based on the same inventive concept, according to the above font identification method embodiment, an embodiment of the present invention further provides a font identification apparatus, referring to fig. 5, which may include the following modules:

the determining module 501 is configured to determine a picture to be recognized, where the picture to be recognized is a picture including characters to be recognized.

The recognition module 502 is configured to input the picture to be recognized into the font recognition network model, so as to obtain a font of the character to be recognized in the picture to be recognized;

the font recognition network model is obtained by training a preset neural network model according to a training set, wherein the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture.

In an embodiment of the present invention, the determining module 501 may be specifically configured to:

acquiring a scanned document to be identified;

and intercepting a character area in the scanned document to be identified as an image to be identified.

In an embodiment of the present invention, if the picture to be recognized includes a plurality of characters to be recognized, the recognition module 502 may be specifically configured to:

cutting a picture to be recognized into a plurality of target pictures, wherein each target picture comprises a character to be recognized;

and respectively inputting each target picture into the font recognition network model.

In an embodiment of the present invention, on the basis of the apparatus shown in fig. 5, the apparatus may further include a training module, and the training module may be specifically configured to: acquiring a preset neural network model and a training set;

inputting the sample pictures contained in the training set into a neural network model to obtain the predicted fonts of characters in the sample pictures;

determining a loss value according to the obtained predicted font and the real font of the characters in the sample picture contained in the training set;

Therefore, the font identification device provided by the embodiment of the invention can determine the picture to be identified, wherein the picture to be identified is the picture containing the characters to be identified. And inputting the picture to be recognized into the font recognition network model to obtain the font of the character to be recognized in the picture to be recognized. The font recognition network model is a model obtained by training according to a training set, and the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture. Because the scanning piece can be converted into the format of the picture, the font identification device provided by the embodiment of the invention can identify the font of the characters in the scanning piece.

Based on the same inventive concept, according to the above font identification method embodiment, an embodiment of the present invention further provides an electronic device, as shown in fig. 6, including a processor 601, a communication interface 602, a memory 603, and a communication bus 604, where the processor 601, the communication interface 602, and the memory 603 complete mutual communication through the communication bus 604,

a memory 603 for storing a computer program;

the processor 601 is configured to implement the following steps when executing the program stored in the memory 603:

Therefore, the electronic equipment provided by the embodiment of the invention can determine the picture to be recognized, wherein the picture to be recognized is a picture containing characters to be recognized. And inputting the picture to be recognized into the font recognition network model to obtain the font of the character to be recognized in the picture to be recognized. The font recognition network model is a model obtained by training according to a training set, and the training set comprises: a plurality of sample pictures including a text, and a real font of the text in each sample picture. Because the scanning piece can be converted into the format of the picture, the electronic equipment provided by the embodiment of the invention can identify the font of the characters in the scanning piece.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but not only one bus or class of buses.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

Based on the same inventive concept, according to the above-mentioned font identification method embodiment, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above-mentioned method steps.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the embodiments of the font identification apparatus, the electronic device and the computer readable storage medium, since they are substantially similar to the embodiments of the font identification method, the description is simple, and the relevant points can be referred to the partial description of the embodiments of the method.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for font recognition, the method comprising:

2. The method according to claim 1, wherein the step of determining the picture to be recognized comprises:

acquiring a scanned document to be identified;

3. The method of claim 1, wherein if the image to be recognized includes a plurality of words to be recognized, the step of inputting the image to be recognized into a font recognition network model comprises:

4. The method of claim 1, wherein the font recognition network model is obtained by training using the following steps:

acquiring a preset neural network model and the training set;

5. The method according to claim 1, wherein the preset neural network model is a Tensor Flow neural network model.

6. A font recognition apparatus, characterized in that the apparatus comprises:

7. The apparatus of claim 6, wherein the determining module is specifically configured to:

acquiring a scanned document to be identified;

8. The apparatus of claim 6, wherein if the picture to be recognized includes a plurality of words to be recognized, the recognition module is specifically configured to:

9. The apparatus of claim 6, further comprising: a training module for training the training of the device,

the training module is specifically configured to:

acquiring a preset neural network model and the training set;

10. The apparatus according to claim 6, wherein the preset neural network model is a Tensor Flow neural network model.

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.

12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-5.