CN112686319A - Merging method of electric power signal model training files - Google Patents

Merging method of electric power signal model training files Download PDF

Info

Publication number
CN112686319A
CN112686319A CN202011638466.6A CN202011638466A CN112686319A CN 112686319 A CN112686319 A CN 112686319A CN 202011638466 A CN202011638466 A CN 202011638466A CN 112686319 A CN112686319 A CN 112686319A
Authority
CN
China
Prior art keywords
file
picture
character
suffix
tif
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011638466.6A
Other languages
Chinese (zh)
Inventor
张海永
高承贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Taisi De Intelligent Electric Co ltd
Original Assignee
Nanjing Taisi De Intelligent Electric Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Taisi De Intelligent Electric Co ltd filed Critical Nanjing Taisi De Intelligent Electric Co ltd
Priority to CN202011638466.6A priority Critical patent/CN112686319A/en
Publication of CN112686319A publication Critical patent/CN112686319A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a method for combining training files of a power signal model, which comprises the following steps: selecting a box text file and a picture file with a suffix of tif; setting file parameters; automatically naming the picture file and the box file with the tif suffix to form a training file meeting the specification; writing the file name with the set format into a text file of font _ properties; calling a tesseract command to generate a text file with a suffix tr for each file with the suffix tif; combining the text file names with the suffix of the box and the text file names with the suffix of the tr into character strings separated by spaces respectively; and finally, calling a combine _ tessdata command of tessaract to generate a train eddata file. The character recognition method combines the manually marked character pictures in practical application and the character pictures generated by the method through a multi-model combination method, and adjusts the recognized wrong characters through the manually marked data in practical application, thereby reducing the training workload and improving the accuracy of character recognition.

Description

Merging method of electric power signal model training files
Technical Field
The invention belongs to the technical field of Chinese training model training methods, and particularly relates to a method for combining training files of a power signal model.
Background
The Chinese training model of the Tesseract character recognition engine is low in recognition rate, and the method for improving the character recognition rate by retraining the commonly used characters of the user is a common method for the user. Because training needs to adjust the position of characters in pictures and the size of character frames in large quantity, great workload is brought to character training, and if the training files are not combined, the processing time is long, and the recognition efficiency is low.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method for combining the training files of the power signal model is provided to solve the problems in the prior art.
The technical scheme adopted by the invention is as follows: a method for merging training files of a power signal model comprises the following steps:
1) selecting a box text file and a picture file with a suffix of tif, which are artificially marked and generated;
2) setting training parameters of each file, including training language and page mode parameters, and defaulting to Chinese training language;
3) automatically naming the picture file and box file with the suffix tif according to the selected file name and the model name to be generated to form a training file meeting the specification, wherein the picture file with the suffix tif is named according to the specification as follows:
Figure BDA0002879261620000021
the text file specification for box is named as follows:
Figure BDA0002879261620000022
4) according to the picture file name with the suffix of. tif, the name between the first ". multidot.and the second". multidot.is taken as the font name, each font is a line, and is written into the text file with the file name of font _ properties in the following format.
font 0 0 0 0 0
5) Calling tesseract command to generate a text file with a suffix of tr for each file with a suffix of tif, wherein the command is as follows:
tesseract power.font.exp0.GIF power.font.exp0–psm 6nobatch box.train
6) combining the text file names of the box suffixes into character strings separated by spaces, using the character strings as parameters of a unicastet _ extra command, and calling and executing the unicastet _ extra command; combining all text file names with the suffix of tr into character strings separated by spaces, using the character strings as parameters of shapecasting, mftraining and cntraing commands, and calling and executing shapecasting, mftraining and cntraing commands;
7) and finally, calling a combine _ tessdata command of the tessaract to generate a train eddata file, namely the finally merged character model.
In the step 1), the method for generating the box text file and the picture file with the suffix of tif comprises the following steps:
1) reading a txt or excel file;
2) setting parameters of font, size and model name of the training characters;
3) reading the selected txt or excel file according to lines, acquiring the total line number of the characters, and marking the total line number as num _ lines, and the word number of the line with the maximum word number in the lines as max _ length;
4) calculating the width and height of the generated character picture according to the set character space (gap), line space (linepacking), page margin (padding), picture maximum width, single character width (width) and single character height (height);
5) if the calculated picture width is larger than the set maximum picture width, the longest picture width is taken as the width of the picture to be generated, and the picture height is recalculated according to the character pitch (gap), the line pitch (linepacking) and the page margin (padding);
6) marking the picture size as imgsize according to the picture size calculated in the step 5), calling a QImage class of Qt to generate a full-white picture, and storing the full-white picture as a picture file with a suffix of tif;
7) drawing single characters on a full-white picture in sequence;
8) scanning pixel values of the pictures in the rectangular frame in four directions of top to bottom, bottom to top, left to right and right to left respectively according to the position of each character and the length and width data of the rectangle recorded in the step 7);
9) converting the coordinate [ (x, y), (end _ x, end) ] position of the minimum enclosing rectangle of the characters in the step 8) into a character coordinate system [ (t _ x, t _ y), (t _ end _ x, t _ end) ] trained by tesseract, wherein the conversion formula is as follows:
t_x=x
t_y=height_image–y-(endy-y)
t_endx=x+width_image
t_endy=height_image–y;
10) and 4) carrying out coordinate conversion calculation on each character according to the step 9), writing the data of each character into a text file with a suffix of the following format, wherein each character data occupies one line:
the text t _ x t _ y t _ endx t _ endy.
The invention has the beneficial effects that: compared with the prior art, the character recognition method has the advantages that the character pictures which are manually marked in practical application and the character pictures which are generated by the character recognition method are combined through a multi-model combining method, recognized wrong characters are adjusted through manually marked data in practical application, training workload is reduced, and meanwhile accuracy of character recognition is improved.
Drawings
FIG. 1 is a flow chart of a model training method;
FIG. 2 is a flow chart of a merging method;
fig. 3 is a schematic diagram of coordinate system conversion.
Detailed Description
The invention is further described below with reference to specific figures and embodiments.
Example 1: as shown in fig. 1, an efficient model training method includes the following steps:
1) forming the name, state name and the like of each station in the power dispatching master station system into a txt format or an execl format file, and reading the txt or excel file; because the number of Chinese characters is huge, the characters to be recognized in practical application are sorted in the step, so that the number of the characters to be recognized is reduced, the size of a generated model is reduced, and the recognition speed of the characters can be improved;
2) setting parameters of the font, the size and the model name of the training characters, and improving the accuracy of character recognition;
3) reading the selected txt or excel file according to lines, and acquiring the total line number (marked as num _ lines) of the characters and the word number (marked as max _ length) of the line with the maximum word number in the lines; the method mainly comprises the following steps of calculating the generation size of a training picture to obtain initial data;
4) calculating the width and height of the generated character picture according to the set character spacing (gap), line spacing (linepacking), page margin (padding), picture maximum width, single character width (width) and single character height (height), wherein the calculation formula is as follows:
picture width (width _ image) ═ padding × 2+ max _ length (gap + width)
Picture height (height _ image) ═ padding 2+ num _ lines (linesizing + height)
5) If the calculated picture width is larger than the set maximum picture width, the longest picture width is taken as the width of the picture to be generated, and the picture height is recalculated according to the character pitch (gap), the line pitch (linepacking) and the page margin (padding). The calculation flow is as follows:
a) calculating the number of characters max _ words _ num that a picture line can contain
words_num=(width_image–padding*2)/(gap+width)
max _ word _ length is taken as the smallest integer greater than or equal to word _ num
b) Calculating the number of lines of characters and the height of picture
lines word total number of words/max word length
Number of lines of text (num _ lines) takes the smallest integer greater than or equal to lines _ word (round up)
Picture height ═ padding × 2+ num _ lines (linescaping + height)
6) Calling a QImage class of Qt (cross-platform C + + graphical user interface application program development framework) to generate a full-white picture according to the picture size (marked as imgsize) calculated in the step 5), wherein calling codes are as follows:
QImage img(imgsize,QImage::Format_RGB888);
img.fill(QColor(255,255,255));
7) and drawing single characters on the all-white picture in sequence. The drawing steps are as follows:
a) setting the first character position (the abscissa is startx and the ordinate is starty), and setting the initial position values as follows:
startx=padding;starty=padding;
b) the first letter is drawn centrally within a rectangle with coordinates (startx, start) as the starting point, a single letter width (width) as the width of the rectangle, and a single letter height (height) as the height of the rectangle.
c) And calculating startx and starty of the next character, wherein the values are as follows:
startx=startx+width+gap
starty=padding
if (startx + page margin) > picture width, then:
starty + number of lines of drawn text ═ starty [ (height + linesapacing) ]
startx=padding
d) And recording and storing, and repeating the steps a-c by the startx, start, width and height of each character until the picture is completely drawn.
8) Scanning pixel values of the pictures in the rectangular frame in four directions of from top to bottom, from bottom to top, from left to right and from right to left according to the character position and the rectangular length and width data recorded in the step 7), wherein the circumscribed rectangles of the characters are calculated by scanning in the four directions, and the calculation speed is higher than that of a method for calculating the circumscribed rectangles by scanning the characters in the two directions of from top to bottom and from left to right, and is improved;
a) the pixels of the character rectangle are scanned from top to bottom in row units until the pixels which are not white exist in the whole row of pixels, and the row number of the row is recorded as y.
b) The pixels of the character rectangle are scanned from left to right in units of columns until the pixels which are not white exist in the whole column of pixels, and the column number is recorded as y.
c) The pixels of the character rectangle are scanned from bottom to top in row units until the pixels which are not white exist in the whole row of pixels, and the row number is recorded as end _ y.
d) Scanning pixels of the character rectangle from right to left in a column unit until the pixels which are not white exist in the whole column of the pixels, and marking the column as end _ x.
e) Taking the data calculated according to a-d) as the minimum circumscribed rectangle of the character, wherein the vertex coordinates of the upper left corner of the rectangle are (x, y), and the vertex coordinates of the lower right corner are (end _ x, endy);
9) since the QImage coordinate system is inconsistent with the text coordinate system for text training (see the figure below), and text requires the vertex coordinates of the lower left corner and the upper right corner of the minimum enclosing rectangle of text as training data, the coordinate [ (x, y), (end _ x, end) ] position of the minimum enclosing rectangle of text in step 8) is converted into the text coordinate system [ (t _ x, t _ y), (t _ end _ x, t _ end) ] for text training, as shown in fig. 3, the conversion formula is as follows:
t_x=x
t_y=height_image–y-(endy-y)
t_endx=x+width_image
t_endy=height_image–y
10) calculating each character according to 9), writing the data of each character into a text file with a suffix of the following format, wherein each character data occupies one line:
text t _ x t _ y t _ endx t _ endy
The box file generated in the step is a necessary character position file in tesseract training, is automatically processed and generated through a training tool, and is more convenient and faster than the traditional method of manually adjusting the position of a character circumscribed rectangle;
11) according to the set parameters, such as training language, paging mode and the like, a text file with a suffix of tr is generated by using a tesseract command, and the command is as follows:
tesseract power.font.exp0.GIF power.font.exp0–psm 6nobatch box.train
tr file generated in the step is a necessary character feature file in tesseract training, and the tesseract command is automatically called by a tool to generate, so that the method is more convenient and faster than the traditional method which needs manual calling for command generation;
12) reading files with suffixes of tif, box and tr in the manually marked folder, simultaneously detecting whether a text file with the file name of font _ properties exists in the folder, if not, according to the picture file names with suffixes of tif, taking the name between the first ". multidot.and the second". multidot.. The format is as follows:
font 0 0 0 0 0
for example, the picture file name of the suffix of. tif is power. user. exp0.GIF, the content written in the font _ properties file is:
userfont 0 0 0 0 0
the generated font _ properties file is a necessary font file in tesseract training, and is quicker and more convenient than the traditional method in which the file names of different tifs need to be manually and sequentially searched, the font _ properties file is manually created, and the font is manually input;
13) sequentially executing uniclass _ extra, shapelogic, mftraining and cntracing training commands of tesseract to generate a file with a suffix of traineddata, wherein the file is a character model file; the commands are automatically and sequentially called and executed through the tool, so that the method is quicker and more convenient than the method of manually and sequentially inputting the commands and executing the commands in the traditional method;
14) after the training is finished, the training tool automatically calls a tesseract command to identify the picture generated in the step 7), compares the identification result with the input characters, and prompts the identification of wrong characters, the character recall rate and the accuracy rate; the recall rate and the accuracy rate of the characters are automatically detected through the training tool, and the method is quicker and more convenient than the method which needs to manually search and calculate the wrong characters in the traditional method;
15) if the user wants to improve the character recognition rate, the user can continue to convert the actual application picture containing the wrong characters into the picture file with the suffix of tif by using the picture tool, and manually mark the picture file by using the marking tool to generate a box text file, add the box text file into the manually marked file folder, and train again.
Example 2: as shown in fig. 2, a method for merging training files of a power signal model includes the following steps:
1) selecting a box text file and a picture file with a suffix of. tif which are artificially marked and generated, wherein the box text file and the picture file with the suffix of. tif generated in the steps 1) to 10) in the embodiment 1 can be adjusted by other methods or tools; picture files in tif format under different paths can be selected, and compared with the traditional method, the method that the merged file needs to be copied to the same path manually is more convenient;
2) setting training parameters of each file, including training language and page mode parameters, and defaulting to Chinese training language; different parameters can be set for each selected file, and compared with the traditional method, the method for independently inputting parameters and executing commands for each file is more visual and convenient;
3) automatically naming the picture file and box file with the suffix tif according to the selected file name and the model name to be generated to form a training file meeting the specification, wherein the picture file with the suffix tif is named according to the specification as follows:
Figure BDA0002879261620000091
the text file specification for box is named as follows:
Figure BDA0002879261620000092
the tif format file and the box format file are renamed automatically through a merging tool, so that compared with the traditional method that manual renaming is needed and the files are copied to the same folder, the operation is quicker and more convenient;
4) according to the picture file name with the suffix of tif, taking the name between the first and the second as the font name, wherein each font is a line and writing the font name into a text file with the file name of font _ properties according to the following format;
font 0 0 0 0 0
font _ properties is a necessary font file required by the tesseract training model, the font name of each tif is automatically acquired by a merging tool and is written into the font _ properties file, and compared with the traditional method, the operation that the font needs to be manually judged, the font _ properties file needs to be manually created, and the font is written into the file is more convenient and faster, and errors are not easy to occur;
5) calling tesseract command to generate a text file with a suffix of tr for each file with a suffix of tif, wherein the command is as follows:
tesseract power.font.exp0.GIF power.font.exp0–psm 6nobatch box.train
the tr-format file is a necessary character feature file required by a tesseract training model, and the method is executed by automatically calling commands according to the number of files through a merging tool, so that the method is quicker and more convenient than the traditional method which needs to manually call the files in sequence according to the file number and file names;
6) combining the text file names of the box suffixes into character strings separated by spaces, using the character strings as parameters of a unicastet _ extra command, and calling and executing the unicastet _ extra command; combining all text file names with the suffix of tr into character strings separated by spaces, using the character strings as parameters of shapecasting, mftraining and cntraing commands, and calling and executing shapecasting, mftraining and cntraing commands;
the execution command statements are automatically combined through the merging tool, and the training command is automatically called, so that the operation method is faster and more convenient compared with the traditional method which needs to manually write the inspection command and execute the command in sequence;
7) and finally, calling a combine _ tessdata command of the tessaract to generate a train eddata file, namely the finally merged character model.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and therefore, the scope of the present invention should be determined by the scope of the claims.

Claims (3)

1. A merging method of power signal model training files is characterized in that: the method comprises the following steps:
1) selecting a box text file and a picture file with a suffix of tif, which are artificially marked and generated;
2) setting training parameters of each file, including training language and page mode parameters, and defaulting to Chinese training language;
3) automatically naming the picture file and box file with the suffix tif according to the selected file name and the model name to be generated to form a training file meeting the specification, wherein the picture file with the suffix tif is named according to the specification as follows:
Figure FDA0002879261610000011
the text file specification for box is named as follows:
Figure FDA0002879261610000012
4) according to the picture file name with the suffix of. tif, the name between the first ". multidot.and the second". multidot.is taken as the font name, each font is a line, and is written into the text file with the file name of font _ properties in the following format.
font 0 0 0 0 0
5) Calling tesseract command to generate a text file with a suffix of tr for each file with a suffix of tif, wherein the command is as follows:
tesseract power.font.exp0.GIF power.font.exp0 –psm 6 nobatch box.train
6) combining the text file names of the box suffixes into character strings separated by spaces, using the character strings as parameters of a unicastet _ extra command, and calling and executing the unicastet _ extra command; combining all text file names with the suffix of tr into character strings separated by spaces, using the character strings as parameters of shapecasting, mftraining and cntraing commands, and calling and executing shapecasting, mftraining and cntraing commands;
7) and finally, calling a combine _ tessdata command of the tessaract to generate a train eddata file, namely the finally merged character model.
2. The method of claim 1, wherein the method further comprises: in the step 1), the method for generating the box text file and the picture file with the suffix of tif comprises the following steps:
1) reading a txt or excel file;
2) setting parameters of font, size and model name of the training characters;
3) reading the selected txt or excel file according to lines, acquiring the total line number of the characters, and marking the total line number as num _ lines, and the word number of the line with the maximum word number in the lines as max _ length;
4) calculating the width and height of the generated character picture according to the set character space (gap), line space (linepacking), page margin (padding), picture maximum width, single character width (width) and single character height (height);
5) if the calculated picture width is larger than the set maximum picture width, the longest picture width is taken as the width of the picture to be generated, and the picture height is recalculated according to the character pitch (gap), the line pitch (linepacking) and the page margin (padding);
6) marking the picture size as imgsize according to the picture size calculated in the step 5), calling a QImage class of Qt to generate a full-white picture, and storing the full-white picture as a picture file with a suffix of tif;
7) drawing single characters on a full-white picture in sequence;
8) scanning pixel values of the pictures in the rectangular frame in four directions of top to bottom, bottom to top, left to right and right to left respectively according to the position of each character and the length and width data of the rectangle recorded in the step 7);
9) converting the coordinate [ (x, y), (end _ x, end) ] position of the minimum enclosing rectangle of the characters in the step 8) into a character coordinate system [ (t _ x, t _ y), (t _ end _ x, t _ end) ] trained by tesseract, wherein the conversion formula is as follows:
t_x=x
t_y=height_image–y-(endy-y)
t_endx=x+width_image
t_endy=height_image–y;
10) and 4) carrying out coordinate conversion calculation on each character according to the step 9), writing the data of each character into a text file with a suffix of the following format, wherein each character data occupies one line:
the text t _ x t _ y t _ endx t _ endy.
3. The method of claim 1, wherein the method further comprises: and the suffix is that tif is converted from a picture by using a picture tool, and the picture is manually marked by using a marking tool to generate a box text file, and the box text file is added into a manually marked folder and is re-trained.
CN202011638466.6A 2020-12-31 2020-12-31 Merging method of electric power signal model training files Pending CN112686319A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011638466.6A CN112686319A (en) 2020-12-31 2020-12-31 Merging method of electric power signal model training files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011638466.6A CN112686319A (en) 2020-12-31 2020-12-31 Merging method of electric power signal model training files

Publications (1)

Publication Number Publication Date
CN112686319A true CN112686319A (en) 2021-04-20

Family

ID=75456595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011638466.6A Pending CN112686319A (en) 2020-12-31 2020-12-31 Merging method of electric power signal model training files

Country Status (1)

Country Link
CN (1) CN112686319A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118014072A (en) * 2024-04-10 2024-05-10 中国电建集团昆明勘测设计研究院有限公司 Construction method and system of knowledge graph for hydraulic and hydroelectric engineering

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942314A (en) * 2014-04-22 2014-07-23 重庆市科学技术研究院 HTML file image-text integrating display method
US20160034441A1 (en) * 2014-07-29 2016-02-04 Board Of Regents, The University Of Texas System Systems, apparatuses and methods for generating a user interface
CN108133212A (en) * 2018-01-05 2018-06-08 东华大学 A kind of quota invoice amount identifying system based on deep learning
CN109583493A (en) * 2018-11-27 2019-04-05 上海交通大学 A kind of credit card detection and digit recognition method based on deep learning
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
CN110188649A (en) * 2019-05-23 2019-08-30 成都火石创造科技有限公司 Pdf document analysis method based on tesseract-ocr
CN110443231A (en) * 2019-09-05 2019-11-12 湖南神通智能股份有限公司 A kind of fingers of single hand point reading character recognition method and system based on artificial intelligence

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942314A (en) * 2014-04-22 2014-07-23 重庆市科学技术研究院 HTML file image-text integrating display method
US20160034441A1 (en) * 2014-07-29 2016-02-04 Board Of Regents, The University Of Texas System Systems, apparatuses and methods for generating a user interface
CN108133212A (en) * 2018-01-05 2018-06-08 东华大学 A kind of quota invoice amount identifying system based on deep learning
CN109583493A (en) * 2018-11-27 2019-04-05 上海交通大学 A kind of credit card detection and digit recognition method based on deep learning
CN110135414A (en) * 2019-05-16 2019-08-16 京北方信息技术股份有限公司 Corpus update method, device, storage medium and terminal
CN110188649A (en) * 2019-05-23 2019-08-30 成都火石创造科技有限公司 Pdf document analysis method based on tesseract-ocr
CN110443231A (en) * 2019-09-05 2019-11-12 湖南神通智能股份有限公司 A kind of fingers of single hand point reading character recognition method and system based on artificial intelligence

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ABDUL ROBBY G 等: "Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application", 《PROCEDIA COMPUTER SCIENCE》, 31 December 2019 (2019-12-31), pages 499, XP085846569, DOI: 10.1016/j.procs.2019.09.006 *
万松: "基于Tesseract-OCR的名片识别***的研究与实现", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 15 January 2015 (2015-01-15), pages 138 - 977 *
翟娟秀 等: "基于Tesseract-ocr的藏文脱机识别", 《科技创业月刊》, 10 November 2016 (2016-11-10), pages 1 - 2 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118014072A (en) * 2024-04-10 2024-05-10 中国电建集团昆明勘测设计研究院有限公司 Construction method and system of knowledge graph for hydraulic and hydroelectric engineering

Similar Documents

Publication Publication Date Title
US8600164B2 (en) Method and tool for recognizing a hand-drawn table
CN100578432C (en) Method for directly writing handwriting information
US6466954B1 (en) Method of analyzing a layout structure of an image using character recognition, and displaying or modifying the layout
US9715623B2 (en) Reduced document stroke storage
US4990903A (en) Method for storing Chinese character description information in a character generating apparatus
JP2000090195A (en) Method and device for table recognition
WO2014183677A1 (en) Method for acquiring and displaying original handwriting information of handwriting input device
CN114005123A (en) System and method for digitally reconstructing layout of print form text
EP2110758A1 (en) Searching method based on layout information
CN112686319A (en) Merging method of electric power signal model training files
JP2010123002A (en) Document image layout device
JP2022092119A (en) Image processing apparatus, image processing method, and program
CN113159086B (en) Efficient electric power signal description model training method
CN112416340A (en) Webpage generation method and system based on sketch
JPH05108716A (en) Machine translation system
JPH0612540B2 (en) Document creation support device
WO2018139700A1 (en) Apparatus and method for generating font by means of metafont by using outline font
CN1140864C (en) Hand writing input method for hand held data processor
JP4143245B2 (en) Image processing method and apparatus, and storage medium
CN101464782A (en) Conversion method for page description information and apparatus using page description
CN114663414B (en) Rock and ore recognition and extraction system and method based on UNET convolutional neural network
CN116088850A (en) Method and system for converting design manuscript into code based on component library coding mapping
CN116682117A (en) Ancient book document identification method, system, terminal and medium
JPS6154569A (en) Document poicture processing system
JPH01262149A (en) Document output device having dot-character font forming function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination