CN105808511A - Spatial position-based method for reconstructing text information in CAD electronic data - Google Patents

Spatial position-based method for reconstructing text information in CAD electronic data Download PDF

Info

Publication number
CN105808511A
CN105808511A CN201610119018.2A CN201610119018A CN105808511A CN 105808511 A CN105808511 A CN 105808511A CN 201610119018 A CN201610119018 A CN 201610119018A CN 105808511 A CN105808511 A CN 105808511A
Authority
CN
China
Prior art keywords
text
text object
coordinate
basic point
discrete
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610119018.2A
Other languages
Chinese (zh)
Inventor
万庆
周良辰
贾明元
闾国年
张明波
谢炯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Normal University
Institute of Geographic Sciences and Natural Resources of CAS
Original Assignee
Nanjing Normal University
Institute of Geographic Sciences and Natural Resources of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Normal University, Institute of Geographic Sciences and Natural Resources of CAS filed Critical Nanjing Normal University
Priority to CN201610119018.2A priority Critical patent/CN105808511A/en
Publication of CN105808511A publication Critical patent/CN105808511A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a spatial position-based method for reconstructing text information in CAD electronic data. The method comprises the following steps: reading discrete character objects in a memory and sorting the discrete character objects from big to small according to base point Y coordinates of the character objects; selecting the character objects of a same line and sorting the selected character objects from small to big according to base point X coordinates of the character objects so as to obtain orderly text sequences from left to right; connecting and reconstructing the text content of each line in sequence to obtain multi-line text information with logic relationship. According to the method, the discrete character objects are constructed into the text information with the logic relationship through twice sorting on the basis of the base point coordinates of the character objects; compared with the method in the prior art, the method is simple, convenient, easy to operate and strong in reliability.

Description

A kind of CAD electronic data Chinese version signal reconstruct method based on locus
Technical field
The present invention relates to a kind of reconstructing method, particularly relate to a kind of CAD electronic data Chinese version signal reconstruct method based on locus.
Background technology
Computer-aided design (CAD-ComputerAidedDesign) refers to and utilizes computer and graphics device thereof to help designer to be designed work.Text is the significant data type in CAD electronic data.Can express figure by word, symbol fails the design idea that expresses clearly, is the important supplement to design object attribute information.In prior art, the mode typically via manual reading of drawings understands the text in CAD electronic data.Sharing at cad data and automatically extract field with text, mainly realize CAD text object and to extended formatting conversion or extract independent text chunk content, again through manually carrying out understanding and recombinating, method is comparatively laborious, inefficient.
Summary of the invention
In order to solve the weak point existing for above-mentioned technology, the invention provides a kind of CAD electronic data Chinese version signal reconstruct method based on locus, CAD electronic data will be automatically reconstructed into the text message with complete logical by independent text object.
In order to solve above technical problem, the technical solution used in the present invention is: a kind of CAD electronic data Chinese version signal reconstruct method based on locus, specifically comprising the following steps that of the method
Step 1: will be stored in discrete text object to be reconstructed in CAD electronic data and read in internal memory, including the content of text of text object, text height and basic point coordinate;
Step 2: text object is read in internal memory successively according to physical storing sequence, then the Y coordinate according to text object basic point, discrete text object is ranked up branch, using the text height of text object as tolerance, when the difference of two text object basic point Y coordinate is in range of tolerable variance, using they text objects as same a line, then each style of writing object word is sorted according to the order that basic point Y coordinate is descending, so that it may discrete text object is divided into several rows;
Step 3: then according to the several rows text object obtained in step 2, selects same style of writing object word, according to the X-coordinate of text object basic point in same a line, is reordered by discrete for single file text object;For each row text object after branch in step 2, according to the order sequence that their basic point X-coordinate is ascending, obtain by left-to-right orderly text sequence.
Step 4: according to order in the row that step 2 and step 3 respectively obtain and line order row, obtain by left-to-right, the orderly text of multirow from top to bottom;Then the content of text of each row it is sequentially connected with and recombinates, obtaining the multiline text information with logical relation.
The present invention is based on the basic point coordinate of text object, by the basic point Y coordinate of discrete text object and basic point X-coordinate are performed twice at sequence, discrete text object is reconstructed into the text message with logical relation, method compared to existing technology, simple and convenient, easily operated, highly reliable.
Accompanying drawing explanation
Fig. 1 is the overall flow figure of the present invention.
Fig. 2 is the physical storing sequence schematic block diagram of text object of the present invention.
Fig. 3 is the logical order schematic block diagram before text object of the present invention sequence.
Fig. 4 is the logical order schematic block diagram after text object of the present invention carries out Y coordinate sequence.
Fig. 5 is the logical order schematic block diagram after text object of the present invention carries out X-coordinate sequence.
Detailed description of the invention
Below in conjunction with the drawings and specific embodiments, the present invention is further detailed explanation.
As shown in Figure 1, the present invention basic point coordinate according to independent text object, first analyze which text object and belong to same a line, again the text object of same a line is analyzed determining its sequencing, then the sequencing according to text object is recombinated, and obtains the restructuring text message with logical relation;
Specifically comprise the following steps that
Step 1: read in discrete text object to be reconstructed;Will be stored in the text object in CAD electronic data to read in internal memory, specifically include the content of text of text object, text height and basic point coordinate.
Wherein, text object Method of Data Organization in calculator memory is the memory data structure by realizing with computer programming language, the content of text of memory data structure organization and management text object, text height and basic point coordinate.Containing 4 text object TextA-TextD as shown in Figure 2, they are in fig. 2 according to broken box arrangement display.
Step 2: as it is shown on figure 3, the text object in TextA to TextD is read in internal memory successively according to physical storing sequence, the then Y coordinate according to text object basic point, discrete text object is ranked up branch.
Concrete grammar is: using the text height of text object as tolerance, when the difference of two text object basic point Y coordinate is in range of tolerable variance, using they text objects as same a line, then each style of writing object word is sorted according to the order that basic point Y coordinate is descending, so that it may discrete text object is divided into several rows.
The basic point Y coordinate of TextC and TextD is identical as shown in Figure 4, and relatively big, and therefore as the first row, and the basic point Y coordinate of TextA and TextB is identical, and the basic point Y coordinate less than TextC and TextD, therefore as the second row.
Step 3: the then X-coordinate according to the text object basic point in the same a line obtained in step 2, reorders discrete for single file text object;For each row text object after branch in step 2, it is ranked up according to their basic point X-coordinate.Method particularly includes: the basic point X-coordinate of text object is more little, the closer to left side, logically more forward according to writing style, is therefore sorted according to the order that basic point X-coordinate is ascending by the text object in each row, just obtains by left-to-right orderly text sequence.
The basic point X-coordinate of the first row TextC and TextD compares as shown in fig. 4-5, and TextD coordinate is less, and therefore TextD is front, and TextC is rear;In like manner, TextB is front, and TextA is rear.
Step 4: according to order in the row that step 2 and step 3 respectively obtain and line order row, discrete text object is attached and recombinates.Two minor sorts through step 2 and step 3, it is possible to obtain by left-to-right, the orderly text of multirow from top to bottom;Then the content of text of each row is sequentially connected with, discrete text object can be reassembled as the multiline text information with logical relation.
As it is shown in figure 5, the logical order after sequence is followed successively by TextD, TextC, TextB, TextA, successively the word content that each text object stores is coupled together, obtain the text message of reconstruct.
TextA-TextD shown in Fig. 2-Fig. 5 of the present invention is the text object representing broken box, splits storage at initial data textual object, is not a complete text fragment, is reconfigured to together by TextA-TextD by the method for the present invention.As can be seen here, discrete text object, based on the basic point coordinate of text object, can be reconstructed into the text message with logical relation by two minor sorts by the present invention.
Above-mentioned embodiment is not limitation of the present invention, and the present invention is also not limited to the example above, the change made within the scope of technical scheme of those skilled in the art, remodeling, interpolation or replacement, also belongs to protection scope of the present invention.

Claims (3)

1. the CAD electronic data Chinese version signal reconstruct method based on locus, it is characterised in that specifically comprising the following steps that of the method
Step 1: will be stored in discrete text object to be reconstructed in CAD electronic data and read in internal memory, including the content of text of text object, text height and basic point coordinate;
Step 2: read in successively in internal memory by text object according to physical storing sequence, the then Y coordinate according to text object basic point, be ranked up discrete text object branch, obtain several rows text object;
Step 3: then according to the several rows text object obtained in step 2, selects same style of writing object word, according to the X-coordinate of text object basic point in same a line, is reordered by discrete for single file text object;For each row text object after branch in step 2, it is ranked up according to their basic point X-coordinate, obtains the orderly text coordinate sequence of text object;
Step 4: according to order in the row that step 2 and step 3 respectively obtain and line order row, obtain by left-to-right, the orderly text of multirow from top to bottom;Then the content of text of each row it is sequentially connected with and recombinates, obtaining the multiline text information with logical relation.
2. the CAD electronic data Chinese version signal reconstruct method based on locus according to claim 1, it is characterized in that: discrete text object sequence branch in described step 2 method particularly includes: using the text height of text object as tolerance, when the difference of two text object basic point Y coordinate is in range of tolerable variance, using they text objects as same a line, then each style of writing object word is sorted according to the order that basic point Y coordinate is descending, so that it may discrete text object is divided into several rows.
3. the CAD electronic data Chinese version signal reconstruct method based on locus according to claim 1, it is characterised in that: the text object of described step 3 sorts according to the order that basic point X-coordinate is ascending, obtains by left-to-right orderly text sequence.
CN201610119018.2A 2016-03-02 2016-03-02 Spatial position-based method for reconstructing text information in CAD electronic data Pending CN105808511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610119018.2A CN105808511A (en) 2016-03-02 2016-03-02 Spatial position-based method for reconstructing text information in CAD electronic data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610119018.2A CN105808511A (en) 2016-03-02 2016-03-02 Spatial position-based method for reconstructing text information in CAD electronic data

Publications (1)

Publication Number Publication Date
CN105808511A true CN105808511A (en) 2016-07-27

Family

ID=56466579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610119018.2A Pending CN105808511A (en) 2016-03-02 2016-03-02 Spatial position-based method for reconstructing text information in CAD electronic data

Country Status (1)

Country Link
CN (1) CN105808511A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002149735A (en) * 2000-11-13 2002-05-24 Sumitomo Metal Electronics Devices Inc Information retrieval processing method
CN102194247A (en) * 2010-03-11 2011-09-21 新奥特(北京)视频技术有限公司 Method for judging graphic element information in modeling process of vector word triangular plate
CN103310077A (en) * 2013-07-08 2013-09-18 攀钢集团攀枝花钢钒有限公司 Modeling method for sequencing and stretching post-rolling pieces to form body
CN105302626A (en) * 2015-11-09 2016-02-03 深圳市依伴数字科技有限公司 Analytic method of XPS (XML Paper Specification) structural data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002149735A (en) * 2000-11-13 2002-05-24 Sumitomo Metal Electronics Devices Inc Information retrieval processing method
CN102194247A (en) * 2010-03-11 2011-09-21 新奥特(北京)视频技术有限公司 Method for judging graphic element information in modeling process of vector word triangular plate
CN103310077A (en) * 2013-07-08 2013-09-18 攀钢集团攀枝花钢钒有限公司 Modeling method for sequencing and stretching post-rolling pieces to form body
CN105302626A (en) * 2015-11-09 2016-02-03 深圳市依伴数字科技有限公司 Analytic method of XPS (XML Paper Specification) structural data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴伟中 等: "AutoCAD 普通表格转换为表格对象的方法", 《制造业信息化》 *
杨小虎: "面向金融企业的内容管理和发布***的开发", 《中国硕士学位论文全文数据库•信息科技辑》 *

Similar Documents

Publication Publication Date Title
US10360294B2 (en) Methods and systems for efficient and accurate text extraction from unstructured documents
US7982737B2 (en) System and method for independent font substitution of string characters
TWI595366B (en) Detection and reconstruction of east asian layout features in a fixed format document
CN104866498A (en) Information processing method and device
CN110427884A (en) The recognition methods of the document structure of an article, device, equipment and storage medium
CN105528935A (en) Writing sequence guiding method and device
CN106325596B (en) Automatic handwriting error correction method and system
CN102063620A (en) Handwriting identification method, system and terminal
CN101483035B (en) Method and system for display text on graphical interface
CN102136154B (en) Cartoon manufacture method and device
CN109446506A (en) A kind of method and apparatus that electronic spreadsheet table reproduces automatically
CN102314252A (en) Character segmentation method and device for handwritten character string
KR101772831B1 (en) Method and apparatus of building intermediate character library
CN101452368B (en) Hand-written character input method
CN105808511A (en) Spatial position-based method for reconstructing text information in CAD electronic data
CN104809483A (en) Method and system for realizing segmentation of text lines written in any directions
CN104536947A (en) Layout document processing method and device
CN102750272B (en) Method and system for optimizing hand-input candidate item of character
CN111199086A (en) Three-dimensional geometric discretization processing system
CN116226681A (en) Text similarity judging method and device, computer equipment and storage medium
CN115544975A (en) Log format conversion method and device
JP2000040085A (en) Method and device for post-processing for japanese morpheme analytic processing
CN102722490B (en) A character-capturing method and a character-capturing device of an electronic reader and the same
CN105653549A (en) Method and device for extracting document information
US20140267380A1 (en) System and method for efficiently viewing a style

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160727

RJ01 Rejection of invention patent application after publication