CN102541905B - For attribute processing methods and the device of pdf document - Google Patents

For attribute processing methods and the device of pdf document Download PDF

Info

Publication number
CN102541905B
CN102541905B CN201010605620.XA CN201010605620A CN102541905B CN 102541905 B CN102541905 B CN 102541905B CN 201010605620 A CN201010605620 A CN 201010605620A CN 102541905 B CN102541905 B CN 102541905B
Authority
CN
China
Prior art keywords
attribute
pdf document
dictionary
file
pdf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010605620.XA
Other languages
Chinese (zh)
Other versions
CN102541905A (en
Inventor
张立业
卢秀琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201010605620.XA priority Critical patent/CN102541905B/en
Publication of CN102541905A publication Critical patent/CN102541905A/en
Application granted granted Critical
Publication of CN102541905B publication Critical patent/CN102541905B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a kind of attribute processing methods for pdf document, comprise the following steps: resolve pdf document, obtain the attribute of pdf document according to the attribute dictionary preset, wherein, the attribute dictionary preset comprises the particular community expecting the described pdf document of test, and particular community is used for search; The attribute of each pdf document obtained and filename thereof are joined in database as a record.Present invention also offers a kind of attribute treating apparatus for pdf document, comprise: acquisition module, for resolving pdf document, the attribute of pdf document is obtained according to the attribute dictionary preset, wherein, the attribute dictionary preset comprises the particular community expecting the described pdf document of test, and particular community is used for search; Logging modle, for joining the attribute of each pdf document obtained and filename thereof in database as a record.Present invention saves cost of labor, improve efficiency.

Description

For attribute processing methods and the device of pdf document
Technical field
The present invention relates to print field, in particular to for the attribute processing methods of PDF (PortableDocumentFormat can carry document format) file and device.
Background technology
In the test process for printing industry software, often need to select the pdf document possessing certain particular community (key) or some particular community collection from the sample file of existing a large amount of PDF and carry out test activity targetedly.
At present, the method filtering out the pdf document of particular community has two kinds: one to be possess which important attribute by going out this file by filename direct representation after making pdf document, goes to screen by filename in the future.But this kind of method has stricter restriction due to the filename length of system and character used, therefore can not list too many attribute, and inquiry is got up for the bad realization of screening of composite attribute.Another method is when testing at every turn, and each pdf document is opened in equal artificially, checks its attribute one by one, and this process is quite time-consuming, and efficiency is very low.
Because this kind of test activity is relatively more frequent, and have strict requirement the time cycle, therefore two kinds of methods of prior art are all infeasible.
Summary of the invention
The present invention aims to provide a kind of attribute processing methods for pdf document and device, to solve the very low problem of existing pdf document attribute selection method efficiency.
In an embodiment of the present invention, provide a kind of attribute processing methods for pdf document, comprise the following steps: pre-set attribute dictionary, wherein will expect that the particular community of test pdf document joins in attribute dictionary as the attribute being used for searching for; Obtain the attribute of pdf document; The attribute of each pdf document obtained and filename thereof are joined in database as a record; Wherein, the attribute of described acquisition pdf document comprises: resolve described pdf document and obtain header file, content flow and file dictionary; The attribute of described pdf document is obtained from described header file, described content flow and described file dictionary; Wherein, the described attribute obtaining described pdf document from described header file, described content flow and described file dictionary comprises: travel through all dictionary objects in described header file, described content flow and described file dictionary, judge in ergodic process the dictionary object of described traversal whether have described in attribute in the attribute dictionary that pre-sets.
In an embodiment of the present invention, additionally provide a kind of attribute treating apparatus for pdf document, comprising: attribute dictionary arranges module, for pre-setting attribute dictionary, wherein will expect that the particular community of test pdf document joins in attribute dictionary as the attribute being used for searching for; Acquisition module, obtains the attribute of pdf document; Logging modle, for joining in database using the attribute of each pdf document obtained and filename thereof as a record; Described acquisition module comprises: pdf document parsing module, obtains header file, content flow and file dictionary for resolving described pdf document; Pdf document dictionary parsing module, for obtaining the attribute of described pdf document from described header file, described content flow and described file dictionary; Described pdf document dictionary parsing module is used for: travel through all dictionary objects in described header file, described content flow and described file dictionary, judge in ergodic process the dictionary object of described traversal whether have described in attribute in the attribute dictionary that pre-sets.
The attribute processing methods for pdf document of above-described embodiment and device are because adopt data-base recording pdf document attribute, be convenient to inquiry in the future, so overcome the very low problem of existing pdf document attribute selection method efficiency, therefore save cost of labor, improve efficiency.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 shows according to an embodiment of the invention for the process flow diagram of the attribute processing methods of pdf document;
Fig. 2 shows according to the preferred embodiment of the invention for the process flow diagram of the attribute processing methods of pdf document;
Fig. 3 shows according to an embodiment of the invention for the schematic diagram of the attribute treating apparatus of pdf document;
Fig. 4 shows according to the preferred embodiment of the invention for the schematic diagram of the attribute treating apparatus of pdf document.
Embodiment
Below with reference to the accompanying drawings and in conjunction with the embodiments, describe the present invention in detail.
Fig. 1 shows according to an embodiment of the invention for the process flow diagram of the attribute processing methods of pdf document, comprises the following steps:
Step S10, obtains the attribute of pdf document;
Step S20, joins the attribute of each pdf document obtained and filename thereof in database as a record.
In prior art, when testing, each pdf document is opened in equal artificially at every turn, checks its attribute one by one, and this process is quite time-consuming, and efficiency is very low.And this attribute processing methods is because adopt data-base recording pdf document attribute, be convenient to inquiry in the future, so without the need to testing at every turn time again artificially open each pdf document, overcome the problem that existing pdf document attribute selection method efficiency is very low, therefore save cost of labor, improve efficiency.
Preferably, step S10 comprises: resolve pdf document and obtain header file, content flow (contents) and file dictionary; The attribute of pdf document is obtained from header file, content flow and file dictionary.Above-mentioned resolving because can realize by performing computer software, thus eliminates the process of manual analysis pdf document up hill and dale, alleviates cost of labor widely, considerably improves efficiency.Certainly, as basic embodiment of the present invention, the attribute of pdf document also can be obtained by the mode of manual analysis.
Preferably, the attribute obtaining pdf document from header file, content flow and file dictionary comprises: all dictionary objects in traversal header file, content flow and file dictionary, judges whether the dictionary object traveled through has the attribute in the attribute dictionary pre-set in ergodic process.In this preferred embodiment, adopt attribute dictionary to preset the attribute needing search, thus improve the speed of program looks PDF attribute.
Preferably, the attribute processing methods for pdf document also comprises: pre-set attribute dictionary, wherein will expect that the particular community of test pdf document joins in attribute dictionary as the attribute being used for searching for.In the preferred embodiment, because pre-set attribute dictionary according to the object of test pdf document, thus the result can guaranteeing to carry out the process of PDF attributive analysis can be used in the test of pdf document.In addition, because the process of can set a property artificially dictionary, i.e. adjustable PDF attributive analysis process, so when test purpose changes, without the need to adjusting the process of PDF attributive analysis process, only need Update attribute dictionary simply.Because attribute dictionary customizes as required, therefore extendability is also stronger, puts in storage if there has been newly-increased attribute specification only to need amendment dictionary re-using native system to carry out pdf document parsing.
Preferably, in ergodic process, judge that the attribute whether dictionary object traveled through has in the attribute dictionary pre-set comprises: for the current dictionary object traversed, judge whether it has in the attribute of attribute dictionary and not yet determine the attribute that pdf document has had, in the attribute of attribute dictionary, determine that the attribute that pdf document has had then no longer judges.According to this preferred embodiment, when attribute dictionary comprises multiple attribute, if in the dictionary object process of traversal pdf document, when determining that certain dictionary object has certain attribute of attribute dictionary, so in ensuing dictionary object ergodic process, just without the need to judging that this is determined attribute, and only need judge whether pdf document has other attributes of attribute dictionary.Do like this and obviously improve executing efficiency, when pdf document quantity is many especially, attribute processing speed can be accelerated significantly.
Preferably, step S10 comprises: acquisition approach from the character string of input; All pdf documents in traverse path, to obtain the attribute of each pdf document of traversal.According to the preferred embodiment, user only need input a path, just automatically can carry out attribute process to pdf documents all in path, alleviates the manual burden of user, improves work efficiency.
Preferably, attribute comprise following one of at least:
Doctype, PDF version, whether in advance color separation file, total page number, whether there is OutputIntent, whether submit to by stream mode, whether process OptionalContent, whether resolve AnnotationProcessed, whether file is encrypted, whether be encryption of soaring, PDFXVersion, whether cross reference table is flow object, whether multiple cross reference, there is the Content of flow object, there is the Content of array object, there is the Content of empty object, notes content attribute [type of comment (WidgetType, Link, FreeText, CirCle, Polygon, Ployline, Highlight, Underline, Squiggly, StrikeOut, Stamp, Caret, Ink, FileAttachment, sound, Movie, PrinterMark, TrapNet, WaterMark, ThreeD), whether Widge can export, N object type (flow object in AP dictionary, dictionary object, other object)], Alternative Content attribute [selectable objects type (OCG, OCMenberShip), MemberShip whether is had to determine OC state, OC state (ON, OFF, UnDenfined), MemberShip computation rule (VE, ANYON, ANYOFF, ALLON, ALLOFF)], image object attribute [image type (Normal, InlineImage, Mask, explictMask, ColorkeyMask, Smask), position dark (1, 2, 4, 6, 8, 16), whether exist line high be 1 image, whether there is the image that live width is 1, X-direction resolution, Y-direction resolution, whether there is default Decode, colour generation purpose, double exposure pattern, whether double exposure, whether front end is assembled, image processing type, whether front end zoom, image zoom algorithm, whether scan from left to right, whether scan from the top down, trasfer type, whether cut, look face quantity, whether be out of shape, whether contain UCR, whether contain BG, linked network type, whether Transfer is there is in linked network, linked network Spot type function, bHasTwoSquaresThreshold)], gradient attributes [type of fade, whether define background color, double exposure pattern, whether define BBox, whether contain UCR, be whether the Pattern of type 2, Transfer type, whether multiple output function, whether double exposure, whether contain BG, type function, whether multi output, whether multi input, whether there is Range item], path attribute [path type, whether existence closes SubPath, whether there is curve, whether there is null vector, whether there is fixed-point number to cross the border, draw operational character, Trasfer type, whether double exposure, whether contain UCR, whether there is multiple SubPath, whether cannot there is not closedly SubPath, whether be buffered, whether Flatness is less than default value, exist close to vertical/horizontal straight line, double exposure pattern, Flatness and whether be curve, whether contain BG], font attribute [font type (Type0, Type1, Type3, TrueType), font name, basis font name, font type of coding, width table type, whether font file is embedded, font PaintType, whether synthesize runic effect, whether synthesize italic effect, whether OpenType font, whether non-indirect referencing object, the whether font of Symbolic type), hidden primitive attribute (has the primitive types of OC attribute, be hidden primitive types (StrokeElement, FillElement, TextElement, ShadingElement, XobjectElement), whether nested multilayer in MarkedContent)], font contents attribute [TextRenderMode, TextKnockOut, existence will enter the Type3 character of cache, whether can not there is not the Type3 character of cache, existence comprises the Type3 character of Image, existence comprises the Type3 character of Form, existence comprises the Type3 character of Font, existence comprises the Type1 character of seac instruction, existence comprises the Type1 character of StemHint, existence comprises the Type1 character of CounterHint, whether the width table information in dictionary is inconsistent with the metric in font file, TransferType, existence comprises the TrueType character of Instruction, whether contain UCR, double exposure pattern, whether contain UCR, whether contain BG, whether double exposure, font type] color space type [CS_DeviceGray, CS_DeviceRGB, CS_DeviceCMYK, CS_CalGray, CS_CalRGB, CS_ICCBased, CS_Separation, CS_DeviceN, CS_Indexed, CS_Lab, CS_Pattern], function property [type function (SampleFunc, ExpFunc, StitchFunc, PSFunc), whether multi output, whether multi input, whether there is Range item], transparent attribute [pel in transparent group, pel contains spot color, containing softImageMask, father's transparent attribute, transparent group of self attributes (Isolated, Konckout, PageGroup), transparent image status attribute (BlendMode, AIS, OP, OPM, SoftMask type, background colour)], FilterType[ASCIIHEX, ASCII85, RLE, LZW, FLATE, FAX, DCT, JBIG2, CRYPT, SUBFILE, RESTREAM, SPECIAL, JPX].
The preferred embodiment of the present invention is including but not limited to above-mentioned attribute.
Fig. 2 shows according to the preferred embodiment of the invention for the process flow diagram of the attribute processing methods of pdf document, and the preferred embodiment combines the scheme of each embodiment above-mentioned.
For the character string of user input, carry out automatically attribute in order to the whole pdf documents in all paths of it being comprised and resolve contrast, generate data-base recording and carry out unified management, step as shown in Figure 2 completes following process:
Step S1: according to the character string of input, splits and obtains effective path.
Step S2: all pdf documents in traverse path.
Step S3: each file of traversal is carried out dissection process one by one.
Step S4: following operation is performed to the current pdf document of resolving:
Step S41: analyze pdf document dictionary object and perform following operation:
Step S411: the dictionary object obtaining pdf document.
Step S412: search the attribute whether comprising and specify in PDF dictionary object.
Step S413: record searching result.
Step S42: the content flow analyzing each page dictionary in pdf document performs following operation:
Step S421: obtain the content flow in pdf document page dictionary object.
Step S422: search whether comprise specified attribute at content of pages stream.
Step S423: record searching result.
Step S5: judge that whether all pages of pdf document are complete by analysis, if do not analyzed, has then continued to perform step S3.
Step S6: if all pages of pdf document are complete by analysis, then generate a data record, the content of the attribute record of pdf document filled this data record by form in the database table of specifying.
Step S7: judge that whether all pdf documents in specified path are complete by analysis, if do not analyzed, has then continued to perform above-mentioned step S2-S6.If analyzed, then ending said process.
Fig. 3 shows according to an embodiment of the invention for the schematic diagram of the attribute treating apparatus of pdf document, comprising:
Acquisition module 10, for obtaining the attribute of pdf document;
Logging modle 20, for joining the attribute of each pdf document obtained and filename thereof in database as a record.
In prior art, when testing, each pdf document is opened in equal artificially at every turn, checks its attribute one by one, and this process is quite time-consuming, and efficiency is very low.And this attribute treating apparatus is because adopt data-base recording pdf document attribute, be convenient to inquiry in the future, so without the need to testing at every turn time again artificially open each pdf document, overcome the problem that existing pdf document attribute selection method efficiency is very low, therefore save cost of labor, improve efficiency.
Preferably, acquisition module 10 comprises: pdf document parsing module, obtains header file, content flow and file dictionary for resolving pdf document; Pdf document dictionary parsing module, for obtaining the attribute of pdf document from header file, content flow and file dictionary.Above-mentioned resolving because can realize by performing computer software, thus eliminates the process of manual analysis pdf document up hill and dale, alleviates cost of labor widely, considerably improves efficiency.Certainly, as basic embodiment of the present invention, the attribute of pdf document also can be obtained by the mode of manual analysis.
Preferably, acquisition module 10 comprises: file path acquisition module, for acquisition approach in the character string from input; Traversal path extracts pdf document module, for all pdf documents in traverse path, to obtain the attribute of each pdf document of traversal.According to the preferred embodiment, user only need input a path, just automatically can carry out attribute process to pdf documents all in path, alleviates the manual burden of user, improves work efficiency.
Fig. 4 shows according to the preferred embodiment of the invention for the schematic diagram of the attribute treating apparatus of pdf document.The preferred embodiment combines the scheme of each embodiment above-mentioned.This attribute treating apparatus comprises:
File path acquisition module 12, traversal path extract pdf document module 14, pdf document parsing module 22, pdf document dictionary parsing module 24, content of pages stream parsing module 26, attribute search module 28, PDF attribute record module 32, data-base recording generation module 34, wherein:
File path acquisition module 12, for obtaining each effective file path in the character string from input, such as, file path acquisition module 12 splits out multiple active path by the method for searching special decollator " | " from the character string of input, then each active path follow-up module that passes to one by one is processed.
Traversal path extracts pdf document module 14, for traveling through each pdf document in specified path, such as, traversal path extracts pdf document module 14 to the active path imported into, traversal each file wherein, and screened by file suffixes, by each " .pdf " suffix file one by one special delivery to subsequent module for processing;
Pdf document parsing module 22, for resolving in pdf document whether comprise defined specified attribute.Which includes pdf document dictionary parsing module and content of pages stream parsing module.
Pdf document dictionary parsing module 24, for obtaining the dictionary of pdf document, and whether search comprises defined attribute, such as, pdf document dictionary parsing module 24 obtains the dictionary object of the pdf document imported into, and call attribute search module and search in this attribute dictionary whether comprise defined attribute, and log file essential information and Search Results.
Content of pages stream parsing module 26, for splitting out the content flow in every page of dictionary, and one by one the content of pages of acquisition is flow to row relax, whether search wherein comprises defined attribute, such as, content of pages stream parsing module 26 splits out the content flow in the dictionary object of each page of pdf document, and the content flow of each page obtained is carried out subsequent treatment one by one, call attribute search module and search in this content of pages stream whether comprise defined attribute, and record searching result.
Whether attribute search module 28, exist in specific dictionary object for searching for the attribute of specifying.
Above-mentioned file path acquisition module 12, traversal path extraction pdf document module 14, pdf document parsing module 22, pdf document dictionary parsing module 24, content of pages stream parsing module 26, attribute search module 28 achieve the acquisition module 10 in Fig. 3
PDF attribute record module 32, the particular community that essential information and search file out for preserving pdf document comprise.
Data-base recording generation module 34, for the PDF attribute record of preservation being recorded in the database table of specifying with the form of data-base recording, such as, data-base recording generation module: the data-base recording that interpolation one is new in the database of specifying, the pdf document attribute Search Results of prior process record is carried out arrangement to merge, fill this data-base recording according to specified format.
Above-mentioned PDF attribute record module 32, data-base recording generation module 34 achieve the logging modle 20 in Fig. 3.
The preferred embodiment because whole process can process in bulk, and does not need human intervention, automatically completes from extraction document to all processes of resolving warehouse-in, has therefore saved cost of labor in large quantities, improve efficiency.And once after warehouse-in, can carry out screening compactly fast for putting in storage content at any time, and the power of various combinations of attributes screening can be realized, be convenient to management and, also make the screening of more refinement become possibility.
As can be seen from the above description, the above embodiments of the present invention overcome the very low problem of existing pdf document attribute selection method efficiency, have therefore saved cost of labor, have improve efficiency.
Obviously, those skilled in the art should be understood that, above-mentioned of the present invention each module or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on network that multiple calculation element forms, alternatively, they can realize with the executable program code of calculation element, thus they storages can be performed by calculation element in the storage device, or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to any specific hardware and software combination.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (6)

1. for an attribute processing methods for pdf document, it is characterized in that, comprise the following steps:
Pre-set attribute dictionary, wherein will expect that the particular community of test pdf document joins in attribute dictionary as the attribute being used for searching for;
Obtain the attribute of pdf document;
The attribute of each described pdf document obtained and filename thereof are joined in database as a record;
Wherein, the attribute of described acquisition pdf document comprises:
Resolve described pdf document and obtain header file, content flow and file dictionary;
The attribute of described pdf document is obtained from described header file, described content flow and described file dictionary;
Wherein, the described attribute obtaining described pdf document from described header file, described content flow and described file dictionary comprises:
Travel through all dictionary objects in described header file, described content flow and described file dictionary, judge in ergodic process the dictionary object of described traversal whether have described in attribute in the attribute dictionary that pre-sets.
2. method according to claim 1, is characterized in that, judges that the attribute whether dictionary object of described traversal has in the attribute dictionary pre-set comprises in ergodic process:
For the current described dictionary object traversed, judge whether it has in the attribute of described attribute dictionary and not yet determine the attribute that described pdf document has had, in the attribute of described attribute dictionary, determine that the attribute that described pdf document has had then no longer judges.
3. method according to claim 1, is characterized in that, the attribute obtaining pdf document comprises:
Acquisition approach from the character string of input;
Travel through all pdf documents in described path, to obtain the attribute of each pdf document of described traversal.
4. the method according to any one of claim 1-3, is characterized in that, described attribute comprise following one of at least:
Doctype, PDF version, whether in advance color separation file, total page number, whether there is OutputIntent, whether submit to by stream mode, whether process OptionalContent, whether resolve AnnotationProcessed, whether file is encrypted, whether be encryption of soaring, PDFXVersion, whether cross reference table is flow object, whether multiple cross reference, there is the Content of flow object, there is the Content of array object, there is the Content of empty object, notes content attribute, Alternative Content attribute, image object attribute, gradient attributes, path attribute, font attribute, font contents attribute, color space type, function property, transparent attribute, FilterType.
5., for an attribute treating apparatus for pdf document, it is characterized in that, comprising:
Attribute dictionary arranges module, for pre-setting attribute dictionary, wherein will expect that the particular community of test pdf document joins in attribute dictionary as the attribute being used for searching for;
Acquisition module, obtains the attribute of pdf document;
Logging modle, for joining in database using the attribute of each described pdf document obtained and filename thereof as a record;
Described acquisition module comprises:
Pdf document parsing module, obtains header file, content flow and file dictionary for resolving described pdf document;
Pdf document dictionary parsing module, for obtaining the attribute of described pdf document from described header file, described content flow and described file dictionary;
Described pdf document dictionary parsing module is used for:
Travel through all dictionary objects in described header file, described content flow and described file dictionary, judge in ergodic process the dictionary object of described traversal whether have described in attribute in the attribute dictionary that pre-sets.
6. device according to claim 5, is characterized in that, described acquisition module comprises:
File path acquisition module, for acquisition approach in the character string from input;
Traversal path extracts pdf document module, for traveling through all pdf documents in described path, to obtain the attribute of each pdf document of described traversal.
CN201010605620.XA 2010-12-15 2010-12-15 For attribute processing methods and the device of pdf document Expired - Fee Related CN102541905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010605620.XA CN102541905B (en) 2010-12-15 2010-12-15 For attribute processing methods and the device of pdf document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010605620.XA CN102541905B (en) 2010-12-15 2010-12-15 For attribute processing methods and the device of pdf document

Publications (2)

Publication Number Publication Date
CN102541905A CN102541905A (en) 2012-07-04
CN102541905B true CN102541905B (en) 2015-11-25

Family

ID=46348822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010605620.XA Expired - Fee Related CN102541905B (en) 2010-12-15 2010-12-15 For attribute processing methods and the device of pdf document

Country Status (1)

Country Link
CN (1) CN102541905B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853791A (en) * 2012-12-07 2014-06-11 腾讯科技(深圳)有限公司 Implementation method and device for quick file retrieving
CN104166849B (en) * 2013-05-17 2017-04-19 北大方正集团有限公司 Electronic document identification method and apparatus
CN105095158B (en) * 2014-05-06 2018-07-10 北大方正集团有限公司 Based on PDF grades of local linked network treating method and apparatus
CN113128175B (en) * 2021-04-19 2023-01-24 福建福昕软件开发股份有限公司 Method and system for merging large batch of PDF (portable document format) files

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633080A (en) * 2003-12-24 2005-06-29 华为技术有限公司 Method for implementing log in network management system
CN101178730A (en) * 2007-12-14 2008-05-14 清华大学 Document management method facing to integration business model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633080A (en) * 2003-12-24 2005-06-29 华为技术有限公司 Method for implementing log in network management system
CN101178730A (en) * 2007-12-14 2008-05-14 清华大学 Document management method facing to integration business model

Also Published As

Publication number Publication date
CN102541905A (en) 2012-07-04

Similar Documents

Publication Publication Date Title
CN105447099B (en) Log-structuredization information extracting method and device
CN103500118B (en) A kind of Cascading Style Sheet optimization method and device
CN103365862B (en) It is a kind of for generating the method and apparatus of picture corresponding with the page
JP6203374B2 (en) Web page style address integration
CN102541905B (en) For attribute processing methods and the device of pdf document
CN107015948B (en) Log information formatting method and system
US20120011429A1 (en) Image processing apparatus and image processing method
CN107633055B (en) Method for converting picture into HTML document
CN111368511A (en) PDF document analysis method and device
US20090172520A1 (en) Method of managing web services using integrated document
DE102016015381A1 (en) Using Bloom filters to simplify the expansion and subdivision of a dynamic font
CN113158987B (en) Table processing method, device, equipment and computer readable storage medium
CN112650529B (en) System and method for configurable generation of mobile terminal APP codes
CN110688118A (en) Webpage optimization method and device
CN105844683A (en) Pixel difference frame-by-frame animation realization method based on Canvas and WebWorker
CN105740355B (en) Webpage context extraction method and device based on aggregation text density
Xu et al. Identifying semantic blocks in Web pages using Gestalt laws of grouping
CN115630343A (en) Electronic document information processing method, device and equipment
DE10158419A1 (en) Process for digital printing of compound documents
US10459942B1 (en) Sampling for preprocessing big data based on features of transformation results
CN109726369A (en) A kind of intelligent template questions record Implementation Technology based on normative document
CN107526619B (en) The loading method of format data stream file
CN109492211A (en) A kind of table extracting method based on OFD document
CN103853849B (en) Method for establishing and drawing high-compression reflowable file
US20040237043A1 (en) Source file generation apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151125

Termination date: 20191215

CF01 Termination of patent right due to non-payment of annual fee