Detailed Description
In order for those skilled in the art to better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification shall fall within the scope of protection.
In an image merging scheme in the prior art, different PDF image files are needed to be sequentially parsed and rendered to the same medium (such as a virtual canvas), and then the medium content with 2 images rendered is stored as a PDF file, so that the merged image file can be obtained. Obviously, the merging efficiency of the scheme is lower, and the merged image file is larger.
In another image merging scheme in the prior art, a certain merging format can be preset, then 2 image files to be merged are merged according to the format, and the efficiency of generating PDF files after being merged is improved by a way of merging with less information, but the scheme loses more information, so that the merged image has deviation of information and color.
In view of the above problems, the embodiments of the present disclosure provide an image merging method, as shown in fig. 2, which may include the following steps:
s101, obtaining an object list of a target page in a background image file, wherein the object list comprises attribute information of each object in the page; the background image file is in a PDF format;
as a structured file format, objects such as directories, pages, images and the like in PDF may be presented in PDF documents by way of reference, and not sequentially and arbitrarily, for example, one PDF file has 3 pages, and page 3 may be presented before page 1, but in PDF files displayed after time resolution by way of reference order, pages 1 to 3 may be presented sequentially, thereby making the PDF document more readable.
The main structure of a PDF file may be as shown in fig. 3, in which the version number of the PDF specification to which the file conforms is specified in the header, which appears in the first line of the PDF file. The file body is also called an object set, and the main part of the PDF file consists of a series of objects. The cross-reference table is an address index table set up to facilitate random access of objects, and typically stores object addresses in an offset indexed manner. The file end declares the address of the cross-reference table and indicates the root directory of the file body, so that the position of each object in the PDF file can be found to achieve random access.
The hierarchical structure of a PDF file may be as shown in fig. 4, where the root directory is indicated by the end of the file, and the page directory and the outline directory are referenced in the root directory. The large outline is a bookmark tree of a PDF file; the page directory contains the page number of the PDF file and the reference mark of each page. The page object is the most important object in the PDF, and the page information includes information how to display the page, for example, an object list including attribute information of each object in the recording page, and the attribute information may be, for example, a used font, a contained content (text, picture, etc.), a size of the page, and the like.
The image included in the background image file in this embodiment may be a general part or a custom part, and the image included in the target image file may be a custom part or a general part, which is not limited in this description, but the background image file is a PDF file, but the format of the target image file is not limited, and may be PDF, JPG, PNG or the like.
In addition, the background image file can comprise a plurality of pages, the image in the target image file can be embedded into a certain page, and the embedded image can be directly printed to generate an image object to be used according to the image file obtained after the combination.
In the scheme, before the object list of the target page in the background image file is obtained, the file header can be checked first, and the background image file is determined to be a file in PDF format according to the version number in the file header. The file tail can also be checked, information in the file tail is read, and the integrity of the background image file is determined.
In this scheme, when obtaining the object list of the target page in the background image file, the target page to be embedded in the target image in the background image file may be first determined, and then the cross-reference table of the background image file is obtained by analyzing the tail of the background image file, for example, the tail of the background image file may be analyzed, and the address of the cross-reference table declared in the tail may be determined according to the preset declaration character.
Thus, the object list of the target page can be obtained according to the addresses referenced in the cross-reference table, for example, the cross-reference table is resolved to obtain the addresses of the referenced objects recorded in the table, the file tail of the background image file is resolved to obtain the reference identification of the root directory, then the root directory is obtained according to the reference identification of the root directory and the addresses recorded in the cross-reference table, and the object list of the target page is obtained according to the page information recorded in the root directory.
As an example, the root directory may be parsed to obtain a reference identifier of a page directory, and the page directory may be obtained according to the reference identifier of the page directory and an address recorded in the cross-reference table, where the page directory includes: the number of pages in the background image file and the reference identification of each page. And obtaining an object list of the target page according to the reference identification of the target page in the page directory, for example, analyzing the page directory, obtaining the reference identification of the target page, and obtaining the page information of the target page according to the reference identification of the target page and the address recorded in the cross reference table, so as to analyze the obtained page information and obtain the object list of the target page.
S102, obtaining merging requirement information of a target image file; the merging requirement information is used for representing: the target image file is embedded into attribute information in the background image file;
s103, writing the merging requirement information into an object list of the target page to obtain a merged page object list;
s104, analyzing the combined page object list, and rendering to generate a combined image with the target image file embedded in the background image file.
For convenience of description, S102 to S104 will be described in combination.
The merging requirement information of the target image file can comprise binary information obtained by binary reading the target image file, and the reference to the target image is realized through binary reading; information on the position, size, etc. in the target page after the image in the target image file is embedded may also be included. After the merging requirement information is written into the object list of the target page, a merged page object list can be obtained, and the attribute information included in the list describes the image obtained by merging the universal part and the customized part.
In this scheme, the merging requirement information is written into the object list of the target page, and after the merged page object list is obtained, the page information of the target page and the cross reference table can be further updated according to the merged page object list, and then the address of the cross reference table declared in the file tail is updated according to the updated cross reference table. In addition, the contents of the page directory and the root directory may be updated. Of course, when updating is performed, the original content can be directly copied in the same part as the content in the background image file before merging.
The image merging method provided in the present specification will be described with reference to a more specific example.
Assume that the images to be combined are respectively shown as (a) (b) in fig. 5, wherein the image shown in (a) is included in a background image file in PDF format, the image shown in (b) is a target image file in JPG format, and the image (b) needs to be embedded in the image (a).
First, the header of the background image file is checked, it is determined to be a PDF file, and the end of the file is checked, determining the integrity of the file. And then analyzing the file tail to obtain the starting point information of the cross reference table, thereby obtaining the position of the cross reference table.
And analyzing and obtaining the content of the cross reference table according to the positions, namely obtaining the total number of objects in the background image file, the reference identification of each object and the positions in the file, which are recorded in the cross reference table. In addition, the file tail is analyzed to obtain the reference identifier of the root directory, and the position of the root directory is obtained according to the corresponding relation between the object reference identifier and the position recorded in the cross reference table.
And analyzing and obtaining the content of the background image file root directory according to the positions so as to obtain the reference identification of the page directory, and obtaining the position of the page directory according to the corresponding relation between the object reference identification and the position recorded in the cross reference table.
And analyzing and obtaining the positions of the page catalogs according to the positions to obtain the number of pages in the background image file and the reference marks of all the pages, and obtaining the positions of the target pages according to the corresponding relation between the object reference marks and the positions recorded in the cross reference table.
And analyzing and obtaining page information of the target page according to the position, wherein the page information comprises page size, an image resource list in the page, a Contents list and the like.
The binary system reads the target image file, takes the read binary system information as a new object, adds the new object into the image resource list, and adds drawing information such as the position, transformation, size and the like of the target image in the target page to the tail of the Contents list.
And updating the cross reference table according to the added content, rewriting the file tail, and re-indicating the position of the updated cross reference table in the file tail to obtain the combined PDF file. The PDF file is analyzed and rendered, so that the combined image (shown on the left side of FIG. 1) can be displayed.
Therefore, if the scheme in the prior art is adopted, the parsing and rendering of all objects in the PDF file are needed, including decoding the picture objects, marking the path objects, inquiring and rendering the text objects, and the like, and when the scheme is applied, the parsing and rendering of all objects in the PDF file can be avoided, a new picture object is directly added semantically, and all information and contents of the source file can be reserved, so that the image merging of the PDF file can be efficiently and accurately realized.
Corresponding to the above method embodiment, the embodiment of the present disclosure further provides an image merging device, as shown in fig. 6, which may include:
a list obtaining module 110, configured to obtain an object list of a target page in a background image file, where the object list includes attribute information of each object in the page; the background image file is in a PDF format;
a requirement obtaining module 120, configured to obtain merging requirement information of the target image file; the merging requirement information is used for representing: the target image file is embedded into attribute information in the background image file;
the list updating module 130 is configured to write the merging requirement information into the object list of the target page to obtain a merged page object list;
the image generating module 140 is configured to parse the merged page object list, and render and generate a merged image in which the target image file is embedded in the background image file.
In one specific embodiment provided in the present specification, the list obtaining module 110 may include:
the page determining submodule is used for determining a target page to be embedded with a target image in the background image file;
the file tail analysis sub-module is used for analyzing the file tail of the background image file and obtaining a cross-reference table of the background image file;
and the list obtaining sub-module is used for obtaining the object list of the target page according to the addresses referenced in the cross reference table.
In a specific embodiment provided in the present specification, the end-of-file parsing sub-module may specifically be used for:
analyzing the file tail of the background image file, and determining the address of a cross reference table declared in the file tail according to a preset declaration character;
and obtaining the cross reference table of the background image file according to the address of the cross reference table.
In a specific embodiment provided in the present specification, the list obtaining submodule may include:
a reference table analyzing unit for analyzing the cross reference table to obtain the address of the referenced object recorded in the table;
the file tail analyzing unit is used for analyzing the file tail of the background image file and obtaining the reference mark of the root directory;
a root directory obtaining unit, configured to obtain the root directory according to a reference identifier of the root directory and an address recorded in the cross reference table;
and the list obtaining unit is used for obtaining the object list of the target page according to the page information recorded by the root directory.
In a specific embodiment provided in the present specification, the list obtaining unit may include:
the root directory analyzing unit is used for analyzing the root directory and obtaining the reference identification of the page directory;
a page directory obtaining subunit, configured to obtain the page directory according to a reference identifier of the page directory and an address recorded in the cross reference table; the page directory includes: the number of pages in the background image file and the reference mark of each page;
and the list obtaining subunit is used for obtaining the object list of the target page according to the reference identification of the target page in the page directory.
In a specific embodiment provided in the present specification, the list obtaining subunit may specifically be used to:
analyzing the page directory to obtain the reference identification of the target page;
acquiring page information of the target page according to the reference identification of the target page and the address recorded in the cross reference table;
and analyzing the obtained page information to obtain an object list of the target page.
In a specific embodiment provided in the present disclosure, the list updating module is configured to write the merging requirement information into the object list of the target page, and after obtaining the merged page object list, may be further configured to:
updating the page information of the target page and the cross reference table according to the merged page object list;
and updating the addresses of the cross-reference tables declared in the file tails according to the updated cross-reference tables.
The implementation process of the functions and roles of each module in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
The embodiments of the present disclosure also provide a computer device, which at least includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the aforementioned image merging method when executing the program. The method at least comprises the following steps:
an image merging method, the method comprising:
obtaining an object list of a target page in a background image file, wherein the object list comprises attribute information of each object in the page; the background image file is in a PDF format;
obtaining merging requirement information of a target image file; the merging requirement information is used for representing: the target image file is embedded into attribute information in the background image file;
writing the merging requirement information into an object list of the target page to obtain a merged page object list;
and analyzing the combined page object list, and rendering to generate a combined image in which the target image file is embedded into the background image file.
FIG. 7 illustrates a more specific hardware architecture diagram of a computing device provided by embodiments of the present description, which may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the aforementioned image merging method. The method at least comprises the following steps:
an image merging method, the method comprising:
obtaining an object list of a target page in a background image file, wherein the object list comprises attribute information of each object in the page; the background image file is in a PDF format;
obtaining merging requirement information of a target image file; the merging requirement information is used for representing: the target image file is embedded into attribute information in the background image file;
writing the merging requirement information into an object list of the target page to obtain a merged page object list;
and analyzing the combined page object list, and rendering to generate a combined image in which the target image file is embedded into the background image file.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
From the foregoing description of embodiments, it will be apparent to those skilled in the art that the present embodiments may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present specification.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The apparatus embodiments described above are merely illustrative, in which the modules illustrated as separate components may or may not be physically separate, and the functions of the modules may be implemented in the same piece or pieces of software and/or hardware when implementing the embodiments of the present disclosure. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing is merely a specific implementation of the embodiments of this disclosure, and it should be noted that, for a person skilled in the art, several improvements and modifications may be made without departing from the principles of the embodiments of this disclosure, and these improvements and modifications should also be considered as protective scope of the embodiments of this disclosure.