CN109918074B - Compiling link optimization method - Google Patents
Compiling link optimization method Download PDFInfo
- Publication number
- CN109918074B CN109918074B CN201711294532.0A CN201711294532A CN109918074B CN 109918074 B CN109918074 B CN 109918074B CN 201711294532 A CN201711294532 A CN 201711294532A CN 109918074 B CN109918074 B CN 109918074B
- Authority
- CN
- China
- Prior art keywords
- sym
- key
- symbol
- function
- marked
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Devices For Executing Special Programs (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a compiling and linking optimization method, which optimizes the linking function of a linker by utilizing the address space distribution function, the symbol resolution function and the repositioning function of a linker LD of a GNU open source compiling and linking tool BINUTILS, and the optimized functions comprise a symbol table establishing function, a searching function and an inquiry function in the linking process. The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.
Description
Technical Field
The invention relates to the technical field of computer software program operation, in particular to a compiling link optimization method.
Background
The GNU tool chain plays a significant role in a Linux system, and the compiling link occupies an important proportion. In recent years, the Linux system is rapidly developed, with the continuous development of computer technology, more and more individuals and enterprises begin to use the Linux system in a large amount, the variety of various application programs is continuously increased, the functions of the programs are continuously changed, and the programs become more diversified and complex, so that the phenomenon of the code amount of the programs is rapidly increased, the modules are increased, and meanwhile, a great burden is brought to compiling and linking of the programs. In the compiling and integrating process, the increase of the modules leads to the increase of binary object files and directly leads to the increase of a large number of symbols needing to be linked, and the order of magnitude can occupy a large number of system resources and seriously slow down the running speed of a system, so that a query processing mode of big data is introduced to solve the problems caused by the linking.
The original linker uses a hash algorithm. The hash algorithm maps data (characters or values, etc.) of any given length to a shorter, fixed-length value, called a hash value, through a given function, which serves as an index. The hash table is used for mapping a group of keywords to a requested memory space through a given hash function H (key) and a collision processing method, the H (key) is used as a storage position of the given keyword in the memory space, the memory space is called as a hash table or hash, and the obtained storage position is called as a hash address or hash address. As a linear data structure, compared with tables, queues, etc., a hash table is undoubtedly a faster one to find.
In the whole linking process, the establishment and searching part of the symbol table is the most time-consuming, mainly the establishment and searching positioning of the symbol table are consumed, and the consumption cannot be perceived by a small number of target files, but if hundreds of links of target files are encountered, and each target file comprises hundreds (more than) of symbols needing to be linked, the order of magnitude can expose the defects in the hardware aspect of the Loongson platform. If the existing hash algorithm of the linker is continuously used under the order of magnitude or higher order of magnitude, the important defect of the hash algorithm, namely low space efficiency, namely hash collision can be generated under the condition of higher order of magnitude, and more memories are developed for solving the hash collision; the hash algorithm used by the linker now occupies a larger memory, and during the period, the influence on the system speed is very large, which easily causes the system to be stuck, and the influence on the system caused by the occupation of a large amount of system resources caused by the linking is not negligible.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a compiling and linking optimization method, which optimizes the linking function of a linker by using the address space distribution function, the symbol resolution function and the repositioning function of a linker LD of a GNU open source compiling and linking tool BINUTILS, and the optimized functions comprise a symbol table establishing function, a searching function and a query function in the linking process.
Wherein the optimization of the symbol table establishment function is to create a GL _ KV symbol table.
In the optimization of the symbol table establishing function, the GL-KV symbol table is established through the following steps:
step S1: respectively collecting GNU library symbols and basic graphic library symbols;
step S2: inputting the symbol name of the symbol collected in step S1 into the bloom filter;
step S3: taking the output of the bloom filter as the input of an index algorithm, and determining the position of a symbol in a symbol table;
step S4: and writing the symbol information into the symbol table.
Wherein, still include:
step S5: if the repetition is found after the calculation of the indexing algorithm, another table is filled for storing data, and whether the repetition is determined by GL _ KV _ sym- > rep- > sym _ r or not is judged;
in the step S3, the indexing algorithm is a Hash indexing algorithm;
in step S3 to step S4, the bloom filter obtains a plurality of values by using a plurality of hash functions, marks the GL _ KV bit array by using the plurality of values as indexes, obtains the position of each symbol in the GL _ KV table by using the plurality of values as input of a hash index algorithm, and writes symbol information into the GL _ KV table.
Wherein, the optimization of the search function is completed by the following steps:
step SA: establishing a lookup table aiming at GL-KV symbols;
step SB: inputting sym _ key, and positioning whether the corresponding sym _ key exists in the lookup table through a bloom filter;
step SC: if not, the feedback search fails; if yes, executing the step SD;
step SD: and judging whether the searched sym _ key is in the flash positive state, if so, feeding back the searching failure, and if not, feeding back the searching success.
In the step SB, the bloom filter is composed of a plurality of hash functions and a bit array, and in the step SD, whether the searched sym _ key is a flip positive is determined by the value of GL _ KV _ sym- > sym _ key.
Wherein, the query function optimization in the linking process is completed through the following steps:
step Sa: respectively establishing a lookup table aiming at an OL _ KV symbol and a GL _ KV symbol;
and Sb: inputting sym _ key;
step Sc: determining whether a qualified marked sym _ key exists in a look-up table for the OL _ KV symbol, and if so, marking its existence;
step Sd: determining whether a marked sym _ key meeting the requirement exists in a lookup table for the GL _ KV symbol, and if so, marking the existence of the marked sym _ key;
step Se: if the marked sym _ key meeting the requirement does not exist in the two lookup tables after the step Sc and the step Sd, an error is reported;
if one sym _ key with a mark exists in the two lookup tables, using the sym _ value corresponding to the sym _ key;
and if the marked sym _ keys exist in the two lookup tables, comparing the intensity, reporting an error if the marked sym _ keys in the two lookup tables are both strong symbols, and selecting sym _ value corresponding to the strong symbols if one of the marked sym _ keys in the two lookup tables is a strong symbol.
Wherein the step Sc comprises:
step Sc 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the OL _ KV symbol; if not, executing step Sd, if yes, executing step Sc 2;
step Sc 2: judging whether the corresponding sym _ key is the active sym _ key, if not, marking the sym _ key to obtain the marked sym _ key and executing the step Sd; if yes, go to step Sc 3;
step Sc 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing the step Sd; if not, directly executing the step Sd.
Wherein the step Sd includes:
step Sd 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the GL _ KV symbol; if not, performing step Se, and if so, performing step Sd 2;
step Sd 2: judging whether the corresponding sym _ key is a flash positive sym _ key or not, if not, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if yes, go to step Sd 3;
step Sd 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if not, directly executing step Se.
In the step Sa, the lookup table for the OL _ KV symbol and the GL _ KV symbol is calculated by a bloom filter composed of a plurality of hash functions and a bit array and a hash index algorithm.
The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.
Drawings
FIG. 1: the invention optimizes the realization process of the symbol establishing function;
FIG. 2 is a schematic diagram: the search function of the invention optimizes the implementation process;
FIG. 3: the invention realizes the flow of optimizing the query function in the linking process.
Detailed Description
In order to further understand the technical scheme and the advantages of the present invention, the technical scheme and the advantages thereof are described in detail below with reference to the accompanying drawings.
The compiling link optimization method mainly aims to optimize the link function of the link by using the functions of address space allocation, symbol resolution, relocation and the like of the link LD of the GNU open source compiling link tool BINUTILS. Algorithms with symbol establishing, searching and repositioning functions are all improved, the original common hash algorithm is abandoned, a bloom filter is used, symbols which can be provided by a GNU component library and a basic graphic library are maintained, the speed of a compiling and linking stage is comprehensively improved, and the limitation of hardware is solved in a software mode.
Specifically, as shown in fig. 1 to fig. 3, the implementation flows of the symbol establishment function optimization, the search function optimization and the query function optimization in the linking process corresponding to the compiling and linking optimization method of the present invention are respectively.
First, please refer to fig. 1, for the implementation flow of the symbol establishment function optimization of the present invention, when performing compiling link optimization, first, the GNU component maintains a most basic symbol table that can be provided by the GNU tool chain and the primitive graphic library, where the symbol table mainly includes a symbol name, a symbol value (address), a type, a file in which the GNU is located, a number of symbols with the same name, etc., the symbol is used as a key, and the address is used as a value, and the table entry is an extension of the structure of the system struct elfxxx _ Sym (xx may be 64 or may be 32). The table is only used as a compiling link at present, so that the table is not resident in a memory, and the system is not required to be worried about the load of the table. This table is named GL-KV (G: GNU, L: linker, K: key, V: value) and the entries GL-KV _ sym, and these data are stored in a data file. In an embodiment of the present invention, the GL _ KV has the following general structure:
and creating a GL-KV symbol table data file:
(1) respectively collecting GNU library symbols and basic graphic library symbols;
(2) using the symbol name in the collected symbol set as the input of a bloom filter;
(3) taking the output of the bloom filter as the input of an index algorithm, and determining the position of a symbol in a data file;
(4) writing the symbol information into a symbol table;
(5) if the repetition is found after the indexing algorithm, another table is established for storing data, and whether the repetition exists is determined by GL _ KV _ sym- > rep- > sym _ r.
Then, please refer to fig. 2, which is a flow for implementing the lookup function optimization of the present invention, in the lookup function optimization, a lookup table is composed of n hash functions and a bit array, and the lookup table (i.e. the bit array) is specially directed to the GL _ KV table and is used for performing a lookup operation on the GL _ KV table; whether sym _ value corresponding to sym _ key exists can be quickly located by using a bloom filter, and the value of sym _ value can be quickly determined by using hash [0] (. once.) and hash [ n-1] (. once.) as factors of an indexing algorithm. Considering the result of "false positive" brought by the use of bloom filter, although the probability is very small, such error is very fatal to the link process, so as to determine whether the result is the sym _ value of "false positive" according to the value of GL _ KV _ sym- > sym _ key (as shown by the following code), thereby not only eliminating the influence of bloom filter "false positive", but also ensuring the correctness of bloom filter in the use process.
if(Del_FP(Get_S_K(key),Get_S_K(gl_sym_key)))
goto F_Positon;
else{
...
}
Finally, referring to fig. 3, for the implementation flow of query function optimization in the linking process of the present invention, in the linking process, links of hundreds of target files are queried in the same manner, and first, an OL _ KV (O: target file, L: link) table (as shown in the following code) is established.
And forming a lookup table by using n hash functions and a bit array, wherein the lookup table is specially used for OL _ KV and is used for performing lookup operation on the OL _ KV table, and using hash [0] (. once.), hash [ n-1] (. once.) as factors of an indexing algorithm to determine the value of sym _ value so as to complete relocation of symbols and correction of addresses. To avoid the effect of "false positive" brought by the bloom filter, OL _ KV- > sym _ key is used to ensure correctness. When the table is established, the symbols with the same name are judged, and if the symbols with the same name are divided into strong symbols and weak symbols, the strong symbols are used; if the symbols are weak symbols, uniformly marking the symbols as 0, and finally displaying the search failure; if the symbols are strong symbols, an error is reported immediately, and the failure of searching is also displayed.
In particular, during the linking process, many symbols of the same name will be involved. During linking, in the process of inquiring a key, firstly searching an OL _ KV table, checking whether the key exists, and if so, marking the symbol; then, inquiring a GL-KV table, inquiring whether the key exists, if so, comparing the key with symbols in the OL-KV table, if not, reporting errors if the key is a strong symbol and the symbols are strong symbols, wherein one symbol is a strong symbol, and selecting the strong symbol; if no key exists in the GL _ KV table and the OL _ KV table, an error is reported, and if only one key exists in the GL _ KV table and the OL _ KV table, the value corresponding to the key is used.
Note: the establishing mode of OL _ KV is the same as that of GL _ KV
The compiling and linking optimization method provided by the invention overcomes the defect of low compiling and linking speed, and can reduce the memory occupancy rate during linking, thereby achieving the purposes of improving the compiling and linking speed, saving the time cost and improving the production efficiency.
The compiling link optimizing method provided by the invention is suitable for optimizing compiling links under various software platforms, and is particularly suitable for compiling links under a Loongson platform.
In the present invention, the term "linker" refers to the linker LD of the GNU open source compiling linking tool binotils, which is used to link a plurality of target binary files into an executable binary file.
In the present invention, the so-called "GNU toolchain" is the force of implicit support behind each large open source project (including the Linux kernel itself). They consist of a set of necessary tools and software for compiling and debugging a wide variety of software, from the smallest tool software to the most complex, with the characteristics of the Linux kernel.
In the invention, the hash algorithm is also called as a hash algorithm and is a one-way cryptosystem, namely, the one-way cryptosystem is irreversible mapping from a plaintext to a ciphertext and only has an encryption process and no decryption process.
In the present invention, the term "bloom filter" refers to a long binary vector and a series of random mapping functions. The bloom filter can be used for searching whether an element is in a set, and has the advantages that the space efficiency and the query time are far higher than those of a general algorithm, and the defects that certain misrecognition rate and deletion difficulty exist, and the misrecognition is called false positive.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that the scope of the present invention is not limited thereto, and those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit and scope of the present invention.
Claims (4)
1. A compile link optimization method, characterized by: optimizing the link function of the linker by using the address space allocation function, the symbol resolution function and the relocation function of the linker LD of the GNU open source compiling and linking tool BINUTILS, wherein the optimized function comprises a symbol table establishing function, a searching function and a query function in the link process;
the GNU component maintains a symbol table of the basis that GNU toolchains and basic graphic libraries can provide, the symbol table includes symbol names, symbol values or addresses, types, files, number of homonyms, symbols as keys, addresses as values, and the symbol table has symbol names of GL _ KV, wherein: g: GNU, L: linker, K: key, V: value, wherein the entry is GL _ KV _ sym, the optimization of the symbol table establishing function is to establish GL _ KV symbol table, and the establishing of GL _ KV symbol table includes the following steps:
step S1: respectively collecting GNU library symbols and basic graphic library symbols;
step S2: inputting the symbol name of the symbol collected in step S1 into the bloom filter;
step S3: the output of the bloom filter is used as the input of an index algorithm, and the position of a symbol in a symbol table is determined;
step S4: writing the symbol information into a symbol table;
the optimization of the search function is completed by the following steps:
step SA: establishing a lookup table aiming at GL-KV symbols;
step SB: inputting sym _ key, and positioning whether the corresponding sym _ key exists in the lookup table through a bloom filter;
step SC: if not, the feedback search fails; if yes, executing the step SD;
step SD: judging whether the searched sym _ key is in a flash positive state, if so, feeding back the searching failure, and if not, feeding back the searching success;
the query function optimization in the linking process is completed through the following steps:
step Sa: respectively establishing a lookup table aiming at an OL _ KV symbol and a GL _ KV symbol, wherein O: a target file;
and Sb: inputting sym _ key;
step Sc: determining whether a qualified marked sym _ key exists in a look-up table for the OL _ KV symbol, and if so, marking its existence;
step Sd: determining whether a marked sym _ key meeting the requirement exists in a lookup table for the GL _ KV symbol, and if so, marking the existence of the marked sym _ key;
step Se: if the marked sym _ key meeting the requirement does not exist in the two lookup tables after the step Sc and the step Sd, an error is reported;
if one sym _ key with a mark exists in the two lookup tables, using the sym _ value corresponding to the sym _ key;
and if the marked sym _ keys exist in the two lookup tables, comparing the intensity, reporting an error if the marked sym _ keys in the two lookup tables are both strong symbols, and selecting sym _ value corresponding to the strong symbols if one of the marked sym _ keys in the two lookup tables is a strong symbol.
2. The compilation link optimization method of claim 1 wherein: the step Sc comprises the following steps:
step Sc 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the OL _ KV symbol; if not, executing step Sd, if yes, executing step Sc 2;
step Sc 2: judging whether the corresponding sym _ key is the active sym _ key, if not, marking the sym _ key to obtain the marked sym _ key and executing the step Sd; if yes, go to step Sc 3;
step Sc 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing the step Sd; if not, directly executing the step Sd.
3. The compile link optimization method of claim 1, wherein: the step Sd includes:
step Sd 1: locating, by the bloom filter, whether the corresponding sym _ key exists in the lookup table for the GL _ KV symbol; if not, performing step Se, and if so, performing step Sd 2;
step Sd 2: judging whether the corresponding sym _ key is a flash positive sym _ key or not, if not, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if yes, go to step Sd 3;
step Sd 3: querying whether the sym _ key exists in the repeated data or not, if so, marking the sym _ key to obtain a marked sym _ key and executing a step Se; if not, directly executing step Se.
4. The compilation link optimization method of claim 1 wherein: in the step Sa, the lookup table for the OL _ KV symbol and the GL _ KV symbol is calculated by a bloom filter composed of a plurality of hash functions and a bit array and a hash index algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711294532.0A CN109918074B (en) | 2017-12-08 | 2017-12-08 | Compiling link optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711294532.0A CN109918074B (en) | 2017-12-08 | 2017-12-08 | Compiling link optimization method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109918074A CN109918074A (en) | 2019-06-21 |
CN109918074B true CN109918074B (en) | 2022-09-27 |
Family
ID=66956601
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711294532.0A Active CN109918074B (en) | 2017-12-08 | 2017-12-08 | Compiling link optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109918074B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111208978B (en) * | 2019-12-31 | 2023-05-23 | 杭州安恒信息技术股份有限公司 | Character bloom filter implemented by taking Python as interface C++, and method for implementing character bloom filter |
CN111736816B (en) * | 2020-07-20 | 2020-11-24 | 华控清交信息科技(北京)有限公司 | Compiling and linking method and device and compiling and linking device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10275088A (en) * | 1997-03-31 | 1998-10-13 | Hitachi Ltd | Link optimizing method |
CN102385524A (en) * | 2011-12-23 | 2012-03-21 | 浙江大学 | Method for replacing compiling chain order based on mixed-compiling order set |
CN103034486A (en) * | 2012-11-28 | 2013-04-10 | 清华大学 | Automatic optimization method based on full-system expansion call graph for mobile terminal operation system |
CN104951290A (en) * | 2014-03-31 | 2015-09-30 | 国际商业机器公司 | Method and equipment for optimizing software |
CN105320654A (en) * | 2014-05-28 | 2016-02-10 | 中国科学院深圳先进技术研究院 | Dynamic bloom filter and element operating method based on same |
-
2017
- 2017-12-08 CN CN201711294532.0A patent/CN109918074B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10275088A (en) * | 1997-03-31 | 1998-10-13 | Hitachi Ltd | Link optimizing method |
CN102385524A (en) * | 2011-12-23 | 2012-03-21 | 浙江大学 | Method for replacing compiling chain order based on mixed-compiling order set |
CN103034486A (en) * | 2012-11-28 | 2013-04-10 | 清华大学 | Automatic optimization method based on full-system expansion call graph for mobile terminal operation system |
CN104951290A (en) * | 2014-03-31 | 2015-09-30 | 国际商业机器公司 | Method and equipment for optimizing software |
CN105320654A (en) * | 2014-05-28 | 2016-02-10 | 中国科学院深圳先进技术研究院 | Dynamic bloom filter and element operating method based on same |
Non-Patent Citations (1)
Title |
---|
"龙芯链接后优化器设计与分析";陈瑜等;《计算机研究与发展》;20060911;第43卷(第8期);1450-1456页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109918074A (en) | 2019-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9721116B2 (en) | Test sandbox in production systems during productive use | |
US8843502B2 (en) | Sorting a dataset of incrementally received data | |
TWI479341B (en) | High throughput, reliable replication of transformed data in information systems | |
US20140025684A1 (en) | Indexing and searching a data collection | |
JP7047228B2 (en) | Data query methods, devices, electronic devices, readable storage media, and computer programs | |
US20090064096A1 (en) | System and methods for tracing code generation in template engines | |
US9218394B2 (en) | Reading rows from memory prior to reading rows from secondary storage | |
CN105095287A (en) | LSM (Log Structured Merge) data compact method and device | |
CN103914483B (en) | File memory method, device and file reading, device | |
US9953106B2 (en) | Dynamic generation of traversal code for a graph analytics environment | |
US20160103858A1 (en) | Data management system comprising a trie data structure, integrated circuits and methods therefor | |
CN111078672B (en) | Data comparison method and device for database | |
CN105989015B (en) | Database capacity expansion method and device and method and device for accessing database | |
CN104424256A (en) | Method and device for generating Bloom filter | |
US20170161027A1 (en) | Learning from input patterns in programing-by-example | |
CN109918074B (en) | Compiling link optimization method | |
CN108268596B (en) | Method and system for searching data stored in memory | |
CN114185895A (en) | Data import and export method and device, electronic equipment and storage medium | |
CN114942863A (en) | Cascade snapshot processing method, device and equipment and storage medium | |
WO2018228001A1 (en) | Electronic device, information query control method, and computer-readable storage medium | |
CN105930104B (en) | Date storage method and device | |
CN105264519A (en) | Columnar database processing method and device | |
CN106980673A (en) | Main memory database table index updating method and system | |
CN104516823A (en) | Method and device for storing data | |
CN111198880A (en) | Data storage method and device based on redis and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |