CN109933532A - One kind being based on matched Internet of Things firmware library function recognition methods - Google Patents

One kind being based on matched Internet of Things firmware library function recognition methods Download PDF

Info

Publication number
CN109933532A
CN109933532A CN201910216299.7A CN201910216299A CN109933532A CN 109933532 A CN109933532 A CN 109933532A CN 201910216299 A CN201910216299 A CN 201910216299A CN 109933532 A CN109933532 A CN 109933532A
Authority
CN
China
Prior art keywords
function
firmware
library
feature
internet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910216299.7A
Other languages
Chinese (zh)
Inventor
朱立鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910216299.7A priority Critical patent/CN109933532A/en
Publication of CN109933532A publication Critical patent/CN109933532A/en
Pending legal-status Critical Current

Links

Landscapes

  • Computer And Data Communications (AREA)

Abstract

It is specifically a kind of to be based on matched Internet of Things firmware library function recognition methods, comprising the following steps: 1) Internet of Things firmware loads Address Recognition the present invention relates to software reversal technique field;2) library function feature database is constructed;3) firmware function feature extraction;4) Function feature matches;5) the subfunction matching of recognition function.The present invention can extract binary function code characteristic, respectively firmware and function library construction feature database from multiple dimensions such as function signature, assembly instruction, character string information;Recycle the library function in the method identification firmware based on characteristic matching;By the subfunction of Auto-matching recognition function, Library function recognition efficiency and accuracy rate are further increased.The quick and precisely identification of library function in Internet of Things firmware analysis may be implemented in technical solution of the present invention, improves the correctness and readability of firmware automated analysis and bug excavation result.

Description

One kind being based on matched Internet of Things firmware library function recognition methods
Technical field
It is specifically a kind of to be based on matched Internet of Things firmware Library function recognition side the present invention relates to software reversal technique field Method.
Background technique
Internet of Things industry development in recent years is rapid, and more and more equipment enter in the life of the public.Internet of things equipment The various industries such as personal wearing, household safety-protection, communication and logistics, smart home are covered, have been provided for people's lives huge Traversal.But at the same time, these equipment are vulnerable to the malicious attack of attacker.Attacker passes through using in equipment firmware Loophole controls equipment, and monitoring user behavior steals privacy of user, seriously endangers the privacy and property safety of user.Therefore in order to Malicious attack is resisted, many security study personnel carry out Binary analysis to firmware.But since internet of things equipment stores and calculates Resource constraint often eliminates symbol table information when compiling firmware.In binary vulnerability mining process, it is special to find Library function leads to not the operating status and loophole situation of accurately analyzing program.
Function signature identification is based primarily upon to the analysis of library function at present.Function feature is signed using hash algorithm Name, when carrying out function identification, only need to load corresponding signature file can be completed identification work.Signature mechanism avoids need not The matching wanted, substantially increasing function is other efficiency.
However the Library function recognition method based on signature and compiler and compiling optimization contact closely, compile the small of option Difference will lead to the huge change of two function signatures, increases the rate of failing to report of Library function recognition, also increases automatic identification Difficulty.In order to reduce the obstruction in Library function recognition, the Library function recognition method of high-accuracy and low rate of failing to report certainly will be needed.
Therefore, for the above status, there is an urgent need to develop one kind to be based on matched Internet of Things firmware library function recognition methods, To overcome the shortcomings of in currently practical application.
Summary of the invention
The purpose of the present invention is to provide one kind to be based on matched Internet of Things firmware library function recognition methods, above-mentioned to solve The problem of being proposed in background technique.
To achieve the above object, the invention provides the following technical scheme:
One kind being based on matched Internet of Things firmware library function recognition methods, comprising the following steps:
1) Internet of Things firmware loads Address Recognition: each by the chip model information and matching of character string in retrieval firmware The corresponding condition code of chip model determines the correct load address and blocking information of Internet of Things firmware, which is the letter of guarantee The number correct premise of feature;
2) library function feature database is constructed:
(2a) selects dis-assembling engine: according to the chip model obtained in step 1), corresponding function library is collected, according to core Chip architecture selects suitable dis-assembling engine loading function library spare;
(2b) extracts function code feature: the feature of function is extracted using dis-assembling engine, required feature includes: letter Several, function address, function assembly code, function instruction set, function code Hash, function digit constant, function draw indirectly Character string;
(2c) extracts function call stream feature: by retrieval jump instruction, function call process neutron function information is extracted, Each function is considered as a node, the function of the function call is considered as child node, and entire function library is configured to one completely Calling flow graph, and record the function name, child node, out-degree and in-degree information of each node of this figure;
(2d) construction feature database: function code feature and function call stream feature are protected in the form of database It deposits, and constructs different lists;
3) firmware function feature extraction: according to firmware loads obtained in step 1) address and firmware blocking information, Loading firmware, the method according to step 2) extract the feature of each function and entire firmware in firmware and call flow graph information;
4) Function feature matches: to step 2), 3) the middle feature extracted matches, each in contrast characteristic's database Item feature, and matching rule is formulated according to the different weight of each feature;
5) the subfunction matching of recognition function: default each matched function, calling the sequence of subfunction is phase With, in order to improve library function matching speed, subfunction is directly matched, the subfunction of identified function is matched, root According to the calling stream information in property data base, grade matching is ranked up according to calling sequence to each child node, completes library function Identify work.
As a further solution of the present invention: in step 1), the Internet of Things firmware includes ARM, MIPS, Xtensa framework Internet of things equipment firmware and internet of things equipment firmware based on linux system.
As a further solution of the present invention: in step (2a), dis-assembling engine includes IDA Pro, Radare2.
As a further solution of the present invention: in step (2c), jump instruction includes BL, B and BLX jump instruction.
As a further solution of the present invention: in step (2d), type of database includes MySQL and Sqlite3.
Compared with prior art, the beneficial effects of the present invention are: the present invention can be from function signature, assembly instruction, character string Multiple dimensions such as information extract binary function code characteristic, respectively firmware and function library construction feature database;It recycles Library function in method identification firmware based on characteristic matching;By the subfunction of Auto-matching recognition function, further mention High library function recognition efficiency and accuracy rate improve solid to realize the quick and precisely identification of library function in Internet of Things firmware analysis The correctness and readability of part automated analysis and bug excavation result provide for the automated analysis and bug excavation of firmware Knowledge Base, code characteristic and code signature are combined, and solve the defect that single library function signature rate of failing to report is high at present.
Detailed description of the invention
Fig. 1 is the schematic flow diagram based on matched Internet of Things firmware library function recognition methods.
Fig. 2 is based on the firmware piecemeal in matched Internet of Things firmware library function recognition methods in happy chip firmware head Information and load address message structure schematic diagram.
Fig. 3 is based on wdt_drv_init function call in function library in matched Internet of Things firmware library function recognition methods Flow graph.
Fig. 4 is based on wdt_drv_init function call stream in firmware in matched Internet of Things firmware library function recognition methods Figure.
Specific embodiment
The technical solution of the patent is explained in further detail With reference to embodiment.
The embodiment of this patent is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and is only used for explaining this patent, and cannot be understood as a limitation of this patent.
In the description of this patent, it is to be understood that term " center ", "upper", "lower", "front", "rear", " left side ", The orientation or positional relationship of the instructions such as " right side ", "vertical", "horizontal", "top", "bottom", "inner", "outside" is based on the figure Orientation or positional relationship, be merely for convenience of description this patent and simplify description, rather than the device of indication or suggestion meaning or Element must have a particular orientation, be constructed and operated in a specific orientation, therefore should not be understood as the limitation to this patent.
In the description of this patent, it should be noted that unless otherwise clearly defined and limited, term " installation ", " phase Even ", " connection ", " setting " shall be understood in a broad sense, for example, it may be being fixedly linked, being arranged, may be a detachable connection, set It sets, or is integrally connected, is arranged.For the ordinary skill in the art, above-mentioned art can be understood as the case may be The concrete meaning of language in this patent.
Fig. 1-4 is please referred to, one kind being based on matched Internet of Things firmware library function recognition methods, comprising the following steps:
1) Internet of Things firmware loads Address Recognition: each by the chip model information and matching of character string in retrieval firmware The corresponding condition code of chip model, determines the correct load address and blocking information of Internet of Things firmware, firmware loads address and Blocking information is used in dis-assembling engine loading firmware;
The Internet of Things firmware includes but is not limited to the internet of things equipment firmware of ARM, MIPS, Xtensa framework and is based on The internet of things equipment firmware of linux system, specifically, in this implementation, Internet of Things firmware is happy chip firmware, in fastener heads Firmware blocking information and load address information it is specific as follows:
Preceding 4 bytes " MRVL " are happy chip firmware condition code;12-16 byte is a 32-bit number, is indicated Piecemeal quantity in firmware,;And then 21-24 byte flag the type of section 1;25-28 byte be section 1 hereof Initial position;29-32 byte is the length of section 1;33-16 byte is the load address of section 1 in firmware.
2) library function feature database is constructed:
(2a) selects dis-assembling engine: internet of things equipment framework is different, makes to be compatible with ARM, MIPS, Xtensa framework etc. With different instruction set, select suitable dis-assembling engine, in the present embodiment, dis-assembling engine include but is not limited to IDA Pro, Radare2 selects suitable parameter according to firmware chip architecture, is loaded into spare in dis-assembling engine;
(2b) extracts function code feature: the feature of function is extracted using dis-assembling engine, required feature includes: letter Several, function address, function assembly code, function instruction set, function code Hash, function digit constant, function draw indirectly Character string;
In the present embodiment, after the function uses QSPI_WriteByte function, QSPI_WriteByte function to extract Code characteristic is as shown in table 1:
Code characteristic after the extraction of 1 QSPI_WriteByte function of table
(2c) extracts function call stream feature: function can call either internally or externally function in the process of running, pass through inspection The subfunction information of function is extracted in the jump instructions such as rope BL, B and BLX, each function is considered as a node, then the function The function of calling is exactly child node, and entire function library is configured to a complete calling flow graph.Record each node of this figure Function name, child node, out-degree and in-degree information;
In the present embodiment, the function uses QSPI_WriteByte function, and the calling of QSPI_WriteByte function is believed Breath is as shown in table 2:
The recalls information of 2 QSPI_WriteByte function of table
Node name QSPI_WriteByte
Out-degree 1
In-degree 7
Child node ["QSPI_WriteByte"]
(2d) construction feature database: function code feature and function call stream feature are protected in the form of database Deposit, construct different lists, use when library function being facilitated to match, selectable type of database include but is not limited to MySQL and Sqlite3 etc.;
3) firmware function feature extraction: the code characteristic extracting mode in firmware and content phase described in step (2) Together, generated code characteristic database list list item is also identical;
4) Function feature matches: we match the feature extracted in step (2), (3), before guaranteeing accuracy rate It puts, in order to improve matching efficiency, setting matching rule is as shown in table 3 for we:
3 Function feature matching rule of table
5) the subfunction matching of recognition function: this step matches the subfunction of identified function, according to spy The calling stream information in database is levied, grade matching is ranked up according to calling sequence to each child node, is finally completed library function Identify work;
The present embodiment, by contrast function bank code and firmware code, illustrates son by taking wdt_drv_init function as an example The matched importance of function;
Fig. 3 illustrates the calling flow graph of wdt_drv_init function in function library, it can be seen that has subfunction mdev_get_ Two functions of handle and mdev_register;
Fig. 4 illustrates the calling flow graph of the wdt_drv_init function recognized in firmware, it can be seen that has subfunction sub_ 104398 and sub_10430c is called in stream information and library function unanimously, therefore directly matching subfunction is special without carrying out Sign matching.
The property data base constructed in the present embodiment step (2d) shares 2411, firmware function number in step (in 4) 1972, step (4) identifies 1137 library functions, and step (5) identifies 172 library functions, compared to single function label Name compares, and improves 15% function discrimination.
The above are merely the preferred embodiment of the present invention, it is noted that for those skilled in the art, not Under the premise of being detached from present inventive concept, several modifications and improvements can also be made, these also should be considered as protection model of the invention It encloses, these all will not influence the effect and patent practicability that the present invention is implemented.

Claims (5)

1. one kind is based on matched Internet of Things firmware library function recognition methods, which comprises the following steps:
1) Internet of Things firmware loads Address Recognition: pass through the chip model information and each chip of matching of character string in retrieval firmware The corresponding condition code of model determines the correct load address and blocking information of Internet of Things firmware;
2) library function feature database is constructed:
(2a) selects dis-assembling engine: according to the chip model obtained in step 1), corresponding function library is collected, according to chip knot Structure selects suitable dis-assembling engine loading function library spare;
(2b) extracts function code feature: the feature of function is extracted using dis-assembling engine, required feature includes: function Name, function address, function assembly code, function instruction set, function code Hash, function digit constant, function indirect referencing Character string;
(2c) extracts function call stream feature: by retrieving jump instruction, extraction function call process neutron function information will be every A function is considered as a node, and the function of the function call is considered as child node, and entire function library is configured to a complete tune With flow graph, and record the function name, child node, out-degree and in-degree information of each node of this figure;
(2d) construction feature database: function code feature and function call stream feature are saved in the form of database, and Construct different lists;
3) firmware function feature extraction: according to firmware loads obtained in step 1) address and firmware blocking information, load Firmware, the method according to step 2) extract the feature of each function and entire firmware in firmware and call flow graph information;
4) Function feature matches: to step 2), 3) the middle feature extracted matches, and each single item in contrast characteristic's database is special Sign, and matching rule is formulated according to the different weight of each feature;
5) the subfunction matching of recognition function: the subfunction of identified function is matched, according in property data base Calling stream information, to each child node according to calling sequence be ranked up grade matching, complete Library function recognition work.
2. according to claim 1 be based on matched Internet of Things firmware library function recognition methods, which is characterized in that step 1) In, the Internet of Things firmware includes the internet of things equipment firmware of ARM, MIPS, Xtensa framework and the Internet of Things based on linux system Net equipment firmware.
3. according to claim 2 be based on matched Internet of Things firmware library function recognition methods, which is characterized in that step In (2a), dis-assembling engine includes IDA Pro, Radare2.
4. according to claim 3 be based on matched Internet of Things firmware library function recognition methods, which is characterized in that step In (2c), jump instruction includes BL, B and BLX jump instruction.
5. according to claim 1 to 4 be based on matched Internet of Things firmware library function recognition methods, which is characterized in that In step (2d), type of database includes MySQL and Sqlite3.
CN201910216299.7A 2019-03-20 2019-03-20 One kind being based on matched Internet of Things firmware library function recognition methods Pending CN109933532A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910216299.7A CN109933532A (en) 2019-03-20 2019-03-20 One kind being based on matched Internet of Things firmware library function recognition methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910216299.7A CN109933532A (en) 2019-03-20 2019-03-20 One kind being based on matched Internet of Things firmware library function recognition methods

Publications (1)

Publication Number Publication Date
CN109933532A true CN109933532A (en) 2019-06-25

Family

ID=66987860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910216299.7A Pending CN109933532A (en) 2019-03-20 2019-03-20 One kind being based on matched Internet of Things firmware library function recognition methods

Country Status (1)

Country Link
CN (1) CN109933532A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306522A (en) * 2020-09-29 2021-02-02 北京软慧科技有限公司 Firmware updating mode identification method and device
CN113382006A (en) * 2021-06-15 2021-09-10 中国信息通信研究院 Internet of things terminal security and risk assessment and evaluation method
CN114666134A (en) * 2022-03-23 2022-06-24 南昌大学 Intelligent discovery and mining method and system for network vulnerabilities

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306522A (en) * 2020-09-29 2021-02-02 北京软慧科技有限公司 Firmware updating mode identification method and device
CN113382006A (en) * 2021-06-15 2021-09-10 中国信息通信研究院 Internet of things terminal security and risk assessment and evaluation method
CN113382006B (en) * 2021-06-15 2022-12-16 中国信息通信研究院 Internet of things terminal security and risk assessment and evaluation method
CN114666134A (en) * 2022-03-23 2022-06-24 南昌大学 Intelligent discovery and mining method and system for network vulnerabilities

Similar Documents

Publication Publication Date Title
CN109753800B (en) Android malicious application detection method and system fusing frequent item set and random forest algorithm
Bao et al. {BYTEWEIGHT}: Learning to recognize functions in binary code
CN111400719B (en) Firmware vulnerability distinguishing method and system based on open source component version identification
US9940581B2 (en) Ontology-aided business rule extraction using machine learning
CN109063055B (en) Method and device for searching homologous binary files
US20170214704A1 (en) Method and device for feature extraction
CN109933532A (en) One kind being based on matched Internet of Things firmware library function recognition methods
CN114077741B (en) Software supply chain safety detection method and device, electronic equipment and storage medium
US10845939B2 (en) Method and system for determining user interface usage
CN105550594A (en) Security detection method for android application file
US11222179B2 (en) Named entity recognition and extraction using genetic programming
CN105653949B (en) A kind of malware detection methods and device
CN101751530A (en) Method for detecting loophole aggressive behavior and device
CN113297580B (en) Code semantic analysis-based electric power information system safety protection method and device
Xu et al. Interpretation-enabled software reuse detection based on a multi-level birthmark model
CN107239694A (en) A kind of Android application permissions inference method and device based on user comment
CN110362995A (en) It is a kind of based on inversely with the malware detection of machine learning and analysis system
CN111881300A (en) Third-party library dependency-oriented knowledge graph construction method and system
CN109325353A (en) A kind of cluster leak analysis method for home router
CN116305158A (en) Vulnerability identification method based on slice code dependency graph semantic learning
CN116186716A (en) Security analysis method and device for continuous integrated deployment
CN115017514A (en) Intelligent contract vulnerability detection method based on abstract syntax tree and application
CN113886832A (en) Intelligent contract vulnerability detection method, system, computer equipment and storage medium
US11868473B2 (en) Method for constructing behavioural software signatures
CN116821903A (en) Detection rule determination and malicious binary file detection method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190625

WD01 Invention patent application deemed withdrawn after publication