CN113849785B - Mobile terminal information asset use behavior identification method for application program - Google Patents

Mobile terminal information asset use behavior identification method for application program Download PDF

Info

Publication number
CN113849785B
CN113849785B CN202110866631.1A CN202110866631A CN113849785B CN 113849785 B CN113849785 B CN 113849785B CN 202110866631 A CN202110866631 A CN 202110866631A CN 113849785 B CN113849785 B CN 113849785B
Authority
CN
China
Prior art keywords
application program
information
acquiring
target application
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110866631.1A
Other languages
Chinese (zh)
Other versions
CN113849785A (en
Inventor
何能强
秦佳伟
贾世林
张华�
涂腾飞
关振智
关广振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongshi Ruian Beijing Network Technology Co ltd
Beijing University of Posts and Telecommunications
National Computer Network and Information Security Management Center
Original Assignee
Zhongshi Ruian Beijing Network Technology Co ltd
Beijing University of Posts and Telecommunications
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongshi Ruian Beijing Network Technology Co ltd, Beijing University of Posts and Telecommunications, National Computer Network and Information Security Management Center filed Critical Zhongshi Ruian Beijing Network Technology Co ltd
Priority to CN202110866631.1A priority Critical patent/CN113849785B/en
Publication of CN113849785A publication Critical patent/CN113849785A/en
Application granted granted Critical
Publication of CN113849785B publication Critical patent/CN113849785B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Stored Programmes (AREA)

Abstract

The present disclosure provides a mobile terminal information asset usage behavior recognition method for an application program, comprising: determining information assets in the mobile terminal related to the user sensitive information; based on instrumentation and kernel monitoring, acquiring the behavior of a target application program to acquire and/or use information assets; acquiring a user protocol of a target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a semantic analysis model; comparing the behavior of the target application to acquire and/or use the information asset with a protocol associated with the user's acquisition and/or use of the information asset, and identifying violations in the behavior of the target application to acquire and/or use the information asset. According to the method and the device, through instrumentation and kernel monitoring, the behavior that the application program actually uses the information asset in the mobile terminal is obtained, and compared with the authority of the application program declared in the user protocol, whether the application program has illegal behaviors exceeding the declared authority can be accurately and efficiently determined.

Description

Mobile terminal information asset use behavior identification method for application program
Technical Field
The disclosure relates to the technical field of information security, in particular to a mobile terminal information asset use behavior identification method aiming at application programs.
Background
With the development of the network intelligence era, the use rate of mobile devices is increasing, and the number of application programs in application stores at home and abroad is also increasing. The convenient service of the mobile terminal application is convenient for the daily life of people, and the application program providing the functions of mobile payment, online shopping and the like is slowly integrated into the life of people. While enjoying the unprecedented convenience of a function-rich application, users are also constantly exposed to various attacks and privacy disclosure risks caused by unsafe applications. The phenomenon of personal information leakage of users is more and more common, and the event of acquiring violence by illegally buying and selling personal privacy information is more and more common.
Malicious software based on an android platform is increased increasingly, and more malicious application programs actually use rights exceeding the rights information declared by the privacy policy of the malicious application programs. Therefore, monitoring the information asset usage and calling activities of the android application is a necessary means for guaranteeing the privacy security of users. Therefore, how to accurately and efficiently detect and identify whether the massive android application programs have illegal acquisition and use of user asset data is a problem to be solved.
Disclosure of Invention
In view of the foregoing, an object of the present disclosure is to provide a method for identifying a mobile terminal information asset usage behavior for an application.
Based on the above object, the present disclosure provides a mobile terminal information asset usage behavior recognition method for an application, including:
determining information assets in the mobile terminal related to the user sensitive information;
based on instrumentation and kernel monitoring, acquiring the behavior of a target application program to acquire and/or use the information asset;
acquiring a user protocol of the target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
and comparing the behavior of the target application program for acquiring and/or using the information asset with the protocol related to user information acquisition and/or use, and identifying the illegal behavior in the behavior of the target application program for acquiring and/or using the information asset.
As can be seen from the above, the present disclosure provides a mobile terminal information asset usage behavior recognition method for an application, including: determining information assets in the mobile terminal related to the user sensitive information; based on instrumentation and kernel monitoring, acquiring the behavior of a target application program to acquire and/or use information assets; acquiring a user protocol of a target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model; comparing the behavior of the target application to acquire and/or use the information asset with a protocol associated with the user's acquisition and/or use of the information asset, and identifying violations in the behavior of the target application to acquire and/or use the information asset. According to the method and the device, through instrumentation and kernel monitoring, the behavior that the application program actually uses the information asset in the mobile terminal is obtained, and compared with the authority of the application program declared in the user protocol, whether the application program has illegal behaviors exceeding the declared authority can be accurately and efficiently determined.
Drawings
In order to more clearly illustrate the technical solutions of the present disclosure or related art, the drawings required for the embodiments or related art description will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to those of ordinary skill in the art.
Fig. 1 is a schematic flow chart of a method for identifying usage behavior of an information asset of a mobile terminal for an application according to an embodiment of the disclosure;
FIG. 2 is a flow diagram of a behavior acquisition method for an application to acquire information assets, provided in accordance with an embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating a comparison of invoking a generic interface and invoking a monitored interface provided in accordance with an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a mobile terminal information asset usage behavior recognition device for an application according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the disclosure.
Detailed Description
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present disclosure pertains. The terms "first," "second," and the like, as used in embodiments of the present disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
According to the description of the background art, the technical problem to be solved in practice of the present disclosure is as follows: how to accurately and efficiently identify whether the behavior of an application using information assets in a mobile terminal is beyond the scope of rights stated in the application's protocol. In view of this, the present disclosure proposes a mobile terminal information asset usage behavior recognition method for an application.
Referring to fig. 1, a flow chart of a method for identifying mobile terminal information asset usage behavior for an application according to an embodiment of the disclosure is shown; the mobile terminal information asset use behavior identification method for the application program comprises the following steps:
s110, information assets related to the user sensitive information in the mobile terminal are determined.
The method specifically comprises the following steps:
in some embodiments, relevant specifications and regulations concerning the application program's acquisition and/or use of user information in the mobile terminal are obtained, and by analyzing these specifications and regulations, information assets in the mobile terminal that are relevant to the user's sensitive information are determined.
In some embodiments, information assets in the mobile terminal which are related to the user sensitive information are preset, and when the method is implemented, the information assets in the mobile terminal which are related to the user sensitive information are directly determined according to preset contents.
Wherein after determining the information assets related to the user sensitive information in the mobile terminal, further comprises:
a mapping relationship between the behavior of acquiring and/or using the information asset and the interface and rights of the target application is determined.
According to the mapping relation, whether the target application program has the action of acquiring and/or using the information asset or not can be determined through the interface called by the target application program and the using authority.
S120, based on instrumentation and kernel monitoring, the behavior of the target application program for acquiring and/or using the information asset is acquired.
The method specifically comprises the following steps:
referring to FIG. 2, a flow chart of a behavior acquisition method for acquiring information assets by an application according to an embodiment of the present disclosure is shown:
s210, installing the target application program into the mobile terminal.
In some embodiments, an adb tool is utilized to enable automated installation of applications. adb (Android Debug Bridge ), a command line tool, an adb command that can be used to perform various device operations (e.g., install and debug applications) and provide access to Unix shell (which can be used to run various commands on a device). Which is a client-server program.
S220, constructing a driving event based on a human-computer interaction interface, and realizing automatic driving of the target application program.
Typically, an application program executes a process (a process refers to a running activity of a program in a computer on a certain data set) by acquiring a click command or inputting a command. The present disclosure refers to a driver that refers to an action that causes an application program to execute a process, including a click class and an input class.
In the related art, a click event and an input event are generally constructed using a source code of an application program to drive the application program for an automated test. First, this approach generally does not support automated analog driving of inter-process communication events because the user spaces of processes are independent of each other, generally not accessible to each other, different processes cannot run simultaneously in one operating system, and cannot communicate and exchange information with each other. This enables one program to handle the requirements of many users at the same time. Secondly, for different applications, the click event and the input event need to be completely reconstructed by using the source code of the application, and in short, artificial reprogramming is needed, so that the method has high cost, low efficiency and high implementation threshold.
The method for realizing the automatic driving of the application program based on the man-machine interaction interface call graph is independent of the source code of the application program to construct clicking events and input events, and can support the automatic simulation driving of inter-process communication events.
The man-machine interaction interface comprises a virtual key and an input window; the man-machine interaction interface-based call construction driving event realizes the automatic driving of the target application program, and comprises the following steps:
In response to determining that the input window exists in the man-machine interaction interface, acquiring description information of the input window and coordinate information of a text box corresponding to the input window, and acquiring simulation data corresponding to the description information to construct an input class driving event so as to realize simulation input operation;
and responding to the fact that the virtual key exists in the man-machine interaction interface, acquiring coordinate information of the virtual key to construct a click type driving event, and realizing simulated click operation.
Specifically, when the application program is started, the application program is presented to a user to be an intuitive man-machine interaction interface, and input type and click type events are displayed in detail in the man-machine interaction interface. The present disclosure proposes a human-machine interaction interface call graph (UCG) to represent the relationship between a human-machine interaction interface (UI) and a driving event. Actuation of an event (event) if present results in a page human machine interface UI k Jump to human-computer interaction interface UI w Then there is an edgeV denotes the whole interface of the application program and E denotes the whole set of events. Wherein, the man-machine interaction interface call graph ucg= (V, E) = {<UI k ,UI w >|UI k ,UI w ∈V}。
The present disclosure simulates the process of a user using an application program, triggering functions presented by a human-machine interaction interface of the current application program. If the event belongs to the input type event, the description information of the event is acquired, and the description information comprises key information such as an input account number, search, input password and the like. According to the information, the data of the corresponding type is selected from the information base for analog input, so that the function of the tested application program can be driven more to improve the code coverage rate. Because the input of text requires knowledge of the location of the text box in the human-machine interface, the coordinates x and y of the text box are obtained. With the data, the input type event can be automatically input. If the time belongs to the clicking event, the triggering of the clicking event only needs to acquire the position of the time on the man-machine interaction interface, so that the automatic simulated clicking operation can be realized only by acquiring the coordinates x and y of the event. Because the clicking event is likely to cause the current man-machine interface to change, the construction of the UCG is performed once again after a clicking event is simulated. And after the automatic simulation of all the man-machine interaction interface driving time is completed, starting to simulate the communication event between the processes. And acquiring the actions, the connections and the data recorded in the time, automatically constructing an Intent message, then sending the Intent message to an application program to be tested, and creating the UCG for subsequent driving when the application program displays a new human-computer interaction interface which is not available in the prior UCG.
The automation driving provided by the disclosure can improve the coverage rate of the application program, and meanwhile, the event of repeated driving is not performed.
S230, according to the mapping relation, based on instrumentation and kernel monitoring, the behavior of the target application program for acquiring and/or using the information asset in the driving event is acquired.
The method specifically comprises the following steps:
and implanting the monitoring system into an android frame, setting a buried point for an interface of the target application program to monitor, responding to the fact that the monitored interface is called, reporting to a log record service and performing persistent storage.
The monitoring system is an interface behavior monitoring engine (the engine refers to a program or a core component of the system), the monitoring system is implanted into an android frame in a dynamic proxy mode based on an open source android system idea and monitors the embedded points of the interface setting, so that an android system from top to bottom is formed, the embedded points of the interface setting are monitored by adopting a technical means, and the monitored interface is called and immediately reported to a log record service for persistent storage. The method can ensure that the monitoring system is not invasive to the original framework, and the lower application is completely unaware of the original framework, so that the monitoring engine can truly and effectively complete real-time monitoring of all privacy information acquisition interfaces. The engine cover comprises camera shooting, microphone recording, geographical position acquisition, information asset acquisition by using application programs such as a shear plate and the like, and real-time monitoring is carried out.
Referring to fig. 3, a comparison diagram of calling a generic interface and calling a monitored interface is provided according to an embodiment of the present disclosure.
When the monitored interface and behavior are called, the interface and behavior are immediately reported to the log record service for persistent storage, calling behavior records can be displayed in real time, the interface is covered comprehensively, and statistical records can be exported to the PC end. The behavior monitored by the engine may be mapped to the interface of the system and the recorded content includes access subject, access object, time of occurrence, raw data and behavior description, etc. The monitoring means adopted by the engine is a passive triggering mode at a system level, can monitor the calling or access condition of a system interface in real time, can generate a system interface calling or access log record, and the monitoring activity is not stopped or bypassed, and the monitoring result is consistent with the actual execution result of the system.
The present engine will adapt two different virtual machines to be compatible with all android frame versions, including Dalvik virtual machine based adaptation and ART virtual machine based adaptation.
Adaptation scheme based on Dalvik virtual machine:
and performing interpretation and execution of byte codes on the Dalvik virtual machine adopted by the previous device of the 4.4 version of the early android system. The scheme of the system for adapting the Dalvik virtual machine is that all sensitive information acquisition interfaces are modified into a JNI method, and then the JNI method is bound to a local system custom processing method logic function to realize. The system acquires all the sensitive information acquisition interface methods, transmits the methods and index values thereof in the Dalvik virtual machine to Native layer hosting functions preset by the system, and takes over the call of all the monitored behaviors by the system, so that the real-time monitoring of the call behaviors of the privacy information acquisition interface is realized.
When the monitored privacy information acquisition interface call behavior occurs, the Dalvik virtual machine performs interpretation and execution of codes, when a Java layer Method enters the Dalvik virtual machine, the Java layer Method is converted into a Method type instance, the virtual machine performs dynamic judgment on the Method instance, when the current instance is found to be a JNI Method, the bound Native Func function is directly called, so that the Native layer managed function preset by the system is entered, the function can complete parameter forwarding of a monitoring Method, call logs are subjected to persistent storage, a return value can be saved after the execution of the Method is finished, and the return value is returned to the Method of the original call interface, so that dynamic monitoring of the monitoring behavior is completed.
An adaptation scheme based on an ART virtual machine:
after the version of the android system is more than 5.0, the android system formally adopts the operation of an ART virtual machine mode, so that the Dalvik virtual machine becomes a history, and all mobile phones on the market currently operate in the ART virtual machine mode, so that the adaptation of the ART virtual machine is very important and significant.
The adaptation scheme based on the ART virtual machine is more complicated than that of the Dalvik virtual machine, which needs to modify and compile the source code of the ART virtual machine again, and the modified libart. So replaces the original ART virtual machine to realize the adaptation scheme.
The ART virtual machine operates in an AOT precompiled mode to improve the execution efficiency, byte codes contained in the APK are converted into machine instructions in the application installation process, the application can directly execute the machine instructions corresponding to each method in the actual operation process, and the execution efficiency is higher than that of the Dalvik virtual machine adopting JIT just-in-time compiling.
Similar to the Dalvik virtual machine adaptation scheme, the present disclosure delivers all the monitored privacy acquisition interface methods to a preset Native layer takeover function, which uses the ArtMethod conversion function provided by frame to convert the Java layer reflection object Method instance into an ArtMethod instance, and simultaneously wears a copy object to ensure that the original Method can be normally used later, and then resets the ArtMethod instance to the memory address of its assembly code, and directs it to a section of fixed code, which is originally used to process Java dynamic code Method functions, and the present system modifies it and adds real-time monitoring logic to it, thereby realizing dynamic monitoring of monitoring behavior.
Aiming at the problem that all privacy data acquisition interfaces provided by Java SDKs cannot be intercepted and captured in a framework layer, partial interfaces are monitored and lost, and a dynamic binary instrumentation technology is adopted to realize a dynamic behavior monitoring engine of a mobile application program in order to meet the integrity of interface monitoring. The engine support can identify the sensitive authority and sensitive behavior of the application program and can identify the embedded SDK and the sensitive authority thereof in the application program.
S130, acquiring a user protocol of the target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model.
The method specifically comprises the following steps:
each application has a specific privacy policy or privacy protocol (e.g., user registration and use of the application privacy protocol) that is preprocessed to obtain textual feature representations before being entered into the semantic analysis model. The pretreatment comprises the following steps: clauses, segmentation, stop word removal, etc.
Clause: for privacy policy analysis, accurately detecting sentence boundaries is critical to splitting privacy policy descriptions into sentences. The present disclosure divides sentence patterns into two types, one is the use of punctuation marks. ", I! "? "equal character segmentation, another is the use of bullets such as" 1"," (1) ". Through the sentence dividing process, the privacy policy chapters can be specifically divided into sentences so as to carry out word dividing and word deactivating operations later.
Word segmentation: chinese is different from english in terms of word segmentation per letter, and requires better segmentation based on word characteristics. When the text processing is performed, a basic step is performed during word segmentation, and a mature word segmentation algorithm can achieve better sentence representation meaning. When the privacy policy text is segmented, unlike other text segmentation, there are many fixed matches for privacy and its functions in the privacy policy, such as "face recognition" is a fixed match, and should not be split. The word segmentation tool used in the method is Jieba, and Chinese word segmentation effect is good. And is divided into the following three modes:
Accurate mode: two parameters are input by using the cut function of the Jieba segmentation, the parameter meanings are respectively input sentences and cut_all, and the second parameter is set as False, which means that the full mode is not used.
Full mode: using the cut function of Jieba segmentation, two parameters are introduced, the meaning of which is the same as the exact mode, the first representing the sentence being introduced, the second parameter being set to True, representing the use of full mode.
Search engine mode: the parameters are unique by using the cut_for_search function of the Jieba word segmentation, and only sentences to be segmented are transmitted.
Decommissioning word: to avoid the influence of irrelevant words, word segmentation is typically followed by word de-segmentation. In chinese privacy policies, certain words (e.g., "we", "you", etc.) are quite common, but they have no specific meaning in sentences. If these words are not removed, some noise may be added to the subsequent text classification. Therefore, the method and the device use the Ha Gong stop vocabulary to filter irrelevant words in the authority information through character string matching, and avoid noise to follow-up privacy policy classification tasks. Through analysis of a number of privacy policies, the top 5 stop words that occur most frequently are shown in table 1.
Table 1 privacy policy occurrence frequency Top5 stop words
Stop words Word frequency
We have 8782
You (you) 7801
And 6535
etc 518
And is also provided with 3487
When the model training is carried out by taking the phrase as a basic unit, the characteristics at the word level are quite sparse due to the limited word number of the phrase, and the complete semantics of the privacy authority cannot be reflected. Furthermore, if static Word vectors such as Word2vec are used, it is difficult to distinguish feature words, especially the Word ambiguous problem, according to the upper and lower semantics. Thus, the present disclosure employs a dynamic word vector BERT model that takes words as processing units and processes context through the Attention mechanism.
The purpose of the present disclosure, using a semantic analysis model, is to determine a protocol related to rights, a process called rights filtering, which is to divide phrases in a privacy policy into rights related and rights unrelated phrases. Assume a set of N privacy policy phrases { (X1, Y1), (X2, Y2), (Xn, yn)) } where Xt (t=1, 2,..n) represents one authority phrase, yt e {0,1} represents a tag of the authority phrase, 1 represents an authority-related phrase, and 0 represents an authority-independent phrase.
The pre-training model BERT firstly loads pre-trained parameters into the model to serve as initial parameters of the model, and then obtains characteristic representations Vt, V of each phrase t ∈R n Is an n-dimensional feature vector (word vector). Then fitting the function by downstream tasksWherein->Is a predictive tag, which is output by judging whether the predictive probability is greater than 0.5 (default value) for the task of the classification. Finally minimize the loss function->And obtaining optimal parameters. The authority filter obtains the optimal model parameters through the learning process, and then correctly identifies the authority phrase.
And filtering the authority irrelevant information by using a classification task of a BERT model, wherein the BERT model selects a BERT-base Chinese pre-training model, the model network structure has 12 layers, the hidden layer comprises 768 dimensions, and a total of 110M parameters are adopted by adopting a 12-head attention mechanism. The model consists of three parts: firstly, an input layer, wherein the main function is to input a privacy phrase after data preprocessing into a model as a training unit; secondly, the BERT layer is used for encoding the privacy phrase to obtain sentence vector H [ CLS ]; finally, authority-related classification is performed through the full connection layer and the softmax function in the output layer.
S140, comparing the behavior of the target application program for acquiring and/or using the information asset with the protocol related to the user information acquisition and/or use, and identifying the illegal behavior in the behavior of the target application program for acquiring and/or using the information asset.
The method specifically comprises the following steps:
the asset use behavior generated in the actual running process of the acquired application program and the behavior related to the use of the asset stated in the application program user protocol are used for identifying whether the application program has illegal use of the asset behavior by comparing the inconsistent parts of the two types of data.
As can be seen from the above, the present disclosure provides a mobile terminal information asset usage behavior recognition method for an application, including: determining information assets in the mobile terminal related to the user sensitive information; based on instrumentation and kernel monitoring, acquiring the behavior of a target application program to acquire and/or use information assets; acquiring a user protocol of a target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model; comparing the behavior of the target application to acquire and/or use the information asset with a protocol associated with the user's acquisition and/or use of the information asset, and identifying violations in the behavior of the target application to acquire and/or use the information asset. According to the method and the device, through instrumentation and kernel monitoring, the behavior that the application program actually uses the information asset in the mobile terminal is obtained, and compared with the authority of the application program declared in the user protocol, whether the application program has illegal behaviors exceeding the declared authority can be accurately and efficiently determined.
Furthermore, in the process of automatically driving the application program, the driving event is built based on the human-computer interaction interface, so that the automatic driving of the target application program is realized, the driving method does not depend on the source code of the application program to build the click event and the input event, and the automatic simulation driving of the inter-process communication event can be supported.
It should be noted that the method of the embodiments of the present disclosure may be performed by a single device, such as a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the methods of embodiments of the present disclosure, the devices interacting with each other to accomplish the methods.
It should be noted that the foregoing describes some embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Based on the same inventive concept, the present disclosure also provides a mobile terminal information asset usage behavior recognition device for an application corresponding to the method of any embodiment.
Referring to fig. 4, the mobile terminal information asset usage behavior recognition device for an application program includes:
an information asset determination module 310 configured to determine information assets in the mobile terminal that are related to the user-sensitive information;
an application behavior acquisition module 320 configured to acquire the behavior of an application to acquire and/or use the information asset based on instrumentation and kernel monitoring;
a related protocol determining module 330 configured to obtain a user protocol of the application program, and obtain a protocol related to user information collection and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
an offensiveness determination module 340 configured to compare the behavior of the application acquiring and/or using the information asset with the protocol associated with the user information acquisition and/or use, and identify an offensiveness in the behavior of the application acquiring and/or using the information asset.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of the various modules may be implemented in the same one or more pieces of software and/or hardware when implementing the present disclosure.
The device of the foregoing embodiment is configured to implement the corresponding method for identifying the usage behavior of the mobile terminal information asset for the application in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, the present disclosure also provides an electronic device corresponding to the method of any embodiment, which includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements the method for identifying usage behavior of information assets of a mobile terminal for an application according to any embodiment when executing the program.
Fig. 5 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an application-specific integrated circuit (application lication Specific Integrated Circ, i.e. ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The electronic device of the foregoing embodiment is configured to implement the corresponding method for identifying the usage behavior of the mobile terminal information asset for the application in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, corresponding to any of the above-described embodiment methods, the present disclosure further provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the mobile terminal information asset usage behavior identification method for an application according to any of the above-described embodiments.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
The storage medium of the foregoing embodiment stores computer instructions for causing the computer to execute the method for identifying the usage behavior of the mobile terminal information asset for the application according to any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
It should be noted that the embodiments of the present disclosure may be further described in the following manner:
a mobile terminal information asset usage behavior recognition method for an application, comprising:
determining information assets in the mobile terminal related to the user sensitive information;
based on instrumentation and kernel monitoring, acquiring the behavior of an application program to acquire and/or use the information asset;
acquiring a user protocol of the application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
comparing the behavior of the application to acquire and/or use the information asset with the protocol associated with the user's acquisition and/or use of information, and identifying violations in the behavior of the application to acquire and/or use the information asset.
Optionally, after determining the information asset related to the user sensitive information in the mobile terminal, the method further includes:
a mapping relationship between the behavior of acquiring and/or using the information asset and the interface and rights of the application is determined.
Optionally, the act of acquiring, based on instrumentation and kernel monitoring, the information asset acquired and/or used by the application program includes:
Installing the application program into the mobile terminal;
constructing a driving event based on a human-computer interaction interface, and realizing automatic driving of the application program;
and according to the mapping relation, based on instrumentation and kernel monitoring, acquiring the behavior of the application program for acquiring and/or using the information asset in the driving event.
Optionally, the man-machine interaction interface comprises a virtual key and an input window;
the man-machine interaction interface-based call construction driving event realizes the automatic driving of the application program, and comprises the following steps:
in response to determining that the input window exists in the man-machine interaction interface, acquiring description information of the input window and coordinate information of a text box corresponding to the input window, and acquiring simulation data corresponding to the description information to construct an input class driving event so as to realize simulation input operation;
and responding to the fact that the virtual key exists in the man-machine interaction interface, acquiring coordinate information of the virtual key to construct a click type driving event, and realizing simulated click operation.
Optionally, the obtaining, according to the mapping relationship, based on instrumentation and kernel monitoring, the behavior of the application program for obtaining and/or using the information asset in the driving event includes:
And implanting the monitoring system into an android frame, setting a buried point for an interface of the application program to monitor, responding to the fact that the monitored interface is called, reporting to a log record service and performing persistent storage.
Optionally, the acquiring the user protocol of the application program, and based on a semantic analysis model, acquiring a protocol related to user information acquisition and/or use in the user protocol, further includes:
preprocessing a user protocol of the application program; the preprocessing comprises sentence segmentation, word segmentation and general word removal.
Optionally, the method further comprises:
constructing a sample set comprising a number of samples; wherein the sample comprises: sample data and tag data; the sample data includes a training user protocol; the label data comprises a training-use and user information acquisition and/or use related protocol corresponding to the training-use user protocol;
and constructing and training to obtain the semantic analysis model through a preset natural language processing algorithm according to the sample set.
A mobile terminal information asset usage behavior recognition device for an application, comprising:
An information asset determination module configured to determine information assets in the mobile terminal that are related to the user-sensitive information;
an application behavior acquisition module configured to acquire behavior of an application to acquire and/or use the information asset based on instrumentation and kernel monitoring;
the related protocol determining module is configured to acquire a user protocol of the application program, and acquire a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
and an offensiveness determination module configured to compare the behavior of the application program for acquiring and/or using the information asset with the protocol related to the user information acquisition and/or use, and identify an offensiveness in the behavior of the application program for acquiring and/or using the information asset.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method as described above when executing the program.
A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the above-described method.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined under the idea of the present disclosure, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the present disclosure as described above, which are not provided in details for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present disclosure. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present disclosure, and this also accounts for the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform on which the embodiments of the present disclosure are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The disclosed embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Accordingly, any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the embodiments of the disclosure, are intended to be included within the scope of the disclosure.

Claims (8)

1. A mobile terminal information asset usage behavior recognition method for an application, comprising:
determining information assets in the mobile terminal related to the user sensitive information;
based on instrumentation and kernel monitoring, acquiring the behavior of a target application program to acquire and/or use the information asset; the method specifically comprises the following steps: installing the target application program into the mobile terminal; constructing a driving event based on a human-computer interaction interface to realize automatic driving of the target application program, and specifically comprises the following steps: the relation between the human-computer interaction interface UI and the driving event is represented by using a human-computer interaction interface call graph UCG, and if the driving of the event exists, the page human-computer interaction interface UI is caused k Jump to human-computer interaction interface UI w Then there is an edgeV represents all interfaces of the application program, E represents all event sets, wherein the man-machine interaction interface call graph ucg= (V, E) = {<UI k ,UI w >|UI k ,UI w E V }; determining a mapping relationship between the behavior of acquiring and/or using the information asset and the interface and rights of the target application; acquiring the behavior of the target application program for acquiring and/or using the information asset in the driving event based on instrumentation and kernel monitoring according to the mapping relation;
acquiring a user protocol of the target application program, and acquiring a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
and comparing the behavior of the target application program for acquiring and/or using the information asset with the protocol related to user information acquisition and/or use, and identifying the illegal behavior in the behavior of the target application program for acquiring and/or using the information asset.
2. The method of claim 1, wherein the human-machine interaction interface comprises virtual keys and an input window;
the man-machine interaction interface-based call construction driving event realizes the automatic driving of the target application program, and comprises the following steps:
In response to determining that the input window exists in the man-machine interaction interface, acquiring description information of the input window and coordinate information of a text box corresponding to the input window, and acquiring simulation data corresponding to the description information to construct an input class driving event so as to realize simulation input operation;
and responding to the fact that the virtual key exists in the man-machine interaction interface, acquiring coordinate information of the virtual key to construct a click type driving event, and realizing simulated click operation.
3. The method of claim 1, wherein the obtaining, based on instrumentation and kernel monitoring, the behavior of the target application to obtain and/or use the information asset in the driving event according to the mapping relationship comprises:
and implanting the monitoring system into an android frame, setting a buried point for an interface of the target application program to monitor, responding to the fact that the monitored interface is called, reporting to a log record service and performing persistent storage.
4. The method of claim 1, wherein the acquiring the user protocol of the target application program and acquiring a protocol related to user information collection and/or use in the user protocol based on a semantic analysis model further comprises:
Preprocessing a user protocol of the target application program; the preprocessing comprises sentence segmentation, word segmentation and general word removal.
5. The method of claim 1, further comprising:
constructing a sample set comprising a number of samples; wherein the sample comprises: sample data and tag data; the sample data includes a training user protocol; the label data comprises a training-use and user information acquisition and/or use related protocol corresponding to the training-use user protocol;
and constructing and training to obtain the semantic analysis model through a preset natural language processing algorithm according to the sample set.
6. A mobile terminal information asset usage behavior recognition device for an application, comprising:
an information asset determination module configured to determine information assets in the mobile terminal that are related to the user-sensitive information;
an application behavior acquisition module configured to acquire behaviors of a target application to acquire and/or use the information asset based on instrumentation and kernel monitoring; specifically configured to: installing the target application program into the mobile terminal; based on the man-machine interaction interface, a driving event is constructed, so that the automation driving of the target application program is realized, and the method is specifically configured to: the relation between the human-computer interaction interface UI and the driving event is represented by using a human-computer interaction interface call graph UCG, and if the driving of the event exists, the page human-computer interaction interface UI is caused k Jump to human-computer interaction interface UI w Then there is an edgeV represents all interfaces of the application program, E represents all event sets, wherein the man-machine interaction interface call graph ucg= (V, E) = {<UI k ,UI w >|UI k ,UI w E V }; determining a mapping relationship between the behavior of acquiring and/or using the information asset and the interface and rights of the target application; acquiring the information acquired and/or used by the target application program in the driving event based on instrumentation and kernel monitoring according to the mapping relationBehavior of the asset;
the related protocol determining module is configured to acquire a user protocol of the target application program, and acquire a protocol related to user information acquisition and/or use in the user protocol based on a pre-constructed and trained semantic analysis model;
and the illegal action determining module is configured to compare the action of the target application program for acquiring and/or using the information asset with the protocol related to the user information acquisition and/or use and identify the illegal action in the action of the target application program for acquiring and/or using the information asset.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 5 when the program is executed.
8. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 5.
CN202110866631.1A 2021-07-29 2021-07-29 Mobile terminal information asset use behavior identification method for application program Active CN113849785B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110866631.1A CN113849785B (en) 2021-07-29 2021-07-29 Mobile terminal information asset use behavior identification method for application program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110866631.1A CN113849785B (en) 2021-07-29 2021-07-29 Mobile terminal information asset use behavior identification method for application program

Publications (2)

Publication Number Publication Date
CN113849785A CN113849785A (en) 2021-12-28
CN113849785B true CN113849785B (en) 2024-01-30

Family

ID=78975233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110866631.1A Active CN113849785B (en) 2021-07-29 2021-07-29 Mobile terminal information asset use behavior identification method for application program

Country Status (1)

Country Link
CN (1) CN113849785B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6693881B1 (en) * 1998-05-29 2004-02-17 Alcatel Method for bit error rate measurements in a cell-based telecommunication system
CN106101105A (en) * 2016-06-14 2016-11-09 北京小米移动软件有限公司 Data processing method, Apparatus and system
WO2017177695A1 (en) * 2016-04-15 2017-10-19 华中科技大学 Method and system for development and integration of application in numerical control system
CN109710246A (en) * 2018-12-07 2019-05-03 北京奇虎科技有限公司 Data management system and its control method, equipment and storage medium
CN112199506A (en) * 2020-11-10 2021-01-08 支付宝(杭州)信息技术有限公司 Information detection method, device and equipment for application program
CN112214418A (en) * 2020-12-04 2021-01-12 支付宝(杭州)信息技术有限公司 Application compliance detection method and device and electronic equipment
WO2021114840A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Scoring method and apparatus based on semantic analysis, terminal device, and storage medium
CN113051613A (en) * 2021-03-15 2021-06-29 Oppo广东移动通信有限公司 Privacy policy detection method and device, electronic equipment and readable storage medium
KR20210087098A (en) * 2020-05-22 2021-07-09 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 Information verification method, apparatus, device, computer storage medium and computer program product based on voice interaction

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6693881B1 (en) * 1998-05-29 2004-02-17 Alcatel Method for bit error rate measurements in a cell-based telecommunication system
WO2017177695A1 (en) * 2016-04-15 2017-10-19 华中科技大学 Method and system for development and integration of application in numerical control system
CN106101105A (en) * 2016-06-14 2016-11-09 北京小米移动软件有限公司 Data processing method, Apparatus and system
CN109710246A (en) * 2018-12-07 2019-05-03 北京奇虎科技有限公司 Data management system and its control method, equipment and storage medium
KR20210087098A (en) * 2020-05-22 2021-07-09 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 Information verification method, apparatus, device, computer storage medium and computer program product based on voice interaction
WO2021114840A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Scoring method and apparatus based on semantic analysis, terminal device, and storage medium
CN112199506A (en) * 2020-11-10 2021-01-08 支付宝(杭州)信息技术有限公司 Information detection method, device and equipment for application program
CN112214418A (en) * 2020-12-04 2021-01-12 支付宝(杭州)信息技术有限公司 Application compliance detection method and device and electronic equipment
CN113051613A (en) * 2021-03-15 2021-06-29 Oppo广东移动通信有限公司 Privacy policy detection method and device, electronic equipment and readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Vulnerability Detection on Android Apps–Inspired by Case Study on Vulnerability Related With Web Functions;JIAWEI QIN 等;IEEE ACCESS;第8卷;第106437-106451页 *
上下文感知的安卓应用程序漏洞检测研究;秦佳伟 等;通信学报;第42卷(第11期);第13-27段 *
基于 UC/OS - II 的 UCGUI 和 LWIP 资源整合的研究;曾繁冲 等;成都信息工程学院学报;第22卷(第5期);第614-617页 *
知识不确定性问题的粒计算模型;王国胤;张清华;马希骜;杨青山;;软件学报(第04期);第676-694页 *

Also Published As

Publication number Publication date
CN113849785A (en) 2021-12-28

Similar Documents

Publication Publication Date Title
US11481492B2 (en) Method and system for static behavior-predictive malware detection
US8850581B2 (en) Identification of malware detection signature candidate code
CN102054149B (en) Method for extracting malicious code behavior characteristic
WO2021017735A1 (en) Smart contract formal verification method, electronic apparatus and storage medium
WO2022076488A2 (en) Method and system for extraction of data from documents for robotic process automation
CN109905385B (en) Webshell detection method, device and system
US9436449B1 (en) Scenario-based code trimming and code reduction
Burns et al. A dataset for interactive vision-language navigation with unknown command feasibility
US11601453B2 (en) Methods and systems for establishing semantic equivalence in access sequences using sentence embeddings
CN112748914A (en) Application program development method and device, electronic equipment and storage medium
CN113158189B (en) Method, device, equipment and medium for generating malicious software analysis report
CN116991990A (en) Program development assisting method, storage medium and device based on AIGC
EP3195115A1 (en) Code development tool with multi-context intelligent assistance
CN112688966A (en) Webshell detection method, device, medium and equipment
CN112148602A (en) Source code security analysis method based on history optimization feature intelligent learning
CN114297700B (en) Dynamic and static combined mobile application privacy protocol extraction method and related equipment
CN114398673A (en) Application compliance detection method and device, storage medium and electronic equipment
CN112817877B (en) Abnormal script detection method and device, computer equipment and storage medium
CN108090355B (en) APK automatic triggering tool
CN113869789A (en) Risk monitoring method and device, computer equipment and storage medium
CN114285641A (en) Network attack detection method and device, electronic equipment and storage medium
CN113849785B (en) Mobile terminal information asset use behavior identification method for application program
CN116932381A (en) Automatic evaluation method for security risk of applet and related equipment
CN116226850A (en) Method, device, equipment, medium and program product for detecting virus of application program
CN114003421B (en) Virtual machine timeout mechanism testing method, system, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant