CN112685072A - Method, device, equipment and storage medium for generating communication address knowledge base - Google Patents

Method, device, equipment and storage medium for generating communication address knowledge base Download PDF

Info

Publication number
CN112685072A
CN112685072A CN202011629917.XA CN202011629917A CN112685072A CN 112685072 A CN112685072 A CN 112685072A CN 202011629917 A CN202011629917 A CN 202011629917A CN 112685072 A CN112685072 A CN 112685072A
Authority
CN
China
Prior art keywords
communication address
application program
stored
target application
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011629917.XA
Other languages
Chinese (zh)
Other versions
CN112685072B (en
Inventor
张健
李超
石磊
孟宝权
王杰
杨满智
蔡琳
梁彧
田野
傅强
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202011629917.XA priority Critical patent/CN112685072B/en
Publication of CN112685072A publication Critical patent/CN112685072A/en
Application granted granted Critical
Publication of CN112685072B publication Critical patent/CN112685072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for generating a communication address knowledge base. The method comprises the following steps: analyzing the target application program through an application program analysis tool to obtain basic information and all corresponding communication addresses; determining whether each communication address is a main communication address to be stored which is not overlapped with the main communication address corresponding to the target application program and stored in the communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and the supplementary communication address stored in the communication address knowledge base; and storing the main communication address to be stored in each communication address in a communication address knowledge base. The embodiment of the invention can generate a relatively complete knowledge base containing the communication address for providing the main support service for the application program, and provides a basis for subsequently determining the communication address for providing the main support service for the illegal application program, thereby realizing the rapid and accurate determination of the communication address for providing the main support service.

Description

Method, device, equipment and storage medium for generating communication address knowledge base
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a storage medium for generating a communication address knowledge base.
Background
With the rapid development of the internet in China, the number of application programs (App) is rapidly increased, and the service content is continuously enriched. While the number of applications is increasing, application violations are becoming more and more serious. When an application violates a rule, the owner of the offending application needs to be located, i.e., the communication address that provides the primary support service for the offending application is determined.
In the related art, when an application program violates a rule, a technician typically analyzes all communication addresses for providing functional services to the application program, and determines a communication address for providing a main support service to the application program from all communication addresses for providing functional services to the application program. All communication addresses that provide functional services for an application include communication addresses that provide primary support services for the application and communication addresses that provide supplemental functionality for the application. With the continuous enrichment of the service content of the application program, the number of communication addresses providing the supplementary function for the application program is increased, so that the number of communication addresses providing the functional service for the application program is large, more manpower is required to be invested for analysis, and the communication address providing the main support service for the illegal application program is difficult to be determined quickly and accurately.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for generating a communication address knowledge base, which may generate a knowledge base including a communication address providing a main support service for an application, and provide a basis for subsequently determining a communication address providing a main support service for an illegal application, thereby quickly and accurately determining a communication address providing a main support service for an illegal application.
In a first aspect, an embodiment of the present invention provides a method for generating a communication address knowledge base, including:
analyzing a target application program through an application program analysis tool, and acquiring basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract;
determining whether each communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in a communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and a supplementary communication address stored in the communication address knowledge base; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program;
and storing the main communication address to be stored in each communication address in the communication address knowledge base.
In a second aspect, an embodiment of the present invention further provides a device for generating a communication address knowledge base, including:
the application program analysis module is used for analyzing a target application program through an application program analysis tool to acquire basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract;
a main communication address determining module, configured to determine, according to the basic information, a main communication address and a supplementary communication address that are stored in a communication address knowledge base and correspond to the target application program, whether each of the communication addresses is a main communication address to be stored that is not overlapped with the main communication address that is stored in the communication address knowledge base and corresponds to the target application program; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program;
and the main communication address storage module is used for storing the main communication address to be stored in each communication address in the communication address knowledge base.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the method for generating the communication address knowledge base according to the embodiment of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method for generating a knowledge base of communication addresses according to the embodiment of the present invention.
The technical scheme of the embodiment of the invention includes analyzing a target application program through an application program analysis tool to obtain basic information of the target application program and all communication addresses corresponding to the target application program, then determining whether each communication address is a main communication address to be stored which is not overlapped with the main communication address corresponding to the target application program and stored in a communication address knowledge base according to the basic information, the main communication address to be stored in the communication address knowledge base and stored in the communication address knowledge base, and storing the main communication address to be stored in each communication address in the communication address knowledge base The main communication address to be stored can be analyzed more accurately and rapidly for the communication address corresponding to the application program, the main communication address to be stored in each communication address is analyzed, the main communication address to be stored in each communication address is stored in the communication address knowledge base, a relatively complete knowledge base containing the communication address for providing the main support service for the application program is generated, a good technical support function is achieved for analyzing and mastering the relation condition of the application program and the communication address for providing the main support service, a basis is provided for subsequently determining the communication address for providing the main support service for the illegal application program, and therefore the communication address for providing the main support service for the illegal application program is determined rapidly and accurately.
Drawings
Fig. 1A is a flowchart of a method for generating a communication address knowledge base according to an embodiment of the present invention.
Fig. 1B is a flowchart of a method for acquiring basic information of a target application and all communication addresses corresponding to the target application according to an embodiment of the present invention.
Fig. 1C is a flowchart of a method for obtaining a top-level domain name corresponding to a target communication address according to an embodiment of the present invention.
Fig. 1D is a schematic diagram of a detection process of a target communication address according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for generating a communication address knowledge base according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a device for generating a communication address knowledge base according to a third embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.
It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1A is a flowchart of a method for generating a communication address knowledge base according to an embodiment of the present invention. The embodiment of the invention can be applied to the situation of generating the knowledge base containing the communication addresses for providing the main support service for the application program, and the method can be executed by the generating device of the communication address knowledge base provided by the embodiment of the invention, and the device can be realized in a software and/or hardware mode and can be generally integrated in a computer device. For example in a server. As shown in fig. 1A, the method of the embodiment of the present invention specifically includes:
step 101, analyzing a target application program through an application program analysis tool, and acquiring basic information of the target application program and all communication addresses corresponding to the target application program.
The basic information comprises an application program name, an application program package name and an information abstract. The application name is the name of the application. The application Package Name (Package Name) is the unique identification of the target application. The Message Digest is Message-Digest Algorithm (MD 5) information corresponding to the target application. The MD5 information corresponding to the target application may be used to verify whether the target application has been modified and is safe.
An Application program (App) is a computer program that performs one or more specific tasks, operates in a user mode, can interact with a user, and has a visual user interface. In this embodiment of the present invention, the target application may be any one or more application programs, which is not limited in this embodiment of the present invention. The application parsing tool may be software for parsing an application. For example, the application parses software ExEinfo PE.
The communication address corresponding to the target application program is a communication address for providing a functional service for the target application program. Optionally, the communication address is in a Uniform Resource Locator (URL) format.
The target application program can establish communication connection with a background server for providing the functional service through a communication address for providing a certain functional service for the target application program, so that the function is realized. The communication address corresponding to the target application may include a primary communication address and a supplemental communication address.
The primary communication address corresponding to the target application is the communication address that provides the primary support service for the target application. The target application program can establish communication connection with a background server for providing main support service through the main communication address, so that the background server provides various service supports for the target application program, and the main functions of the target application program are realized.
The supplemental communication address corresponding to the target application is a communication address that provides supplemental functionality for the target application. The target application may have a plurality of Software Development Kits (SDKs) for providing supplemental functionality to the target application. Each software development kit provides different complementary functionality for the target application. Each software development kit has a corresponding supplemental communication address. The software development kit of the target application program can establish communication connection with a background server for providing a certain supplementary function through a supplementary communication address corresponding to the software development kit, so as to realize the supplementary function.
Optionally, analyzing the target application program through an application program analysis tool, and acquiring the basic information of the target application program and all communication addresses corresponding to the target application program, including: performing static analysis and dynamic analysis on a target application program through an application program analysis tool to obtain a static analysis result and a dynamic analysis result of the target application program; acquiring basic information of the target application program in the static analysis result; and acquiring all communication addresses corresponding to the target application program in the dynamic analysis result.
Optionally, the static analysis is to analyze information such as permission, components, sensitive functions and the like in the installation package file of the target application program by decompiling the installation package file of the target application program, so as to obtain a static analysis result of the target application program. The static analysis result of the target application program is information obtained by analysis in the static analysis process of the target application program. The static analysis result of the target application program comprises an application program name, an application program package name and an information abstract of the target application program.
Optionally, the dynamic analysis is to run the target application program on the simulator, then perform some operations on the target application program to trigger as many behaviors as possible, then output the log, and analyze information in the log through the script to obtain a dynamic analysis result of the target application program. The dynamic analysis result of the target application includes all communication addresses corresponding to the target application.
In an embodiment, fig. 1B is a flowchart of a method for acquiring basic information of a target application and all communication addresses corresponding to the target application according to an embodiment of the present invention. As shown in fig. 1B, the method specifically includes:
and step 1011, performing static analysis and dynamic analysis on the target application program through the application program analysis tool to obtain a static analysis result and a dynamic analysis result of the target application program.
And 1012, acquiring basic information of the target application program in the static analysis result.
Optionally, the basic information of the target application program in the static analysis result is extracted.
Step 1013, in the dynamic analysis result, acquiring all communication addresses corresponding to the target application program.
Optionally, all communication addresses corresponding to the target application program in the dynamic analysis result are extracted.
Therefore, the static analysis and the dynamic analysis are carried out on the target application program through the application program analysis tool, the basic information of the target application program is obtained through the static analysis, and the network activity information of the target application program, namely all communication addresses corresponding to the target application program, is obtained through the dynamic analysis.
And step 102, determining whether each communication address is a main communication address to be stored which is not overlapped with the main communication address corresponding to the target application program and stored in the communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and the supplementary communication address stored in the communication address knowledge base.
The main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program.
A primary communication address corresponding to the plurality of applications may be included in the communication address repository.
Optionally, determining, according to the basic information, a main communication address and a supplementary communication address stored in a communication address knowledge base and corresponding to the target application program, whether each of the communication addresses is a main communication address to be stored that is not overlapped with the main communication address stored in the communication address knowledge base and corresponding to the target application program, includes: performing the following for each communication address corresponding to the target application: acquiring a top-level domain name corresponding to a target communication address; when detecting that the main communication address and the supplementary communication address corresponding to the target application program stored in the communication address knowledge base do not contain the target communication address, calculating the similarity between the top-level domain name and the reverse-order result of the application program package name by using an edit distance algorithm; and when the similarity is detected to be larger than a preset similarity threshold value, determining that the target communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in the communication address knowledge base.
Optionally, determining, according to the basic information, a main communication address and a supplementary communication address stored in a communication address knowledge base and corresponding to the target application program, whether each of the communication addresses is a main communication address to be stored that is not overlapped with the main communication address stored in the communication address knowledge base and corresponding to the target application program, further includes: when the similarity is detected to be smaller than or equal to a preset similarity threshold value, acquiring the developer name of the target application program according to the basic information; performing record inquiry and domain name inquiry according to the top-level domain name, and acquiring a main body name, a website name and an owner name corresponding to the target communication address; calculating a similarity between the developer name and the subject name, a similarity between the developer name and the website name, and a similarity between the developer name and the owner name, respectively, using a cosine similarity algorithm; calculating the ratio of the target communication address in the top-level domain names of all communication addresses corresponding to the target application program; performing weighted calculation according to the ratio, the similarity between the developer name and the subject name, the similarity between the developer name and the website name, and the similarity between the developer name and the owner name to obtain a weighted calculation result; when the weighting calculation result is detected to be larger than a preset result threshold value, determining that the target communication address is a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program; and when the weighting calculation result is detected to be less than or equal to a preset result threshold value, determining that the target communication address is not a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program.
Optionally, determining, according to the basic information, a main communication address and a supplementary communication address stored in a communication address knowledge base and corresponding to the target application program, whether each of the communication addresses is a main communication address to be stored that is not overlapped with the main communication address stored in the communication address knowledge base and corresponding to the target application program, further includes: when detecting that a primary communication address or a supplementary communication address corresponding to the target application program stored in the communication address knowledge base contains the target communication address, determining that the target communication address is not a primary communication address to be stored which does not overlap with the primary communication address corresponding to the target application program stored in the communication address knowledge base.
Optionally, the obtaining a top-level domain name corresponding to the target communication address includes: extracting a domain name in the target communication address; and extracting the top-level domain name in the domain names.
The target communication address is any one of all communication addresses corresponding to the target application program. The communication address is a communication address in a URL format, and includes a domain name. The domain name, also called network domain, is the name of a certain computer or group of computers on the internet, which is composed of a string of names separated by dots, and is used for positioning and identifying (sometimes also referred to as geographical location) the computer during data transmission. In practical application, the domain names adopt a hierarchical structure, the highest level is the root domain name, the second level is the top level domain name, and the second level is the first level domain name, the second level domain name, the third level domain name and the like. For example, ". is the root domain name,". com ",". cn "is the top level domain name," abc.com "is the first level domain name, and" www.abc.com "is the second level domain name.
In an embodiment, fig. 1C is a flowchart of a method for obtaining a top-level domain name corresponding to a target communication address according to an embodiment of the present invention. As shown in fig. 1C, the method specifically includes:
step 1021, extracting the domain name in the target communication address.
And step 1022, extracting the top-level domain name in the domain names.
Therefore, the domain name of the target communication address in the URL format is extracted, and then the top-level domain name is extracted from the domain name, so that the top-level domain name corresponding to the target communication address is obtained.
Optionally, after the top-level domain name corresponding to the target communication address is obtained, it is detected whether the primary communication address and the supplementary communication address corresponding to the target application program stored in the communication address knowledge base include the target communication address.
The primary communication address corresponding to the target application stored in the communication address repository is a known primary communication address corresponding to the target application. Optionally, a corresponding relationship between each known primary communication address and an application program name of the target application program is pre-established, and then each known primary communication address corresponding to the target application program is stored in the communication address knowledge base according to the corresponding relationship. And subsequently, according to the application program name of the target application program, acquiring a main communication address corresponding to the target application program and stored in a communication address knowledge base.
The supplemental communication address stored in the communication address repository corresponding to the target application is a supplemental communication address known to correspond to the target application. Optionally, a corresponding relationship between each known supplementary communication address and an application name of the target application is pre-established, and then each known supplementary communication address corresponding to the target application is stored in the communication address knowledge base according to the corresponding relationship. And subsequently, according to the application program name of the target application program, acquiring a supplementary communication address which is stored in a communication address knowledge base and corresponds to the target application program.
In an embodiment, fig. 1D is a schematic diagram of a detection process of a target communication address according to an embodiment of the present invention. As shown in fig. 1D, the method specifically includes:
step 1201, acquiring a top-level domain name corresponding to the target communication address.
Step 1202, determining whether the primary communication address corresponding to the target application program stored in the communication address knowledge base contains the target communication address: if not, go to step 1203; if so, go to step 1213.
Optionally, the primary communication address corresponding to the target application program stored in the communication address knowledge base includes a target communication address, which indicates that the target communication address is a primary communication address overlapping with the primary communication address corresponding to the target application program stored in the communication address knowledge base, so that it may be determined that the target communication address is not a primary communication address to be stored which does not overlap with the primary communication address corresponding to the target application program stored in the communication address knowledge base.
Step 1203, determining whether the supplementary communication address corresponding to the target application program stored in the communication address knowledge base includes a target communication address: if not, go to step 1204; if so, go to step 1213.
Optionally, the supplementary communication address corresponding to the target application program stored in the communication address knowledge base includes a target communication address, which indicates that the target application program is a supplementary communication address overlapping with the supplementary communication address corresponding to the target application program stored in the communication address knowledge base, so that it may be determined that the target communication address is not a to-be-stored main communication address which is not overlapped with the main communication address corresponding to the target application program stored in the communication address knowledge base.
And 1204, calculating the similarity between the top-level domain name and the reverse-order result of the application program package name by using an edit distance algorithm.
When detecting that the main communication address and the supplementary communication address corresponding to the target application program stored in the communication address knowledge base do not contain the target communication address, further analyzing the target communication address.
Optionally, the reverse order result of the application package name is a result obtained after name reverse order operation is performed on the application package name. Illustratively, the reverse order of the application package name "com.
Step 1205, judging whether the similarity is greater than a preset similarity threshold value: if yes, go to step 1212; if not, go to step 1206.
Optionally, the preset similarity threshold may be set according to a service requirement.
Optionally, if the similarity between the top-level domain name and the reverse-order result of the application package name is greater than the preset similarity threshold, that is, the similarity between the top-level domain name of the target communication address and the reverse-order result of the application package name is higher, it represents that the target communication address is a main communication address corresponding to the target application program, and it is known according to the previous determination process that the main communication address corresponding to the target application program stored in the communication address knowledge base does not include the target communication address, so that it may be determined that the target application program is a main communication address to be stored that is not overlapped with the main communication address corresponding to the target application program stored in the communication address knowledge base.
Optionally, if the similarity between the top-level domain name and the reverse-order result of the application package name is less than or equal to the preset similarity threshold, that is, the similarity between the top-level domain name of the target communication address and the reverse-order result of the application package name is low, which means that the target communication address is the primary communication address corresponding to the target application program, and thus the probability that the target communication address is not the primary communication address to be stored, which is stored in the communication address knowledge base and does not overlap with the primary communication address corresponding to the target application program, is low.
And step 1206, acquiring the developer name of the target application program according to the basic information.
Alternatively, the developer name of the application is the name of the developer of the application. And establishing a corresponding relation between the basic information of each application program and the developer name of the application program in advance, and storing the basic information and the developer name of each application program into a database according to the corresponding relation. Therefore, the developer name of the target application program can be obtained in the database according to the basic information of the target application program.
Step 1207, performing record inquiry and domain inquiry according to the top-level domain name, and acquiring a main body name, a website name and an owner name corresponding to the target communication address.
Optionally, the filing query is to query filing information of the domain name in the domain name filing system. The domain name query is to query relevant information of the domain name on a domain name information platform provided by a domain name registrar, for example, to query a current owner name of the domain name, a contact address of the owner, a current state of the domain name, and the like. The domain name information platform provided by the domain name registrar may be the WHOIS domain name information platform.
Optionally, performing record query and domain name query according to the top-level domain name, and acquiring a main name, a website name, and an owner name corresponding to the target communication address, including: inquiring the main name, the website name and the owner name of the top-level domain name of the target communication address on a domain name filing system and a domain name information platform according to the top-level domain name; and determining the subject name, the website name and the owner name of the top-level domain name of the target communication address as the subject name, the website name and the owner name corresponding to the target communication address.
Step 1208, calculating a similarity between the developer name and the subject name, a similarity between the developer name and the website name, and a similarity between the developer name and the owner name, respectively, using a cosine similarity algorithm.
Step 1209, calculating the ratio of the target communication address in the top domain names of all the communication addresses corresponding to the target application program.
The occupation ratio of the target communication address in the top-level domain names of all the communication addresses corresponding to the target application program is the ratio of the occurrence times of the target communication address in the top-level domain names of all the communication addresses to the total number of all the communication addresses. Illustratively, the number of occurrences of the target communication address in the top-level domain names of all communication addresses corresponding to the target application is 60, the total number of all communication addresses corresponding to the target application is 100, and the occupation ratio of the target communication address in the top-level domain names of all communication addresses corresponding to the target application is 0.6.
Step 1210, performing weighted calculation according to the percentage, the similarity between the developer name and the subject name, the similarity between the developer name and the website name, and the similarity between the developer name and the owner name to obtain a weighted calculation result.
Optionally, performing weighted calculation according to the percentage, the similarity between the developer name and the subject name, the similarity between the developer name and the website name, and the similarity between the developer name and the owner name to obtain a weighted calculation result, where the weighted calculation result includes: and respectively calculating the product of the occupation ratio and the similarity between the developer name and the main name, the product of the occupation ratio and the similarity between the developer name and the website name, and the product of the occupation ratio and the similarity between the developer name and the owner name, and then summing the products to obtain a weighted calculation result. Illustratively, the ratio of the target communication address to the top-level domain names of all communication addresses corresponding to the target application is 0.6, the similarity between the developer name and the subject name is 0.6, the similarity between the developer name and the website name is 0.4, and the similarity between the developer name and the owner name is 0.6. The product of the occupation ratio and the similarity between the developer name and the subject name is 0.36, the product of the occupation ratio and the similarity between the developer name and the site name is 0.24, and the product of the occupation ratio and the similarity between the developer name and the owner name is 0.36, and then the products are summed up to obtain a weighted calculation result of 0.96.
Step 1211, determining whether the weighted calculation result is greater than a preset result threshold: if yes, go to step 1212; if not, go to step 1213.
Optionally, the preset result threshold may be set according to a service requirement.
Optionally, if the weighted calculation result is greater than the preset result threshold, that is, the similarity between the developer name and the subject name, the website name, and the owner name is high, it indicates that the target communication address is the primary communication address corresponding to the target application program, and it is known according to the previous determination process that the primary communication address corresponding to the target application program stored in the communication address knowledge base does not include the target communication address, so that it may be determined that the target application program is the primary communication address to be stored, which is not overlapped with the primary communication address corresponding to the target application program stored in the communication address knowledge base.
Optionally, if the weighted calculation result is greater than the preset result threshold, that is, the similarity between the developer name and the subject name, the website name, and the owner name is low, the probability that the target communication address is the primary communication address corresponding to the target application program is low, so that it may be determined that the target communication address is not the primary communication address to be stored, which is stored in the communication address knowledge base and does not overlap with the primary communication address corresponding to the target application program.
And 1212, determining that the target communication address is a to-be-stored main communication address which is not overlapped with the main communication address corresponding to the target application program and stored in the communication address knowledge base.
Step 1213, determine that the target communication address is not the to-be-stored primary communication address that does not overlap with the primary communication address corresponding to the target application program stored in the communication address repository.
Thus, it is possible to comprehensively determine whether or not each communication address is a master communication address to be stored which does not overlap with a master communication address corresponding to the target application stored in the communication address knowledge base through the above-described detection process.
And 103, storing the main communication address to be stored in each communication address in the communication address knowledge base.
Optionally, the communication address knowledge base is a knowledge base containing main communication addresses corresponding to the plurality of application programs, that is, the knowledge base contains communication addresses providing main support services for the plurality of application programs. The main communication address to be stored is the main communication address corresponding to the application program which is not stored in the communication address knowledge base.
Optionally, storing the main communication address to be stored in each communication address in the communication address knowledge base includes: establishing a corresponding relation between a main communication address to be stored in each communication address and the application program name; and storing the main communication address to be stored in the communication address knowledge base according to the corresponding relation. And subsequently, according to the application program name of the target application program, acquiring a main communication address corresponding to the target application program and stored in a communication address knowledge base.
In the related art, as the service content of the application program is continuously enriched, the number of communication addresses providing the supplementary function for the application program is continuously increased, so that the number of communication addresses providing the functional service for the application program is large, more manpower is required to be invested for analysis, and it is difficult to quickly and accurately determine the communication address providing the main support service for the illegal application program.
The embodiment of the invention analyzes the basic information (related information such as application program name, application program package name, information abstract and the like) of the application program based on an application program analysis tool, analyzes the network activity information (all communication addresses corresponding to the application program) of the application program, analyzes the communication addresses, analyzes the main communication address to be stored in each communication address, stores the main communication address to be stored in each communication address in the communication address knowledge base, generates a relatively complete knowledge base containing the communication address for providing the main support service for the application program, and plays a good technical support role in analyzing and mastering the relation between the application program and the communication address for providing the main support service. When the application program has a violation, the communication address for providing the main support service for the violation application program can be quickly and accurately acquired from the knowledge base containing the communication address for providing the main support service for the violation application program, so that the communication address for providing the main support service for the violation application program can be quickly and accurately determined.
According to the basic information of the application program, the known main communication address and the supplementary communication address which are stored in the communication address knowledge base and correspond to the application program, whether each communication address corresponding to the application program is the main communication address to be stored which is not overlapped with the known main communication address which is stored in the communication address knowledge base and corresponds to the application program or not is comprehensively judged, the communication address corresponding to the application program can be more accurately and rapidly analyzed, the main communication address information corresponding to the application program is extracted, the corresponding activity characteristics of the application program can be more deeply known, and therefore support can be more comprehensively and efficiently provided for related industry management and industry development.
The embodiment of the invention provides a method for generating a communication address knowledge base, which comprises the steps of analyzing a target application program through an application program analysis tool, obtaining basic information of the target application program and all communication addresses corresponding to the target application program, determining whether each communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in the communication address knowledge base according to the basic information, the main communication address and the supplementary communication address which are stored in the communication address knowledge base and correspond to the target application program, storing the to-be-stored main communication address in each communication address in the communication address knowledge base, and comprehensively judging whether each communication address corresponding to the application program is a known communication address and an application program stored in the communication address knowledge base according to the basic information of the application program and the known main communication address and the supplementary communication address which are stored in the communication address knowledge base The main communication addresses to be stored, which correspond to non-overlapping main communication addresses, can analyze the communication addresses corresponding to the application program more accurately and quickly, analyze the main communication addresses to be stored in each communication address, store the main communication addresses to be stored in each communication address in a communication address knowledge base, generate a relatively complete knowledge base containing the communication addresses providing main support services for the application program, play a good technical support role in analyzing and mastering the relation between the application program and the communication addresses providing the main support services, provide a basis for subsequently determining the communication addresses providing the main support services for the illegal application program, and thus realize the quick and accurate determination of the communication addresses providing the main support services for the illegal application program.
Example two
Fig. 2 is a flowchart of a method for generating a communication address knowledge base according to a second embodiment of the present invention. Embodiments of the invention may be combined with various alternatives in one or more of the embodiments described above. As shown in fig. 2, the method of the embodiment of the present invention specifically includes:
step 201, performing static analysis and dynamic analysis on a target application program through an application program analysis tool to obtain a static analysis result and a dynamic analysis result of the target application program.
Optionally, the static analysis is to analyze information such as permission, components, sensitive functions and the like in the installation package file of the target application program by decompiling the installation package file of the target application program, so as to obtain a static analysis result of the target application program. The static analysis result of the target application program is information obtained by analysis in the static analysis process of the target application program. The static analysis result of the target application program comprises an application program name, an application program package name and an information abstract of the target application program.
Optionally, the dynamic analysis is to run the target application program on the simulator, then perform some operations on the target application program to trigger as many behaviors as possible, then output the log, and analyze information in the log through the script to obtain a dynamic analysis result of the target application program. The dynamic analysis result of the target application includes all communication addresses corresponding to the target application.
Step 202, obtaining basic information of the target application program in the static analysis result.
Optionally, the basic information of the target application program in the static analysis result is extracted.
And 203, acquiring all communication addresses corresponding to the target application program in the dynamic analysis result.
Optionally, all communication addresses corresponding to the target application program in the dynamic analysis result are extracted.
And 204, determining whether each communication address is a main communication address to be stored which is not overlapped with the main communication address corresponding to the target application program and stored in the communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and the supplementary communication address stored in the communication address knowledge base.
The main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program.
Step 205, establishing a corresponding relationship between the main communication address to be stored in each communication address and the application program name.
And step 206, storing the main communication address to be stored in the communication address knowledge base according to the corresponding relation.
The embodiment of the invention provides a method for generating a communication address knowledge base, which carries out static analysis and dynamic analysis on a target application program through an application program analysis tool, obtaining basic information of the target application program through static analysis, obtaining network activity information of the target application program through dynamic analysis, that is, all communication addresses corresponding to the target application program are acquired, whether each communication address corresponding to the application program is a main communication address to be stored which is not overlapped with the known main communication address corresponding to the application program and is stored in the communication address knowledge base or not can be comprehensively judged according to the basic information of the application program, the known main communication address corresponding to the application program and the supplementary communication address stored in the communication address knowledge base, the communication address corresponding to the application program can be analyzed more accurately and rapidly, and the main communication address information corresponding to the application program is extracted.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a device for generating a communication address knowledge base according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: an application program resolving module 301, a primary communication address determining module 302, and a primary communication address storing module 303.
The application program analysis module 301 is configured to analyze a target application program through an application program analysis tool, and obtain basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract; a primary communication address determining module 302, configured to determine, according to the basic information, a primary communication address and a supplementary communication address that are stored in a communication address knowledge base and correspond to the target application program, whether each of the communication addresses is a primary communication address to be stored that does not overlap with the primary communication address that is stored in the communication address knowledge base and corresponds to the target application program; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program; a master communication address storage module 303, configured to store a master communication address to be stored in each of the communication addresses in the communication address repository.
The embodiment of the invention provides a device for generating a communication address knowledge base, which is characterized in that a target application program is analyzed through an application program analysis tool to obtain basic information of the target application program and all communication addresses corresponding to the target application program, then whether each communication address is a main communication address to be stored which is not overlapped with the main communication address corresponding to the target application program and stored in the communication address knowledge base is determined according to the basic information, the main communication address to be stored and stored in the communication address knowledge base, and whether each communication address corresponding to the application program is a known main communication address and a supplementary communication address which are stored in the communication address knowledge base and correspond to the application program or not is comprehensively determined according to the basic information of the application program and the known main communication address and supplementary communication address which are stored in the communication address knowledge base and corresponding to the application program The main communication addresses to be stored, which correspond to non-overlapping main communication addresses, can analyze the communication addresses corresponding to the application program more accurately and quickly, analyze the main communication addresses to be stored in each communication address, store the main communication addresses to be stored in each communication address in a communication address knowledge base, generate a relatively complete knowledge base containing the communication addresses providing main support services for the application program, play a good technical support role in analyzing and mastering the relation between the application program and the communication addresses providing the main support services, provide a basis for subsequently determining the communication addresses providing the main support services for the illegal application program, and thus realize the quick and accurate determination of the communication addresses providing the main support services for the illegal application program.
In an optional implementation manner of the embodiment of the present invention, optionally, the application parsing module 301 may include: the application program analysis unit is used for carrying out static analysis and dynamic analysis on the target application program through an application program analysis tool to obtain a static analysis result and a dynamic analysis result of the target application program; a basic information obtaining unit, configured to obtain basic information of the target application program from the static analysis result; and a communication address acquisition unit, configured to acquire all communication addresses corresponding to the target application program from the dynamic analysis result.
In an optional implementation manner of the embodiment of the present invention, optionally, the primary communication address determining module 302 is specifically configured to: performing the following for each communication address corresponding to the target application: acquiring a top-level domain name corresponding to a target communication address; when detecting that the main communication address and the supplementary communication address corresponding to the target application program stored in the communication address knowledge base do not contain the target communication address, calculating the similarity between the top-level domain name and the reverse-order result of the application program package name by using an edit distance algorithm; and when the similarity is detected to be larger than a preset similarity threshold value, determining that the target communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in the communication address knowledge base.
In an optional implementation manner of the embodiment of the present invention, optionally, the primary communication address determining module 302 is further specifically configured to: when the similarity is detected to be smaller than or equal to a preset similarity threshold value, acquiring the developer name of the target application program according to the basic information; performing record inquiry and domain name inquiry according to the top-level domain name, and acquiring a main body name, a website name and an owner name corresponding to the target communication address; calculating a similarity between the developer name and the subject name, a similarity between the developer name and the website name, and a similarity between the developer name and the owner name, respectively, using a cosine similarity algorithm; calculating the ratio of the target communication address in the top-level domain names of all communication addresses corresponding to the target application program; performing weighted calculation according to the ratio, the similarity between the developer name and the subject name, the similarity between the developer name and the website name, and the similarity between the developer name and the owner name to obtain a weighted calculation result; when the weighting calculation result is detected to be larger than a preset result threshold value, determining that the target communication address is a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program; and when the weighting calculation result is detected to be less than or equal to a preset result threshold value, determining that the target communication address is not a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program.
In an optional implementation manner of the embodiment of the present invention, optionally, when the primary communication address determining module 302 acquires the top-level domain name corresponding to the target communication address, it is specifically configured to: extracting a domain name in the target communication address; and extracting the top-level domain name in the domain names.
In an optional implementation manner of the embodiment of the present invention, optionally, the primary communication address determining module 302 is further specifically configured to: when detecting that a primary communication address or a supplementary communication address corresponding to the target application program stored in the communication address knowledge base contains the target communication address, determining that the target communication address is not a primary communication address to be stored which does not overlap with the primary communication address corresponding to the target application program stored in the communication address knowledge base.
In an optional implementation manner of the embodiment of the present invention, optionally, the primary communication address storage module 303 may include: the relation establishing unit is used for establishing a corresponding relation between a main communication address to be stored in each communication address and the application program name; and the address storage unit is used for storing the main communication address to be stored in the communication address knowledge base according to the corresponding relation.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
The device for generating the communication address knowledge base can execute the method for generating the communication address knowledge base provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects for executing the method for generating the communication address knowledge base.
Example four
Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 4 is only one example and should not bring any limitations to the functionality or scope of use of embodiments of the present invention.
As shown in FIG. 4, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors 16, a memory 28, and a bus 18 that connects the various system components (including the memory 28 and the processors 16).
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processor 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, to implement the method for generating the communication address knowledge base provided by the embodiment of the present invention. The method specifically comprises the following steps: analyzing a target application program through an application program analysis tool, and acquiring basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract; determining whether each communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in a communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and a supplementary communication address stored in the communication address knowledge base; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program; and storing the main communication address to be stored in each communication address in the communication address knowledge base.
EXAMPLE five
Fifth embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, for example, implements the method for generating a knowledge base of communication addresses provided in the fifth embodiment of the present invention. The method specifically comprises the following steps: analyzing a target application program through an application program analysis tool, and acquiring basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract; determining whether each communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in a communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and a supplementary communication address stored in the communication address knowledge base; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program; and storing the main communication address to be stored in each communication address in the communication address knowledge base.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or computer device. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for generating a communication address knowledge base is characterized by comprising the following steps:
analyzing a target application program through an application program analysis tool, and acquiring basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract;
determining whether each communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in a communication address knowledge base or not according to the basic information, the main communication address corresponding to the target application program and a supplementary communication address stored in the communication address knowledge base; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program;
and storing the main communication address to be stored in each communication address in the communication address knowledge base.
2. The method of claim 1, wherein analyzing a target application program through an application program analysis tool to obtain basic information of the target application program and all communication addresses corresponding to the target application program comprises:
performing static analysis and dynamic analysis on a target application program through an application program analysis tool to obtain a static analysis result and a dynamic analysis result of the target application program;
acquiring basic information of the target application program in the static analysis result;
and acquiring all communication addresses corresponding to the target application program in the dynamic analysis result.
3. The method of claim 1, wherein determining whether each communication address is a primary communication address to be stored that does not overlap with a primary communication address stored in the communication address repository and corresponding to the target application program according to the basic information, a primary communication address and a supplementary communication address stored in the communication address repository and corresponding to the target application program comprises:
performing the following for each communication address corresponding to the target application:
acquiring a top-level domain name corresponding to a target communication address;
when detecting that the main communication address and the supplementary communication address corresponding to the target application program stored in the communication address knowledge base do not contain the target communication address, calculating the similarity between the top-level domain name and the reverse-order result of the application program package name by using an edit distance algorithm;
and when the similarity is detected to be larger than a preset similarity threshold value, determining that the target communication address is a to-be-stored main communication address which is not overlapped with a main communication address corresponding to the target application program and stored in the communication address knowledge base.
4. The method of claim 3, wherein determining whether each communication address is a primary communication address to be stored that does not overlap with a primary communication address corresponding to the target application program stored in the communication address repository is based on the basic information, a primary communication address corresponding to the target application program stored in the communication address repository, and a supplementary communication address, and further comprising:
when the similarity is detected to be smaller than or equal to a preset similarity threshold value, acquiring the developer name of the target application program according to the basic information;
performing record inquiry and domain name inquiry according to the top-level domain name, and acquiring a main body name, a website name and an owner name corresponding to the target communication address;
calculating a similarity between the developer name and the subject name, a similarity between the developer name and the website name, and a similarity between the developer name and the owner name, respectively, using a cosine similarity algorithm;
calculating the ratio of the target communication address in the top-level domain names of all communication addresses corresponding to the target application program;
performing weighted calculation according to the ratio, the similarity between the developer name and the subject name, the similarity between the developer name and the website name, and the similarity between the developer name and the owner name to obtain a weighted calculation result;
when the weighting calculation result is detected to be larger than a preset result threshold value, determining that the target communication address is a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program;
and when the weighting calculation result is detected to be less than or equal to a preset result threshold value, determining that the target communication address is not a to-be-stored main communication address which is stored in the communication address knowledge base and does not overlap with a main communication address corresponding to the target application program.
5. The method of claim 3, wherein obtaining the top-level domain name corresponding to the target communication address comprises:
extracting a domain name in the target communication address;
and extracting the top-level domain name in the domain names.
6. The method of claim 3, wherein determining whether each communication address is a primary communication address to be stored that does not overlap with a primary communication address corresponding to the target application program stored in the communication address repository is based on the basic information, a primary communication address corresponding to the target application program stored in the communication address repository, and a supplementary communication address, and further comprising:
when detecting that a primary communication address or a supplementary communication address corresponding to the target application program stored in the communication address knowledge base contains the target communication address, determining that the target communication address is not a primary communication address to be stored which does not overlap with the primary communication address corresponding to the target application program stored in the communication address knowledge base.
7. The method of claim 1, wherein storing a primary communication address to be stored in each of the communication addresses in the communication address repository comprises:
establishing a corresponding relation between a main communication address to be stored in each communication address and the application program name;
and storing the main communication address to be stored in the communication address knowledge base according to the corresponding relation.
8. An apparatus for generating a knowledge base of communication addresses, comprising:
the application program analysis module is used for analyzing a target application program through an application program analysis tool to acquire basic information of the target application program and all communication addresses corresponding to the target application program; the basic information comprises an application program name, an application program package name and an information abstract;
a main communication address determining module, configured to determine, according to the basic information, a main communication address and a supplementary communication address that are stored in a communication address knowledge base and correspond to the target application program, whether each of the communication addresses is a main communication address to be stored that is not overlapped with the main communication address that is stored in the communication address knowledge base and corresponds to the target application program; the main communication address is a communication address for providing main support service for the application program, and the supplementary communication address is a communication address for providing supplementary functions for the application program;
and the main communication address storage module is used for storing the main communication address to be stored in each communication address in the communication address knowledge base.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of generating a knowledge base of communication addresses according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of generating a knowledge base of communication addresses according to any one of claims 1 to 7.
CN202011629917.XA 2020-12-31 2020-12-31 Method, device, equipment and storage medium for generating communication address knowledge base Active CN112685072B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011629917.XA CN112685072B (en) 2020-12-31 2020-12-31 Method, device, equipment and storage medium for generating communication address knowledge base

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011629917.XA CN112685072B (en) 2020-12-31 2020-12-31 Method, device, equipment and storage medium for generating communication address knowledge base

Publications (2)

Publication Number Publication Date
CN112685072A true CN112685072A (en) 2021-04-20
CN112685072B CN112685072B (en) 2023-08-01

Family

ID=75455869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011629917.XA Active CN112685072B (en) 2020-12-31 2020-12-31 Method, device, equipment and storage medium for generating communication address knowledge base

Country Status (1)

Country Link
CN (1) CN112685072B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113890866A (en) * 2021-09-26 2022-01-04 恒安嘉新(北京)科技股份公司 Illegal application software identification method, device, medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182401A1 (en) * 2002-03-25 2003-09-25 Alps System Integration Co., Ltd. URL information sharing system using proxy cache of proxy log
CN106067879A (en) * 2016-06-07 2016-11-02 腾讯科技(深圳)有限公司 The detection method of information and device
CN107018210A (en) * 2017-04-12 2017-08-04 北京微影时代科技有限公司 A kind of IP address base establishing method and device
CN108076006A (en) * 2016-11-09 2018-05-25 华为技术有限公司 A kind of lookup is by the method and log management server of attack host
CN111478984A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Server IP address obtaining method and device and computer readable storage medium
CN111782231A (en) * 2020-07-14 2020-10-16 厦门市美亚柏科信息股份有限公司 Service deployment method and device
CN112015910A (en) * 2020-08-20 2020-12-01 恒安嘉新(北京)科技股份公司 Method and device for generating domain name knowledge base, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182401A1 (en) * 2002-03-25 2003-09-25 Alps System Integration Co., Ltd. URL information sharing system using proxy cache of proxy log
CN106067879A (en) * 2016-06-07 2016-11-02 腾讯科技(深圳)有限公司 The detection method of information and device
CN108076006A (en) * 2016-11-09 2018-05-25 华为技术有限公司 A kind of lookup is by the method and log management server of attack host
CN107018210A (en) * 2017-04-12 2017-08-04 北京微影时代科技有限公司 A kind of IP address base establishing method and device
CN111478984A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Server IP address obtaining method and device and computer readable storage medium
CN111782231A (en) * 2020-07-14 2020-10-16 厦门市美亚柏科信息股份有限公司 Service deployment method and device
CN112015910A (en) * 2020-08-20 2020-12-01 恒安嘉新(北京)科技股份公司 Method and device for generating domain name knowledge base, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ANGEL_CG: "一个完整的URL 解析过程", pages 1 - 3, Retrieved from the Internet <URL:https://blog.csdn.net/angle_chen123/article/details/85335244> *
CAYMANT: "计算两个URL的相似度 编辑距离和docsim", pages 1 - 2, Retrieved from the Internet <URL:https://blog.csdn.net/cayman_2015/article/details/84950524> *
XIAOBIN FU等: "Mining navigation history for recommendation", PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, pages 106 *
张志海: "移动互联网绿色上网管理***的设计与实现", 中国优秀硕士学位论文全文数据库信息科技辑, no. 3, pages 139 - 182 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113890866A (en) * 2021-09-26 2022-01-04 恒安嘉新(北京)科技股份公司 Illegal application software identification method, device, medium and electronic equipment
CN113890866B (en) * 2021-09-26 2024-03-12 恒安嘉新(北京)科技股份公司 Illegal application software identification method, device, medium and electronic equipment

Also Published As

Publication number Publication date
CN112685072B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN108667855B (en) Network flow abnormity monitoring method and device, electronic equipment and storage medium
CN110990020A (en) Software compiling method and device, electronic equipment and storage medium
CN110321154B (en) Micro-service interface information display method and device and electronic equipment
CN112035354B (en) Positioning method, device and equipment of risk codes and storage medium
US20200327043A1 (en) System and a method for automated script generation for application testing
CN111241111B (en) Data query method and device, data comparison method and device, medium and equipment
CN110795472A (en) Address standardization method, system, equipment and medium based on fuzzy matching
CN110597704B (en) Pressure test method, device, server and medium for application program
US20120054724A1 (en) Incremental static analysis
CN109388568B (en) Code testing method and device
CN112685072B (en) Method, device, equipment and storage medium for generating communication address knowledge base
US10489728B1 (en) Generating and publishing a problem ticket
CN113032834A (en) Database table processing method, device, equipment and storage medium
CN110858143B (en) Installation package generation method, device, equipment and storage medium
CN115022201B (en) Data processing function test method, device, equipment and storage medium
CN111309311B (en) Vulnerability detection tool generation method, device, equipment and readable storage medium
CN110716859A (en) Method for automatically pushing test cases for modified codes and related device
CN110674491B (en) Method and device for real-time evidence obtaining of android application and electronic equipment
CN114090650A (en) Sample data identification method and device, electronic equipment and storage medium
CN113656301A (en) Interface testing method, device, equipment and storage medium
EP2782005A1 (en) Verifying state reachability in a statechart model having computer program code embedded therein
CN113760696A (en) Program problem positioning method and device, electronic equipment and storage medium
CN113746953B (en) Domain Name Server (DNS) processing method, device, equipment and storage medium
CN111400623A (en) Method and apparatus for searching information
CN115495750A (en) Component detection method, device, electronic equipment and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant