CN109981389A - Phone number recognition methods, device, equipment and medium - Google Patents

Phone number recognition methods, device, equipment and medium Download PDF

Info

Publication number
CN109981389A
CN109981389A CN201711459451.1A CN201711459451A CN109981389A CN 109981389 A CN109981389 A CN 109981389A CN 201711459451 A CN201711459451 A CN 201711459451A CN 109981389 A CN109981389 A CN 109981389A
Authority
CN
China
Prior art keywords
phone number
doubtful
fixed network
data
network data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711459451.1A
Other languages
Chinese (zh)
Inventor
高东生
王欣
王峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Liaoning Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Liaoning Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Liaoning Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711459451.1A priority Critical patent/CN109981389A/en
Publication of CN109981389A publication Critical patent/CN109981389A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/062Generation of reports related to network traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of phone number recognition methods, device, equipment and media.This method comprises: extracting list from the fixed network data on flows packet comprising doubtful phone number, the list includes the doubtful phone number and the associated keyword of the doubtful phone number and HTTP message basic information;Based on the HTTP message basic information, judge whether the doubtful phone number is phone number;And in the case where determining the doubtful phone number for phone number, it will be used as phone number feature critical word with the doubtful associated keyword of phone number, in write-in phone number feature critical character library.As a result, by precisely rapidly parsing to fixed network data on flows packet, realizes the extraction of efficient phone number feature critical word, improve the accuracy of phone number identification.

Description

Phone number recognition methods, device, equipment and medium
Technical field
The present invention relates to network security audits and flow performance analysis technical field, more particularly to one kind to be used for fixed network flow Phone number recognition methods, device, equipment and the medium of data packet.
Background technique
With the appearance of the self-built WiFi of family, the free WiFi in city, more and more mobile terminals have by WiFi access Line broadband internet obtains mobile Internet content.In order to realize parsing and identification of the common carrier to network flow, reach To the target of network security audit and flow performance analysis, how high efficiency extraction phone number feature, accurately identify user mobile phone It number is just particularly important.With the rapid growth of the internet information scale of construction and traffic rate, existing scheme is to phone number spy The efficiency of the identification of the analysis and phone number of sign has been unable to meet analysis demand,
In conclusion there is an urgent need to it is a kind of efficiently, accurately phone number identifying schemes.
Summary of the invention
The embodiment of the invention provides a kind of phone number recognition methods for fixed network data on flows packet, device, equipment And medium, realize the extraction to the phone number feature in fixed network data on flows packet, and accurately identify the mobile phone in network flow Number.Further, by corresponding phone number keyword, the efficiency of phone number identification is improved.
In a first aspect, the embodiment of the invention provides a kind of phone number recognition methods for fixed network data on flows packet, Method includes:
List is extracted from the fixed network data on flows packet comprising doubtful phone number, the list includes described Doubtful phone number and the associated keyword of the doubtful phone number and HTTP message basic information;
Based on the HTTP message basic information, judge whether the doubtful phone number is phone number;And
It, will be with the doubtful associated pass of phone number in the case where determining the doubtful phone number for phone number Key word is written in phone number feature critical character library as phone number feature critical word.
Second aspect, the embodiment of the invention provides a kind of phone number identification device for fixed network data on flows packet, Device includes:
List extraction unit is clear for extracting data from the fixed network data on flows packet comprising doubtful phone number Single, the list includes the doubtful phone number and the associated keyword of the doubtful phone number and HTTP message Basic information;
Phone number judging unit judges that the doubtful phone number is for being based on the HTTP message basic information No is phone number;And
Keyword writing unit, in the case where determining the doubtful phone number for phone number, will with it is described The doubtful associated keyword of phone number is written in phone number feature critical character library as phone number feature critical word.
The third aspect, the embodiment of the invention provides a kind of calculating equipment, comprising: at least one processor, at least one Memory and computer program instructions stored in memory are realized such as when computer program instructions are executed by processor The method of first aspect in above embodiment.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey The method such as first aspect in above embodiment is realized in sequence instruction when computer program instructions are executed by processor.
Phone number recognition methods, device, equipment and Jie provided in an embodiment of the present invention for fixed network data on flows packet Matter fast and efficiently extracts the phone number characteristic in fixed network data on flows packet, accurately identifies the mobile phone in network flow Number.Further, by corresponding phone number keyword, the efficiency of phone number identification is improved.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 shows the phone number recognition methods according to an embodiment of the invention for fixed network data on flows packet Flow diagram.
Fig. 2 shows one according to the present invention to apply exemplary module diagram.
Fig. 3 shows the schematic block diagram of phone number identification device according to an embodiment of the invention.
Fig. 4 shows the hardware structural diagram provided in an embodiment of the present invention for calculating equipment.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention , technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention. To those skilled in the art, the present invention can be real in the case where not needing some details in these details It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including There is also other identical elements in the process, method, article or equipment of the element.
Currently, the currently used internet traffic identification of common carrier and classification method include: port analysis method, application Layer payload analytic approach, traffic behavior method for feature analysis, statistical flow characteristic method of identification etc..
Application layer payload analytic approach: since its accuracy of identification is high and is able to achieve early detection, it has also become operator is practical to be made Main stream approach.Current foreign countries network equipment manufacturer and Internet Service Provider are proposed relevant product or technology, Such as the netflow technology of Cisco company, the CacheLogieP2P Managed Solution of CacheLogic company and NetSpective series of products of VersoTechnologies etc..Either external or domestic flow identifies product, greatly DPI (deep-packet detection) technology and the identification technology based on data flow characteristics are all used, technological essence is identical.
Port detecting method: various P2P softwares all can carry out P2P stream using these port informations from the port numbers of default Amount detection.Such as the eDonkey of early stage uses 4661 and 4662 ports, BT uses the port 6881-6890.Monitoring system passes through inspection The port whether port used in survey network flow belongs to typical P2P can determine whether its corresponding data packet belongs to P2P Data packet.
Traffic behavior method for feature analysis and statistical flow characteristic method of identification: mature traffic characteristic tag database is relied primarily on It is identified, by the way that traffic characteristic is compared with the feature tag data in traffic characteristic database, analyzes outflow Feature.Both methods is especially relied on based on artificial constructed traffic characteristic database.Also, traffic characteristics analysis method is led at present To be applied to carry out analysis by the equipment for directly acquiring mobile flow and targetedly carry out some specific analysis (such as videos Flow) etc., and the exploratory stage is still in for the analysis of fixed network flow (such as family's broadband network flow).
In conclusion showing that most of flow analysis is mainly based upon at present by the analysis to prior art Manual analysis and database rely on, and prior art exists following insufficient:
(1) prior art needs to use the behavior of specific application by manual simulation user, and is carried out using packet catcher Packet capturing is compared by experience and naked eyes and extracts key feature word, forms the feature character library for being directed to specific application.Artificial offline matching Method heavy workload, the inefficient and simulation behavior for obtaining tagged word are limited, to the more demanding of personnel.
(2) with the rapid growth of the internet information scale of construction and traffic rate, analysis of the prior art for traffic characteristic It is more and more unable to do what one wishes.On the one hand, the feature of existing traffic characteristic database increases the flow for having been unable to meet rapid growth, Expanding with rate decline.On the other hand, increase in time since flow is presented, existing manual analysis efficiency has been unable to meet point Analysis demand.
In view of this, efficiently, accurately being identified for the phone number of fixed network data on flows packet the invention proposes a kind of Scheme, to solve at least one above-mentioned technical problem.
The present invention by using deep packet inspection technical (DPI), Hyperscan (high speed regular expression matching engine), The technologies such as distributed reptile (be based on distribution ETL) propose phone number Automatic signature extraction skill under a kind of home broadband WiFi Art realizes efficient phone number feature extraction and phone number identification.Meanwhile the floating resources by accessing phone number Depth analysis is carried out, the accuracy rate of phone number identification is improved, determines that the associated feature critical word of the phone number is cell-phone number Code feature critical word, the phone number feature critical word can be used as carrying out the characteristic of phone number identification and mobile phone being written Number feature critical character library, and then promote phone number recognition efficiency.
The accompanying drawings and embodiments phone number feature extraction scheme that the present invention will be described in detail will be combined as follows.
Fig. 1 shows the phone number recognition methods according to an embodiment of the invention for fixed network data on flows packet Flow diagram.
List is extracted from the fixed network data on flows packet comprising doubtful phone number in step S110 referring to Fig. 1, The list includes the doubtful phone number and the associated keyword of the doubtful phone number and HTTP message basis Information.
Fixed network of the invention can be different from traditional mobile network, e.g. home broadband WiFi, the free WiFi in city etc. Network access mode accesses cable broadband internet by fixed network to obtain mobile Internet content.Fixed network flow is exactly to pass through The data volume of fixed network transmission, fixed network data on flows packet are the data packets of its transmission.The present invention preferably can be to the net under home broadband Network flow is acquired and analyzes.
Doubtful phone number is the number character string for including in fixed network data on flows packet, which has and mobile phone The similar numerical characteristic of number, such as be all that ten one-bit digitals, number front three correspond to predetermined common carrier etc..
In a preferred embodiment, above-mentioned one or more fixed network data on flows packets comprising doubtful phone number can be with It is to be gone out based on scheduled phone number Rules Filtering.
For example, can quickly be filtered by using application layer payload analytic approach in deep packet inspection technical (DPI) unrelated Network flow data packet (such as the logs such as mail log, FTP) and the characteristic of network of relation data on flows packet is extracted. Also, parallel is carried out to multiple regular expressions simultaneously by Hyperscan (high speed regular expression matching engine) technology Match, reduce the match time that searches of doubtful phone number using the characteristics of Hyperscan, and using this database to data packet Rapid matching is carried out, doubtful cell-phone number therein is found, thereby determines that one or more fixed network streams comprising doubtful phone number Measure data packet.
Later, to extract the corresponding data of doubtful phone number from identified one or more fixed network data on flows packets clear It is single.The doubtful phone number and its related data information can be associatedly recorded in the list, for example, with doubtful mobile phone Keyword, HTTP message basic information and corresponding broadband account of number-associated etc..Wherein, HTTP message basic information can It is the information such as relevant timestamp, keyword, uri, host, ua, referer, cookie, content to include but is not limited to. The list can be used as the subsequent data basis for carrying out phone number identification.
Later, in step S120, such as it can be based on the HTTP message basic information, judges the doubtful phone number It whether is phone number.
One complete data packet is substantially all comprising fields such as uri, ua, host, comes from same number with phone number The fields such as host, ua according to packet can be used to identify phone number, and whether judge phone number accurate, and then ensure associated Whether phone number feature critical word is accurate.
Therefore, the present invention can judge whether above-mentioned doubtful phone number is hand by fields such as above-mentioned ua, host Machine number, and in the case where determining the doubtful phone number is not phone number, delete its corresponding list.
It in a preferred embodiment, can be by judging whether the fixed network data on flows packet comes from mobile phone terminal, to mention Rise the accuracy rate of phone number identification.
For example, whether can judge fixed network data on flows packet based on the user terminal information in HTTP message basic information From mobile phone.Specifically, can based on the ua field in HTTP message basic information identify user use terminal type, behaviour Make system and version, cpu type, browser and version, browser rendering engine, browser language, browser plug-in etc..
When above-mentioned ua fields match is to mobile phone terminal, judge the fixed network data on flows packet from mobile phone terminal, it is believed that The doubtful phone number extracted in the fixed network data on flows packet is phone number.
And when above-mentioned ua fields match is to non-mobile phone terminal, judge the fixed network data on flows packet from non-mobile phone terminal, At this point, judge that the doubtful phone number extracted from the data packet is a string of numbers for meeting phone number feature, and it is non-real Positive cell-phone number.It is hereby understood that the doubtful associated keyword of phone number is not suitable for use in as phone number keyword.
Therefore, the number can be deleted in the case where determining that the fixed network data on flows packet is not from mobile phone According to inventory.It will be rejected as a result, from the noise data of non-mobile phone terminal, further increase the accuracy rate of phone number identification.
It in a preferred embodiment, can also be by being analyzed the floating resources that phone number accesses to know Not, it can be determined that fixed network data on flows packet is identified from which application program (APP).
For example, can determine the fixed network data on flows packet pair based on the host name in the HTTP message basic information The application answered.Specifically, application corresponding with the field can be identified based on the host field in HTTP message basic information Program (APP).
In the case where the fixed network data on flows packet can correspond to scheduled application (such as certain mobile phone application APP), can recognize Doubtful phone number to extract in the fixed network data on flows packet is phone number.
And in the case where fixed network data on flows packet does not correspond to scheduled application (such as unknown applications), delete the data Inventory.The noise data from unknown applications is rejected as a result, further increases the accuracy rate of phone number identification.
Mobile phone terminal brand and type, Application Type currently on the market is numerous, the spy of each money terminal, application program Reference breath has a degree of difference.To guarantee the accuracy of above-mentioned phone number identification and the standard of associated keyword True property, in a preferred embodiment, the modes such as the present invention can be crawled by network construct its corresponding database, pass through by The HTTP message basic information of list is matched with the characteristic in respective database, to ensure that phone number identifies Accuracy.
Specifically, it for example, can use Webmagic crawler frame crawls electronic emporium end message in advance, and is based on climbing The end message got constructs terminal information database, and the end message in the database is preferably that information of mobile phone terminal (can also To include non-information of mobile phone terminal).Later, based on the user terminal information in HTTP message basic information whether with the terminal Information of mobile phone terminal matching in information database, to judge whether the fixed network flow packet comes from mobile phone.
Alternatively, also can use Webmagic crawler frame crawls host host name and application (such as mobile phone application in advance APP) the corresponding relationship of title, and host name database is constructed, it is right with it that host name is associatedly recorded in host name database The application answered, later, based on the host name in the matched host name database of host name in HTTP message basic information, really Determine the corresponding application of fixed network data on flows packet.
Database and wherein corresponding characteristic as a result, based on big data analysis building, can be realized and report to HTTP The matching analysis of literary basic information realizes the identification decision to phone number.
It later,, will be with doubtful phone number in the case where determining doubtful phone number for phone number in step S130 Associated keyword is written in phone number feature critical character library as phone number feature critical word.
Extraction to the phone number feature in network flow and phone number are realized by mode as above as a result, Precisely identification.Accurate identification based on phone number, i.e., it is believed that the keyword being associated is phone number feature critical word, Phone number feature critical character library can be written into.
Due to phone number feature can the automatic output of timing daily, be based on the phone number feature critical character library, when It, can will be associated with the phone number feature critical word after identifying phone number feature critical word in fixed network data on flows packet Number is determined as phone number.According to these phone number feature critical words identify cell-phone number accuracy rate can achieve 80% with On, the recognition methods of conventional offline artificial eye is compared, the efficiency of phone number identification can be greatlyd improve.
As a result, by the above-mentioned identification to phone number feature extraction and phone number, mobile phone can be greatly promoted The accuracy rate of number keyword, and then Network Traffic Analysis is promoted to identify the accuracy rate of phone number.
In addition, the data that above-mentioned doubtful phone number numeric string may be from the timestamp of certain application or temporarily report, It is to have the characteristics that provisional, timeliness noise data.Alternatively, doubtful phone number numeric string may also be from certain application mark Know the own ID number of user identity.
It therefore, is the accuracy for ensuring data, it in above process, can also be by big data analysis ability to above-mentioned number It is analyzed according to inventory, further ensures that the accuracy of phone number and corresponding phone number feature critical word.
It in a preferred embodiment, can be with the fixed network broadband account in list, doubtful phone number, doubtful hand The keyword of machine number-associated and the mark of application count the accumulative of four-tuple data and number of days occur as a four-tuple data Or there is number of days or in the case that number reaches the first predetermined threshold, determine its doubtful hand in number, accumulative in four-tuple data Machine number is phone number, and using associated keyword as phone number feature critical word, phone number feature critical is written In character library.
As a result, by the way that time threshold is rationally arranged, effectively filter with provisional, timeliness noise data.Also, Occur that number of days is few without adopted, genuine phone number, it can also be super at it by constantly the accumulating again of data, reprecipitation In the case where crossing the first predetermined threshold, then by it, doubtful phone number is determined as phone number accordingly, and then will be associated Keyword is written in phone number feature critical character library.
The binary group information and application program formed due to fixed network broadband account, phone number is in one-to-one relationship, The binary group formed with fixed network width account and phone number can only correspond to only one application.Therefore, it is preferably implemented at one In example, the application number of doubtful phone number different application corresponding under same broadband account can be counted, and described In the case where being more than the second predetermined threshold using number, determine that the doubtful phone number is phone number, and then be associated with Keyword write-in phone number feature critical character library in.
It, in a preferred embodiment, can also be to passing through in addition, to ensure the accuracy of phone number feature critical word The phone number feature critical word that above-mentioned identification process determines is checked, belongs to pass in the associated keyword of doubtful phone number In the case where key word blacklist, its corresponding keyword is rejected.And key is not belonging in the associated keyword of doubtful phone number In the case where word blacklist, be written in phone number feature critical character library as phone number feature critical word.As a result, Rejecting has interfering keyword.
So far, the hand for fixed network data on flows packet of the invention is described in detail in the method flow diagram for having been combined Fig. 1 Machine number identification method.
Fig. 2 shows one according to the present invention to apply exemplary module diagram.
Technical solution for a clearer understanding of the present invention will apply exemplary data module as shown in connection with fig. 2 as follows And its process of phone number feature extraction and identification, phone number recognition methods of the invention is described in detail.
In order to excavate the corresponding phone number feature critical word of phone number, this application example is to the network under home broadband Flow is acquired, and quickly filters unrelated flow using the application layer payload analytic approach in deep packet inspection technical (DPI), is known Not Chu doubtful cell-phone number and associated keyword in data packet, then the keyword of identification and its corresponding data inventory are imported Hadoop platform crawls the relevant informations such as application program (app) corresponding domain name, mobile phone terminal by spiders technology, and from Time dimension and application dimension go associated data, filter out the higher phone number feature critical word of accuracy.
Referring to fig. 2, the present invention innovatively proposes a kind of novel phone number Automatic signature extraction technology modules, the mould Block is mainly by DPI data cleansing module and big data analysis module composition.Wherein, DPI data cleansing module emphasis uses mobile phone Number characteristic automatic extraction method carries out, and big data analysis module analyzes the data after cleaning, from time dimension and answers Associated data, the higher phone number of further screening accuracy and associated phone number feature critical are removed with dimension Word further increases the efficiency to phone number identification and analysis in network flow based on phone number feature critical character library. Modules and its implementation are described in detail individually below.
(1) DPI data cleansing module
DPI Data Cleaning Model is primarily based on deep packet inspection technical (DPI), for different network application layer load (such as HTTP, DNS etc.) carries out depth detection, and carries out protocal analysis to the log that sorts out, sort out HTTP message, RADIUS message, by interference information cleaning (such as filtering logs such as mail log, FTP etc.).
Later, DPI Data Cleaning Model extracts the key message of message.Such as to RADUIS message, fixed network therein is extracted Broadband account;To HTTP message, the field information in HTTP message is extracted, and the field information extracted carries out unified decoding, and According to phone number prefix rule, the surfing Internet with cell phone message with doubtful phone number prefix is parsed.
Since there are a large amount of redundancies in data on flows, phone number includes one for big data analysis system A little numerical characteristics, for example, number front three represents operator, number centre four represents regional information etc., these are regular at present Centainly, a limited number of phone numbers form one group about 300,000 or so of number feature, pass through these phone number features It goes with can be reversed to extract phone number keyword entrained by phone number.
In cleaning procedure in initialization, Hyperscan high speed matching technique can use simultaneously to multiple regular expressions Formula carries out PARALLEL MATCHING, and number one code word hat is configured in Hyperscan database, forms phone number Hyperscan characteristic According to library, in order to search match time using reduce phone number the characteristics of Hyperscan.
Later, DPI Data Cleaning Model carries out Rapid matching to one or more data packets using this database.According to doubtful Preliminary Analysis Results record is formed like phone number, by doubtful phone number and other relevant informations, is formed and has doubtful mobile phone Number, keyword, HTTP message basic information and the HTTP message bit string of fixed network broadband account binding, the HTTP as a result recorded In message information string comprising timestamp, fixed network broadband account, doubtful cell-phone number, keyword, uri, host, ua, referer, The information such as cookie, content.The bit string can be organized into volume of data inventory be sent into big data analysis module carry out into The analysis of one step.
(2) big data analysis module
(1) analysis library is established by big data analysis ability
One complete data packet is substantially all comprising fields such as uri, ua, host, comes from same number with phone number It according to host, ua field of packet, whether accurate can be used for judging phone number, and then judge the phone number feature in the data packet Whether keyword is accurate.
UA (user terminal) enables the server to terminal type, operating system and version, the CPU class that identification user uses Type, browser and version, browser rendering engine, browser language, browser plug-in etc..
To solve this problem, this application example can crawl relevant information as analysis using Webmagic crawler frame in advance Characteristic is with creation analysis library.Analyze the formation in library mainly by the analysis ability of big data, according to Data Modeling Method, Grab prolonged network crawl data and DPI parsing data be compared, accumulate, precipitate, according to Mathematical Method shape At the inaccuracy in the analysis library for avoiding single data source from being formed.Analysis can also construct in library for different information Different information banks.
Mobile phone terminal brand and numerous types currently on the market, the characteristic information of each money terminal have a degree of difference It is different, it if only matched by phone number feature critical word, will appear the erroneous judgement of more phone number, lead to the hand obtained Machine number inaccuracy.
In this application example, electronic emporium end message can be crawled using Webmagic crawler frame in advance, will be obtained Get the end message library that information increases in analysis library.The end message library in analysis library is formed, and point of user terminal is had recorded Characteristic is analysed, can be used for providing to user behavior progress mathematical and calculating.
Using in example, it is corresponding with app title also host host name can be crawled using Webmagic crawler frame in advance Relation information, and the information that will acquire increases to the host name database in analysis library.Analyze host name (host) letter in library Breath library is formed, and the relevant analysis characteristic of host is had recorded, and can be used for providing to user behavior progress mathematical and calculating.
(2) big data analysis module analyzes the list after DPI data cleansing
As the expansion type of home broadband client develops, traditional data processing and analysis method can not cope with sea The wide online ticket of the family of amount.
This application example distributed arithmetic ability powerful using Hadoop platform, by developing Map-Reduce script, The ua field of online ticket is matched, using packaged terminal coupling function, calls analysis accordingly in analysis library special Sign, the data packet where judging phone number keyword come from the terminal of which model.When matching non-mobile phone model Terminal then judges that this 11 bit digital is not genuine cell-phone number, only meets the number of phone number feature for a string.It is hereby understood that What it is according to the acquirement of this phone number feature critical word is not phone number, this phone number feature critical word is invalid.
The host field information obtained after DPI cleaning, represents the host of the data packet of phone number feature critical word Title.Due to the present invention to phone number key data analyze, be by application dimension analysis, judge its accurately with It is no.Therefore, this application example crawls the corresponding relationship of host host name Yu app title using Webmagic crawler frame in advance, MapReduce code is write using host information bank to go in host field to match.Host matching is packaged into MapReduce to be packaged At jar, by the hadoopjar order in Hadoop cluster, judge that phone number closes using packaged host adaptation function Key word is applied from which kind of app, is effectively classified as the case where many difference host host name that application of the same race produces together Noise data from unknown applications is recorded and is rejected by a kind of application, and the data analytical calculation phone number after helping is same The number that different app applications are appeared under one broadband account, improves the extraction accuracy rate of phone number feature critical word.
In above data handling procedure, for the accuracy for ensuring data, data analysis module uses two kinds of dimensions, makes With big data analysis ability, analysis phone number keyword accuracy rate is promoted.
A. time dimension
The data that the phone number numeric string that DPI is washed out may be from the timestamp of certain application or temporarily report, because This, such noise data all has the characteristics that provisional, timeliness, the method that this method uses data accumulation, by broadband account, Phone number, phone number feature critical word, app title this four-tuple data are accumulated daily as key (unique identification) Data pass through mass data meter using the database distributed number of days for calculating four-tuple data and occurring of the hive in Hadoop The number of days that phone number feature critical word occurs is calculated, analyzes and uses for data, redundant data can be efficiently reduced, alleviates system The storage pressure of system finally obtains the tables of data of the opposite lightweight of all program days runnings of accumulation.
By be arranged time threshold, effectively filter out with provisional, timeliness noise data, and occur number of days it is few and Not adopted, genuine phone number, by data constantly accumulate again, reprecipitation, be more than time threshold, then sort out into mobile phone In number feature critical feature database.
B. dimension is applied
The phone number numeric string that DPI is washed out may be from own No. id of certain application identities user identity, therefore, this In noise like data, broadband account, the binary group information of phone number composition and application are in one-to-one relationship, i.e., such The binary group of noise data packet can only correspond to only one application.In view of the characteristic of this noise like, this method is using calculating hand The method that machine number appears in the number of different app applications under same broadband account, utilizes the hive database in Hadoop The number for the different app application that distributed computing binary group data occur.Finally, we using broadband account, phone number this One binary group data match, app is answered in completion table as key (unique identification) in the filtered table of above-mentioned time dimension With this element of number, accumulation operation is made to already present app number of applications, updates the app application time of all binary group data Number.
By the way that threshold application is arranged, own No. id for effectively filtering the identity user identity of those applications, and use different Number of applications is few without adopted genuine phone number, is more than threshold application by constantly the accumulating again of data, reprecipitation, Then sort out into phone number feature critical character library.
Be found through experiments that, using DPI, Hyperscan high speed matching, Hadoop, crawler technology can more quickly standard It really identifies subscriber phone number feature critical word under fixed network WiFi, realizes that phone number feature critical word is not only quasi- but also efficient Output, greatly improve DPI technical staff excavate phone number feature critical word efficiency, can for common carrier realize network Security audit and flow performance analysis provide huge help.It is specific as follows:
1) present invention solves the problems, such as that data lack surf the Internet for phone number by WIFI in the past caused by, effectively It compensates for flow information caused by surfing the Internet using WIFI to lack, enables an operator to preferably analyze surfing Internet with cell phone traffic behavior;
2) present invention employs a kind of new depth DPI parsing joint Hyperscan, (high speed regular expression matching is drawn Hold up), the technologies such as distributed reptile (based on distribution ETL), realize the accurate fast parsing of mobile phone WIFI internet information, can be fast Speed output phone number characteristic results table solves conventional mobile phone number and excavates difficult, the slow and inaccurate problem of output;
3) present invention is identified using the mathematical model of time dimension and application dimension, by the analysis ability of big data, is built Found long-acting differentiation mechanism, it is ensured that the accuracy of identification.
In addition, the phone number recognition methods for network flow data packet of the invention can also be used for network by one kind The phone number identification device of data on flows packet is realized.Fig. 3 shows phone number identification according to an embodiment of the invention The schematic block diagram of device.Wherein, the functional module of phone number identification device 300 can be by the hard of the realization principle of the invention The combination of part, software or hardware and software is realized.It will be appreciated by persons skilled in the art that Fig. 3 described function mould Block can combine or be divided into submodule, to realize the principle of foregoing invention.Therefore, description herein can be supported To any possible combination or division of functions described herein module or further restriction.
Phone number identification device 300 shown in Fig. 3 can be used to realize phone number recognition methods shown in FIG. 1, under The operation that the functional module and each functional module that face only can have with regard to mobile phone NID number identifier 300 can execute is done briefly Illustrate, the description above in association with Fig. 1 may refer to for the detail section being directed to, which is not described herein again.
As shown in figure 3, phone number identification device 300 of the invention may include list extraction unit 310, mobile phone Number judging unit 320 and keyword writing unit 330.
List extraction unit 310, for extracting data from the fixed network data on flows packet comprising doubtful phone number Inventory, the list include that the doubtful phone number and the associated keyword of the doubtful phone number and HTTP are reported Literary basic information;
Phone number judging unit 320 judges the doubtful phone number for being based on the HTTP message basic information It whether is phone number;And
Keyword writing unit 330 will be with institute in the case where determining the doubtful phone number for phone number The associated keyword of doubtful phone number is stated as phone number feature critical word, is written in phone number feature critical character library.
Phone number identification device 300 preferably can also include deleting unit.Determine in phone number judging unit 320 In the case that the doubtful phone number is not phone number, list described in element deletion is deleted.
Preferably, phone number judging unit 320 can be believed based on the user terminal in the HTTP message basic information Whether breath, judge the fixed network data on flows packet from mobile phone.Hand is not from the judgement fixed network data on flows packet In the case where machine, the list can be deleted by deleting unit.
Preferably, phone number identification device 300 can also include database sharing unit.Database sharing unit can be with Construct terminal information database, and based on the user terminal information in the HTTP message basic information whether with the terminal believe The mobile phone terminal matching in database is ceased, judges whether the fixed network flow packet comes from mobile phone.
Preferably, phone number judging unit 320 can be determined based on the host name in the HTTP message basic information The corresponding application of the fixed network data on flows packet is deleted in the case where the fixed network data on flows packet does not correspond to scheduled application Except the list.
Preferably, phone number identification device 300 can also include database sharing unit.Database sharing unit can be with Host name database is constructed, the corresponding application of host name is associatedly recorded in the host name database, and be based on and institute The host name in the matched host name database of host name in HTTP message basic information is stated, determines the fixed network flow The corresponding application of data packet.
Preferably, phone number judging unit 320 can count the accumulative of four-tuple data and number of days or number occurs, described Four-tuple data include the mark of fixed network broadband account, the doubtful phone number, the keyword and the application, in quaternary There is number of days or in the case that number reaches the first predetermined threshold, determines that the doubtful phone number is mobile phone in the accumulative of group data Number.
Preferably, it is corresponding under same broadband account can to count doubtful phone number for phone number judging unit 320 The application number of different application determine the doubtful mobile phone in the case where the application number is more than the second predetermined threshold Number is phone number.
Preferably, phone number identification device 300 can also include blacklist unit.Blacklist unit can be doubted described In the case where belonging to keyword blacklist like the associated keyword of phone number, the keyword is rejected.
Preferably, phone number identification device 300 can also include screening unit.Screening unit can be based on predetermined mobile phone Number rule, screening include the fixed network data on flows packet of doubtful phone number.
Preferably, phone number identification device 300 can also include keyword recognition unit.Keyword recognition unit can be with Based on phone number feature critical character library, identify phone number feature critical word in fixed network data on flows packet, and will with it is described The associated number of phone number feature critical word is determined as phone number.
It is as a result, to realize target, and realize and extract mesh with phone number Automatic signature extraction under family WIFI in the present invention Mark is based primarily upon " automatic number extractive technique method ", and automatic number extracting method includes DPI Data Cleaning Model and big data Matching Model is analyzed, key point is as follows:
(1) DPI Data Cleaning Model and the character string of output
DPI Data Cleaning Model includes following cleaning step: first, protocal analysis, sorting are carried out to the log sorted out HTTP message, RADIUS message out, it is therefore an objective to by interference information cleaning (such as filtering logs such as mail log, FTP etc.).The Two, message key message is extracted, including to HTTP message, RADUIS packet wideband account, which extracts, to be analyzed and be associated simultaneously Unified decoding is carried out, according to phone number prefix rule, parses the surfing Internet with cell phone message with phone number prefix.Third, Characteristic matching is carried out to phone number feature inventory, according to the progress of cell-phone number, mobile phone keyword and HTTP message basic information Match, it is rear to match broadband account, it is formed and has cell-phone number, the HTTP message bit string of mobile phone keyword and the binding of broadband account.
(2) data analysis mining model
The host field information obtained after DPI cleaning, represents the host name of the data packet of phone number feature critical word Claim.This method crawls the corresponding relationship of host host name Yu app title using Webmagic crawler frame in advance, and it is special to form analysis Library is levied, MapReduce code is write using the characteristic model of analysis feature database and goes to match.Host matching is packaged into MapReduce is packaged into jar, by the hadoopjar order in Hadoop cluster, is sentenced using packaged host adaptation function Disconnected phone number keyword is applied from which kind of app, many difference host host name for effectively producing application of the same race The case where be classified as same application, and the noise data from unknown applications is recorded by time dimension and application dimension and is rejected It goes out, the data analytical calculation phone number after helping appears in the number of different app applications under same broadband account, mentions The extraction accuracy rate of high mobile phone number feature critical word.
In addition, in conjunction with the phone number recognition methods for network flow data packet of Fig. 1 embodiment of the present invention described It can be realized by calculating equipment.Fig. 4 shows the hardware structural diagram provided in an embodiment of the present invention for calculating equipment.
Calculating equipment may include processor 401 and the memory 402 for being stored with computer program instructions.
Specifically, above-mentioned processor 401 may include central processing unit (CPU) or specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement implementation of the present invention One or more integrated circuits of example.
Memory 402 may include the mass storage for data or instruction.For example it rather than limits, memory 402 may include hard disk drive (Hard Disk Drive, HDD), floppy disk drive, flash memory, CD, magneto-optic disk, tape or logical With the combination of universal serial bus (Universal Serial Bus, USB) driver or two or more the above.It is closing In the case where suitable, memory 402 may include the medium of removable or non-removable (or fixed).In a suitable case, it stores Device 402 can be inside or outside data processing equipment.In a particular embodiment, memory 402 is nonvolatile solid state storage Device.In a particular embodiment, memory 402 includes read-only memory (ROM).In a suitable case, which can be mask ROM, programming ROM (PROM), erasable PROM (EPROM), the electric erasable PROM (EEPROM), electrically-alterable ROM of programming (EAROM) or the combination of flash memory or two or more the above.
Processor 401 is by reading and executing the computer program instructions stored in memory 402, to realize above-mentioned implementation Any one in example is used for the phone number recognition methods of network flow data packet.
In one example, calculating equipment may also include communication interface 403 and bus 410.Wherein, as shown in figure 4, processing Device 401, memory 402, communication interface 403 connect by bus 410 and complete mutual communication.
Communication interface 403 is mainly used for realizing in the embodiment of the present invention between each module, device, unit and/or equipment Communication.
Bus 410 includes hardware, software or both, and the component for calculating equipment is coupled to each other together.For example and It is unrestricted, bus may include accelerated graphics port (AGP) or other graphics bus, enhancing Industry Standard Architecture (EISA) bus, Front side bus (FSB), super transmission (HT) interconnection, the interconnection of Industry Standard Architecture (ISA) bus, infinite bandwidth, low pin count (LPC) Bus, memory bus, micro- channel architecture (MCA) bus, peripheral component interconnection (PCI) bus, PCI-Express (PCI-X) Bus, Serial Advanced Technology Attachment (SATA) bus, Video Electronics Standards Association part (VLB) bus or other suitable buses Or the combination of two or more the above.In a suitable case, bus 410 may include one or more buses.To the greatest extent Specific bus has been described and illustrated in the pipe embodiment of the present invention, but the present invention considers any suitable bus or interconnection.
In addition, in conjunction with the phone number recognition methods method for network flow data packet in above-described embodiment, this hair Bright embodiment can provide a kind of computer readable storage medium to realize.Computer is stored on the computer readable storage medium Program instruction;The computer program instructions realize that any one phone number in above-described embodiment identifies when being executed by processor Method.
It should be clear that the invention is not limited to specific configuration described above and shown in figure and processing. For brevity, it is omitted here the detailed description to known method.In the above-described embodiments, several tools have been described and illustrated The step of body, is as example.But method process of the invention is not limited to described and illustrated specific steps, this field Technical staff can be variously modified, modification and addition after understanding spirit of the invention, or suitable between changing the step Sequence.
Functional block shown in structures described above block diagram can be implemented as hardware, software, firmware or their group It closes.When realizing in hardware, it may, for example, be electronic circuit, specific integrated circuit (ASIC), firmware appropriate, insert Part, function card etc..When being realized with software mode, element of the invention is used to execute program or the generation of required task Code section.Perhaps code segment can store in machine readable media program or the data-signal by carrying in carrier wave is passing Defeated medium or communication links are sent." machine readable media " may include any medium for capableing of storage or transmission information. The example of machine readable media includes electronic circuit, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), soft Disk, CD-ROM, CD, hard disk, fiber medium, radio frequency (RF) link, etc..Code segment can be via such as internet, inline The computer network of net etc. is downloaded.
It should also be noted that, the exemplary embodiment referred in the present invention, is retouched based on a series of step or device State certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according in embodiment The sequence referred to executes step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it is apparent to those skilled in the art that, For convenience of description and succinctly, the system, module of foregoing description and the specific work process of unit can refer to preceding method Corresponding process in embodiment, details are not described herein.It should be understood that scope of protection of the present invention is not limited thereto, it is any to be familiar with Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions, These modifications or substitutions should be covered by the protection scope of the present invention.

Claims (14)

1. a kind of phone number recognition methods for fixed network data on flows packet, which is characterized in that the described method includes:
List is extracted from the fixed network data on flows packet comprising doubtful phone number, the list includes described doubtful Phone number and the associated keyword of the doubtful phone number and HTTP message basic information;
Based on the HTTP message basic information, judge whether the doubtful phone number is phone number;And
It, will be with the doubtful associated keyword of phone number in the case where determining the doubtful phone number for phone number As phone number feature critical word, be written in phone number feature critical character library.
2. the method according to claim 1, wherein the method also includes:
In the case where determining the doubtful phone number is not phone number, the list is deleted.
3. judging whether the doubtful phone number is cell-phone number the method according to claim 1, wherein described Code the step of include:
Based on the user terminal information in the HTTP message basic information, judge the fixed network data on flows packet whether from Mobile phone;And
In the case where determining that the fixed network data on flows packet is not from mobile phone, the list is deleted.
4. according to the method described in claim 3, it is characterized in that, the user based in the HTTP message basic information End message, judge the fixed network data on flows packet whether from the step of mobile phone include:
Construct terminal information database;And
Based on the user terminal information in the HTTP message basic information whether with the mobile phone in the terminal information database Terminal coupling, judges whether the fixed network flow packet comes from mobile phone.
5. judging whether the doubtful phone number is cell-phone number the method according to claim 1, wherein described Code the step of include:
Based on the host name in the HTTP message basic information, the corresponding application of the fixed network data on flows packet is determined;
In the case where the fixed network data on flows packet does not correspond to scheduled application, the list is deleted.
6. according to the method described in claim 5, it is characterized in that, the host based in the HTTP message basic information Name determines that the step of fixed network data on flows packet corresponding application includes:
Host name database is constructed, associatedly records the corresponding application of host name in the host name database;And
Based on the host name in the matched host name database of host name in the HTTP message basic information, determine The corresponding application of the fixed network data on flows packet.
7. described according to the method described in claim 5, it is characterized in that, the list further includes fixed network broadband account The step of whether doubtful phone number is phone number judged further include:
There is number of days or number in the accumulative of statistics four-tuple data, and the four-tuple data include fixed network broadband account, described doubt Like the mark of phone number, the keyword and the application;And
There is number of days or in the case that number reaches the first predetermined threshold, determines the doubtful mobile phone in accumulative in four-tuple data Number is phone number.
8. according to the method described in claim 5, it is characterized in that, described judge whether the doubtful phone number is cell-phone number Code the step of include:
Count the application number of doubtful phone number different application corresponding under same broadband account;And
In the case where the application number is more than the second predetermined threshold, determine that the doubtful phone number is phone number.
9. the method according to claim 1, wherein the method also includes:
In the case where the doubtful associated keyword of phone number belongs to keyword blacklist, the keyword is rejected.
10. the method according to claim 1, wherein the method also includes:
Based on phone number rule, screening includes the fixed network data on flows packet of doubtful phone number.
11. the method according to claim 1, wherein the method also includes:
Based on phone number feature critical character library, phone number feature critical word is identified in fixed network data on flows packet;And
It will be determined as phone number with the associated number of phone number feature critical word.
12. a kind of phone number identification device for fixed network data on flows packet, which is characterized in that described device includes:
List extraction unit, for extracting list from the fixed network data on flows packet comprising doubtful phone number, institute Stating list includes that the doubtful phone number and the associated keyword of the doubtful phone number and HTTP message basis are believed Breath;
Phone number judging unit, for be based on the HTTP message basic information, judge the doubtful phone number whether be Phone number;And
Keyword writing unit, in the case where determining the doubtful phone number for phone number, will with it is described doubtful The associated keyword of phone number is written in phone number feature critical character library as phone number feature critical word.
13. a kind of calculating equipment characterized by comprising at least one processor, at least one processor and be stored in institute The computer program instructions in memory are stated, are realized when the computer program instructions are executed by the processor as right is wanted Seek the described in any item methods of 1-11.
14. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that when the calculating Such as method of any of claims 1-11 is realized when machine program instruction is executed by processor.
CN201711459451.1A 2017-12-28 2017-12-28 Phone number recognition methods, device, equipment and medium Pending CN109981389A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711459451.1A CN109981389A (en) 2017-12-28 2017-12-28 Phone number recognition methods, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711459451.1A CN109981389A (en) 2017-12-28 2017-12-28 Phone number recognition methods, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN109981389A true CN109981389A (en) 2019-07-05

Family

ID=67074717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711459451.1A Pending CN109981389A (en) 2017-12-28 2017-12-28 Phone number recognition methods, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN109981389A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104579A (en) * 2019-12-31 2020-05-05 北京神州绿盟信息安全科技股份有限公司 Identification method and device for public network assets and storage medium
CN112367663A (en) * 2019-07-23 2021-02-12 ***通信集团广东有限公司 Method, device and equipment for determining broadband access user number
CN112583832A (en) * 2020-12-14 2021-03-30 北京鼎普科技股份有限公司 DPI-based application layer protocol identification method and system
CN113127767A (en) * 2019-12-31 2021-07-16 ***通信集团四川有限公司 Mobile phone number extraction method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102469117A (en) * 2010-11-08 2012-05-23 ***通信集团广东有限公司 Method and device for identifying abnormal access action
CN104283918A (en) * 2013-07-05 2015-01-14 ***通信集团浙江有限公司 Method and system for obtaining wireless local area network (WLAN) terminal types
CN106452859A (en) * 2016-09-29 2017-02-22 南京邮电大学 Automatic cell phone number characteristic keyword extraction method under fixed network WiFi environment
CN106991316A (en) * 2016-01-21 2017-07-28 滴滴(中国)科技有限公司 A kind of method for identifying ID and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102469117A (en) * 2010-11-08 2012-05-23 ***通信集团广东有限公司 Method and device for identifying abnormal access action
CN104283918A (en) * 2013-07-05 2015-01-14 ***通信集团浙江有限公司 Method and system for obtaining wireless local area network (WLAN) terminal types
CN106991316A (en) * 2016-01-21 2017-07-28 滴滴(中国)科技有限公司 A kind of method for identifying ID and device
CN106452859A (en) * 2016-09-29 2017-02-22 南京邮电大学 Automatic cell phone number characteristic keyword extraction method under fixed network WiFi environment

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112367663A (en) * 2019-07-23 2021-02-12 ***通信集团广东有限公司 Method, device and equipment for determining broadband access user number
CN112367663B (en) * 2019-07-23 2023-04-07 ***通信集团广东有限公司 Method, device and equipment for determining broadband access user number
CN111104579A (en) * 2019-12-31 2020-05-05 北京神州绿盟信息安全科技股份有限公司 Identification method and device for public network assets and storage medium
CN113127767A (en) * 2019-12-31 2021-07-16 ***通信集团四川有限公司 Mobile phone number extraction method and device, electronic equipment and storage medium
CN113127767B (en) * 2019-12-31 2023-02-10 ***通信集团四川有限公司 Mobile phone number extraction method and device, electronic equipment and storage medium
CN112583832A (en) * 2020-12-14 2021-03-30 北京鼎普科技股份有限公司 DPI-based application layer protocol identification method and system

Similar Documents

Publication Publication Date Title
CN103297435B (en) A kind of abnormal access behavioral value method and system based on WEB daily record
CN109981389A (en) Phone number recognition methods, device, equipment and medium
CN110620770B (en) Method and device for analyzing network black product account number
US11537751B2 (en) Using machine learning algorithm to ascertain network devices used with anonymous identifiers
CN102420723A (en) Anomaly detection method for various kinds of intrusion
CN107222511B (en) Malicious software detection method and device, computer device and readable storage medium
CN111431939A (en) CTI-based SDN malicious traffic defense method and system
CN111371778B (en) Attack group identification method, device, computing equipment and medium
CN108334758A (en) A kind of detection method, device and the equipment of user's ultra vires act
CN110691080A (en) Automatic tracing method, device, equipment and medium
KR102086936B1 (en) User data sharing method and device
CN110020161B (en) Data processing method, log processing method and terminal
CN107209834A (en) Malicious communication pattern extraction apparatus, malicious communication schema extraction system, malicious communication schema extraction method and malicious communication schema extraction program
CN114006765A (en) Method and device for detecting sensitive information in message and electronic equipment
CN116915450A (en) Topology pruning optimization method based on multi-step network attack recognition and scene reconstruction
CN114422211A (en) HTTP malicious traffic detection method and device based on graph attention network
CN108055227B (en) WAF unknown attack defense method based on site self-learning
CN105528352B (en) The method for establishing mobile communication subscriber and the corresponding relationship of its network account information
CN106528805A (en) Mobile internet baleful program URL intelligent analyzing and mining method based on users
CN110336798A (en) Message matching filtering method and device based on DPI
CN108199878B (en) Personal identification information identification system and method in high-performance IP network
CN109672586A (en) A kind of DPI service traffics recognition methods, device and computer readable storage medium
Huang et al. On the understanding of interdependency of mobile app usage
CN108650145A (en) Phone number characteristic automatic extraction method under a kind of home broadband WiFi
CN116738369A (en) Traffic data classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190705

RJ01 Rejection of invention patent application after publication