WO2017000761A1 - 一种终端设备的特征信息的提取方法及装置 - Google Patents

一种终端设备的特征信息的提取方法及装置 Download PDF

Info

Publication number
WO2017000761A1
WO2017000761A1 PCT/CN2016/085592 CN2016085592W WO2017000761A1 WO 2017000761 A1 WO2017000761 A1 WO 2017000761A1 CN 2016085592 W CN2016085592 W CN 2016085592W WO 2017000761 A1 WO2017000761 A1 WO 2017000761A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature information
terminal device
string
feature
library file
Prior art date
Application number
PCT/CN2016/085592
Other languages
English (en)
French (fr)
Inventor
李燕
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017000761A1 publication Critical patent/WO2017000761A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history

Definitions

  • This document relates to, but is not limited to, the field of communication technologies, and in particular, to a method and an apparatus for extracting feature information of a terminal device.
  • the characteristic information of the terminal devices in the communication network including: the information of the terminal device and the operating system, on the one hand, the customer information can be better understood, and the network service can be optimized; on the other hand, the collection can be The market share of each type of terminal equipment and operating system is better for product promotion.
  • the technology for extracting information of the terminal device and the operating system in the related art is not perfect, and the information of the terminal device and the operating system cannot be accurately extracted.
  • the embodiment of the invention provides a method and a device for extracting feature information of a terminal device, which can accurately extract terminal device information and better optimize network services.
  • An embodiment of the present invention provides a method for extracting feature information of a terminal device, including:
  • the feature information of the terminal device includes feature information respectively obtained from multiple dimensions
  • the plurality of dimensions include at least a terminal device system, a terminal device manufacturer, and a terminal device model One of them.
  • the request message includes: a hypertext transfer protocol HTTP message.
  • the searching, by the UA signature file, the first feature information corresponding to the first UA string includes:
  • the extracting method further includes:
  • the first feature information having the longest length is selected as the feature information of the first terminal device.
  • An embodiment of the present invention further provides an apparatus for extracting feature information of a terminal device, including:
  • a building module configured to: according to the feature information of the terminal device and the user agent UA string corresponding to each feature information, construct a UA feature library file that identifies a mapping relationship between the feature information and the UA string;
  • the acquiring module is configured to collect the request packet sent by the first terminal device, and obtain the first UA string of the first terminal device from the request packet;
  • a determining module configured to search, according to the first UA string, the first feature information corresponding to the first UA string from the UA feature library file, and to find the first UA string corresponding to the first UA string
  • the first feature information is used as feature information of the first terminal device.
  • the feature information of the terminal device includes feature information respectively obtained from multiple dimensions
  • the plurality of dimensions includes at least one of a terminal device system, a terminal device manufacturer, and a terminal device model.
  • the collection module is configured to:
  • HTTP Hypertext Transfer Protocol
  • the determining module is configured to:
  • the extracting device further includes:
  • the information determining module is configured to select the first feature information having the longest length as the feature information of the first terminal device.
  • the technical solution provided by the embodiment of the present invention includes: constructing a UA feature library file that defines a mapping relationship between the feature information of the terminal device and the UA string; and the UA character to be analyzed according to the first terminal device.
  • the string searches for the matching first feature information from the UA signature file to obtain the feature information of the first terminal device.
  • the method of the embodiment of the present invention obtains the feature information of the terminal device by parsing the UA string, and implements the feature of the terminal device. Accurate extraction of information.
  • FIG. 1 is a flowchart showing a method for extracting feature information of a terminal device according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing the structure of a goto table of an AC algorithm
  • FIG. 3 is a schematic structural diagram of another goto table of an AC algorithm according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of constructing an AC feature library file in a method for extracting feature information of a terminal device according to an embodiment of the present invention
  • FIG. 5 is a flowchart of extracting feature information in a method for extracting feature information of a terminal device according to an embodiment of the present invention
  • FIG. 6 is a block diagram showing the structure of an apparatus for extracting feature information of a terminal device according to an embodiment of the present invention.
  • the embodiment of the present invention provides a method and a device for extracting feature information of a terminal device, in which the operator cannot accurately obtain the feature information of the terminal device, and firstly constructs the feature information of the terminal device and the user agent (UA,
  • the UA signature library file of the mapping relationship between the strings of the user agent, and the first feature information of the first terminal device is obtained by searching the first feature information from the UA signature database file according to the UA string to be analyzed of the first terminal device.
  • Feature information; the extraction method obtains the feature information of the terminal device by parsing the UA string, and effectively solves the requirement of the operator to acquire the feature information of the terminal device.
  • an embodiment of the present invention provides a method for extracting feature information of a terminal device, including:
  • Step 11 Construct, according to the feature information of the terminal device and the user agent (UA, User Agent) string corresponding to each feature information, a UA feature library file that has a mapping relationship between the feature information and the UA string;
  • UA User Agent
  • the UA feature library file that constructs the mapping relationship between the feature information and the UA string can be constructed by using the feature library in the related art; the method for constructing the feature library file is a common technical means for those skilled in the art. I will not repeat them here.
  • Step 12 Acquire a request packet sent by the first terminal device, and obtain a first UA string of the first terminal device from the request packet.
  • Step 13 Search for the first feature information corresponding to the first UA string from the UA signature file according to the first UA string, and use the first feature information corresponding to the first UA string as the first terminal device. Characteristic information.
  • User Agent is a part of the HTTP protocol and belongs to one of the components of the header domain. Its information includes hardware platform, system software, application software, and user personal preferences.
  • User-Agent includes: AppStore. /2.0iOS/7.1.2model/iPhone3, 1build/11D257(4;dt:27), through UA, can accurately extract the information of the terminal device.
  • the parameter types included in the UA include: terminal system platform, terminal product model, user application, and so on. There may be only one or more types; for example, AppStore/2.0 is user application information, and iOS/7.1.2 is a terminal system platform. And version information, iPhone3 is user terminal information, and the like.
  • the feature information of the terminal device is information that can identify the terminal device, such as a serial number, a name, and the like of the feature information.
  • each feature information has a corresponding UA string, which is a feature string indicating its feature information.
  • the UA feature library file stores the UA string and the corresponding feature information in a one-to-one correspondence, which facilitates the call recognition of the subsequent steps.
  • the request message (which may be a data domain message) of the collected first terminal device is obtained in step 12, and the first UA string included in the request message is obtained, so that, according to the first UA string,
  • the first feature information is obtained from the UA signature file, and the first feature information is the feature information of the first terminal device.
  • the feature information of the terminal device includes feature information respectively obtained from multiple dimensions
  • the plurality of dimensions includes at least one of a terminal device system, a terminal device manufacturer, and a terminal device model.
  • acquiring the feature information from one dimension is also an optional embodiment of the present invention.
  • the terminal device system generally includes the version number of the terminal system
  • the terminal device manufacturer generally includes the name of the terminal device manufacturer
  • the terminal device model generally includes the model number of the terminal device.
  • step 12 includes:
  • Step 121 Acquire a Hypertext Transfer Protocol (HTTP) packet sent by the first terminal device, and obtain a first UA string of the first terminal device from the HTTP packet.
  • HTTP Hypertext Transfer Protocol
  • the UA is a part of the HTTP protocol, and the embodiment of the present invention only collects the HTTP packet sent by the first terminal device, and the HTTP packet (for example, the request for the HTTP message (REQ) message)
  • the UA string is obtained by the first UA string, and the UA string includes system software, hardware platform, system software, application software, and/or user personal preference of the first terminal device.
  • the embodiment of the present invention obtains feature information according to the UA string in HTTP, which is more accurate than the related technology.
  • step 13 includes:
  • Step 131 Apply a multi-pattern matching (AC) algorithm from the first UA string to the UA feature.
  • the first feature information corresponding to the first UA string is searched in the library file.
  • AC multi-pattern matching
  • the lightweight multi-mode matching engine based on the AC algorithm identifies the UA string and obtains the recognition result of the related device information (ie, the first feature information).
  • the AC algorithm is a classic multi-pattern matching algorithm consisting of three parts, a goto table, a fail table, and an output table; a goto table, a fail table, and an output table are conventional tables included in the AC algorithm itself. It can be guaranteed that for a given text and pattern set P ⁇ p1, p2,...pm ⁇ of length n, within the O(n) time complexity, all target patterns in the text are found, and the scale of the pattern set m has nothing to do.
  • the mode string is the first feature information corresponding to the first UA string in the embodiment of the present invention:
  • the method of the embodiment of the present invention further includes:
  • Step 14 If the first UA string corresponds to two or more first feature information and each of the first feature information belongs to the same dimension, obtain the length of each first feature information;
  • Step 15 Select the first feature information with the longest length as the feature information of the first terminal device.
  • the step 14 and the step 15 are to prevent the same UA string from being corresponding to the plurality of first feature information.
  • the feature information in which the information length is the longest is selected as the feature information of the first terminal.
  • the matching feature information obtained according to the first UA string is iphone3 and iphone3, 1build respectively; then iphone3, 1build is selected as the feature information of the first terminal device.
  • selecting the first feature information with the longest length as the feature information of the first terminal device is considered based on the principle that the longer the length, the higher the accuracy, and only an optional embodiment of the present invention;
  • the shortest first feature information is also applicable to the feature information of the first terminal device in a certain application scenario, that is, it should also belong to the protection scope of the embodiment of the present invention, and other pre-set methods are also applicable.
  • the process of constructing a UA signature file in the embodiment of the present invention is as follows:
  • the file covers the feature information of the terminal device from three dimensions, the system, the terminal vendor, and the terminal device model.
  • the steps are as follows:
  • the feature information of the terminal device such as system, vendor, and brand is separately recorded in the form of a table;
  • the table mainly includes information such as TOKEN_ID (serial number), NAME (recognition result), and PATTERN data (UA string);
  • Step 402 Construct a system table, and obtain and fill data of TOKEN_ID, NAME, and PATTERN of each record in the table;
  • Step 403 Construct a vendor table, and obtain and fill data of TOKEN_ID, NAME, and PATTERN of each record in the table;
  • Step 404 Construct a brand table, and obtain and fill data of TOKEN_ID, NAME, and PATTERN of each record in the table;
  • Step 405 Generate a UA feature library file according to the system table, the vendor table, and the brand table, and the process ends.
  • the process of extracting feature information of a terminal device in the embodiment of the present invention is as follows:
  • the file covers the feature information of the terminal device from three dimensions, the system, the terminal device, and the terminal device model.
  • the device feature library covers the terminal device in the UA string.
  • Step 501 Load the UA feature library file generated by the build.
  • step 502 the HTTP packet is collected, and the first UA string is obtained from the HTTP message (for example, the request message (REQ) of the HTTP packet.
  • the HTTP packet is one of the request packets.
  • Step 503 Search for the first feature information corresponding to the first UA string from the UA feature library file by using an AC algorithm according to the first UA string.
  • step 504 the length of the first feature information corresponding to the first UA string is calculated by traversing; in the method of the embodiment of the present invention, the ID and the length of the feature information having the longest length can be recorded for each of the first feature information.
  • Step 505 The first feature information having the longest length is selected as the feature information of the first terminal device.
  • Step 506 Return the version number of the device system of the terminal device according to the feature information of the first terminal device.
  • Step 507 Return the terminal device model of the terminal device according to the feature information of the first terminal device, For example, brand.
  • Step 508 Return to the terminal device manufacturer of the terminal device according to the feature information of the first terminal device. End the process.
  • the foregoing extraction method may be implemented by a server or an collection and analysis device in an operator.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method for extracting feature information of the terminal device.
  • the embodiment of the present invention further provides an apparatus for extracting feature information of a terminal device, including:
  • the structuring module 61 is configured to construct a UA feature library file that is mapped to the UA string according to the feature information of the terminal device and the user agent (UA) string corresponding to each feature information.
  • the acquiring module 62 is configured to collect the request packet sent by the first terminal device, and obtain the first UA string of the first terminal device from the request packet.
  • the determining module 63 is configured to search for the first feature information corresponding to the first UA string from the UA signature file according to the first UA string, and use the first feature information corresponding to the first UA string as the first Characteristic information of a terminal device.
  • the feature information of the terminal device includes feature information respectively obtained from multiple dimensions
  • the plurality of dimensions includes at least one of a terminal device system, a terminal device manufacturer, and a terminal device model.
  • the collecting module 62 is configured to:
  • HTTP Hypertext Transfer Protocol
  • the determining module 63 is configured to:
  • the first feature information corresponding to the first UA string is searched from the UA feature library file by using a multi-pattern matching (AC) algorithm according to the first UA string.
  • AC multi-pattern matching
  • the extracting device further includes:
  • Obtaining a module configured to acquire a length of the first feature information if the first UA string corresponds to two or more first feature information and each of the first feature information belongs to the same dimension;
  • the information determining module is configured to select the first feature information having the longest length as the feature information of the first terminal device.
  • the device for extracting feature information of the terminal device provided by the foregoing embodiment of the present invention is a device for applying the method for extracting feature information of the terminal device, and all embodiments of the foregoing extraction method are applicable to the extracting device. Both can achieve the same or similar benefits.
  • each module/unit in the foregoing embodiment may be implemented in the form of hardware, for example, by implementing an integrated circuit to implement its corresponding function, or may be implemented in the form of a software function module, for example, being executed by a processor and stored in a memory. Programs/instructions to implement their respective functions.
  • the invention is not limited to any specific form of combination of hardware and software.
  • the above technical solution realizes accurate extraction of feature information of the terminal device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种终端设备的特征信息的提取方法及装置,包括:根据终端设备的特征信息以及每个特征信息对应的用户代理(UA)字符串,构建标识特征信息与UA字符串之间成映射关系的UA特征库文件;采集第一终端设备发送的请求报文,从请求报文中获取第一终端设备的第一UA字符串;根据第一UA字符串从UA特征库文件中查找与第一UA字符串对应的第一特征信息,将查找到的与第一UA字符串对应的第一特征信息作为第一终端设备的特征信息。本发明实施例方法,实现了对终端设备的特征信息的准确提取。

Description

一种终端设备的特征信息的提取方法及装置 技术领域
本文涉及但不限于通信技术领域,尤其涉及一种终端设备的特征信息的提取方法及装置。
背景技术
随着移动通信技术和业务应用的快速发展,国内外涌现出大批终端设备制造商,终端设备和终端设备的操作***也在进行着快速的发展。
对于运营商而言,能够掌握通信网络中的这些终端设备的特征信息,包括:终端设备和操作***的信息,一方面可以更好的了解客户信息,优化网络服务;另一方面,可以通过收集每一种类的终端设备和操作***的市场份额更好的进行产品推广。但相关技术中提取终端设备和操作***的信息的技术不够完善,无法对终端设备和操作***的信息进行准确的提取。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本发明实施例提供一种终端设备的特征信息的提取方法及装置,能够准确的提取终端设备信息,更好的优化网络服务。
本发明实施例提供一种终端设备的特征信息的提取方法,包括:
根据终端设备的特征信息以及每个特征信息对应的用户代理UA字符串,构建标识所述特征信息与所述UA字符串之间成映射关系的UA特征库文件;
采集第一终端设备发送的请求报文,从所述请求报文中获取所述第一终端设备的第一UA字符串;
根据所述第一UA字符串从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息,得到所述第一终端设备的特征信息。
可选的,所述终端设备的特征信息包括分别从多个维度获取的特征信息;
所述多个维度至少包括终端设备***、终端设备厂商以及终端设备型号 中的一种。
可选的,所述请求报文包括:超文本传输协议HTTP报文。
可选的,所述从UA特征库文件中查找与所述第一UA字符串对应的第一特征信息包括:
根据所述第一UA字符串运用多模式匹配AC算法从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息。
可选的,所述提取方法还包括:
若所述第一UA字符串对应于两个或两个以上第一特征信息且每一个所述第一特征信息均属于同一维度,获取每一个所述第一特征信息的长度;
选取长度最长的所述第一特征信息作为所述第一终端设备的特征信息。
本发明实施例还提供一种终端设备的特征信息的提取装置,包括:
构建模块,设置为根据终端设备的特征信息以及每个特征信息对应的用户代理UA字符串,构建标识所述特征信息与所述UA字符串之间成映射关系的UA特征库文件;
采集模块,设置为采集第一终端设备发送的请求报文,从所述请求报文中获取所第一述终端设备的第一UA字符串;
确定模块,设置为根据所述第一UA字符串从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息,将查找到的与所述第一UA字符串对应的第一特征信息作为所述第一终端设备的特征信息。
可选的,所述终端设备的特征信息包括分别从多个维度获取的特征信息;
所述多个维度至少包括终端设备***、终端设备厂商以及终端设备型号中的一种。
可选的,所述采集模块是设置为:
采集第一终端设备发送的超文本传输协议HTTP报文,以所述HTTP报文作为请求报文,从所述HTTP报文中获取所述第一终端设备的第一UA字符串。
可选的,所述确定模块是设置为:
根据所述第一UA字符串运用多模式匹配AC算法从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息。
可选的,所述提取装置还包括:
获取模块,设置为若所述第一UA字符串对应于两个或两个以上第一特征信息且所述每一个第一特征信息均属于同一维度,获取每一个所述第一特征信息的长度;
信息确定模块,设置为选取长度最长的所述第一特征信息作为所述第一终端设备的特征信息。
与相关技术相比,本发明实施例提供的技术方案,包括:构建标识终端设备的特征信息与UA字符串之间成映射关系的UA特征库文件;根据第一终端设备的待分析的UA字符串从UA特征库文件中查找相匹配的第一特征信息,得到第一终端设备的特征信息;本发明实施例方法通过解析UA字符串,获得终端设备的特征信息,实现了对终端设备的特征信息的准确提取。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1表示本发明实施例的终端设备的特征信息的提取方法的流程图;
图2表示AC算法的goto表结构示意图;
图3表示本发明实施例AC算法的另一goto表结构示意图;
图4表示本发明实施例的终端设备的特征信息的提取方法中构建AC特征库文件的流程图;
图5表示本发明实施例的终端设备的特征信息的提取方法中提取特征信息的流程图;
图6表示本发明实施例的终端设备的特征信息的提取装置的组成结构示意图。
本发明的实施方式
下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在 不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
本发明实施例针对现关技术中运营商无法准确获取终端设备的特征信息的问题,提供一种终端设备的特征信息的提取方法及装置,首先构建标识终端设备的特征信息与用户代理(UA,User Agent)字符串之间的映射关系的UA特征库文件,进而根据第一终端设备的待分析的UA字符串从UA特征库文件中查找相匹配的第一特征信息,得到第一终端设备的特征信息;该提取方法通过解析UA字符串,获得终端设备的特征信息,有效的解决了运营商对终端设备的特征信息获取的需求。
如图1所示,本发明实施例提供一种终端设备的特征信息的提取方法,包括:
步骤11,根据终端设备的特征信息以及每个特征信息对应的用户代理(UA,User Agent)字符串,构建标识特征信息与UA字符串之间成映射关系的UA特征库文件;
需要说明的是,构建识特征信息与UA字符串之间成映射关系的UA特征库文件可以通过相关技术中的特征库进行构建;构建特征库文件的方法为本领域技术人员的惯用技术手段,在此不再赘述。
步骤12,采集第一终端设备发送的请求报文,从请求报文中获取第一终端设备的第一UA字符串;
步骤13,根据第一UA字符串从UA特征库文件中查找与第一UA字符串对应的第一特征信息,将查找到的与第一UA字符串对应的第一特征信息作为第一终端设备的特征信息。
用户代理(UA,User Agent)是HTTP协议中的一部分,属于头域的组成之一,它的信息包括了硬件平台、***软件、应用软件和用户个人偏好等;如、User-Agent包括:AppStore/2.0iOS/7.1.2model/iPhone3,1build/11D257(4;dt:27),通过UA,能够准确的提取终端设备的信息。其中,UA包含的参数类型包括:终端***平台、终端产品型号、用户应用等信息,可能只存在一种或者多种;例如、AppStore/2.0为用户应用信息、iOS/7.1.2为终端***平台及版本信息、iPhone3为用户终端信息等。
本发明实施例中,终端设备的特征信息为能够标识该终端设备的信息,例如特征信息的序列号、名称等。且每个特征信息都存在有与之相对应的UA字符串,该UA字符串即表明其特征信息的特征字串。上述UA特征库文件中将UA字符串和与其对应的特征信息一一对应的保存,方便后续步骤的调用识别。
可选的,步骤12中对采集到的第一终端设备的请求报文(可以是数据域报文),获取该请求报文中包含的第一UA字符串,从而根据第一UA字符串,从UA特征库文件中查找得到第一特征信息,该第一特征信息即为上述第一终端设备的特征信息。
可选的,本发明实施例方法为了全面的表示终端设备的特征信息,终端设备的特征信息包括分别从多个维度获取的特征信息;其中,
所述多个维度至少包括终端设备***、终端设备厂商以及终端设备型号中的一种。本发明实施例,只包含终端设备***、终端设备厂商以及终端设备型号中的一种时,从一个维度获取特征信息,也属于本发明可选实施例。
终端设备***一般包括终端***的版本号,终端设备厂商一般包括终端设备厂商的名字,终端设备型号一般包括终端设备的型号。
可选的,本发明上述实施例中,步骤12包括:
步骤121,采集第一终端设备发送的超文本传输协议(HTTP)报文,从HTTP报文中获取第一终端设备的第一UA字符串。
本发明实施例中,由于UA是HTTP协议中的一部分,故本发明实施例仅采集第一终端设备发送的HTTP报文,从HTTP报文(例如、HTTP报文的请求(REQ)消息)中获取UA字符串,该UA字符串即为第一UA字符串,该UA字符串中包括了第一终端设备的***软件、硬件平台、***软件、应用软件和/或用户个人偏好等。
需要说明的是,相对现关技术,本发明实施例根据HTTP中的UA字符串获得特征信息,相比于相关技术更准确。
可选的,本发明上述实施例中,步骤13包括:
步骤131,根据第一UA字符串运用多模式匹配(AC)算法从UA特征 库文件中查找与第一UA字符串对应的第一特征信息。
本发明的可选实施例中,包括:基于AC算法的轻量级多模匹配引擎,对UA字符串进行识别,获得相关设备信息的识别结果(即第一特征信息)。可选的,AC算法是一个经典的多模式匹配算法,由三部分构成,goto表,fail表和output表;goto表,fail表和output表为AC算法本身包含的常规的表格。可以保证对于给定的长度为n的文本和模式集合P{p1,p2,...pm},在O(n)时间复杂度内,找到文本中的所有目标模式,而与模式集合的规模m无关。
假如有模式集合P{he,she,his,hers},如图2所示为该模式的goto表、表1所示为该模式的output表,表中匹配的模式串为本发明实施例与第一UA字符串对应的第一特征信息;
output表    
模式串状态数 匹配的模式串  
2 he  
5 she he
7 his  
9 hers  
表1
假设第一UA字符串为第一特征模式集合P{iphone,iphone6,iphone4,andriod},如图3所示为该第一特征模式的goto表,表2为该模式的output表;表中匹配的模式串为本发明实施例与第一UA字符串对应的第一特征信息:
output表    
模式串状态数 匹配的模式串  
6 iphone  
7 iphone4  
8 iphone6  
15 android  
表2
AC算法对文本进行匹配的步骤包括:一开始,将i指向文本text[1...j]的起始位置,然后用text[i]从goto表的状态D[0]开始执行状态跳转。如果存在可行的跳转方案D[0][text[i]]=p,p!=0,则将i增加1,同时转移到状态D[p]。如果不存在可行的转移方案,则考察状态D[p]的fail值,如果fail[p]不等于0,则转移到D[fail[p]],再次查看D[fail[p]][text[i]]是否等于0,直到发现不为0的状态转移方案或者对于所有经历过的fail状态,对于当前输入text[i]都没有非0的转移方案为止,如果确实不存在非0的转移方案,则将i增加1,同时转移到D[0]继续执行跳转。在每次跳转到一个状态D[p]时(fail跳转不算),都需要查看一下output[p]是否指向可输出的模式串,如果有,说明当前位置匹配了某些模式串,将这些模式串输出。
承续上例,本发明的上述实施例中,本发明实施例方法还包括:
步骤14,若第一UA字符串对应于两个或两个以上第一特征信息且每一个第一特征信息均属于同一维度,获取每一个第一特征信息的长度;
步骤15,选取长度最长的第一特征信息作为第一终端设备的特征信息。
需要说明的是,步骤14及步骤15是为了防止同一UA字符串对应多个第一特征信息的情况,本发明实施例是选取其中信息长度最长的特征信息作为第一终端的特征信息。例如,根据第一UA字符串得到的匹配特征信息分别为iphone3和iphone3,1build;则此时选取iphone3,1build作为第一终端设备的特征信息。可选的,选取长度最长的第一特征信息作为第一终端设备的特征信息是基于长度越长精度越高的原理考虑的,仅为本发明的一可选实施例;其他的如选择长度最短的第一特征信息作为第一终端设备的特征信息在某种应用场景下同样适用,即其也应属于本发明实施例的保护范围,其他一些预先设定的方式也同样适用,在此不一一枚举。
下面结合图4及图5对本发明实施例提供的特征信息的提取方法做详细描述:
如图4所示为本发明实施例中构建UA特征库文件的过程:
假设该文件从三个维度,设备***(system)、终端设备厂商(vendor)以及终端设备型号(brand)来覆盖终端设备的特征信息,步骤如下:
步骤401,事先以表格的形式分别记录system、vendor、brand等终端设备的特征信息;表格中主要包括TOKEN_ID(序列号)、NAME(识别结果)、PATTERN数据(UA字符串)等信息;
步骤402,构造system表,获取并填充表中每条记录的TOKEN_ID、NAME和PATTERN的数据;
步骤403,构造vendor表,获取并填充表中每条记录的TOKEN_ID、NAME、PATTERN的数据;
步骤404,构造brand表,获取并填充表中每条记录的TOKEN_ID、NAME、PATTERN的数据;
步骤405,根据system表、vendor表、brand表构建生成UA特征库文件,流程结束。
如图5所示为本发明实施例中提取终端设备的特征信息的执行过程:
假设该文件从三个维度,设备***(system)、终端设备厂商(vendor)以及终端设备型号(brand)来覆盖终端设备的特征信息,当设备特征库中覆盖了UA字符串中的终端设备的特征信息时步骤如下:
步骤501,加载构建生成的UA特征库文件;
步骤502,采集HTTP报文,从HTTP报文(例如、HTTP报文的请求消息(REQ)中获取第一UA字符串;这里,HTTP报文为请求报文中的一种。
步骤503,根据第一UA字符串运用AC算法从UA特征库文件查找与第一UA字符串对应的第一特征信息;
步骤504,遍历计算查找到的与第一UA字符串对应的第一特征信息的长度;本发明实施例方法,可以对每一个的第一特征信息记录特征信息长度最长的ID和长度。
步骤505,选取长度最长的第一特征信息作为第一终端设备的特征信息;
步骤506、根据第一终端设备的特征信息返回终端设备的设备***的版本号。
步骤507,根据第一终端设备的特征信息返回终端设备的终端设备型号, 例如、品牌。
步骤508,根据第一终端设备的特征信息返回终端设备的终端设备厂商。结束流程。
需要说明的是,本发明实施例上述提取方法可以通过运营商中的服务器或采集分析设备实施。
本发明实施例还提供一种计算机存储介质,计算机存储介质中存储有计算机可执行指令,计算机可执行指令用于执行上述的终端设备的特征信息的提取方法。
如图6所示,本发明实施例还提供一种终端设备的特征信息的提取装置,包括:
构建模块61,设置为根据终端设备的特征信息以及每个特征信息对应的用户代理(UA)字符串,构建标识特征信息与UA字符串之间成映射关系的UA特征库文件;
采集模块62,设置为采集第一终端设备发送的请求报文,从请求报文中获取第一述终端设备的第一UA字符串;
确定模块63,设置为根据第一UA字符串从UA特征库文件中查找与第一UA字符串对应的第一特征信息,将查找到的与第一UA字符串对应的第一特征信息作为第一终端设备的特征信息。
可选的,本发明的上述实施例中,终端设备的特征信息包括分别从多个维度获取的特征信息;
多个维度至少包括终端设备***、终端设备厂商以及终端设备型号中的一种。
可选的,本发明的上述实施例中,采集模块62是设置为:
采集第一终端设备发送的超文本传输协议(HTTP)报文,以HTTP报文作为请求报文,从HTTP报文中获取第一终端设备的第一UA字符串。
可选的,本发明的上述实施例中,确定模块63是设置为:
根据第一UA字符串运用多模式匹配(AC)算法从UA特征库文件中查找与第一UA字符串对应的第一特征信息。
可选的,本发明的上述实施例中,提取装置还包括:
获取模块,设置为若第一UA字符串对应于两个或两个以上第一特征信息且每一个第一特征信息均属于同一维度,获取第一特征信息的长度;
信息确定模块,设置为选取长度最长的第一特征信息作为第一终端设备的特征信息。
需要说明的是,本发明的上述实施例提供的终端设备的特征信息的提取装置是应用上述终端设备的特征信息的提取方法的装置,则上述提取方法的所有实施例均适用于该提取装置,且均能达到相同或相似的有益效果。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件(例如处理器)完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的每个模块/单元可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序/指令来实现其相应功能。本发明不限制于任何特定形式的硬件和软件的结合。
虽然本申请所揭露的实施方式如上,但所述的内容仅为便于理解本申请而采用的实施方式,并非用以限定本申请,如本发明实施方式中的可选的实现方法。任何本申请所属领域内的技术人员,在不脱离本申请所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本申请的专利保护范围,仍须以所附的权利要求书所界定的范围为准。
工业实用性
上述技术方案实现了对终端设备的特征信息的准确提取。

Claims (10)

  1. 一种终端设备的特征信息的提取方法,所述方法包括:
    根据终端设备的特征信息以及每个特征信息对应的用户代理UA字符串,构建标识所述特征信息与所述UA字符串之间成映射关系的UA特征库文件;
    采集第一终端设备发送的请求报文,从所述请求报文中获取所述第一终端设备的第一UA字符串;
    根据所述第一UA字符串从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息,将查找到的与所述第一UA字符串对应的第一特征信息作为所述第一终端设备的特征信息。
  2. 根据权利要求1所述的提取方法,其中,
    所述终端设备的特征信息包括分别从多个维度获取的特征信息;
    所述多个维度至少包括终端设备***、终端设备厂商以及终端设备型号中的一种。
  3. 根据权利要求1所述的提取方法,其中,所述请求报文中包括超文本传输协议HTTP报文。
  4. 根据权利要求3所述的提取方法,其中,所述从UA特征库文件中查找与所述第一UA字符串对应的第一特征信息包括:
    根据所述第一UA字符串运用多模式匹配AC算法从构建的所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息。
  5. 根据权利要求4所述的提取方法,所述提取方法还包括:
    若所述第一UA字符串对应于两个或两个以上第一特征信息且每一个所述第一特征信息均属于同一维度,获取每一个所述第一特征信息的长度;
    选取长度最长的所述第一特征信息作为所述第一终端设备的特征信息。
  6. 一种终端设备的特征信息的提取装置,所述提取装置包括:
    构建模块,设置为根据终端设备的特征信息以及每个特征信息对应的用户代理UA字符串,构建标识所述特征信息与所述UA字符串之间成映射关 系的UA特征库文件;
    采集模块,设置为采集第一终端设备发送的请求报文,从所述请求报文中获取第一述终端设备的第一UA字符串;
    确定模块,设置为根据所述第一UA字符串从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息,将查找到的与所述第一UA字符串对应的第一特征信息作为所述第一终端设备的特征信息。
  7. 根据权利要求6所述的提取装置,其中,所述终端设备的特征信息包括分别从多个维度获取的特征信息;
    所述多个维度至少包括终端设备***、终端设备厂商以及终端设备型号中的一种。
  8. 根据权利要求6所述的提取装置,其中,所述采集模块是设置为:
    采集第一终端设备发送的超文本传输协议HTTP报文,以所述HTTP报文作为请求报文,从所述HTTP报文中获取所述第一终端设备的第一UA字符串。
  9. 根据权利要求8所述的提取装置,其中,所述确定模块是设置为:
    根据所述第一UA字符串运用多模式匹配AC算法从所述UA特征库文件中查找与所述第一UA字符串对应的第一特征信息,将查找到的与所述第一UA字符串对应的第一特征信息作为所述第一终端设备的特征信息。
  10. 根据权利要求9所述的提取装置,所述提取装置还包括:
    获取模块,设置为若所述第一UA字符串对应于两个或两个以上第一特征信息且所述每一个第一特征信息均属于同一维度,获取每一个所述第一特征信息的长度;
    信息确定模块,设置为选取长度最长的所述第一特征信息作为所述第一终端设备的特征信息。
PCT/CN2016/085592 2015-07-02 2016-06-13 一种终端设备的特征信息的提取方法及装置 WO2017000761A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510379950.4 2015-07-02
CN201510379950.4A CN106330520A (zh) 2015-07-02 2015-07-02 一种终端设备的特征信息的提取方法及装置

Publications (1)

Publication Number Publication Date
WO2017000761A1 true WO2017000761A1 (zh) 2017-01-05

Family

ID=57607871

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/085592 WO2017000761A1 (zh) 2015-07-02 2016-06-13 一种终端设备的特征信息的提取方法及装置

Country Status (2)

Country Link
CN (1) CN106330520A (zh)
WO (1) WO2017000761A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109660507A (zh) * 2018-10-16 2019-04-19 深圳壹账通智能科技有限公司 与用户端通信的方法、装置、设备及可读存储介质
CN113507471A (zh) * 2021-07-12 2021-10-15 深圳市共进电子股份有限公司 获取终端***类型的方法、装置、路由器及存储介质
CN114143385A (zh) * 2021-11-24 2022-03-04 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11429671B2 (en) 2017-05-25 2022-08-30 Microsoft Technology Licensing, Llc Parser for parsing a user agent string
CN107368532A (zh) * 2017-06-14 2017-11-21 上海斐讯数据通信技术有限公司 一种用户代理字段信息处理方法和***
CN107562835B (zh) * 2017-08-23 2020-03-27 Oppo广东移动通信有限公司 文件查找方法、装置、移动终端及计算机可读存储介质
CN109951354B (zh) * 2019-03-12 2021-08-10 北京奇虎科技有限公司 一种终端设备识别方法、***及存储介质
CN109905292B (zh) * 2019-03-12 2021-08-10 北京奇虎科技有限公司 一种终端设备识别方法、***及存储介质
CN109905293B (zh) * 2019-03-12 2021-06-08 北京奇虎科技有限公司 一种终端设备识别方法、***及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110124319A1 (en) * 2009-11-25 2011-05-26 Nokia Corporation Method and apparatus for ensuring transport of user agent information
CN102722585A (zh) * 2012-06-08 2012-10-10 亿赞普(北京)科技有限公司 浏览器类型识别方法、装置及***
CN102932775A (zh) * 2012-11-16 2013-02-13 广州市通联技术发展有限公司 一种利用imei与ua结合进行终端识别的方法及装置
CN103974232A (zh) * 2013-01-24 2014-08-06 中国电信股份有限公司 WiFi用户终端识别方法及***

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110124319A1 (en) * 2009-11-25 2011-05-26 Nokia Corporation Method and apparatus for ensuring transport of user agent information
CN102722585A (zh) * 2012-06-08 2012-10-10 亿赞普(北京)科技有限公司 浏览器类型识别方法、装置及***
CN102932775A (zh) * 2012-11-16 2013-02-13 广州市通联技术发展有限公司 一种利用imei与ua结合进行终端识别的方法及装置
CN103974232A (zh) * 2013-01-24 2014-08-06 中国电信股份有限公司 WiFi用户终端识别方法及***

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109660507A (zh) * 2018-10-16 2019-04-19 深圳壹账通智能科技有限公司 与用户端通信的方法、装置、设备及可读存储介质
CN109660507B (zh) * 2018-10-16 2022-05-17 深圳壹账通智能科技有限公司 与用户端通信的方法、装置、设备及可读存储介质
CN113507471A (zh) * 2021-07-12 2021-10-15 深圳市共进电子股份有限公司 获取终端***类型的方法、装置、路由器及存储介质
CN114143385A (zh) * 2021-11-24 2022-03-04 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质
CN114143385B (zh) * 2021-11-24 2024-01-05 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质

Also Published As

Publication number Publication date
CN106330520A (zh) 2017-01-11

Similar Documents

Publication Publication Date Title
WO2017000761A1 (zh) 一种终端设备的特征信息的提取方法及装置
JP6626211B2 (ja) ショートリンクを処理する方法及び装置並びにショートリンクサーバ
JP6435398B2 (ja) 端末識別子を促進する方法及びシステム
JP6734946B2 (ja) 情報を生成するための方法及び装置
US9491223B2 (en) Techniques for determining a mobile application download attribution
KR20170060280A (ko) 탐지 규칙 자동 생성 장치 및 방법
CN109657107B (zh) 一种基于第三方应用的终端匹配方法和装置
WO2016184163A1 (zh) Dpi规则的生成方法及装置
US10540325B2 (en) Method and device for identifying junk picture files
CN104199863A (zh) 存储设备上的文件的查找方法、装置及路由器
US20190327105A1 (en) Method and apparatus for pushing information
US20200259895A1 (en) Maintenance of a persistent master identifier for clusters of user identifiers across a plurality of devices
US9736215B1 (en) System and method for correlating end-user experience data and backend-performance data
US10516628B2 (en) Transfer device, transfer system, and transfer method
CN111209325A (zh) 业务***接口识别方法、装置及存储介质
US20160277477A1 (en) Information processing apparatus, terminal device, information processing method, and non-transitory computer readable recording medium
US10445213B2 (en) Non-transitory computer-readable storage medium, evaluation method, and evaluation device
US20150088958A1 (en) Information Processing System and Distributed Processing Method
US8700954B2 (en) Common trouble case data generating method and non-transitory computer-readable medium storing common trouble case data generating program
TW201905669A (zh) App應用展示介面的方法、裝置和電子設備
WO2015078124A1 (zh) 一种网络数据处理方法及装置
JP2017004500A (ja) 分析支援方法、分析支援プログラムおよび分析支援装置
WO2021040766A1 (en) Primary tagging in a data stream
JP5718256B2 (ja) システム性能解析装置、システム性能解析方法、およびシステム性能解析プログラム
CN109992424B (zh) 本地网络的业务关联关系的确定方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16817125

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16817125

Country of ref document: EP

Kind code of ref document: A1