CN111753023B - Method and device for determining type of internet private line - Google Patents

Method and device for determining type of internet private line Download PDF

Info

Publication number
CN111753023B
CN111753023B CN202010583164.7A CN202010583164A CN111753023B CN 111753023 B CN111753023 B CN 111753023B CN 202010583164 A CN202010583164 A CN 202010583164A CN 111753023 B CN111753023 B CN 111753023B
Authority
CN
China
Prior art keywords
line
internet
type
private line
internet private
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010583164.7A
Other languages
Chinese (zh)
Other versions
CN111753023A (en
Inventor
班瑞
李彤
马季春
白海龙
邹雨佳
陈泉霖
郝宇飞
王鹏
王佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
China Information Technology Designing and Consulting Institute Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
China Information Technology Designing and Consulting Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd, China Information Technology Designing and Consulting Institute Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202010583164.7A priority Critical patent/CN111753023B/en
Publication of CN111753023A publication Critical patent/CN111753023A/en
Application granted granted Critical
Publication of CN111753023B publication Critical patent/CN111753023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a method and a device for determining the type of an internet private line, which relate to the field of communication and can more accurately determine the type of the internet private line. The method comprises the following steps: acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; and determining the type of the internet private line to be classified by utilizing a real-time private line classification model according to the classification related characteristics of the internet private line to be classified.

Description

Method and device for determining type of internet private line
Technical Field
The present invention relates to the field of communications, and in particular, to a method and apparatus for determining a type of an internet private line.
Background
At present, the type classification of the special Internet mainly depends on the special Internet static resource information association for identification, and the identification method completely depends on the accuracy of static data, and the more accurate the static data is identified, the more accurate the static data is. However, in the actual situation, the static data of the private line has a larger deviation from the actual service of the private line, so that the identification method has some disadvantages: (1) Failure to identify or misidentify is caused by incomplete or inaccurate information of the private line static resource information table. (2) The classification information in the private line static resource information table is less and fixed, and is not suitable for the change of the demand. And (3) the maintenance workload of the special line static resource information table is large.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining the type of an internet private line, which can more accurately determine the type of the internet private line.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical scheme:
in a first aspect, a method for determining a type of an internet private line is provided, including: acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; and determining the type of the internet private line to be classified by utilizing a real-time private line classification model according to the classification related characteristics of the internet private line to be classified.
In a second aspect, there is provided an internet private line type determining apparatus, including: the device comprises an acquisition module and a classification module; the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring classification related characteristics of an internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; the classification module is used for determining the type of the internet special line to be classified by utilizing the special line classification model according to the classification related characteristics of the internet special line to be classified, which are acquired by the acquisition module.
In a third aspect, an internet private line type determining apparatus is provided, including a memory, a processor, a bus, and a communication interface; the memory is used for storing computer execution instructions, and the processor is connected with the memory through a bus; when the internet private line type determining apparatus is operated, the processor executes computer-executable instructions stored in the memory to cause the internet private line type determining apparatus to perform the internet private line type determining method as provided in the first aspect.
In a fourth aspect, there is provided a computer-readable storage medium comprising computer-executable instructions which, when run on a computer, cause the computer to perform the internet private line type determination method as provided in the first aspect.
The method and the device for determining the type of the internet private line provided by the embodiment of the application comprise the following steps: acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; and determining the type of the internet private line to be classified by utilizing a real-time private line classification model according to the classification related characteristics of the internet private line to be classified. According to the technical scheme provided by the embodiment of the application, the classification related characteristics of the special Internet line to be classified are firstly obtained, and then the type of the special Internet line to be classified is determined through a special line classification model obtained through pre-training; because the special line user characteristics, the special line network characteristics and the special line geographic characteristics included in the classification related characteristics of the internet special line in the self-application embodiment are closely related to the special line type and are generated in the use process of the internet special line, the type characteristics of the internet special line can be accurately reflected; in addition, the type of the internet private line to be classified is determined by the private line classification model obtained by combining training, so that compared with the mode of judging the type by utilizing the private line static resource information in the prior art, the technical scheme provided by the embodiment of the application is more accurate, and the method and the device can be suitable for judging various types of internet private lines.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 2 is a schematic flow chart of a method for determining internet private line type according to an embodiment of the present application;
fig. 3 is a second flow chart of a method for determining internet private line type according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a method for training a private line classification model according to an embodiment of the present application;
FIG. 5 is a second flowchart of a method for training a private line classification model according to an embodiment of the present application;
fig. 6 is a flowchart of a method for training a private line classification model according to an embodiment of the present application;
fig. 7 is a flowchart of a method for training a private line classification model according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a model parameter adjusting effect according to an embodiment of the present disclosure;
fig. 9 is a flowchart of a method for training a private line classification model according to an embodiment of the present application;
fig. 10 is a flowchart of a method for determining an internet private line type according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an internet private line type determining device according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of another internet private line type determining device according to an embodiment of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
It should be noted that, in the embodiment of the present application, "english: of", "corresponding" and "corresponding" may sometimes be used in combination, and it should be noted that the meaning to be expressed is consistent when the distinction is not emphasized.
In order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present invention, the terms "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function and effect, and those skilled in the art will understand that the terms "first", "second", and the like are not limited in number and execution order.
At present, the type classification of the special internet line mainly depends on static resource information of the special internet line, but because the static resource information often has larger deviation from the actual service condition of the special internet line, the classification of the special internet line is not accurate enough. And because the special line static resource information is needed to be relied on, and the special line static resource information is generally stored in the form of a special line static resource table, the special line static resource information table is needed to be maintained all the time, and the maintenance workload of the special line static resource information table is increased along with the increase of the special lines of the Internet.
Therefore, in view of the above problems, the present application provides a method for determining the type of an internet private line, which can determine the type of the internet private line more accurately. The internet private line type determining method provided by young people is applied to a system architecture shown in fig. 1, wherein the system architecture comprises an internet private line type determining device 01 and an internet private line 02 (02-1, 02-2).
One end of the internet private line 02 is connected to the backbone network 04, and the other end is connected to a user 03 (a single user terminal or a user group formed by multiple user terminals (03-1 and 03-2)) corresponding to the internet private line 02, which is mainly used for providing the user 03 with a high-speed stable private network only for the user 03.
The internet private line type determining device 01 is used for acquiring call ticket data of the internet private line from the internet private line 02 and determining the type of the internet private line according to the call ticket data. In practice, the internet private line type determining device 01 may be a service system of an operator, or may be a device independently set by the operator, or may be any other feasible device, so long as the call ticket data of the internet private line can be obtained.
Based on the above system architecture, referring to fig. 2, an embodiment of the present application provides an internet private line type determining method, which may be implemented from internet private line type determining drill to 01 in fig. 1, and specifically includes 101-102:
101. And obtaining the classification related characteristics of the internet special line to be classified.
The classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the private line geographic feature is used to indicate a geographic information point (point of information, POI) of the internet private line.
Optionally, referring to fig. 3, step 101 specifically includes:
101. and obtaining the classification related characteristics of the internet private line to be classified from the ticket (at least comprising CDR (Call Detail Record, call detail record)) data of the internet private line to be classified by using a deep packet inspection (deep packet inspection, DPI) technology.
Alternatively, since most of the data related to ipv6 (Internet Protocol version, network protocol version 6) is test data, the ticket data obtained in the step 101 may be only related to ipv4 (Internet Protocol version, network protocol version 4).
Illustratively, the classification related features are derived in units of time of week, so the ticket data in step 101 may be one week or multiple weeks, and when multiple weeks are involved, an average may be taken. Taking the week as an example, the classification related features may include the following:
(1) An IP address.
(2) Flow rate per hour on weekdays and flow rate per hour on weekends. The weekday hourly flow refers to the sum of the same hourly flows on five days of weekday, e.g., 8:00-9:00 of weekday is the sum of 8:00-9:00 of monday and 8:00-9:00 of friday through friday. The flow rate per hour on the weekend is the same.
(3) The flow rate of weekdays and the flow rate of weekends.
The weekday flow is specifically the total flow of weekdays, and the weekend flow is specifically the total flow of weekends for two days.
(4) Flow rate throughout the week.
(5) The weekday hourly traffic is a percentage of the weekday traffic.
(6) The weekend hourly flow is a percentage of the weekend flow.
(7) The flow rate at working days 9-12 and 13-19 is the ratio of the flow rate at working days.
(8) The flow rate at the working days 19-23 is the duty ratio of the flow rate at the working days.
(9) Fluctuation rate η of flow rate in units of per hour 1
Figure BDA0002553760670000051
Wherein x is i Is the sum of the flows of the ith hour of each day in the week,
Figure BDA0002553760670000052
is the average of the flow sums per hour in one.
(10) Flow rate η for on-time versus off-time 2
Figure BDA0002553760670000061
Wherein x is 1 Flow sum, x, of 9-19 points per day of the week 2 Is the sum of the flow rates at times other than 9-19 points per day of the week,
Figure BDA0002553760670000062
is x 1 And x 2 Average value of (2).
(11) The working day flow rate is the whole week flow rate ratio.
(12) The weekend flow is the whole week flow rate.
(13) The usage of each application class accounts for the total application amount.
The application of the distinction of the major classes is based on the actual requirements.
(14) Upstream and downstream flow rates.
The uplink flow is the same as the uplink flow of one week and the downlink flow.
(15) Uplink and downlink duration.
The uplink duration is the sum of the uplink time and the downlink duration in one day, and the same is true.
(16) Average rate of up and down.
Wherein, the uplink average rate is the ratio of the uplink flow rate to the uplink duration time; the downlink average rate is the same.
(17) Ratio of the uplink average rate to the downlink average rate.
(18) The number of uplink and downlink packets.
The number of uplink packets is the number of uplink data packets in a week, and the number of downlink packets is the same.
(19) The number of uplink and downlink retransmission packets.
The number of uplink retransmission packets is the number of uplink retransmission data packets in a week, and the number of downlink retransmission packets is the same.
(20) Uplink and downlink retransmission rates.
The uplink retransmission rate is the proportion of the uplink retransmission packet number to the uplink packet number, and the downlink retransmission rate is the same.
(21) TCP chaining acknowledges the total delay.
Specifically, the sum of the delays of link establishment acknowledgements of the internet private line in a week of TCP (transmission control protocol ).
(22) The TCP builds a link and acknowledges the total number of times.
In particular to the number of times of TCP link establishment confirmation of an internet private line in a week.
(23) TCP chaining acknowledges the average delay.
Specifically, the quotient of the total delay of the TCP link establishment confirmation and the total times of the TCP link establishment confirmation is obtained.
(24) The first transaction requests a total delay to its first response packet.
Specifically the sum of the delays from all the first transaction request to its first response packet before the TCP chaining acknowledgement.
(25) The first transaction requests a total number of packets to its first response.
Specifically the total number of times all the first transaction requests to its first response packet before TCP chaining acknowledgements.
(26) The first transaction requests an average delay to its first response packet.
Specifically, the ratio of the total time delay from the first transaction request to the first response packet and the total time of the first transaction request to the first response packet.
(27) The total latency of the first HTTP response packet relative to the first HTTP request packet.
Specifically, the sum of the time delays of all HTTP response packets after TCP link establishment confirmation relative to the first HTTP request packet.
(28) The first HTTP response packet is relative to the total number of times the first HTTP request packet.
Specifically, the total number of HTTP response packets after TCP chaining acknowledgement is compared with the first HTTP request packet.
(29) The average latency of the first HTTP response packet relative to the first HTTP request packet.
Specifically, the ratio of the total time delay of the first HTTP response packet relative to the first HTTP request packet to the total time of the first HTTP response packet relative to the first HTTP request packet is provided.
(30) Various message transaction types call numbers.
(31) All message transaction type call ticket numbers.
(32) Various message transaction type ticket duty cycles.
Specifically, the ratio of the number of calls of each message transaction type in the total number of the calls of all the message transaction types is defined.
(33) Various message transaction type traffic ratios.
In particular the ratio of traffic for each message transaction type in the total traffic for all message transaction types.
(34) Commonly used url (uniform resource locator ) suffix ticket duty cycle (.com.cn,. Com,. Cn,. Gov,. Edu,. Org, other).
Specifically, common url is used as a suffix, and the number of calls accounts for the total number of calls.
(35) Commonly used url suffix traffic ratios (.com.cn,. Com,. Cn,. Gov,. Edu,. Org, other).
Specifically, the ratio of the traffic of the ticket with the common url as the suffix to the total traffic is adopted.
(36) Terminal type duty cycle in units of one week.
The method specifically comprises the step of occupying the ratio of the number of various terminals in a week to the total number of terminals.
(37) And the flow ratio of the mobile end and the desktop end is set as a unit.
(38) Daily terminal type duty cycle (PC side duty cycle, mobile side duty cycle).
Specifically, the ratio of the number of various terminals to the total number of terminals in each day.
(39) The fluctuation rate (variance) of the mobile-side duty cycle is in units of each day.
Specifically the variance of the mobile end duty cycle per day of the week.
(40) The http content ticket duty cycle of each class (application, text, audio, video, image, message, drawing, java, other).
Reference (34) is explained.
(44) The content traffic ratio of various http (application, text, audio, video, image, message, drawing, java, other).
Reference (35) explanation.
(42) Number of TCP connection successes.
(43) Number of TCP connection failures.
(44) TCP connection success rate.
(45) TCP connection failure rate.
(46) Not of significant concern for the business form.
Wherein the important attention business is set by the internet private line user or the operator.
(47) Is an important concern for the business call ticket number.
(48) Traffic rate is of important concern.
And particularly the duty ratio of the service number to all the service numbers is important.
(49) Not the high-risk website bill.
(50) Is a high-risk website bill number.
(51) Not a high risk website rate.
(52) Is a high-risk website rate.
(53) Longitude and latitude 1 (picking up beauty from the scene, hungry, daily fresh, stock delivery, delivery address will appear most geographic tags (residential, CBD business, school, … …) as geographic location tags for IP).
(54) Longitude and latitude 2 (longitude and latitude appearing in uri, reject abnormal value. Take longitude and latitude average value as the reference of IP geographic position).
(55) And generating a geographical position POI type (null for no geographical position type) according to the longitude and latitude 1 and the longitude and latitude 2.
Wherein the private line subscriber features include (2) - (13), (30) - (41) and (46) - (52) described above; the private line network features include (14) - (29) and (42) - (45) above; the private line geographic features include (1) and (53) - (55) above.
In addition, it should be noted that, the above 55 features are not all obtained from ticket data directly by using DPI technology, and since there may be only a difference in the amount of the features obtained directly, and the difference in the features of different types of internet dedicated lines should be mainly reflected in the scale, after the features are directly extracted, the features are further required to be reconstructed according to the features extracted directly, and specifically, the features obtained by the construction may include the features (5) - (13), (16), (17), (20), (23), (26), (29), (32), (33) - (41), (44), (45), (48), (51) and (55) described above.
Of course, in practice, more or fewer features than those described above may be obtained without specific limitation.
102. And determining the type of the internet private line to be classified by utilizing a private line classification model according to the classification related characteristics of the internet private line to be classified.
Based on the technical scheme, the special line user characteristics, the special line network characteristics and the special line geographic characteristics included in the classification related characteristics of the special line in the technical scheme are closely related to the special line type, and are generated in the use process of the special line, so that the type characteristics of the special line can be accurately reflected; in addition, the type of the internet private line to be classified is determined by the private line classification model obtained by combining training, so that compared with the mode of judging the type by utilizing the private line static resource information in the prior art, the technical scheme provided by the embodiment of the application is more accurate, and the method and the device can be suitable for judging various types of internet private lines.
Optionally, referring to fig. 4, in order to ensure smooth implementation of the technical solution provided in the embodiment of the present application, step 102 further includes (may also be before step 101, here only is an example, and is specifically determined according to actual needs) S1-S4:
S1, acquiring historical classification related features of a sample internet private line from historical ticket data of a plurality of sample internet private lines by using a DPI technology.
The historical ticket data are ticket data of a sample internet private line in a first preset time period before the current moment; the historical classification related features comprise special line user features, special line network features and special line geographic features of the sample internet special line in a first preset time period; the first preset time period is at least one week long (in this application, the first preset time period is an integer multiple of one week).
S2, clustering the sample internet private lines by using a preset clustering algorithm according to the historical classification related characteristics of the sample internet private lines so as to obtain k types of sample internet private lines; k is a positive integer.
Optionally, as shown in fig. 5, S2 specifically includes S21-S25, with reference to the example where the preset clustering algorithm is a k-means clustering algorithm:
s21, assigning values to the historical classification related features of the sample internet private line.
Wherein the associated rules used for assignment should be the same rule.
S22, determining feature coordinates of the sample internet private line according to the assigned historical classification related features.
Taking two features, i.e., feature 1 and feature 2 as an example, if feature 1 is assigned to 1 and feature 2 is assigned to 2, then its feature coordinates corresponding to the internet private line are (1, 2).
S23, calculating the characteristic distances among different sample Internet private lines according to the characteristic coordinates of the sample Internet private lines.
For example, taking the feature coordinate of the sample internet private line a as (1, 1) and the feature coordinate of the sample internet private line B as (2, 2), the feature distance between the two is
Figure BDA0002553760670000102
S24, determining a k value according to the contour coefficient formula.
Specifically, the higher the profile coefficient is, the better the classification effect achieved by the determined k value is; if k is too large, the accuracy of the clustering result is low; too small k can cause ambiguity in the clustering result.
Illustratively, the profile factor formula is:
Figure BDA0002553760670000101
wherein S (i) is a contour coefficient; a (i) is the average distance from sample i to other samples in the cluster; b (i) is the average distance of sample i to other cluster samples.
S25, dividing the sample Internet private lines into k types by using a k-means clustering algorithm according to the k values and the characteristic distances between different sample Internet private lines.
S3, setting type labels for the k-type sample Internet private lines, wherein the type labels correspond to the k-type sample Internet private lines one by one; the type tag is used to indicate the type of the private line.
Exemplary, alternative type tags are shown with reference to table 1 below.
Figure BDA0002553760670000111
TABLE 1
Optionally, referring to fig. 6, S3 specifically includes two face-changing schemes S31-S32 and S33-S35:
S31, acquiring reference classification related characteristics of the reference internet private line from the ticket data of the reference internet private line of a known type by using the DPI technology.
The reference classification related features comprise special line user features, special line network features and special line geographic features of the reference internet special lines. For example, since there are some internet private lines in practice, the operator will determine the type and set the type tag when setting up, and such known type of internet private line is referred to as the internet private line.
S32, setting type labels for the sample Internet private lines according to the reference classification related characteristics of the reference Internet private lines of known types.
Alternatively, the type tag of the sample internet private line with the highest similarity between the historical classification related feature and the reference classification related feature may be set as the type tag corresponding to the type of the reference internet private line. The calculation of the similarity may be in any feasible manner, and is not particularly limited herein.
Based on the technical scheme provided by S31-S32, the type label can be automatically set for the sample Internet private line, and the implementation efficiency of the whole scheme can be further improved.
S33, the clustering result of the sample internet private line and the historical classification related characteristics of the sample internet private line are sent to the classification terminal.
For example, the owner of the classification terminal may be a professional analyst with sufficient knowledge of the sample internet private line.
S34, receiving a type label setting result of the classification terminal on the sample internet private line.
S35, setting a type label for the sample Internet private line according to a type label setting result of the classification terminal for the sample Internet private line.
Based on the technical scheme provided by S33-S34, the cognition of professionals on the Internet private line can be utilized, the type label is more accurately set on the sample Internet private line, and the accuracy of the technical scheme can be further improved.
S4, taking the type label of the sample internet private line and the historical classification related characteristics of the sample internet private line as training data, and constructing a private line classification model by using a preset machine learning algorithm.
Taking a preset machine learning algorithm as an example of a random forest model training algorithm (in practice, other algorithms may be used, and optionally, the best effect may be selected as the preset machine learning algorithm after all tests), referring to fig. 7, S4 includes: S41-S45:
s41, dividing training data into two parts, wherein one part is used as a training set and the other part is used as a test set.
Illustratively, the ratio of samples in a particular training set to in a test set may be 7:3; during segmentation, a sample specifically comprises a type tag of an internet private line and corresponding historical classification related features.
S42, training by using a training set and adopting a random forest model algorithm to obtain an initial special line classification model.
S43, testing and evaluating the initial special line classification model by using the test set, and determining whether the initial special line classification model meets preset indexes.
When the initial special line classification model is determined to accord with the preset index, S44 is executed; and when the initial special line classification model is determined not to accord with the preset index, executing S45.
In particular, there is a problem of sample imbalance due to the multi-classification problem. Accuracy accuracies and macro averages are used to measure model goodness. accuracy pays attention to overall accuracy, and macro average can give consideration to recognition accuracy of few classes.
Where accuracy = number of samples correctly classified/number of all samples classified;
the macro average specifically comprises a comprehensive index, a macro Cha Zhun rate and a macro recall (the higher the numerical value of the three is, the better the model effect is indicated):
the comprehensive index f1= (2×macro Cha Zhun rate pmacro×macrorecall)/(macro Cha Zhun rate+macrorecall);
Figure BDA0002553760670000131
Figure BDA0002553760670000132
Wherein P is i For the precision of the internet special line of the i-th sample, R i The recall ratio of the internet special line for the i-th sample is n, which is the type that can be separated by the initial special line classification modelA number.
S44, determining the initial special line classification as a special line classification model.
S45, adjusting parameters of the initial special line classification model according to the test evaluation result of the test set on the initial special line classification model until the initial special line classification model meets the preset index, and determining the initial special line classification model as the special line classification model.
Illustratively, the parameters that are adjustable in the initial classification model obtained by the random forest model algorithm are shown in table 2 below.
Figure BDA0002553760670000133
TABLE 2
Optionally, if the initial special line classification model is determined to be over-fitted according to the test evaluation result, adjusting the parameters to the model simplifying direction; if the fitting is under, the parameters are adjusted towards the complicating direction of the model. For example, model simplification may be to turn min_samples_split high, and complexity is reversed; taking the adjustment of min_samples_split as an example, referring to fig. 8, it can be known that the optimal parameter of min_samples_split is 6; auc area in the graph is the value of accuracy.
Optionally, because there may be 55 features in the above classification related features, where the different types of features may not be accurately distinguished, or there may be a problem of noise, missing, etc. in the features, referring to fig. 9, the method further includes, before step S2, SA:
SA, carrying out feature engineering on the historical classification related features of the sample internet private line.
Wherein, the feature engineering includes at least: data preprocessing and feature selection; the data preprocessing at least comprises: dimensionless, missing value processing and continuity feature processing; the feature selection at least comprises: variance filtering and mutual information filtering.
The dimensionless method can accelerate the convergence speed of the model, eliminate the influence of overlarge difference of different characteristic values, and particularly normalize the values of all the characteristics. Illustratively, feature 1 (duty cycle of 9:00 to 10:00 traffic) may have a feature value of 0.1, feature 2 (TCP link establishment response average delay) may have a feature of 500, and normalization is to change the mean of the two features to 0 and both standard deviation and variance to 1; assuming that the mean value of feature 1 is 0.05 and the standard deviation is 0.025, 0.1 is normalized to 2; assuming that the mean value of feature 2 is 400 and the standard deviation is 100, 500 is normalized to 1. It can be seen that the normalization significantly reduces the huge difference between the values of the features of the different dimensions. Specifically, normalized value of a= (mean of A-A)/standard deviation of a.
The missing value processing may specifically be that the missing portion may be padded with a value of 0.
The continuity feature processing is specifically to change the continuity feature into a discrete feature (segment), mainly because of the following three points: (1) Algorithms such as decision trees, naive bayes, etc., are developed based on discrete data. If this type of algorithm is to be used, the discrete data must be processed. The effective discretization can reduce the time and space expenditure of the algorithm and improve the classifying and clustering capacity and noise resistance of the system to the samples. (2) Discretized features are easier to understand and more closely approximate knowledge-level expressions than continuous features. Such as payroll income, monthly salary 2000 and monthly salary 20000, the difference between high and low salaries can be understood by numerical level from the continuous feature, but the difference is converted into discrete data (bottom salary and high salary), so that the high salary and the bottom salary which we want in mind can be more intuitively expressed. (3) The hidden defect in the data can be effectively overcome, and the model result is more stable.
The variance is filtered by removing features with variance of 0. Since if the variance of a certain feature is 0, it indicates that it is all the same, and does not have discrimination capability.
Reasons for mutual information method filtration: the probability of a feature and a type tag is not necessarily in a proportional linear relationship, but may be in a nonlinear relationship (the larger the feature is in the early stage, the larger the probability that the type tag is in a certain class is, but the later stage is probably not), so that the mutual information method is needed to filter noise. In particular, the amount of mutual information between each feature and type tag is calculated, and features with a mutual information amount of 0 (indicating that the feature and type tag are completely uncorrelated) are deleted. The calculation of the specific mutual information amount can refer to the prior art, and is not described herein.
In addition, in addition to the above feature processing manner, regarding the features related to the time period (for example, the ratio of the flow rate at the time of 9-12 days to the flow rate at the time of day) in the classification related features, the internet dedicated lines may not be different from each other for different industries, so that it is also necessary to perform a certain degree of merging processing on such features by using a box-in-box merging manner.
It should be noted that if the above feature engineering is performed when the special line classification correlation model is trained, the same feature engineering needs to be performed on the classification correlation features of the internet special line to be classified after step 101, or part of the features in the classification correlation features obtained in step 101 are selected directly according to the features screened out by the training special line classification correlation model to be used as the features in the input special line classification model in the subsequent step 102.
Optionally, referring to fig. 10, in order to continuously optimize the special line classification model, step 102 further includes 103-105:
103. and obtaining the classification related characteristics of the target internet private line from the ticket data of the target internet private line by using the DPI technology every second preset time period.
The target internet private line is an internet private line of a known type which appears in a second preset time period; for example, an operator sets an internet private line with a type tag x for a kindergarten. The type tag x corresponding to the kindergarten type may not exist in the original type tag library, that is, the private line classification model in the step 102 may never determine the type tag of a certain internet private line as the type tag x corresponding to the kindergarten type.
104. And taking the target internet private line as a new sample internet private line, and taking the type of the target internet private line as a corresponding type label so as to update training data.
105. And updating the special line classification model by using a preset machine learning algorithm according to the updated training data.
The method and the device for determining the type of the internet private line provided by the embodiment of the application comprise the following steps: acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; and determining the type of the internet private line to be classified by utilizing a real-time private line classification model according to the classification related characteristics of the internet private line to be classified. According to the technical scheme provided by the embodiment of the application, the classification related characteristics of the special Internet line to be classified are firstly obtained, and then the type of the special Internet line to be classified is determined through a special line classification model obtained through pre-training; because the special line user characteristics, the special line network characteristics and the special line geographic characteristics included in the classification related characteristics of the internet special line in the self-application embodiment are closely related to the special line type and are generated in the use process of the internet special line, the type characteristics of the internet special line can be accurately reflected; in addition, the type of the internet private line to be classified is determined by the private line classification model obtained by combining training, so that compared with the mode of judging the type by utilizing the private line static resource information in the prior art, the technical scheme provided by the embodiment of the application is more accurate, and the method and the device can be suitable for judging various types of internet private lines.
Referring to fig. 11, an internet private line type determining apparatus 01 as shown in fig. 1 provided in an embodiment of the present application is used to implement the technical solution provided in the foregoing embodiment, where the apparatus specifically includes: an acquisition module 21, a classification module 22 and a modeling module 23.
The acquiring module 21 is configured to acquire classification related features of an internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line;
the classification module 22 is configured to determine a type of the internet private line to be classified according to the classification related feature of the internet private line to be classified acquired by the acquisition module 21 by using the private line classification model.
Optionally, the obtaining module 21 is specifically configured to: and obtaining classification related characteristics of the internet private line to be classified from the ticket data of the internet private line to be classified by using a Deep Packet Inspection (DPI) technology.
Optionally, the modeling module 23 includes an acquisition unit 231, a clustering unit 232, a labeling unit 233, and a training unit 234;
An obtaining unit 231, configured to obtain, by using a DPI technology, a historical classification related feature of a sample internet private line from historical ticket data of a plurality of sample internet private lines; the historical ticket data is ticket data of a sample internet private line in a first preset time period before the current moment; the historical classification related features comprise special line user features, special line network features and special line geographic features of the sample internet special line in a first preset time period; the time length of the first preset time period is at least one week;
a clustering unit 232, configured to cluster the sample internet private lines by using a preset clustering algorithm according to the historical classification related features of the sample internet private lines acquired by the acquisition unit 231, so as to obtain k-class sample internet private lines; k is a positive integer;
the labeling unit 233 is configured to set type labels for the k-type sample internet private lines clustered by the clustering unit 232, where the type labels are in one-to-one correspondence with the k-type sample internet private lines; the type tag is used for indicating the type of the special line;
the training unit 234 is configured to construct a private line classification model by using a preset machine learning algorithm, with the type tag set by the labeling unit 233 on the sample internet private line and the historical classification related features of the sample internet private line acquired by the acquiring unit 231 as training data.
Optionally, the labeling unit 233 is specifically configured to: obtaining reference classification related characteristics of the reference internet private line from the ticket data of the known type of the reference internet private line by using the DPI technology; the reference classification related features comprise special line user features, special line network features and special line geographic features of the reference internet special lines;
and setting type labels for the sample internet private lines according to the reference classification related characteristics of the known type reference internet private lines.
Optionally, the labeling unit 233 is specifically configured to: the clustering result of the clustering unit 232 on the sample internet private line and the historical classification related characteristics of the sample internet private line acquired by the acquisition unit 231 are sent to a classification terminal; receiving a type label setting result of the classification terminal on the sample internet private line; and setting a type label for the sample Internet private line according to a type label setting result of the classification terminal for the sample Internet private line.
Optionally, the modeling module 23 further includes a feature processing unit 235; the feature processing unit 235 is configured to perform feature engineering on the historical classification related features of the sample internet private line before the clustering unit 232 clusters the sample internet private line by using a preset clustering algorithm according to the historical classification related features of the sample internet private line acquired by the acquiring unit 231 to obtain k-class sample internet private lines; the feature engineering at least comprises: data preprocessing and feature selection; the data preprocessing at least comprises: dimensionless, missing value processing and continuity feature processing; the feature selection at least comprises: variance filtering and mutual information filtering.
Optionally, after the classification module 22 determines the type of internet private line to be classified,
the obtaining unit 231 is further configured to obtain classification related features of the target internet private line from ticket data of the target internet private line by using the DPI technology at intervals of a second preset time period; the target internet private line is an internet private line of a known type which appears in a second preset time period;
the training unit 234 is further configured to take the target internet private line acquired by the acquiring unit 231 as a new sample internet private line, and take the type of the target internet private line acquired by the acquiring unit 231 as a corresponding type tag thereof, so as to update training data;
the training unit 234 is further configured to update the private line classification model with a preset machine learning algorithm according to the updated training data.
The device for determining the type of the internet private line provided by the embodiment of the application comprises: the acquisition module is used for acquiring the classification related characteristics of the internet special line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line; the classification module is used for determining the type of the internet special line to be classified by utilizing the special line classification model according to the classification related characteristics of the internet special line to be classified, which are acquired by the acquisition module. Therefore, according to the technical scheme provided by the embodiment of the application, the classification related characteristics of the special Internet line to be classified can be obtained firstly, and then the type of the special Internet line to be classified is determined through the special line classification model obtained through pre-training; because the special line user characteristics, the special line network characteristics and the special line geographic characteristics included in the classification related characteristics of the internet special line in the self-application embodiment are closely related to the special line type and are generated in the use process of the internet special line, the type characteristics of the internet special line can be accurately reflected; in addition, the type of the internet private line to be classified is determined by the private line classification model obtained by combining training, so that compared with the mode of judging the type by utilizing the private line static resource information in the prior art, the technical scheme provided by the embodiment of the application is more accurate, and the method and the device can be suitable for judging various types of internet private lines.
Referring to fig. 12, another apparatus for determining a type of internet private line is provided in the embodiment of the present application, including a memory 41, a processor 42, a bus 43 and a communication interface 44; the memory 41 is used for storing computer-executable instructions, and the processor 42 is connected with the memory 41 through the bus 43; when the internet private line type determining apparatus is operated, the processor 42 executes computer-executable instructions stored in the memory 41 to cause the internet private line type determining apparatus to perform the internet private line type determining method as provided in the above-described embodiment.
In a particular implementation, as one embodiment, the processor 42 (42-1 and 42-2) may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 12. And as one example the internet private line type determining means may comprise a plurality of processors 42, such as processor 42-1 and processor 42-2 shown in fig. 12. Each of these processors 42 may be a Single-core processor (Single-CPU) or a Multi-core processor (Multi-CPU). The processor 42 herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
The Memory 41 may be, but is not limited to, a Read-Only Memory 41 (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (Random Access Memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a compact disc Read-Only Memory (Compact Disc Read-Only Memory) or other optical disc storage, optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 41 may be stand alone and be coupled to the processor 42 via a bus 43. Memory 41 may also be integrated with processor 42.
In a specific implementation, the memory 41 is used for storing data in the application and computer-executable instructions corresponding to executing a software program of the application. The processor 42 may determine various functions of the device by running or executing a software program stored in the memory 41 and invoking data stored in the memory 41.
The communication interface 44 uses any transceiver-like device for communicating with other devices or communication networks, such as a control system, a radio access network (Radio Access Network, RAN), a wireless local area network (Wireless Local Area Networks, WLAN), etc. The communication interface 44 may include a receiving unit to implement a receiving function and a transmitting unit to implement a transmitting function.
Bus 43 may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus 43 may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 12, but not only one bus or one type of bus.
The embodiment of the application also provides a computer readable storage medium, which includes computer-executable instructions that, when executed on a computer, cause the computer to perform the method for determining the type of internet private line provided in the above embodiment.
The embodiment of the application also provides a computer program which can be directly loaded into a memory and contains software codes, and the computer program can realize the internet private line type determining method provided by the embodiment after being loaded and executed by a computer.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer-readable storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and the division of modules or units, for example, is merely a logical function division, and other manners of division are possible when actually implemented. For example, multiple units or components may be combined or may be integrated into another device, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and the parts shown as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the method described in the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (14)

1. The method for determining the type of the special line of the Internet is characterized by comprising the following steps of:
acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line;
acquiring historical classification related features of a plurality of sample internet private lines from historical ticket data of the sample internet private lines by using a Deep Packet Inspection (DPI) technology; the historical ticket data are ticket data of the sample internet private line in a first preset time period before the current moment; the historical classification related features comprise special line user features, special line network features and special line geographic features of the sample internet special line in the first preset time period; the time length of the first preset time period is at least one week;
clustering the sample internet private lines by using a preset clustering algorithm according to the historical classification related characteristics of the sample internet private lines so as to obtain k-type sample internet private lines; k is a positive integer;
Setting type labels for the k-type sample internet private lines, wherein the type labels correspond to the k-type sample internet private lines one by one; the type tag is used for indicating the type of the private line;
taking the type label of the sample internet private line and the historical classification related characteristics of the sample internet private line as training data, and constructing a private line classification model by using a preset machine learning algorithm;
and determining the type of the internet private line to be classified by utilizing a private line classification model according to the classification related characteristics of the internet private line to be classified.
2. The method for determining the internet private line type according to claim 1, wherein the obtaining the classification related features of the internet private line user to be classified comprises:
and obtaining the classification related characteristics of the internet private line to be classified from the ticket data of the internet private line to be classified by using a DPI technology.
3. The method for determining the type of the internet private line according to claim 1, wherein the setting type tags for the k-type sample internet private line, the type tags and the k-type sample internet private line are in one-to-one correspondence, includes:
obtaining reference classification related characteristics of a reference internet private line from ticket data of the known type of the reference internet private line by using a DPI technology; the reference classification related features comprise special line user features, special line network features and special line geographic features of the reference internet special line;
And setting type labels for the sample internet private lines according to the reference classification related characteristics of the known type reference internet private lines.
4. The method for determining the type of the internet private line according to claim 1, wherein the setting type tags for the k-type sample internet private line, the type tags and the k-type sample internet private line are in one-to-one correspondence, includes:
the clustering result of the sample internet private line and the historical classification related characteristics of the sample internet private line are sent to a classification terminal;
receiving a type label setting result of the classification terminal on the sample internet private line;
and setting a type label for the sample internet private line according to a type label setting result of the classification terminal for the sample internet private line.
5. The method for determining the internet private line type according to claim 1, wherein the clustering the sample internet private lines by using a preset clustering algorithm according to the historical classification related features of the sample internet private lines to obtain k types of sample internet private lines further comprises:
performing feature engineering on the historical classification related features of the sample internet private line; the feature engineering at least comprises: data preprocessing and feature selection; the data preprocessing at least comprises: dimensionless, missing value processing and continuity feature processing; the feature selection at least includes: variance filtering and mutual information filtering.
6. The method for determining the type of the internet private line according to claim 1, wherein determining the type of the internet private line to be classified by using a real-time private line classification model according to the classification related characteristics of the internet private line to be classified further comprises:
obtaining classification related features of the target internet private line from ticket data of the target internet private line by using a DPI technology at intervals of a second preset time period; the target internet private line is an internet private line of a known type which appears in the second preset time period;
taking the target internet private line as a new sample internet private line, and taking the type of the target internet private line as a corresponding type label thereof to update the training data;
and updating the special line classification model by using the preset machine learning algorithm according to the updated training data.
7. An internet private line type determining apparatus, comprising: the system comprises an acquisition module, a classification module and a modeling module, wherein the modeling module comprises an acquisition unit, a clustering unit, a labeling unit and a training unit;
the acquisition module is used for acquiring classification related characteristics of the internet private line to be classified; the classification related features comprise special line user features, special line network features and special line geographic features; the special line user characteristics are used for indicating the use habit of the user corresponding to the internet special line; the special line network characteristics are used for indicating the network performance of the Internet special line; the special line geographic features are used for indicating geographic information Points (POIs) of the Internet special line;
The acquisition unit is used for acquiring historical classification related features of the sample internet private line from historical ticket data of a plurality of sample internet private lines by using a Deep Packet Inspection (DPI) technology; the historical ticket data are ticket data of the sample internet private line in a first preset time period before the current moment; the historical classification related features comprise special line user features, special line network features and special line geographic features of the sample internet special line in the first preset time period; the time length of the first preset time period is at least one week;
the clustering unit is used for clustering the sample internet private lines by using a preset clustering algorithm according to the historical classification related characteristics of the sample internet private lines acquired by the acquisition unit so as to acquire k types of sample internet private lines; k is a positive integer;
the labeling unit is used for setting type labels for the k-type sample internet private lines clustered by the clustering unit, wherein the type labels correspond to the k-type sample internet private lines one by one; the type tag is used for indicating the type of the private line;
the training unit is used for taking the type label set by the labeling unit on the sample internet private line and the historical classification related characteristics of the sample internet private line acquired by the acquisition unit as training data, and constructing a private line classification model by using a preset machine learning algorithm;
The classification module is used for determining the type of the internet private line to be classified by utilizing a private line classification model according to the classification related characteristics of the internet private line to be classified, which are acquired by the acquisition module.
8. The internet private line type determining apparatus according to claim 7, wherein the obtaining module is specifically configured to:
and obtaining the classification related characteristics of the internet private line to be classified from the ticket data of the internet private line to be classified by using a DPI technology.
9. The internet private line type determining apparatus according to claim 7, wherein the labeling unit is specifically configured to:
obtaining reference classification related characteristics of a reference internet private line from ticket data of the known type of the reference internet private line by using a DPI technology; the reference classification related features comprise special line user features, special line network features and special line geographic features of the reference internet special line;
and setting type labels for the sample internet private lines according to the reference classification related characteristics of the known type reference internet private lines.
10. The internet private line type determining apparatus according to claim 7, wherein the labeling unit is specifically configured to:
The clustering result of the clustering unit on the sample internet private line and the historical classification related characteristics of the sample internet private line acquired by the acquisition unit are sent to a classification terminal;
receiving a type label setting result of the classification terminal on the sample internet private line;
and setting a type label for the sample internet private line according to a type label setting result of the classification terminal for the sample internet private line.
11. The internet private line type determination apparatus according to claim 7, wherein the modeling module further includes a feature processing unit;
the characteristic processing unit is used for carrying out characteristic engineering on the historical classification related characteristics of the sample internet private line before the clustering unit clusters the sample internet private line by utilizing a preset clustering algorithm according to the historical classification related characteristics of the sample internet private line acquired by the acquisition unit so as to acquire k types of sample internet private lines; the feature engineering at least comprises: data preprocessing and feature selection; the data preprocessing at least comprises: dimensionless, missing value processing and continuity feature processing; the feature selection at least includes: variance filtering and mutual information filtering.
12. The apparatus according to claim 7, wherein after the classification module determines the type of the internet private line to be classified,
the obtaining unit is further configured to obtain classification related features of the target internet private line from ticket data of the target internet private line by using a DPI technology every a second preset time period; the target internet private line is an internet private line of a known type which appears in the second preset time period;
the training unit is further configured to take the target internet private line acquired by the acquiring unit as a new sample internet private line, and take the type of the target internet private line acquired by the acquiring unit as a corresponding type tag, so as to update the training data;
the training unit is further configured to update the private line classification model with the preset machine learning algorithm according to the updated training data.
13. The device for determining the type of the special line of the Internet is characterized by comprising a memory, a processor, a bus and a communication interface; the memory is used for storing computer execution instructions, and the processor is connected with the memory through the bus; when the internet private line type determining apparatus is operated, the processor executes the computer-executable instructions stored in the memory to cause the internet private line type determining apparatus to perform the internet private line type determining method according to any one of claims 1 to 6.
14. A computer readable storage medium comprising computer executable instructions which, when run on a computer, cause the computer to perform the internet private line type determination method of any one of claims 1-6.
CN202010583164.7A 2020-06-23 2020-06-23 Method and device for determining type of internet private line Active CN111753023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010583164.7A CN111753023B (en) 2020-06-23 2020-06-23 Method and device for determining type of internet private line

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010583164.7A CN111753023B (en) 2020-06-23 2020-06-23 Method and device for determining type of internet private line

Publications (2)

Publication Number Publication Date
CN111753023A CN111753023A (en) 2020-10-09
CN111753023B true CN111753023B (en) 2023-06-06

Family

ID=72676943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010583164.7A Active CN111753023B (en) 2020-06-23 2020-06-23 Method and device for determining type of internet private line

Country Status (1)

Country Link
CN (1) CN111753023B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055674A (en) * 2011-01-17 2011-05-11 工业和信息化部电信传输研究所 Internet protocol (IP) message as well as information processing method and device based on same
CN102546220A (en) * 2010-12-31 2012-07-04 ***通信集团福建有限公司 Key quality indicator (KQI) composition method based on service characteristics
CN106452858A (en) * 2016-09-28 2017-02-22 北京齐尔布莱特科技有限公司 Method and device for identifying network user and computing device
CN107563429A (en) * 2017-07-27 2018-01-09 国家计算机网络与信息安全管理中心 A kind of sorting technique and device of network user colony
CN108021673A (en) * 2017-12-06 2018-05-11 北京拉勾科技有限公司 A kind of user interest model generation method, position recommend method and computing device
CN111131068A (en) * 2018-11-01 2020-05-08 ***通信集团广东有限公司 Internet private line data transmission method and device
CN111125061A (en) * 2019-12-18 2020-05-08 甘肃省卫生健康统计信息中心(西北人口信息中心) Method for standardizing and promoting health medical big data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10528604B2 (en) * 2018-01-30 2020-01-07 Sunrise Opportunities, LLC Methods and systems for tracking the flow of trucking freight and/or other assets using mobile device geolocation data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102546220A (en) * 2010-12-31 2012-07-04 ***通信集团福建有限公司 Key quality indicator (KQI) composition method based on service characteristics
CN102055674A (en) * 2011-01-17 2011-05-11 工业和信息化部电信传输研究所 Internet protocol (IP) message as well as information processing method and device based on same
CN106452858A (en) * 2016-09-28 2017-02-22 北京齐尔布莱特科技有限公司 Method and device for identifying network user and computing device
CN107563429A (en) * 2017-07-27 2018-01-09 国家计算机网络与信息安全管理中心 A kind of sorting technique and device of network user colony
CN108021673A (en) * 2017-12-06 2018-05-11 北京拉勾科技有限公司 A kind of user interest model generation method, position recommend method and computing device
CN111131068A (en) * 2018-11-01 2020-05-08 ***通信集团广东有限公司 Internet private line data transmission method and device
CN111125061A (en) * 2019-12-18 2020-05-08 甘肃省卫生健康统计信息中心(西北人口信息中心) Method for standardizing and promoting health medical big data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
a neuro-fuzzy approach for user behaviour classification and prediction;Atta ur Rahman等;journal of cloud computing;1-15 *
大客户专线接入的建设方式研究;欧秀惠;;中国新通信;第19卷(第12期);56-57 *

Also Published As

Publication number Publication date
CN111753023A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN111614690B (en) Abnormal behavior detection method and device
CN107181724B (en) Identification method and system of cooperative flow and server using method
WO2014066619A2 (en) Combining measurements based on beacon data
JP2002523814A (en) Recognize and predict transactions using regular expressions
US8239532B1 (en) System and method of reducing latency using adaptive DNS resolution
CN111935820A (en) Positioning implementation method based on wireless network and related equipment
US7689645B2 (en) Systems and methods for brokering services
CN115080854A (en) Digital service platform system for enterprise service
US20230004776A1 (en) Moderator for identifying deficient nodes in federated learning
CN111753023B (en) Method and device for determining type of internet private line
CN111581226B (en) Data sharing method and device based on big data platform and administrative enterprise cloud platform
CN110674832A (en) Method, device and terminal for identifying enterprise to which Internet user belongs
CN117172721A (en) Data flow supervision early warning method and system for financing service
CN116094969B (en) Bandwidth adjustment method, device, equipment and storage medium
CN109167673B (en) Novel cloud service screening method integrating abnormal Qos data detection
CN112396313B (en) Method for optimizing telephone sales performance by using smart watch
CN113707282A (en) Candidate doctor sorting method and device, readable storage medium and electronic equipment
CN108227038B (en) Typhoon intensity diagnosis method and device, server and storage medium
CN112307338A (en) Flow control method, device, equipment and storage medium of freight rate search system
CN112054989A (en) Construction method of detection model and detection method of batch operation abnormity
RU2795667C1 (en) User internet traffic analysis system for user classification coefficient assignment to user device
CN110569475A (en) Evaluation method, device, equipment and storage medium for netizen influence
WO2019231015A1 (en) Enterprise information providing apparatus and method
CN112906999B (en) Telephone traffic index optimization effect evaluation method and device and computing equipment
CN113965421B (en) Application program interface acquisition method and device and application program interface analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant