WO2021169294A1 - 更新应用识别模型的方法、装置及存储介质 - Google Patents

更新应用识别模型的方法、装置及存储介质 Download PDF

Info

Publication number
WO2021169294A1
WO2021169294A1 PCT/CN2020/118993 CN2020118993W WO2021169294A1 WO 2021169294 A1 WO2021169294 A1 WO 2021169294A1 CN 2020118993 W CN2020118993 W CN 2020118993W WO 2021169294 A1 WO2021169294 A1 WO 2021169294A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
data
model
unknown
data flow
Prior art date
Application number
PCT/CN2020/118993
Other languages
English (en)
French (fr)
Inventor
司晓云
胡新宇
薛莉
吴俊�
张亮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20921538.3A priority Critical patent/EP4095768A4/en
Publication of WO2021169294A1 publication Critical patent/WO2021169294A1/zh
Priority to US17/822,581 priority patent/US20220414487A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Definitions

  • This application relates to the technical field of artificial intelligence (AI), and in particular to a method, device and storage medium for updating an application recognition model.
  • AI artificial intelligence
  • each application performs data communication with the corresponding application server when running, thereby generating multiple data traffic.
  • the client device can identify the data flow through the application recognition model to determine the application category to which the data flow belongs, and then reasonably arrange the data flow according to the application category.
  • the application recognition model used for application recognition is deployed on the client device after being trained offline using training samples.
  • the accuracy of application recognition results obtained by offline training of the application recognition model is not high.
  • This application provides a method, device, and storage medium for updating an application recognition model, which can be used to improve the ability of the application recognition model to recognize emerging applications and improve the accuracy of the application recognition model to recognize upgraded applications.
  • the technical solution is as follows:
  • a method for updating an application recognition model includes: determining a plurality of training samples according to an application recognition result of each data flow in a plurality of data flows, and the application recognition result is based on the application recognition result.
  • the model recognizes the corresponding data flow; trains the application recognition model according to the multiple training samples; sends the trained model data of the application recognition model to the server, so that the server can be based on the multiple clients received.
  • the model data sent by the end device acquires the jointly updated model data; receives the jointly updated model data sent by the server; and obtains the jointly updated application identification model according to the jointly updated model parameters.
  • the client device can determine multiple training samples according to the recognition results of multiple pieces of data traffic, and then use the training samples to train the application recognition model. After that, the client device can use the trained application recognition model
  • the model data of is uploaded to the server, and the server performs joint updates based on the model data uploaded by multiple client devices. After that, the client device may obtain the jointly updated application identification model according to the jointly updated model data issued by the server. It can be seen that the model data of the jointly updated application recognition model is equivalent to being dynamically updated by multiple client devices based on real-time data traffic. In this way, even if the application is updated or other emerging applications appear, the model data of the application recognition model It is also dynamically updated according to the data flow, so the recognition accuracy of the application recognition model can be better guaranteed.
  • the first implementation method can obtain unknown data flows belonging to unknown applications from the multiple data flows according to the application recognition results of each data flow, and generate multiple training samples based on the unknown data flows.
  • the target data traffic belonging to the target application category can be obtained from the multiple data traffic according to the application recognition result of each data traffic, and multiple training samples can be generated according to the target data traffic, wherein the target The application category refers to the application category in which the flow characteristics of the corresponding data flow have characteristic drift within a specified time period.
  • the third implementation method combines the above-mentioned first implementation method and the second implementation method to obtain the unknown data traffic belonging to the unknown application and the target data traffic belonging to the target application category from multiple data traffic, according to the unknown data traffic And target data traffic to generate multiple training samples.
  • the client device can obtain the unknown data traffic belonging to the unknown application among the multiple data traffic, and then generate training samples according to the unknown data traffic to train the application recognition model.
  • the updated application recognition The model can better identify emerging applications and enhance the ability of the application recognition model to recognize emerging applications.
  • the client device can identify the updated application by detecting whether the traffic characteristics of the data traffic drift or not, and then generate training samples based on the data traffic of the updated application to identify the application
  • the model is retrained, so that the finally updated application recognition model can better recognize the updated application, which improves the accuracy of application recognition.
  • the implementation process of obtaining unknown data traffic belonging to unknown applications from the multiple data traffic may be: Among the multiple pieces of data traffic, the data traffic whose application identification result meets the unknown application condition is obtained, and the acquired data traffic is regarded as the unknown data traffic belonging to the unknown application, and the unknown application condition refers to the confidence corresponding to each application category in the application identification result. Degree is less than the reference threshold, or the unknown application condition means that the application recognition result does not belong to multiple specified cluster clusters, and the multiple specified cluster clusters are for each application in the original training sample set of the application recognition model The traffic characteristics of the data traffic of the category are obtained by clustering.
  • the implementation process of generating training samples based on unknown data flow may be: acquiring the flow characteristics of the unknown data flow; according to the flow characteristics of the unknown data flow, Obtain the application information of the application to which the unknown data flow belongs from the server; use the flow characteristics of the unknown data flow as the training data in the first training sample, and use the application information of the application to which the unknown data flow belongs as For label data in the first training sample, the first training sample is one training sample among the multiple training samples.
  • the realization process of obtaining the target data traffic belonging to the target application category from the multiple data traffic may be: According to the application identification result of each data flow, determine from the multiple data flows a plurality of known data flows that do not belong to an unknown application within the specified time period; according to the application identification results of the plurality of known data flows , The designated time period and the identifier of the client device, the feature drift flags corresponding to the multiple application categories included in the application recognition results of the multiple pieces of known data traffic are obtained from the server, the feature The drift flag is used to indicate whether the flow characteristics of the data flow of the corresponding application category drift; according to the characteristic drift flags corresponding to each application category, determine the target application category where the flow characteristics of the data flow have drifted from the multiple application categories ; Obtain the target data flow belonging to the target application category from the plurality of known data flows.
  • the implementation manner of generating training samples according to the target data flow may be: using the flow characteristics of the target data flow as the training data in the second training sample, The application category to which the target data flow belongs as indicated by the application recognition result of the target data flow is used as the label data in the second training sample, and the second training sample is one of the multiple training samples .
  • the model data of the trained application recognition model includes model parameters of the trained application recognition model, or the model data of the trained application recognition model includes the trained application recognition model The difference data of the model parameters compared to the model parameters of the application recognition model before training.
  • the jointly updated model data includes the model parameters of the jointly updated application recognition model, or the jointly updated model data includes the model parameters of the jointly updated application recognition model.
  • the difference data of the model parameters compared to the application recognition model before training.
  • a method for updating an application recognition model includes: receiving model data of a trained application recognition model sent by a plurality of client devices, and the trained application recognition model is used by the corresponding client.
  • the device trains the application recognition model according to multiple training samples, and the multiple training samples are determined by the corresponding client device according to the application recognition result of each of the multiple data flows; according to the multiple received
  • the model data obtains the joint updated model data; and sends the joint updated model data to the multiple client devices, so that the multiple client devices obtain the joint updated model data according to the joint updated model data Application recognition model.
  • the server may obtain the jointly updated model data according to the received trained model data sent by multiple client devices. Since the trained model data sent by the client device is dynamically updated according to the real-time data flow, the model data after the joint update is equivalent to the dynamic update of each client device according to the real-time data flow. In this way, even if the application is updated or other emerging applications appear, since the model data of the application recognition model is dynamically updated according to the data traffic corresponding to each client device, the recognition accuracy of the application recognition model can be better guaranteed.
  • the flow characteristics of the unknown data flow sent by the first client device may also be received, where the unknown data flow is the first A client device determines data traffic belonging to an unknown application from a plurality of data traffic; obtains application information of the application to which the unknown data traffic belongs according to the traffic characteristics of the unknown data traffic; and reports to the first client device Sending the application information of the application to which the unknown data flow belongs, so that the first client device generates a training sample according to the application information of the application to which the unknown data flow belongs.
  • the server may obtain corresponding application information according to the flow characteristics of the unknown data flow sent by the client device, and then feed back the application information to the client device.
  • the client device can generate training samples based on the unknown data traffic to train the application recognition model.
  • the updated application recognition model can better recognize emerging applications and enhance the recognition of emerging applications by the application recognition model. ability.
  • the application recognition results of multiple known data flows may also be received.
  • the multiple pieces of known data traffic refer to data traffic that does not belong to an unknown application within a specified time period determined from the multiple pieces of data traffic; according to the multiple pieces of known data traffic
  • the application recognition result includes each application category and the confidence level corresponding to each application category, and determine the current portrait of the corresponding application category; according to the identification of the first client device, obtain the last determined time and the specified time Segment corresponding to the portrait of each application category; according to the current portrait of each application category and the last acquired portrait of the corresponding application category, the feature drift flag corresponding to the corresponding application category is determined, and the feature drift flag is used to indicate the corresponding Whether the traffic characteristics of the data traffic of the application category drift; send the feature drift flag corresponding to each application category to the first client device, so that the first client device drifts according
  • the server can detect whether the traffic characteristics of the data traffic of the corresponding application category drift by determining the profile of each application category, and then feed back to the client device to indicate the characteristics A characteristic drift flag indicating whether drift has occurred.
  • the client device can identify the updated and upgraded application according to the feature drift flag, and then generate training samples according to the data traffic of the updated and upgraded application to retrain the application recognition model, so that the final updated application recognition model can be better Recognizing the updated application improves the accuracy of application recognition.
  • an apparatus for updating an application recognition model has the function of implementing the method and behavior of updating the application recognition model in the first aspect or the second aspect.
  • the apparatus for updating an application recognition model includes at least one module, and the at least one module is used to implement the method for updating an application recognition model provided in the first aspect or the second aspect.
  • an apparatus for updating an application recognition model includes a processor and a memory, and the memory is used to store a device that supports updating the application recognition model to execute the above-mentioned first aspect.
  • the program of the method for updating the application recognition model provided by the second aspect and storing the data involved in the method for implementing the method for updating the application recognition model provided by the first aspect or the second aspect.
  • the processor is configured to execute a program stored in the memory.
  • the operating device of the storage device may further include a communication bus, which is used to establish a communication connection between the processor and the memory.
  • a computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute the update application described in the first or second aspect above. Identify the method of the model.
  • a computer program product containing instructions, which when running on a computer, causes the computer to execute the method for updating an application recognition model described in the first or second aspect.
  • the client device can determine multiple training samples according to the recognition results of multiple pieces of data traffic, and then use the training samples to train the application recognition model. After that, the client device can use the trained application recognition model
  • the model data of is uploaded to the server, and the server performs joint updates based on the model data uploaded by multiple client devices. After that, the client device may obtain the jointly updated application identification model according to the jointly updated model data issued by the server. It can be seen that the jointly updated application recognition model is equivalent to dynamically updated based on the characteristics of the data traffic collected by multiple client devices. In this way, even if the application is updated or other emerging applications appear, the application recognition model is also based on each The characteristics of the data traffic collected by the client device are dynamically updated, so that the recognition accuracy of the application recognition model can be better guaranteed.
  • FIG. 1 is a system architecture diagram involved in a method for updating an application recognition model provided by an embodiment of the present application
  • FIG. 2 is a diagram of another system architecture involved in the method for updating an application recognition model provided by an embodiment of the present application
  • Figure 3 is a schematic structural diagram of a computer device for updating an application recognition model provided by an embodiment of the present application
  • Figure 4 is a flowchart of a method for updating an application recognition model provided by an embodiment of the present application
  • FIG. 5 is a schematic structural diagram of an apparatus for updating an application recognition model provided by an embodiment of the present application
  • Fig. 6 is a schematic structural diagram of a determining module for generating multiple training samples provided by an embodiment of the present application
  • Fig. 7 is a schematic structural diagram of another apparatus for updating an application recognition model provided by an embodiment of the present application.
  • Application identification refers to identifying the data traffic of the client device to determine the application category to which the data traffic belongs. By identifying the application category to which the data traffic belongs, the client device can be used for other purposes, for example, to arrange each data traffic more reasonably, that is, to configure the sending and receiving priority of the data traffic according to the application category to which the data traffic belongs.
  • offline trained AI models can be used to identify data traffic.
  • the recognition accuracy of the AI model depends on whether the training sample set during offline training effectively describes the possible traffic distribution in reality.
  • the training sample set of the originally trained AI model will gradually become unable to characterize the current distribution of traffic.
  • the recognition accuracy of the AI model will be decline.
  • the method for updating the application recognition model provided by the embodiment of the application can be used in the above scenario.
  • the model of the retrained AI model After the client device retrains the AI model according to the traffic characteristics of the collected data traffic, the model of the retrained AI model The data is uploaded to the server, and the server performs a joint update based on the model data uploaded by multiple client devices, so as to realize the update of the AI model.
  • FIG. 1 is a system architecture diagram involved in a method for updating an application recognition model provided by an embodiment of the present application.
  • the system includes multiple client devices 101 and servers 102. Multiple client devices 101 can communicate with the server 102.
  • an application recognition model is deployed on each client device 101.
  • multiple applications may be installed on the client device 101, or multiple applications may be installed on the terminal corresponding to the client device 101.
  • the application When the application is running, it can communicate with the application server corresponding to the application to generate data traffic, which will pass through the client device.
  • Each client device can identify the data flow through its own application identification model, so as to obtain the application identification result.
  • each client device 101 can use the method provided in the embodiment of the present application to collect training samples according to the application recognition result of the data traffic, and then retrain the application recognition model according to the collected training samples. . After that, each client device 101 can upload the model data of the application recognition model obtained by retraining to the server 102.
  • the server 102 may receive the model data uploaded by each client device, and obtain the jointly updated model data according to the model data uploaded by each client device. The server can then deliver the jointly updated model data to each client device 101.
  • each client device 101 After each client device 101 receives the jointly updated model data issued by the server 102, it can obtain the jointly updated application identification model according to the jointly updated model data. If the jointly updated application identification model meets the convergence condition, the client device 101 can subsequently use the jointly updated application identification model to identify the data traffic. Of course, if the jointly updated application recognition model does not meet the convergence conditions, the client device 101 can continue to train the application recognition model, and then continue to upload the trained model data to the server, and the server can also continue to perform the joint update until the joint The updated application recognition model meets the convergence conditions. That is, in the embodiment of the present application, the client device and the server may perform a single round of updating the application recognition model, or may perform multiple rounds of updating, which is not limited in the embodiment of the present application.
  • the client device 101 may be a device that supports local training.
  • the client device 101 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a switch, an optical line terminal (OLT) , Optical network terminal (ONT), routers, switches, etc.
  • OLT optical line terminal
  • ONT Optical network terminal
  • routers switches, etc.
  • the embodiment of the application does not limit this.
  • the server 102 may be a device supporting joint learning.
  • the server 102 may be a server used for joint training, or a server cluster used for joint training, or a cloud service platform that supports joint training.
  • the embodiment of the application does not limit this.
  • FIG. 2 is a schematic diagram of another system architecture shown in an embodiment of the present application.
  • the system may include multiple client devices 201, multiple network devices 202, and servers 203.
  • each client device 201 may correspond to one or more network devices 202
  • the one or more network devices 202 corresponding to each client device 201 are network devices in one site.
  • the client device 201 can also be called a site analysis device (also called a site analysis platform), which can be a server, or a server cluster composed of several servers, or a cloud computing service center.
  • the system involved in the model update method includes multiple site networks.
  • the site network can be a core network or an edge network, and the users of each site network can be operators or corporate customers.
  • Different site networks can be different networks divided according to corresponding dimensions, for example, they can be networks of different regions, networks of different operators, networks of different services, and different network domains.
  • Each site network includes one or more network devices, and multiple client devices 201 can have a one-to-one correspondence with multiple site networks.
  • Each client device 201 is used to provide data analysis services for the corresponding site network.
  • each client device 201 can correspond to one or more network devices 202 in an office network to provide data analysis and other services for it.
  • Each client device 201 can be located in the corresponding site analysis network, or can be located outside the corresponding site analysis network.
  • Each client device 201 and the server 203 are connected through a wired network or a wireless network.
  • the communication network involved in the embodiments of this application is a second-generation (2-Generation, 2G) communication network, a third-generation (3rd Generation, 3G) communication network, a long-term evolution (Long Term Evolution, LTE) communication network, or a fifth-generation (2-Generation, 2G) communication network.
  • Generation (5rd Generation, 5G) communication network etc.
  • the server 203 may be a cloud analysis device (also called a cloud analysis platform), which may be a computer, or a server, or a server cluster composed of several servers, or a cloud computing service center, which is deployed in The back end of the service network.
  • a cloud analysis device also called a cloud analysis platform
  • the server 203 may be a cloud analysis device (also called a cloud analysis platform), which may be a computer, or a server, or a server cluster composed of several servers, or a cloud computing service center, which is deployed in The back end of the service network.
  • the network device 202 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a switch, an optical line terminal (OLT), an optical network terminal (ONT), a router, a switch, and the like.
  • OLT optical line terminal
  • ONT optical network terminal
  • router a switch
  • switch an optical line terminal
  • switch optical network terminal
  • Fig. 3 is a schematic structural diagram of a computer device provided by an embodiment of the present application. Both the client device or the server in FIG. 1 or 2 can be implemented by the computer device shown in FIG. 3.
  • the computer device includes at least one processor 301, a communication bus 302, a memory 303, and at least one communication interface 304.
  • the processor 301 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a graphics processing unit (GPU), or any combination thereof.
  • the processor 301 may include one or more chips, and the processor 301 may include an AI accelerator, such as a neural network processor (neural processing unit, NPU).
  • NPU neural network processor
  • the communication bus 302 may include a path for transferring information between the above-mentioned components.
  • the memory 303 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions.
  • the type of dynamic storage device can also be electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, optical disc Storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by Any other medium accessed by the computer, but not limited to this.
  • the memory 303 may exist independently, and is connected to the processor 301 through the communication bus 302.
  • the memory 303 may also be integrated with the processor 301.
  • the memory 303 may store computer instructions. When the computer instructions stored in the memory 303 are executed by the processor 301, the method of updating the application recognition model of the present application may be implemented.
  • the memory 303 may also store intermediate data and/or result data generated by the processor during the execution of the foregoing method.
  • the communication interface 304 uses any device such as a transceiver to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area networks (Wireless Local Area Networks, WLAN), etc.
  • RAN wireless access network
  • WLAN Wireless Local Area Networks
  • the processor 301 may include one or more CPUs.
  • the computer device may include multiple processors.
  • processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
  • Fig. 4 is a flowchart of a method for updating an application recognition model provided by an embodiment of the present application. This method can be applied to the system shown in FIG. 1 or FIG. 2. Referring to FIG. 4, the method includes:
  • Step 401 The client device determines a plurality of training samples according to the application identification result of each data flow in the multiple data flows, and the application identification result is obtained by identifying the corresponding data flow through the application identification model.
  • an application recognition model is deployed on the client device.
  • the application recognition model can be an AI model trained offline, or an AI model after multiple updates.
  • the client device may be the client device in the system shown in FIG. 1 or FIG. 2.
  • the client device when the client device is the client device in FIG. 1, when the client device is running an application, or when the terminal corresponding to the client device is running an application, it can communicate with The application server of the application performs data communication, thereby generating data traffic passing through the client device.
  • the client device may send a data request to the application server, and the application server may return application data to the client device.
  • Both the data sent by the client device to the application server and the data returned by the application server to the terminal can be referred to as data traffic passing through the client device.
  • the multiple data flows in this step may refer to these data flows passing through the client device.
  • the client device may correspond to one or more network devices in an office network.
  • the one or more network devices may report the flow characteristics of the data traffic passing through to the client device.
  • the multiple pieces of data traffic in this step may refer to the data traffic passing through one or more network devices in the site network corresponding to the client device.
  • multiple pieces of data traffic may refer to data traffic passing through the client device, or may refer to data traffic passing through the client device but passing through the site network served by the client device. Data traffic of one or more network devices.
  • the client device can obtain the flow characteristics of the data flow, and input the flow characteristics into the application identification model, so as to obtain the application identification result of the data flow through the identification of the application identification model.
  • the application identification result may include the application category to which the data flow may belong and the confidence level corresponding to each application category.
  • the application category can be a specific application name, for example, A application, B application, and so on.
  • the application category may also be an application type, for example, it may be a video application, a game application, a voice application, and so on. That is to say, in the embodiment of the present application, the application identification result may be used to indicate which application the data flow belongs to, or may be used to indicate which type of application the data flow belongs to, which is not limited in the embodiment of the present application.
  • the client device can obtain the application identification result of each data flow through the application identification model.
  • multiple pieces of data traffic may be data traffic of unknown applications, or data traffic of known applications.
  • the client device can generate a first training sample according to the data traffic belonging to an unknown application, and the client device can generate a second training sample according to the data traffic with a large change in traffic characteristics among the data traffic of the known application.
  • the client device After the client device obtains the application recognition result of the data traffic through the application recognition model, the client device can determine whether the corresponding data traffic is an unknown data traffic belonging to an unknown application according to the application recognition result of each data traffic, thereby One or more unknown data flows belonging to unknown applications are obtained from the data flows.
  • the client device may detect whether the first data flow meets the unknown application condition according to the application identification result of the first data flow. If the first data flow meets the unknown application condition, it indicates that the first data flow is an unknown data flow belonging to an unknown application.
  • the unknown application condition may be that the confidence of each application category in the application recognition result is less than a specified threshold. That is to say, if the confidence levels of the various application categories included in the application recognition result of the first data flow are all less than the specified threshold, it can be determined that the first data flow meets the unknown application conditions, that is, the first data flow belongs to the unknown Unknown data traffic of the application. Otherwise, it can be determined that the first data flow does not meet the unknown application condition, and it is a known data flow that does not belong to the unknown application.
  • the unknown application condition may also mean that the application recognition result does not belong to multiple specified clusters, where the multiple specified clusters are data flows of each application category in the original training sample set of the application recognition model.
  • the traffic characteristics are clustered. That is, in the embodiment of the present application, the client device may cluster the training samples of each application category in the original training sample set used in offline training to obtain multiple clusters, and the multiple clusters That is, the aforementioned multiple designated clusters. In this way, after identifying the flow characteristics of the first data flow through the application identification model, the client device can compare the identification result with the cluster centers and the confidence radius of the clusters corresponding to each application category.
  • the result does not belong to the confidence radius cluster of any cluster center, that is, does not belong to any of the multiple cluster clusters, it can be determined that the first data flow meets the unknown application condition, that is, the first data flow A data flow is unknown data flow belonging to an unknown application. Otherwise, it can be determined that the first data flow does not meet the unknown application condition, and it is a known data flow that does not belong to the unknown application.
  • the client device can According to each unknown data flow, a corresponding first training sample is generated.
  • the client device can obtain the traffic characteristics of the unknown data traffic; according to the traffic characteristics of the unknown data traffic, obtain the application information of the application to which the unknown data traffic belongs from the server; The flow characteristics of the unknown data flow are used as the training data in the first training sample, and the application information of the application to which the unknown data flow belongs is used as the label data in the first training sample.
  • the flow characteristics of the unknown data flow may include the time of sending and receiving multiple data packets included in the unknown data flow, the quintuple of sending and receiving packets, the domain name system (DNS) address, the length of sending and receiving packets, and the number of sending and receiving packets. Data, uniform resource locator (URL), etc.
  • DNS domain name system
  • the client device After the client device obtains the flow characteristics of the unknown data traffic, it can obtain the unknown data traffic from the server according to the DNS address or the 5-tuple of the sending and receiving packets or the URL and other key traffic information contained in the flow characteristics of the unknown data traffic. Application information of your application.
  • the client device may send key traffic information contained in the flow characteristics of the unknown data flow to the server.
  • the server may obtain the application information of the application to which the unknown data flow belongs according to the key information of the flow.
  • the server may store a mapping relationship between key traffic information and application information.
  • the client device can send the URL contained in the flow characteristics of the unknown data traffic to The server, after receiving the URL, the server can obtain the application information corresponding to the URL from the stored mapping relationship and send it to the client device.
  • the client device can use the application information as the application information of the application to which the unknown data flow belongs.
  • the application information may include information such as application name and application type.
  • the client device may send the receiving and sending packet quintuple to the server, and the server may obtain the IP address therefrom, and obtain it from the stored mapping relationship
  • the application information corresponding to the IP address sends the obtained application information to the client device.
  • the client device can use the application information as the application information of the application to which the unknown data flow belongs.
  • the client device may send the DNS address included in the key feature of the unknown data traffic to the server, and the server may retrieve the stored mapping relationship from the DNS address. Obtain the application information corresponding to the DNS address, and return the application information to the client device. After receiving the application information, the client device can use the application information as the application information of the application to which the unknown data flow belongs.
  • the mapping relationship stored in the server may also be the mapping relationship between the IP address, DNS address, URL, and application information.
  • the client device can send and receive data in the flow characteristics of the unknown data flow. Any one or more of the packet quintuple, DNS address, and URL is sent to the server to obtain the corresponding application information, which will not be repeated in this embodiment of the application.
  • the client device can directly send the flow characteristics of the unknown data flow to the server, and the server can obtain the application information of the application to which the unknown data flow belongs according to the key traffic information contained in the flow characteristics.
  • the server can obtain the corresponding type of traffic key information from the traffic characteristics according to the type of the traffic key information stored in the mapping relationship, and then obtain the corresponding application information from the mapping relationship according to the acquired traffic key information .
  • the server may obtain the URL from the received traffic characteristics, and then obtain the corresponding application information according to the obtained URL.
  • the server cannot find the corresponding application information from the mapping relationship based on the key traffic information reported by the client device.
  • the server can display the key traffic information, which can be checked by a technician.
  • the server can return the application information to the client device.
  • the server also needs to store the key traffic information and the application information input by the technician in the aforementioned mapping relationship table for subsequent query and use.
  • the client device After receiving the application information of the application to which the unknown data flow belongs from the server, the client device can use the previously acquired flow characteristics of the unknown data flow as training data, and use the received data of the application to which the unknown data flow belongs.
  • the application information is used as the label data corresponding to the training data.
  • the first training sample corresponding to the unknown data flow is composed of the training data and the label data.
  • the first training sample is stored in the training data storage area of the client device.
  • the client device can use the above method to process the data flow, so as to obtain a corresponding first training sample.
  • the client device can also determine the target data traffic belonging to the target application category from multiple data traffic according to the application recognition result of each data traffic, where the target application category refers to the corresponding data traffic within a specified time period The application category in which the characteristic drift of the flow characteristics occurs. After that, a second training sample can be generated according to the target data flow belonging to the target application category.
  • the client device can determine multiple known data flows that are not unknown applications within a specified time period from the multiple data flows according to the application recognition results of each data flow; Knowing the application recognition result of the data flow, the specified time period and the identification of the client device, the characteristic drift value corresponding to the multiple application categories included in the multiple application recognition results of the known data flow is obtained from the server; Among the application categories, determine the target application category whose characteristic drift value is equal to the reference drift value; obtain the data traffic belonging to the target application category from multiple known data traffic.
  • the client device can detect whether the application recognition result of the data traffic meets the unknown application condition. If the application recognition result of the data traffic does not meet the unknown application condition, then The data traffic can be determined as known data traffic that does not belong to an unknown application. Among them, the implementation of detecting whether the application identification result of the data traffic meets the unknown application conditions can refer to the relevant introduction in the foregoing, and the details of the embodiment of the application are not repeated here.
  • the client device may store the application identification result of the data flow according to different time periods. Based on this, when it is determined that a piece of data traffic is a known data traffic, the client device can add the application identification result of the known data traffic to include the corresponding receiving or sending time according to the receiving or sending time of the known data traffic In the set of recognition results for the specified time period. In this way, the client device can obtain multiple pieces of known data within a specified time period.
  • one of the multiple data flows is a known data flow
  • the receiving or sending time of the data flow is 19:30
  • the client device stores the identification results for the time period of 19:00-21:00 Set
  • the client device can add the identification result of the known data flow to the identification result set for this time period. In this way, the client device can determine all the known data flows in this time period, and then obtain the application identification results of the known data flows in this time period.
  • the client device can determine the known data flow in the corresponding time period through the above method.
  • the client device After obtaining the application identification results of multiple known data flows within a specified time period, the client device can one by one, batch by batch, or at a time, the application identification results of the known data flow within the specified time period, the specified time period and The identification of the client device is sent to the server.
  • the identification of the client device may be the geographic location of the client device, or the device identification of the client device, etc., which is not limited in the embodiment of the present application.
  • the server After the server receives the application identification results of multiple known data flows within a specified time period sent by the client device, the specified time period, and the identification of the client device, it can be based on multiple known data flows within the specified time period.
  • Each application category and the corresponding confidence level of each application category in the application recognition result of the application determine the current profile of the corresponding application category.
  • the server may obtain the most recently determined portrait of each application category corresponding to the specified time period according to the identifier of the client device.
  • the feature drift flag is used to indicate whether the traffic characteristics of the data traffic of the corresponding application category occur.
  • Drift Send the feature drift flag corresponding to each application category to the client device.
  • the server may use the application category with the highest confidence among the application categories included in the application identification result as the final application category of the known data traffic. According to the final application category of each known data flow, multiple application recognition results are classified. After that, the server may determine the portrait of the corresponding application category according to the number of application recognition results corresponding to each final application category and the confidence of the final application category in the application recognition results.
  • the most recently determined portrait of each application category corresponding to the time period may refer to the portrait of each application category within the specified time period of the client device in the last cycle determined last time, or several recent times The determined average portrait of each application category of the client device in the specified time period in the previous several cycles.
  • the server can compare two portraits of the same application category. If the difference between the two portraits exceeds a preset threshold, It can be determined that the flow characteristics of the data flow of the application category corresponding to the two portraits have drifted, that is, the application of the application category may be updated and upgraded. At this time, the feature drift flag corresponding to the application category can be set as the first flag. If the difference between the two portraits does not exceed the preset threshold, it can be determined that the data traffic characteristics of the application categories corresponding to the two portraits have small changes and no drift, that is, the applications of the application category are not The application is updated and upgraded. At this time, the feature drift flag of the application category corresponding to the two portraits can be set to a second flag that is different from the first flag.
  • the server can maintain a feature drift field in the order of each application category.
  • the feature drift field can include multiple feature drift flag bits, and each feature drift flag bit can correspond to an application. category.
  • the default value of the multiple feature drift flag bits may be the same as the second flag, for example, it may be 0.
  • the server may set the characteristic drift flag bit corresponding to the application category in the characteristic drift field as the first flag, where the reference The drift value can be 1. If the flow characteristics of the data flow of the application category do not drift, the characteristic drift flag bit corresponding to the application category can be kept at the default value of 0. In this way, the value of the feature drift flag corresponding to each application category in the feature drift field can be determined, and then the server can send the feature drift field to the client device, so that the client device can determine the data traffic of which application categories. The feature has drifted.
  • each application recognition result includes three application categories: video, game, and voice, and each application The confidence level corresponding to the category.
  • the server may use the application category with the highest degree of confidence in each application recognition result as the final application category corresponding to the corresponding application recognition result.
  • the server can classify multiple application recognition results into different application categories according to the final application category corresponding to each application recognition result, thereby obtaining video application recognition results, game application recognition results and voice application recognition result.
  • the server may determine to obtain the portrait of the application category according to the confidence of the application category in the application recognition results of each application category and the number of application recognition results included.
  • the server can obtain the portraits of each application category obtained according to the application recognition results in the time period of 19:00-21:00 on the day before. After that, the server can compare the differences between the portraits of the same application category to determine whether the differences between the portraits exceed the preset threshold, and then determine the feature drift flag corresponding to the application category.
  • the server may send the feature drift flag corresponding to each application category to the client device.
  • the client device After receiving the feature drift flag corresponding to each application category, the client device can use the application category whose feature drift flag is the first flag as the target application category. After that, the client device can determine the final application category of each known data traffic based on the application identification results of multiple known data traffic within the specified time period reported to the server, and then classify the final application category as the target application category.
  • the data traffic is regarded as the target data traffic belonging to the target application category.
  • the server can send a feature drift field containing multiple feature drift flag bits to the client device, and the client device can determine from the feature drift field after receiving the feature drift field
  • the value is the application category corresponding to the feature drift flag bit of the first flag, and then the determined application category is used as the target application category.
  • the client device After the client device determines the target data flow, for each target data flow, the client device can obtain the flow characteristics of the target data flow, use the flow characteristics of the target data flow as training data, and apply the target data flow The recognition result is used as the label data of the training data, so that a second training sample is formed from the training data and the label data.
  • the client device can refer to this method to determine and obtain a corresponding second training sample. After that, the client device may store the obtained second training sample in the training data storage area of the client device.
  • the training data storage area can simultaneously store the first training sample generated according to the first implementation and the basis The second training sample generated by the second implementation.
  • the embodiment of the present application may also use any of the foregoing implementation manners to generate training samples.
  • the training data storage area may store the first training samples generated according to the first implementation manner or according to the first implementation manner.
  • the second training sample generated by the second implementation mode does not limit this.
  • Step 402 The client device trains the application recognition model according to multiple training samples.
  • the client device can generate a first training sample based on the unknown application data flow, and can generate a second training sample based on the data flow of the known data flow with feature drift.
  • the generated first training sample and second training sample can be stored in the training data storage area.
  • the space size of the training data storage area can be fixed, so that when the client device detects that the space occupied by the training samples stored in the training data storage area reaches a certain threshold, it can trigger the training of the application recognition model .
  • a certain threshold is less than or equal to the space size of the training data storage area.
  • the client device may also trigger the training of the application recognition model when the user instruction is detected or the specified time is reached, which is not limited in the embodiment of the present application.
  • the client device may obtain multiple stored training samples from the training data storage area, and train the current application recognition model through the multiple obtained training samples.
  • the client device may perform one round of training on the application recognition model through multiple acquired training samples, of course, it may also conduct multiple rounds of training through multiple acquired training samples.
  • the embodiment of the application does not limit this.
  • Step 403 The client device sends the model data of the trained application recognition model to the server.
  • the client device after the client device locally trains the application recognition model according to multiple training samples, it can obtain the model parameters of the application recognition model after the training, and determine the model parameters of the application recognition model after the training and the model parameters before the training.
  • the difference data between the model parameters of the application recognition model is then uploaded to the server as the model data of the trained application recognition model.
  • the client The device can determine the difference data: [a2-a1, b2-b1, c2-c1, d2-d1...], and report the difference data to the server.
  • the client device may also directly report the model parameters of the trained application recognition model as model data to the server.
  • the client device can directly report [a2, b2, c2, d2...] to the server.
  • the client device may also report the complete data (including model parameters and model structure) of the trained application recognition model to the server.
  • Step 404 The server receives model data of the trained application recognition model uploaded by multiple client devices.
  • Step 405 The server obtains the jointly updated model data according to the received multiple model data.
  • the implementation of this step is also different.
  • the server may jointly average the difference data reported by each client device, and use the joint average result as the jointly updated model data.
  • the server may jointly update the model parameters of the application recognition model currently deployed on the server according to the difference data reported by each client device, and use the difference data between the model parameters after the joint update and the model parameters before the update as the joint The updated model data.
  • the server may jointly update the model parameters of the application identification model currently deployed on the server according to the difference data reported by each client device, and directly use the jointly updated model parameters as the jointly updated model data.
  • the server may jointly update the model parameters of the application recognition model currently deployed on the server according to the difference data reported by each client device, and use the complete data (including model structure and model parameters) of the application recognition model after the joint update as the joint The updated model data.
  • the server may jointly average the model parameters reported by each client device, and directly use the model parameter after the joint average as the joint update Model data.
  • the server may jointly update the application recognition model deployed by itself according to the model parameters reported by each client device, and then combine the model parameters of the application recognition model after the joint update and the model parameters of the application recognition model when it is not updated.
  • the difference data, or the model parameters of the jointly updated application recognition model, or the complete data (including the model structure and model parameters) of the jointly updated application recognition model are used as the jointly updated model data.
  • the server can determine the application recognition model based on the model parameters of the application recognition model reported by each client device.
  • the server may use the complete data of the jointly updated application identification model (including model structure and model parameters) or the model parameters of the jointly updated application identification model as the jointly updated model data.
  • the server can also identify the model of the application deployed by itself based on the combined average result.
  • the model parameters are updated to obtain the jointly updated application recognition model.
  • the jointly updated application identification model is equivalent to the update based on the traffic data of each client device. Therefore, the application identification The model parameters of the model can reflect the traffic distribution of more client devices.
  • Step 406 The server sends the jointly updated model data to the client device.
  • the server may deliver the jointly updated model data to each client device that reports the trained model data.
  • Step 407 The client device receives the joint updated model data sent by the server.
  • Step 408 The client device obtains the jointly updated application recognition model according to the jointly updated model data.
  • the client device After receiving the jointly updated model data issued by the server, the client device can obtain the jointly updated application identification model according to the jointly updated model data.
  • the client device may update the pre-training application recognition model according to the difference data to obtain the jointly updated application recognition model.
  • the client device may update the application recognition model before or after training according to the model parameters to obtain the jointly updated application recognition model.
  • the client device can directly load the model data to obtain the jointly updated application identification Model.
  • the client device can detect whether the jointly updated application recognition model meets the convergence condition, and if the convergence condition is satisfied, the jointly updated application recognition model can be used as the final updated model .
  • the client device may return to step 402, continue to train the updated application recognition model through multiple training samples, and upload it to continue training
  • the latter model data is sent to the server, and the server performs joint update again until the joint updated application identification model obtained by the client device according to the joint updated model data issued by the server satisfies the convergence condition.
  • the operations performed by the client device can be implemented as a separate embodiment, and the operations performed by the server can also be implemented as a separate embodiment, which is not limited in the embodiment of this application. .
  • the client device can determine multiple training samples according to the recognition results of multiple pieces of data traffic, and then use the training samples to train the application recognition model. After that, the client device can use the trained application recognition model
  • the model data of is uploaded to the server, and the server performs joint updates based on the model data uploaded by multiple client devices. After that, the client device may obtain the jointly updated application identification model according to the jointly updated model data issued by the server. It can be seen that the jointly updated application recognition model is equivalent to dynamically updated based on the real-time data traffic corresponding to multiple client devices. In this way, even if the application is updated or other emerging applications appear, the application recognition model is also based on each client The data traffic corresponding to the device is dynamically updated, so the recognition accuracy of the application recognition model can be better guaranteed.
  • the client device can obtain the unknown data traffic belonging to the unknown application in the data traffic, and then generate training samples based on the unknown data traffic to train the application recognition model, the updated application recognition model can better recognize Emerging applications enhance the ability of the application recognition model to recognize emerging applications.
  • the client device can identify the updated application by detecting whether the traffic characteristics of the data traffic drift or not, and then generate training samples based on the data traffic of the updated application To retrain the application recognition model, so that the finally updated application recognition model can better recognize the updated and upgraded applications, which improves the accuracy of application recognition.
  • an embodiment of the present application provides an apparatus 500 for updating an application recognition model, and the apparatus includes:
  • the determining module 501 is configured to execute step 401 in the foregoing embodiment
  • the training module 502 is used to perform step 402 in the foregoing embodiment
  • the sending module 503 is configured to execute step 403 in the foregoing embodiment
  • the receiving module 504 is configured to execute step 407 in the foregoing embodiment
  • the update module 505 is used to execute step 408 in the foregoing embodiment.
  • the determination module 501 includes: an unknown application detection sub-module 5011 and/or a feature drift detection sub-module 5012, and the determination module 501 further includes a generation sub-module 5013;
  • the unknown application detection module 5011 is used to obtain unknown data traffic belonging to an unknown application from multiple data traffic according to the application identification result of each data traffic;
  • the feature drift detection sub-module 5012 is used to obtain the target data traffic belonging to the target application category from multiple data traffic according to the application recognition result of each data traffic.
  • the target application category refers to the corresponding data traffic in the specified time period.
  • the generation sub-module 5013 is used to generate multiple training samples according to the acquired data flow.
  • the unknown application detection submodule 5011 is specifically configured to:
  • Unknown application conditions mean that the confidence levels corresponding to each application category in the application identification results are less than The reference threshold, or the unknown application condition, means that the application recognition result does not belong to multiple specified clusters. Multiple specified clusters are used to aggregate the traffic characteristics of the data traffic of each application category in the original training sample set of the application recognition model. Class get.
  • the generating submodule 5013 is specifically used for:
  • the first training sample is one of multiple training samples. sample.
  • the feature drift detection submodule 5012 is specifically configured to:
  • the feature drift flags corresponding to the multiple application categories included in the application identification results of multiple known data flows are obtained from the server,
  • the characteristic drift flag is used to indicate whether the flow characteristics of the data flow of the corresponding application category drift
  • the feature drift flag corresponding to each application category determine the target application category where the flow characteristics of the data flow have drifted from a variety of application categories;
  • the generating submodule 5013 is specifically used for:
  • the second training sample is One training sample among multiple training samples.
  • the model data of the application recognition model after training includes the model parameters of the application recognition model after training, or the model data of the application recognition model after training includes the model parameters of the application recognition model after training compared with the model parameters of the application recognition model after training.
  • the application recognizes the difference data of the model parameters of the model.
  • the jointly updated model data includes the model parameters of the jointly updated application recognition model, or the jointly updated model data includes the model parameters of the jointly updated application recognition model compared to the application recognition model before training The difference data of the model parameters.
  • the client device can determine multiple training samples according to the recognition results of multiple pieces of data traffic, and then use the training samples to train the application recognition model. After that, the client device can train The obtained model data of the application recognition model is uploaded to the server, and the server performs a joint update based on the model data uploaded by multiple client devices. After that, the client device may obtain the jointly updated application identification model according to the jointly updated model data issued by the server. It can be seen that the jointly updated application recognition model is equivalent to dynamically updated based on the characteristics of the real-time data traffic collected by multiple client devices. In this way, even if the application is updated or other emerging applications appear, the application recognition model is also based on The characteristics of the data traffic collected by each client device are dynamically updated, so that the recognition accuracy of the application recognition model can be better guaranteed.
  • FIG. 7 is an apparatus 700 for updating an application recognition model provided by an embodiment of the present application, and the apparatus 700 includes:
  • the receiving module 701 is configured to execute step 404 in the foregoing embodiment
  • the first acquisition module 702 is configured to execute step 405 in the foregoing embodiment
  • the sending module 703 is configured to execute step 406 in the foregoing embodiment.
  • the device further includes a second acquisition module (not shown in the figure);
  • the receiving module is further configured to receive the flow characteristics of the unknown data flow sent by the first client device, where the unknown data flow is the data flow of an unknown application determined by the first client device from a plurality of data flows;
  • the second obtaining module is used to obtain application information of the application to which the unknown data flow belongs according to the flow characteristics of the unknown data flow;
  • the sending module is further configured to send application information of the application to which the unknown data flow belongs to the first client device, so that the first client device generates training samples according to the application information of the application to which the unknown data flow belongs.
  • the device further includes: a third acquiring module and a determining module (not shown in the figure);
  • the receiving module is also used to receive the application identification results of multiple pieces of known data traffic sent by the first client device, the specified time period, and the identification of the first client device.
  • the multiple pieces of known data traffic refer to multiple pieces of data traffic. Data traffic that does not belong to unknown applications within a specified time period determined in
  • the determining module is used to determine the current profile of the corresponding application category according to each application category included in the application identification results of multiple known data flows and the corresponding confidence of each application category;
  • the third acquisition module is configured to acquire the last determined portrait of each application category corresponding to the specified time period according to the identifier of the first client device;
  • the determining module is also used to determine the feature drift flag corresponding to the corresponding application category according to the current portrait of each application category and the last acquired portrait of the corresponding application category.
  • the feature drift flag is used to indicate the data flow of the corresponding application category. Whether the flow characteristics of the flow rate drift;
  • the sending module is also used to send the feature drift flag corresponding to each application category to the first client device, so that the first client device obtains the data traffic belonging to the target application category according to the feature drift flag corresponding to each application category, and
  • the training samples are generated according to the data traffic belonging to the target application category, and the target application category is the application category in which the traffic characteristics of the data traffic drift.
  • the server may obtain the jointly updated model data according to the received trained model data sent by multiple client devices. Since the trained model data sent by the client device is dynamically updated based on the characteristics of the collected real-time data flow, the model data after joint update is equivalent to the dynamic update based on the characteristics of the real-time data flow collected by each client device . In this way, even if the application is updated or other emerging applications appear, because the model data of the application recognition model is dynamically updated according to the characteristics of the data traffic collected by each client device, the recognition accuracy of the application recognition model can be better guaranteed .
  • the device for updating the application recognition model provided in the above embodiment updates the application recognition model
  • only the division of the above-mentioned functional modules is used as an example for illustration.
  • the above-mentioned function assignments can be assigned to different functions as required.
  • the function module is completed, that is, the internal structure of the device is divided into different function modules to complete all or part of the functions described above.
  • the apparatus for updating an application recognition model provided by the above-mentioned embodiment belongs to the same concept as the method embodiment for updating an application recognition model. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD) )Wait.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种更新应用识别模型的方法、装置及存储介质。客户端设备可以根据多条数据流量的识别结果来确定多个训练样本,进而通过该训练样本来训练应用识别模型,之后,客户端设备可以将训练得到的应用识别模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据进行联合更新。之后,客户端设备可以根据服务器下发的联合更新后的模型数据来获取联合更新后的应用识别模型。由此可见,联合更新后的应用识别模型相当于是根据多个客户端设备收集到的实时数据流量的特征更新得到的,这样,即使应用更新或出现了其他新兴应用,也可以更好的保证应用识别模型的识别准确率。

Description

更新应用识别模型的方法、装置及存储介质
本申请要求于2020年2月29日提交的申请号为202010132251.0、申请名称为“更新应用识别模型的方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(artificial intelligence,AI)技术领域,特别涉及一种更新应用识别模型的方法、装置及存储介质。
背景技术
当前,各个应用在运行时会与对应的应用服务器进行数据通信,从而产生多条数据流量。客户端设备可以通过应用识别模型识别数据流量,以确定出该数据流量所属的应用类别,进而根据该应用类别来合理安排该数据流量。目前,用于进行应用识别的应用识别模型是采用训练样本离线训练好之后部署在客户端设备中的。但是,由于应用经常会进行更新,并且,新兴应用层出不穷,因此,通过离线训练好的应用识别模型得到的应用识别结果的准确率不高。
发明内容
本申请提供了一种更新应用识别模型的方法、装置及存储介质,可以用于提高应用识别模型识别新兴应用的能力以及提高应用识别模型识别升级应用的准确率。所述技术方案如下:
第一方面,提供了一种更新应用识别模型的方法,所述方法包括:根据多条数据流量中每条数据流量的应用识别结果,确定多个训练样本,所述应用识别结果是通过应用识别模型对相应数据流量进行识别得到;根据所述多个训练样本对所述应用识别模型进行训练;向服务器发送训练后的应用识别模型的模型数据,以使所述服务器根据接收到的多个客户端设备发送的模型数据获取联合更新后的模型数据;接收所述服务器发送的联合更新后的模型数据;根据所述联合更新后的模型参数获取联合更新后的应用识别模型。
在本申请实施例中,客户端设备可以根据多条数据流量的识别结果来确定多个训练样本,进而通过该训练样本来训练应用识别模型,之后,客户端设备可以将训练得到的应用识别模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据进行联合更新。之后,客户端设备可以根据服务器下发的联合更新后的模型数据来获取联合更新的应用识别模型。由此可见,联合更新后的应用识别模型的模型数据相当于是由多个客户端设备根据实时数据流量动态更新得到的,这样,即使应用更新或出现了其他新兴应用,由于应用识别模型的模型数据也根据数据流量进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
可选地,根据多条数据流量中的每条数据流量的应用识别结果,确定多个训练样本的实现方式可以有三种。其中,第一种实现方式,可以根据每条数据流量的应用识别结果,从所 述多条数据流量中获取属于未知应用的未知数据流量,根据未知数据流量生成多个训练样本。第二种实现方式,可以根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于目标应用类别的目标数据流量,根据目标数据流量生成多个训练样本,其中,所述目标应用类别是指在指定时间段内对应的数据流量的流量特征发生特征漂移的应用类别。第三种实现方式,将上述第一种实现方式和第二种实现方式结合,从多条数据流量中获取属于未知应用的未知数据流量以及属于目标应用类别的目标数据流量,根据该未知数据流量和目标数据流量生成多个训练样本。
在本申请实施例中,客户端设备可以获取多条数据流量中属于未知应用的未知数据流量,进而根据该未知数据流量生成训练样本来对应用识别模型进行训练,这样,通过更新后的应用识别模型可以更好的识别新兴应用,增强了应用识别模型对新兴应用的识别能力。
除此之外,对于已知应用的数据流量,客户端设备可以通过检测数据流量的流量特征是否发生漂移来识别更新升级的应用,进而根据更新升级的应用的数据流量生成训练样本来对应用识别模型进行重新训练,这样,最终更新后的应用识别模型可以更好的对更新升级后的应用进行识别,提高了应用识别的准确性。
可选地,在上述生成训练样本的第一种实现方式中,根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于未知应用的未知数据流量的实现过程可以为:从所述多条数据流量中获取应用识别结果符合未知应用条件的数据流量,将获取的数据流量作为属于未知应用的未知数据流量,所述未知应用条件是指应用识别结果中各个应用类别对应的置信度均小于参考阈值,或者,所述未知应用条件是指应用识别结果不属于多个指定聚类簇,所述多个指定聚类簇是对所述应用识别模型的原始训练样本集中的各个应用类别的数据流量的流量特征进行聚类得到。
本申请实施例中可以通过检测数据流量的识别结果是否符合未知应用条件来判断该数据流量是否属于未知应用,以此来确定未知数据流量,进而根据该未知数据流量生成训练样本对应用识别模型进行训练。
可选地,在上述生成训练样本的第一种实现方式中,根据未知数据流量生成训练样本的实现过程可以为:获取所述未知数据流量的流量特征;根据所述未知数据流量的流量特征,从所述服务器中获取所述未知数据流量所属的应用的应用信息;将所述未知数据流量的流量特征作为第一训练样本中的训练数据,将所述未知数据流量所属的应用的应用信息作为所述第一训练样本中的标签数据,所述第一训练样本为所述多个训练样本中的一个训练样本。
可选地,在上述生成训练样本的第二种实现方式中,根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于目标应用类别的目标数据流量的实现过程可以为:根据每条数据流量的应用识别结果,从所述多条数据流量中确定所述指定时间段内不属于未知应用的多条已知数据流量;根据所述多条已知数据流量的应用识别结果、所述指定时间段和所述客户端设备的标识,从所述服务器中获取所述多条已知数据流量的应用识别结果所包括的多种应用类别分别对应的特征漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;根据每种应用类别对应的特征漂移标志,从所述多种应用类别中确定数据流量的流量特征发生漂移的目标应用类别;从所述多条已知数据流量中获取属于所述目标应用类别的目标数据流量。
可选地,上述生成训练样本的第二种实现方式中,根据目标数据流量生成训练样本的实 现方式可以为:将所述目标数据流量的流量特征作为第二训练样本中的训练数据,将所述目标数据流量的应用识别结果所指示的所述目标数据流量所属的应用类别作为所述第二训练样本中的标签数据,所述第二训练样本为所述多个训练样本中的一个训练样本。
可选地,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数,或者,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
可选地,所述联合更新后的模型数据包括所述联合更新后的应用识别模型的模型参数,或者,所述联合更新后的模型数据包括所述联合更新后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
第二方面,提供了一种更新应用识别模型的方法,所述方法包括:接收多个客户端设备发送的训练后的应用识别模型的模型数据,所述训练后的应用识别模型由相应客户端设备根据多个训练样本对所述应用识别模型训练得到,所述多个训练样本是由相应客户端设备根据多条数据流量中每条数据流量的应用识别结果确定得到;根据接收到的多个模型数据获取联合更新后的模型数据;向所述多个客户端设备发送所述联合更新后的模型数据,以使所述多个客户端设备根据所述联合更新后的模型数据获取联合更新后的应用识别模型。
在本申请实施例中,服务器可以根据接收到的多个客户端设备发送的训练后的模型数据来获取联合更新后的模型数据。由于客户端设备发送的训练后的模型数据是根据实时数据流量动态更新得到的,联合更新后的模型数据相当于是由各个客户端设备根据实时数据流量动态更新得到。这样,即使应用更新或出现了其他新兴应用,由于应用识别模型的模型数据也根据各个客户端设备对应的数据流量进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
可选地,在接收多个客户端设备发送的训练后的应用识别模型的模型数据之前,还可以接收第一客户端设备发送的未知数据流量的流量特征,所述未知数据流量是所述第一客户端设备从多条数据流量中确定的属于未知应用的数据流量;根据所述未知数据流量的流量特征,获取所述未知数据流量所属的应用的应用信息;向所述第一客户端设备发送所述未知数据流量所属的应用的应用信息,以使所述第一客户端设备根据所述未知数据流量所属的应用的应用信息生成训练样本。
在本申请实施例中,服务器可以根据客户端设备发送的未知数据流量的流量特征来获取对应的应用信息,进而向客户端设备反馈该应用信息。这样,客户端设备就可以根据该未知数据流量生成训练样本来对应用识别模型进行训练,这样,通过更新后的应用识别模型可以更好的识别新兴应用,增强了应用识别模型对新兴应用的识别能力。
可选地,在接收多个客户端设备发送的训练后的应用识别模型的模型数据之前,还可以接收第一客户端设备发送的多条已知数据流量的应用识别结果、指定时间段和所述第一客户端设备的标识,所述多条已知数据流量是指从所述多条数据流量中确定的指定时间段内不属于未知应用的数据流量;根据所述多条已知数据流量的应用识别结果中包括的每个应用类别和每个应用类别对应的置信度,确定相应应用类别的当前画像;根据所述第一客户端设备的标识,获取最近一次确定的与所述指定时间段对应的每个应用类别的画像;根据每个应用类别的当前画像和获取的最近一次确定的相应应用类别的画像,确定相应应用类别对应的特征 漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;向所述第一客户端设备发送每个应用类别对应的特征漂移标志,以使所述第一客户端设备根据每个应用类别对应的特征漂移标志获取属于目标应用类别的数据流量,并根据属于目标应用类别的数据流量生成训练样本,所述目标应用类别为数据流量的流量特征发生漂移的应用类别。
在本申请实施例中,对于已知应用的数据流量,服务器可以通过确定每个应用类别的画像来检测相应应用类别的数据流量的流量特征是否发生漂移,进而向客户端设备反馈用于指示特征是否发生漂移的特征漂移标志。客户端设备可以根据该特征漂移标志来识别更新升级的应用,进而根据更新升级的应用的数据流量生成训练样本来对应用识别模型进行重新训练,这样,最终更新后的应用识别模型可以更好的对更新升级后的应用进行识别,提高了应用识别的准确性。
第三方面,提供了一种更新应用识别模型的装置,所述更新应用识别模型的装置具有实现上述第一方面或第二方面中更新应用识别模型的方法行为的功能。所述更新应用识别模型的装置包括至少一个模块,该至少一个模块用于实现上述第一方面或第二方面所提供的更新应用识别模型的方法。
第四方面,提供了一种更新应用识别模型的装置,所述更新应用识别模型的装置的结构中包括处理器和存储器,所述存储器用于存储支持更新应用识别模型的装置执行上述第一方面或第二方面所提供的更新应用识别模型的方法的程序,以及存储用于实现上述第一方面或第二方面所提供的更新应用识别模型的方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于建立该处理器与存储器之间的通信连接。
第五方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的更新应用识别模型的方法。
第六方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的更新应用识别模型的方法。
上述第三方面、第四方面、第五方面和第六方面所获得的技术效果与第一方面或第二方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
本申请提供的技术方案带来的有益效果至少包括:
在本申请实施例中,客户端设备可以根据多条数据流量的识别结果来确定多个训练样本,进而通过该训练样本来训练应用识别模型,之后,客户端设备可以将训练得到的应用识别模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据进行联合更新。之后,客户端设备可以根据服务器下发的联合更新后的模型数据来获取联合更新的应用识别模型。由此可见,联合更新后的应用识别模型相当于是根据多个客户端设备收集到的数据流量的特征动态更新得到的,这样,即使应用更新或出现了其他新兴应用,由于应用识别模型 也根据各个客户端设备收集的数据流量的特征进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
附图说明
图1是本申请实施例提供的更新应用识别模型的方法所涉及的一种***架构图;
图2是本申请实施例提供的更新应用识别模型的方法所涉及的另一种***架构图;
图3是本申请实施例提供的用于更新应用识别模型的计算机设备的结构示意图;
图4是本申请实施例提供的更新应用识别模型的方法流程图;
图5是本申请实施例提供的一种更新应用识别模型的装置结构示意图;
图6是本申请实施例提供的用于生成多个训练样本的确定模块的结构示意图;
图7是本申请实施例提供的另一种更新应用识别模型的装置结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
在对本申请实施例进行详细的解释说明之前,先对本申请实施例涉及的应用场景予以介绍。
应用识别是指对客户端设备的数据流量进行识别以确定数据流量所属的应用类别。通过识别数据流量所属的应用类别,客户端设备可以另作他用,例如更为合理地安排各个数据流量,即可以根据数据流量所属的应用类别,来配置数据流量的收发优先级。
目前,可以采用离线训练好的AI模型来对数据流量进行识别。其中,AI模型的识别准确度依赖于离线训练时的训练样本集是否有效刻画了现实中可能的流量分布。然而,随着应用版本的更新迭代以及新兴应用的不断出现,原本训练得到AI模型的训练样本集将慢慢变得无法表征当前流量的分布情况,此时,该AI模型的识别准确度将会下降。本申请实施例提供的更新应用识别模型的方法即可以用于上述场景中,由客户端设备根据收集到的数据流量的流量特征来对AI模型进行重新训练后,将重新训练的AI模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据来进行联合更新,以此来实现对AI模型的更新。
接下来对本申请实施例提供的更新应用识别模型的方法所涉及的***架构进行介绍。
图1是本申请实施例提供的更新应用识别模型的方法所涉及的***架构图。如图1中所示,该***中包括多个客户端设备101和服务器102。多个客户端设备101均可以与服务器102进行通信。
其中,每个客户端设备101上均部署有应用识别模型。并且,客户端设备101上可以安装有多个应用,或者,客户端设备101对应的终端上可以安装有多个应用。当应用运行时,可以与应用对应的应用服务器进行数据通信,从而产生数据流量,这些数据流量将经过客户端设备。各个客户端设备可以通过自身的应用识别模型来对数据流量进行识别,从而得到应用识别结果。在对数据流量进行识别的过程中,各个客户端设备101可以采用本申请实施例提供的方法,根据数据流量的应用识别结果来收集训练样本,进而根据收集的训练样本对应用识别模型进行重新训练。之后,各个客户端设备101可以将重新训练得到的应用识别模型 的模型数据上传至服务器102。
服务器102可以接收各个客户端设备上传的模型数据,并根据各个客户端设备上传的模型数据获取联合更新后的模型数据。之后服务器可以向各个客户端设备101下发联合更新后的模型数据。
各个客户端设备101在接收到服务器102下发的联合更新后的模型数据之后,可以根据该联合更新后的模型数据来获取联合更新的应用识别模型。如果联合更新后的应用识别模型符合收敛条件,则客户端设备101后续可以利用该联合更新后的应用识别模型对数据流量进行识别。当然,如果联合更新后的应用识别模型不符合收敛条件,则客户端设备101可以继续对应用识别模型进行训练,然后继续上传训练后的模型数据至服务器,服务器也可以继续进行联合更新,直至联合更新后的应用识别模型符合收敛条件为止。也即,在本申请实施例中,客户端设备和服务器可以对该应用识别模型进行单轮更新,也可以进行多轮更新,本申请实施例对此不作限定。
需要说明的是,客户端设备101可以为支持本地训练的设备,示例性地,客户端设备101可以为手机、平板电脑、台式电脑、笔记本电脑、交换机、光线路终端(optical line terminal,OLT)、光网络终端(optical network terminal,ONT)、路由器、交换机等。本申请实施例对此不作限定。
服务器102可以为支持联合学***台。本申请实施例对此不作限定。
可选地,图2是本申请实施例示出的另一种***架构的示意图。参见图2,该***中可以包括多个客户端设备201、多个网络设备202和服务器203。其中,每个客户端设备201可以对应一个或多个网络设备202,且每个客户端设备201对应的一个或多个网络设备202为一个局点内的网络设备。
在该种实现方式中,客户端设备201也可以称为局点分析设备(也称局点分析平台),可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。在该应用场景中,模型更新方法所涉及的***包括多个局点网络,局点网络可以为核心网,也可以为边缘网络,每个局点网络的用户可以为运营商或企业客户。不同局点网络可以是按照相应维度划分的不同网络,如,可以是不同地域的网络、不同运营商的网络、不同业务网络、不同网络域等。每个局点网络内包括一个或多个网络设备,多个客户端设备201与多个局点网络可以一一对应,每个客户端设备201用于为对应的局点网络提供数据分析服务,也即,每个客户端设备201可以对应一个局点网络内的一个或多个网络设备202,以为其提供数据分析等服务。每个客户端设备201可以位于对应的局点分析网络内,也可以位于对应的局点分析网络外。每个客户端设备201与服务器203之间通过有线网络或无线网络连接。本申请实施例中所涉及的通信网络是第二代(2-Generation,2G)通信网络、第三代(3rd Generation,3G)通信网络、长期演进(Long Term Evolution,LTE)通信网络或第五代(5rd Generation,5G)通信网络等。
另外,服务器203可以为云端分析设备(也称云分析平台),其可以是一台计算机,或者一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心,其部署在服务网络的后端。
其中,网络设备202可以手机、平板电脑、台式电脑、笔记本电脑、交换机、光线路终端(optical line terminal,OLT)、光网络终端(optical network terminal,ONT)、路由器、交换机等。本申请实施例对此不作限定。
图3是本申请实施例提供的一种计算机设备的结构示意图。图1或2中的客户端设备或服务器均可以通过图3所示的计算机设备来实现。参见图3,该计算机设备包括至少一个处理器301,通信总线302,存储器303以及至少一个通信接口304。
处理器301可以是一个通用中央处理器(Central Processing Unit,CPU),特定应用集成电路(application-specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或其任意组合。处理器301可以包括一个或多个芯片,处理器301可以包括AI加速器,例如:神经网络处理器(neural processing unit,NPU)。
通信总线302可包括一通路,在上述组件之间传送信息。
存储器303可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其它类型的静态存储设备,随机存取存储器(random access memory,RAM))或者可存储信息和指令的其它类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其它磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器303可以是独立存在,通过通信总线302与处理器301相连接。存储器303也可以和处理器301集成在一起。存储器303可以存储计算机指令,当存储器303中存储的计算机指令被处理器301执行时,可以实现本申请的更新应用识别模型的方法。另外,存储器303中还可以存储有处理器在执行上述方法的过程中所产生的中间数据和/或结果数据。
通信接口304,使用任何收发器一类的装置,用于与其它设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(Wireless Local Area Networks,WLAN)等。
在具体实现中,作为一种实施例,处理器301可以包括一个或多个CPU。
在具体实现中,作为一种实施例,计算机设备可以包括多个处理器。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
接下来对本申请实施例提供的更新应用识别模型的方法进行详细的解释说明。
图4是本申请实施例提供的一种更新应用识别模型的方法流程图。该方法可以应用于图1或图2所示的***中,参见图4,该方法包括:
步骤401:客户端设备根据多条数据流量中每条数据流量的应用识别结果,确定多个训练样本,该应用识别结果是通过应用识别模型对相应数据流量进行识别得到。
在本申请实施例中,客户端设备上部署有应用识别模型。该应用识别模型可以是离线训练好的AI模型,也可以是经过多次更新后的AI模型。其中,该客户端设备可以为图1或图2所示的***中的客户端设备。
在一种可能的实现方式中,当该客户端设备为图1中的客户端设备,客户端设备在运行某个应用时,或者是客户端设备对应的终端在运行某个应用时,可以与该应用的应用服务器进行数据通信,从而产生经过该该客户端设备的数据流量。例如,客户端设备可能会向应用服务器发送数据请求,应用服务器可以向客户端设备返回应用数据等。客户端设备向应用服务器发送的数据以及应用服务器返回至终端的数据均可以称为经过该客户端设备的数据流量。本步骤中的多条数据流量即可以指这些经过客户端设备的数据流量。
可选地,在另一种可能的实现方式中,对于图2中的客户端设备,客户端设备可以对应一个局点网络内的一个或多个网络设备。在这种情况下,当有数据流量经过该一个或多个网络设备时,该一个或多个网络设备可以将经过自身的数据流量的流量特征上报至客户端设备。此时,本步骤中的多条数据流量即可以是指经过客户端设备所对应的局点网络内的一个或多个网络设备的数据流量。
也就是说,在本申请实施例中,多条数据流量可以是指经过该客户端设备的数据流量,也可以是指未经过该客户端设备但是经过该客户端设备所服务的局点网络内的一个或多个网络设备的数据流量。
对于多条数据流量中的每条数据流量,客户端设备可以获取该数据流量的流量特征,将该流量特征输入至应用识别模型,从而通过该应用识别模型识别得到该数据流量的应用识别结果。
其中,应用识别结果可以包括该数据流量可能属于的应用类别和各个应用类别对应的置信度。需要说明的是,应用类别可以为具体的应用名称,例如,A应用、B应用等。或者,应用类别也可以为应用的类型,例如,可以为视频类应用、游戏类应用、语音类应用等。也就是说,本申请实施例中,应用识别结果可以是用于指示该数据流量属于哪个应用,也可以是用于指示该数据流量属于哪种类型的应用,本申请实施例对此不作限定。对于每条数据流量,客户端设备均可以通过应用识别模型得到各条数据流量的应用识别结果。
需要说明的是,在本申请实施例中,多条数据流量可能是属于未知应用的数据流量,也可能是已知应用的数据流量。其中,根据属于未知应用的数据流量,客户端设备可以生成第一训练样本,根据已知应用的数据流量中流量特征变化较大的数据流量,客户端设备可以生成第二训练样本。
接下来,分别对上述两种情况所对应的实现方式进行介绍。
(一)第一种实现方式
客户端设备在通过应用识别模型获得数据流量的应用识别结果之后,该客户端设备可以根据每条数据流量的应用识别结果,来判断相应数据流量是否为属于未知应用的未知数据流量,从而从多条数据流量中获取得到一条或多条属于未知应用的未知数据流量。
示例性地,以多条数据流量中的任一条数据流量为例,为了方便叙述,将其称为第一数据流量。客户端设备可以根据第一数据流量的应用识别结果,检测第一数据流量是否符合未知应用条件,如果第一数据流量符合未知应用条件,则说明第一数据流量为属于未知应用的未知数据流量。
其中,示例性地,未知应用条件可以为应用识别结果中各个应用类别的置信度均小于指定阈值。也就是说,如果第一数据流量的应用识别结果中包括的各种应用类别的置信度均小于指定阈值,则可以确定第一数据流量符合未知应用条件,也即,第一数据流量为属于未知 应用的未知数据流量。否则,则可以确定第一数据流量不符合未知应用条件,其为不属于未知应用的已知数据流量。
可选地,未知应用条件也可以是指应用识别结果不属于多个指定聚类簇,其中,该多个指定聚类簇是对应用识别模型的原始训练样本集中的各个应用类别的数据流量的流量特征进行聚类得到。也即,在本申请实施例中,客户端设备可以对在离线训练时所采用的原始训练样本集中的各个应用类别的训练样本进行聚类,得到多个聚类簇,该多个聚类簇即为前述的多个指定聚类簇。这样,在通过该应用识别模型对第一数据流量的流量特征进行识别之后,客户端设备可以将该识别结果与各个应用类别对应的聚类簇的聚类中心以及置信半径作比较,如果该识别结果不属于任一聚类中心的置信半径簇内,也即,不属于该多个聚类簇中的任一聚类簇中,则可以确定第一数据流量符合未知应用条件,也即,第一数据流量为属于未知应用的未知数据流量。否则,则可以确定第一数据流量不符合未知应用条件,其为不属于未知应用的已知数据流量。
在通过前述方法确定出一条或多条未知数据流量之后,由于未知数据流量可能属于未知应用,而应用识别模型之前的训练样本中可能并没有包含该未知应用的相关样本,因此,客户端设备可以根据每条未知数据流量生成一个对应的第一训练样本。
示例性地,以任一条未知数据流量为例,客户端设备可以获取该未知数据流量的流量特征;根据该未知数据流量的流量特征,从服务器中获取未知数据流量所属的应用的应用信息;将未知数据流量的流量特征作为第一训练样本中的训练数据,将未知数据流量所属的应用的应用信息作为第一训练样本中的标签数据。
其中,该未知数据流量的流量特征可以包括该未知数据流量包括的多个数据包的收发包时间、收发包五元组、域名***(domain name system,DNS)地址、收发包长度和收发包个数、统一资源定位符(uniform resource locator,URL)等。
客户端设备在获取到未知数据流量的流量特征之后,可以根据未知数据流量的流量特征中包含的DNS地址或者是收发包五元组或者是URL等流量关键信息,从服务器中获取未知数据流量所属的应用的应用信息。
在一种可能的实现方式中,客户端设备可以将未知数据流量的流量特征中包含的流量关键信息发送至服务器。服务器可以根据该流量关键信息获取该未知数据流量所属的应用的应用信息。
需要说明的是,服务器中可以存储有流量关键信息与应用信息之间的映射关系。其中,如果服务器中存储的URL与应用信息之间的映射关系,也即,如果服务器中存储的流量关键信息为URL,则客户端设备可以将该未知数据流量的流量特征中包含的URL发送至服务器,服务器在接收到该URL之后,可以从存储的上述映射关系中获取该URL对应的应用信息发送至客户端设备。客户端设备在接收到该应用信息之后,可以将该应用信息作为该未知数据流量所属的应用的应用信息。其中,该应用信息可以包括应用名称、应用类型等信息。
可选地,如果服务器中存储的映射关系中的流量关键信息为IP地址,则客户端设备可以将收发包五元组发送至服务器,服务器可以从中获取IP地址,并从存储的映射关系中获取该IP地址对应的应用信息,将获取的应用信息发送至客户端设备。客户端设备在接收到该应用应用信息之后,可以将该应用信息作为该未知数据流量所属的应用的应用信息。
可选地,如果服务器中存储的映射关系中的流量关键信息为DNS地址,则客户端设备可 以将该未知数据流量的关键特征中包含的DNS地址发送至服务器,服务器可以从存储的映射关系中获取该DNS地址对应的应用信息,并将该应用信息返回至客户端设备。客户端设备在接收到该应用应用信息之后,可以将该应用信息作为该未知数据流量所属的应用的应用信息。
可选地,服务器中存储的映射关系也可以是IP地址、DNS地址、URL和应用信息之间的映射关系,在这种情况下,客户端设备可以将该未知数据流量的流量特征中的收发包五元组、DNS地址和URL中的任意一个或多个发送至服务器,以获取对应的应用信息,本申请实施例在此不再赘述。
在另一种可能的实现方式中,客户端设备可以直接将未知数据流量的流量特征发送至服务器,服务器可以根据该流量特征中包含的流量关键信息来获取该未知数据流量所属的应用的应用信息。在这种情况下,服务器可以根据自身存储映射关系中流量关键信息的种类,从流量特征中获取对应种类的流量关键信息,进而根据获取到的流量关键信息从该映射关系中获取对应的应用信息。例如,假设服务器中存储的是URL与应用信息的映射关系,则服务器可以从接收到的流量特征中获取URL,进而根据获取的URL获取对应的应用信息。
值得注意的是,在一些可能的情况中,服务器根据客户端设备上报的流量关键信息从映射关系中未能找到对应的应用信息,此时,服务器可以显示该流量关键信息,由技术人员查看后,接收技术人员输入的该流量关键信息对应的应用信息。之后,服务器可以将该应用信息返回至客户端设备。与此同时,服务器还需将该流量关键信息和技术人员输入的应用信息对应存储至前述的映射关系表中,以便后续查询使用。
客户端设备在接收到服务器返回的该未知数据流量所属的应用的应用信息之后,可以将前述获取到的该未知数据流量的流量特征作为训练数据,将接收到的该未知数据流量所属的应用的应用信息作为该训练数据对应的标签数据。由该训练数据和标签数据组成该未知数据流量对应的第一训练样本。将该第一训练样本存储至客户端设备的训练数据存储区。
对于检测到的每条符合未知应用条件的未知数据流量,客户端设备均可以采用上述方法对该数据流量进行处理,从而得到对应的一个第一训练样本。
(二)第二种实现方式
上述介绍了对于符合未知应用条件的数据流量的处理方式。可选地,多条数据流量中可能还存在不属于未知应用的已知数据流量。基于此,客户端设备还可以根据每条数据流量的应用识别结果,从多条数据流量中确定属于目标应用类别的目标数据流量,其中,目标应用类别是指在指定时间段内对应的数据流量的流量特征发生特征漂移的应用类别。之后,可以根据属于目标应用类别的目标数据流量,生成第二训练样本。
示例性地,首先,客户端设备可以根据每条数据流量的应用识别结果,从多条数据流量中确定指定时间段内不属于未知应用的多条已知数据流量;之后,可以根据多条已知数据流量的应用识别结果、指定时间段和客户端设备的标识,从服务器中获取所述多条已知数据流量的应用识别结果所包括的多种应用类别分别对应的特征漂移值;从多种应用类别中确定特征漂移值等于参考漂移值的目标应用类别;从多条已知数据流量中获取属于目标应用类别的数据流量。
需要说明的是,客户端设备上每接收或发送一条数据流量,客户端设备即可以检测该数据流量的应用识别结果是否符合未知应用条件,如果数据流量的应用识别结果不符合未知应用条件,则可以将该数据流量确定为不属于未知应用的已知数据流量。其中,检测数据流量 的应用识别结果是否符合未知应用条件的实现方式可以参考前文中的相关介绍,本申请实施例在此不再赘述。
另外,在本申请实施例中,客户端设备中可以按照不同的时间段来存储数据流量的应用识别结果。基于此,当确定一条数据流量为已知数据流量时,客户端设备可以根据该已知数据流量的接收或发送时间,将该已知数据流量的应用识别结果添加至包含有相应接收或发送时间的指定时间段的识别结果集合中。这样,客户端设备可以获取得到指定时间段内的多条已知数据量。
例如,多条数据流量中的一条数据流量为已知数据流量,该数据流量的接收或发送时间为19:30,客户端设备上存储有19:00-21:00这一时间段的识别结果集合,则客户端设备可以将该已知数据流量的识别结果添加至这一时间段的识别结果集合中。这样,客户端设备即可以确定出这一时间段内所有的已知数据流量,进而获取到这一时间段内的已知数据流量的应用识别结果。
对于每一时间段,客户端设备均可以通过上述方法确定得到相应时间段内的已知数据流量。
在获取到指定时间段内的多条已知数据流量的应用识别结果之后,客户端设备可以逐条或逐批或一次性将指定时间段内已知数据流量的应用识别结果、该指定时间段和客户端设备的标识发送至服务器。
其中,客户端设备的标识可以为客户端设备的地理位置,或者是客户端设备的设备标识等,本申请实施例对此不作限定。
服务器在接收到客户端设备发送的指定时间段内的多条已知数据流量的应用识别结果、指定时间段和客户端设备的标识之后,可以根据该指定时间段内的多条已知数据流量的应用识别结果中的每个应用类别和每个应用类别对应的置信度,确定相应应用类别的当前画像。之后,服务器可以根据客户端设备的标识,获取最近一次确定的与该指定时间段对应的每个应用类别的画像。根据每个应用类别的当前画像和获取的最近一次确定的相应应用类别的画像,确定相应应用类别对应的特征漂移标志,该特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;向客户端设备发送每个应用类别对应的特征漂移标志。
其中,对于任一条已知数据流量的应用识别结果,服务器可以将该应用识别结果包括的应用类别中对应的置信度最大的应用类别作为该已知数据流量的最终应用类别。按照每条已知数据流量的最终应用类别,将多个应用识别结果进行分类。之后,服务器可以按照每个最终应用类别所对应的应用识别结果的个数以及应用识别结果中该最终应用类别的置信度来确定相应应用类别的画像。
另外,最近一次确定的与该时间段对应的每个应用类别的画像可以是指最近一次确定的上一周期中该客户端设备的该指定时间段内的各个应用类别的画像,或者最近数次确定的在前几个周期中该客户端设备的该指定时间段内的各个应用类别的平均画像。
在确定每个应用类别的当前画像且获取到最近一次确定的各个应用类别的画像之后,服务器可以将相同应用类别的两个画像进行比较,如果两个画像之间的差异值超过预设阈值,则可以确定这两个画像对应的应用类别的数据流量的流量特征发生了漂移,也即,该应用类别的应用可能进行了应用更新升级。此时,可以将该应用类别对应的特征漂移标志设置为第一标志。如果两个画像之间的差异值不超过预设阈值,则可以确定这两个画像对应的应用类 别的数据流量的流量特征变化较小,并未发生漂移,也即,该应用类别的应用未进行应用更新升级,此时,可以将这两个画像对应的应用类别的特征漂移标志设置为与第一标志不同的第二标志。
需要说明的是,在本申请实施例中,服务器中可以按照各个应用类别的顺序维护一个特征漂移字段,该特征漂移字段可以包括多个特征漂移标志位,每个特征漂移标志位可以对应一个应用类别。多个特征漂移标志位的默认值可以为与第二标志,例如,可以为0。基于此,当通过上述方法确定出某个应用类别的数据流量的流量特征发生漂移之后,服务器可以将该特征漂移字段中该应用类别对应的特征漂移标志位设置为第一标志,其中,该参考漂移值可以为1。如果应用类别的数据流量的流量特征未发生漂移,则可以保持该应用类别对应的特征漂移标志位为默认值0。如此,可以确定出特征漂移字段中各个应用类别对应的特征漂移标志位的取值,之后,服务器可以将该特征漂移字段发送至客户端设备,以便客户端设备确定哪些应用类别的数据流量的流量特征发生了漂移。
示例性地,假设服务器接收到客户端设备上传的当天19:00-21:00之间的应用识别结果,各个应用识别结果中包括视频类、游戏类、语音类三个应用类别和每个应用类别对应的置信度。则服务器可以将各个应用识别结果中对应的置信度最大的应用类别作为相应应用识别结果所对应的最终应用类别。这样,服务器可以按照各个应用识别结果所对应的最终应用类别,将多个应用识别结果划分到不同的应用类别,从而得到视频类的应用识别结果,游戏类的应用识别结果和语音类的应用识别结果。之后,服务器可以根据各个应用类别的应用识别结果中该应用类别的置信度和包含的应用识别结果的个数来确定得到该应用类别的画像。在确定出各个应用类别的画像之后,服务器可以获取根据当天的前一天的19:00-21:00这个时间段内的应用识别结果得到的各个应用类别的画像。之后,服务器可以比较相同应用类别的画像之间的差异,以此来判断画像之间的差异是否超过了预设阈值,进而确定出该应用类别对应的特征漂移标志。
服务器在确定出每个应用类别对应的特征漂移标志之后,可以将每个应用类别对应的特征漂移标志发送至客户端设备。客户端设备在接收到每个应用类别对应的特征漂移标志之后,对于特征漂移标志为第一标志的应用类别,可以将其作为目标应用类别。之后,客户端设备可以根据之前上报服务器的指定时间段内的多条已知数据流量的应用识别结果,确定每条已知数据流量的最终应用类别,进而将其中最终应用类别为目标应用类别的数据流量作为属于目标应用类别的目标数据流量。
需要说明的是,如前所述,服务器可以向客户端设备发送包含有多个特征漂移标志位的特征漂移字段,客户端设备在接收到该特征漂移字段之后,可以从该特征漂移字段中确定取值为第一标志的特征漂移标志位所对应的应用类别,进而将确定的应用类别作为目标应用类别。
客户端设备在确定出目标数据流量之后,对于每条目标数据流量,客户端设备可以获取该目标数据流量的流量特征,将该目标数据流量的流量特征作为训练数据,将该目标数据流量的应用识别结果作为该训练数据的标签数据,从而由该训练数据和标签数据组成一个第二训练样本。
对于确定出的每条数据流量,客户端设备均可以参照该种方法来确定得到对应的一个第二训练样本。之后,客户端设备可以将得到的第二训练样本存储至客户端设备的训练数据存 储区。
值得注意的是,本申请实施例中可以同时结合上述两种可能的实现方式生成多个训练样本,这样,训练数据存储区可以同时存储有根据第一种实现方式生成的第一训练样本和根据第二种实现方式生成的第二训练样本。可选地,本申请实施例也可以采用上述任一种实现方式来生成训练样本,在这种情况下,训练数据存储区可以存储有根据第一种实现方式生成的第一训练样本或根据第二实现方式生成的第二训练样本。本申请实施例对此不作限定。
步骤402:客户端设备根据多个训练样本对应用识别模型进行训练。
在本申请实施例中,客户端设备根据未知应用数据流量可以生成第一训练样本,根据已知数据流量中发生特征漂移的数据流量可以生成第二训练样本。生成的第一训练样本和第二训练样本可以存储至训练数据存储区。
其中,训练数据存储区的空间大小可以为固定的,这样,当客户端设备检测到训练数据存储区中存储的训练样本所占的空间大小达到一定阈值时,即可以触发对应用识别模型的训练。其中,一定阈值小于或等于该训练数据存储区的空间大小。
可选地,在一种可能的情况中,客户端设备也可以在检测到用户指令或者是到达指定时刻时,触发对应用识别模型的训练,本申请实施例对此不作限定。
在触发应用识别模型的训练之后,客户端设备可以从训练数据存储区中获取存储的多个训练样本,通过获取的多个训练样本对当前的应用识别模型进行训练。
其中,客户端设备可以通过获取的多个训练样本对该应用识别模型进行一轮训练,当然,也可以进行通过获取的多个训练样本进行多轮训练。本申请实施例对此不作限定。
步骤403:客户端设备向服务器发送训练后的应用识别模型的模型数据。
在本申请实施例中,客户端设备根据多个训练样本对应用识别模型进行本地训练之后,可以获取训练之后的应用识别模型的模型参数,确定训练后的应用识别模型的模型参数与训练前的应用识别模型的模型参数之间的差异数据,之后,将该差异数据作为训练后的应用识别模型的模型数据上传至服务器。
示例性地,假设训练前的应用识别模型的模型参数为[a1、b1、c1、d1…],训练后的应用识别模型的模型参数为[a2、b2、c2、d2…],则客户端设备可以确定得到差异数据:[a2-a1、b2-b1、c2-c1、d2-d1…],将该差异数据上报至服务器。
可选地,在一种可能的实现方式中,客户端设备也可以将训练后的应用识别模型的模型参数作为模型数据直接上报至服务器。例如,假设训练后的应用识别模型的参数为[a2、b2、c2、d2…],则客户端设备可以直接将[a2、b2、c2、d2…]上报至服务器。
可选地,在另一种可能的实现方式中,客户端设备也可以将训练后的应用识别模型的完整数据(包括模型参数和模型结构)上报至服务器。
步骤404:服务器接收多个客户端设备上传的训练后的应用识别模型的模型数据。
步骤405:服务器根据接收到的多个模型数据获取联合更新后的模型数据。
根据客户端设备上报的模型数据的实现方式不同,本步骤的实现方式也不同。
当客户端设备上报的模型数据为训练前后应用识别模型的模型参数的差异数据时,服务器可以对各个客户端设备上报的差异数据进行联合平均,将该联合平均结果作为联合更新后的模型数据。或者,服务器可以根据各个客户端设备上报的差异数据对服务器上当前部署的应用识别模型的模型参数进行联合更新,将联合更新后的模型参数相较于未更新前的模型参 数的差异数据作为联合更新后的模型数据。或者,服务器可以根据各个客户端设备上报的差异数据对服务器上当前部署的应用识别模型的模型参数进行联合更新,将联合更新后的模型参数直接作为联合更新后的模型数据。或者,服务器可以根据各个客户端设备上报的差异数据对服务器上当前部署的应用识别模型的模型参数进行联合更新,将联合更新后的应用识别模型的完整数据(包括模型结构和模型参数)作为联合更新后的模型数据。
可选地,当客户端设备上报的模型数据为训练后应用识别模型的模型参数时,服务器可以对各个客户端设备上报的模型参数进行联合平均,直接将联合平均后的模型参数作为联合更新后的模型数据。或者,服务器可以根据各个客户端设备上报的模型参数对自身部署的应用识别模型进行联合更新,然后将联合更新后的应用识别模型的模型参数与未更新时的应用识别模型的模型参数之间的差异数据、或者是联合更新后的应用识别模型的模型参数、或者是联合更新后的应用识别模型的完整数据(包括模型结构和模型参数)作为联合更新后的模型数据。
可选地,当客户端设备上报的模型数据为训练后的应用识别模型的完整数据(包括模型参数和模型结构),则服务器可以根据各个客户端设备上报的应用识别模型的模型参数,对应用识别模型进行联合更新,之后,服务器可以将联合更新后的应用识别模型的完整数据(包括模型结构和模型参数)或联合更新后的应用识别模型的模型参数作为联合更新后的模型数据。
需要说明的是,在上述实现方式中,如果是对客户端设备上报的模型数据直接进行联合平均,则在得到联合平均结果之后,服务器还可以根据该联合平均结果对自身部署的应用识别模型的模型参数进行更新,从而得到联合更新后的应用识别模型。
由于服务器更新后的模型数据是根据各个客户端设备上报的模型数据联合更新得到的,也即联合更新后的应用识别模型相当于是根据各个客户端设备的流量数据更新得到的,因此,该应用识别模型的模型参数可以体现更多客户端设备的流量分布情况。
步骤406:服务器向客户端设备发送联合更新后的模型数据。
服务器在获取到联合更新后的模型数据之后,可以将联合更新后的模型数据下发至上报训练后的模型数据的各个客户端设备。
步骤407:客户端设备接收服务器发送的联合更新后的模型数据。
步骤408:客户端设备根据联合更新后的模型数据获取联合更新后的应用识别模型。
客户端设备在接收到服务器下发的联合更新后的模型数据之后,可以根据该联合更新后的模型数据获取联合更新后的应用识别模型。
其中,如果联合更新后的模型数据为差异数据,则客户端设备可以根据该差异数据对该训练前的应用识别模型进行更新以得到该联合更新后的应用识别模型。
可选地,如果联合更新后的模型数据为模型参数,则客户端设备可以根据该模型参数对训练前或训练后的应用识别模型进行更新以得到该联合更新后的应用识别模型。
可选地,如果联合更新后的模型数据为联合更新后的应用识别模型的完整数据(包括模型结构和模型参数),则客户端设备可以直接加载该模型数据以得到该联合更新后的应用识别模型。
在得到联合更新后的应用识别模型之后,客户端设备可以检测联合更新后的应用识别模型是否满足收敛条件,如果满足收敛条件,则可以将该联合更新后的应用识别模型作为最终 更新得到的模型。
可选地,如果根据该联合更新后的应用识别模型不满足收敛条件,则客户端设备可以返回步骤402,继续通过多个训练样本来对更新后的应用识别模型进行继续训练,并上传继续训练后的模型数据至服务器,由服务器再次进行联合更新,直至客户端设备根据服务器下发的联合更新后的模型数据获取的联合更新后的应用识别模型满足收敛条件为止。
需要说明的是,在上述实施例中,客户端设备执行的操作可以作为一个单独的实施例来实现,服务器执行的操作也可以作为一个单独的实施例来实现,本申请实施例对此不作限定。
在本申请实施例中,客户端设备可以根据多条数据流量的识别结果来确定多个训练样本,进而通过该训练样本来训练应用识别模型,之后,客户端设备可以将训练得到的应用识别模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据进行联合更新。之后,客户端设备可以根据服务器下发的联合更新后的模型数据来获取联合更新的应用识别模型。由此可见,联合更新后的应用识别模型相当于是根据多个客户端设备对应的实时数据流量动态更新得到的,这样,即使应用更新或出现了其他新兴应用,由于应用识别模型也根据各个客户端设备对应的数据流量进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
另外,由于客户端设备可以获取数据流量中属于未知应用的未知数据流量,进而根据该未知数据流量生成训练样本来对应用识别模型进行训练,因此,通过更新后的应用识别模型可以更好的识别新兴应用,增强了应用识别模型对新兴应用的识别能力。
最后,在本申请实施例中,对于已知应用的数据流量,客户端设备可以通过检测数据流量的流量特征是否发生漂移来识别更新升级的应用,进而根据更新升级的应用的数据流量生成训练样本来对应用识别模型进行重新训练,这样,最终更新后的应用识别模型可以更好的对更新升级后的应用进行识别,提高了应用识别的准确性。
参见图5,本申请实施例提供了一种更新应用识别模型的装置500,该装置包括:
确定模块501,用于执行前述实施例中的步骤401;
训练模块502,用于执行前述实施例中的步骤402;
发送模块503,用于执行前述实施例中的步骤403;
接收模块504,用于执行前述实施例中的步骤407;
更新模块505,用于执行前述实施例中的步骤408。
可选地,参见图6,确定模块501包括:未知应用检测子模块5011和/或特征漂移检测子模块5012,该确定模块501还包括生成子模块5013;
未知应用检测模块5011,用于根据每条数据流量的应用识别结果,从多条数据流量中获取属于未知应用的未知数据流量;
特征漂移检测子模块5012,用于根据每条数据流量的应用识别结果,从多条数据流量中获取属于目标应用类别的目标数据流量,目标应用类别是指在指定时间段内对应的数据流量的流量特征发生特征漂移的应用类别;
生成子模块5013,用于根据获取的数据流量,生成多个训练样本。
可选地,未知应用检测子模块5011具体用于:
从多条数据流量中获取应用识别结果符合未知应用条件的数据流量,将获取的数据流量 作为属于未知应用的未知数据流量,未知应用条件是指应用识别结果中各个应用类别对应的置信度均小于参考阈值,或者,未知应用条件是指应用识别结果不属于多个指定聚类簇,多个指定聚类簇是对应用识别模型的原始训练样本集中的各个应用类别的数据流量的流量特征进行聚类得到。
可选地,生成子模块5013具体用于:
获取未知数据流量的流量特征;
根据未知数据流量的流量特征,从服务器中获取未知数据流量所属的应用的应用信息;
将未知数据流量的流量特征作为第一训练样本中的训练数据,将未知数据流量所属的应用的应用信息作为第一训练样本中的标签数据,第一训练样本为多个训练样本中的一个训练样本。
可选地,特征漂移检测子模块5012具体用于:
根据每条数据流量的应用识别结果,从多条数据流量中确定指定时间段内不属于未知应用的多条已知数据流量;
根据多条已知数据流量的应用识别结果、指定时间段和客户端设备的标识,从服务器中获取多条已知数据流量的应用识别结果所包括的多种应用类别分别对应的特征漂移标志,该特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
根据每种应用类别对应的特征漂移标志,从多种应用类别中确定数据流量的流量特征发生漂移的目标应用类别;
从多条已知数据流量中获取属于目标应用类别的目标数据流量。
可选地,生成子模块5013具体用于:
将目标数据流量的流量特征作为第二训练样本中的训练数据,将目标数据流量的应用识别结果所指示的目标数据流量所属的应用类别作为第二训练样本中的标签数据,第二训练样本为多个训练样本中的一个训练样本。
可选地,训练后的应用识别模型的模型数据包括训练后的应用识别模型的模型参数,或者,训练后的应用识别模型的模型数据包括训练后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
可选地,联合更新后的模型数据包括联合更新后的应用识别模型的模型参数,或者,联合更新后的模型数据包括联合更新后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
综上所述,在本申请实施例中,客户端设备可以根据多条数据流量的识别结果来确定多个训练样本,进而通过该训练样本来训练应用识别模型,之后,客户端设备可以将训练得到的应用识别模型的模型数据上传至服务器,由服务器根据多个客户端设备上传的模型数据进行联合更新。之后,客户端设备可以根据服务器下发的联合更新后的模型数据来获取联合更新的应用识别模型。由此可见,联合更新后的应用识别模型相当于是根据多个客户端设备收集到的实时数据流量的特征动态更新得到的,这样,即使应用更新或出现了其他新兴应用,由于应用识别模型也根据各个客户端设备收集到的数据流量的特征进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
图7是本申请实施例提供的一种更新应用识别模型的装置700,该装置700包括:
接收模块701,用于执行前述实施例中的步骤404;
第一获取模块702,用于执行前述实施例中的步骤405;
发送模块703,用于执行前述实施例中的步骤406。
可选地,该装置还包括第二获取模块(图中未示出);
接收模块,还用于接收第一客户端设备发送的未知数据流量的流量特征,未知数据流量是第一客户端设备从多条数据流量中确定的属于未知应用的数据流量;
第二获取模块,用于根据未知数据流量的流量特征,获取未知数据流量所属的应用的应用信息;
发送模块,还用于向第一客户端设备发送未知数据流量所属的应用的应用信息,以使第一客户端设备根据未知数据流量所属的应用的应用信息生成训练样本。
可选地,该装置还包括:第三获取模块和确定模块(图中未示出);
接收模块,还用于接收第一客户端设备发送的多条已知数据流量的应用识别结果、指定时间段和第一客户端设备的标识,多条已知数据流量是指从多条数据流量中确定的指定时间段内不属于未知应用的数据流量;
确定模块,用于根据多条已知数据流量的应用识别结果中包括的每个应用类别和每个应用类别对应的置信度,确定相应应用类别的当前画像;
第三获取模块,用于根据第一客户端设备的标识,获取最近一次确定的与指定时间段对应的每个应用类别的画像;
确定模块,还用于根据每个应用类别的当前画像和获取的最近一次确定的相应应用类别的画像,确定相应应用类别对应的特征漂移标志,特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
发送模块,还用于向第一客户端设备发送每个应用类别对应的特征漂移标志,以使第一客户端设备根据每个应用类别对应的特征漂移标志获取属于目标应用类别的数据流量,并根据属于目标应用类别的数据流量生成训练样本,目标应用类别为数据流量的流量特征发生漂移的应用类别。
在本申请实施例中,服务器可以根据接收到的多个客户端设备发送的训练后的模型数据来获取联合更新后的模型数据。由于客户端设备发送的训练后的模型数据是根据收集到的实时数据流量的特征动态更新得到的,联合更新后的模型数据相当于是根据各个客户端设备收集到的实时数据流量的特征动态更新得到。这样,即使应用更新或出现了其他新兴应用,由于应用识别模型的模型数据也根据各个客户端设备收集的数据流量的特征进行了动态更新,因此,可以更好的保证应用识别模型的识别准确率。
需要说明的是:上述实施例提供的更新应用识别模型的装置在更新应用识别模型时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的更新应用识别模型的装置与更新应用识别模型的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当 使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(Digital Subscriber Line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(Digital Versatile Disc,DVD))、或者半导体介质(例如:固态硬盘(Solid State Disk,SSD))等。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述为本申请提供的实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (23)

  1. 一种更新应用识别模型的方法,其特征在于,应用于客户端设备,所述方法包括:
    根据多条数据流量中每条数据流量的应用识别结果,确定多个训练样本,所述应用识别结果是通过应用识别模型对相应数据流量进行识别得到;
    根据所述多个训练样本对所述应用识别模型进行训练;
    向服务器发送训练后的应用识别模型的模型数据,以使所述服务器根据接收到的多个客户端设备发送的模型数据获取联合更新后的模型数据;
    接收所述服务器发送的所述联合更新后的模型数据;
    根据所述联合更新后的模型数据获取联合更新后的应用识别模型。
  2. 根据权利要求1所述的方法,其特征在于,所述根据多条数据流量中每条数据流量的应用识别结果,确定多个训练样本,包括:
    根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于未知应用的未知数据流量,和/或,根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于目标应用类别的目标数据流量,所述目标应用类别是指在指定时间段内对应的数据流量的流量特征发生特征漂移的应用类别;
    根据获取的数据流量,生成所述多个训练样本。
  3. 根据权利要求2所述的方法,其特征在于,所述根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于未知应用的未知数据流量,包括:
    从所述多条数据流量中获取应用识别结果符合未知应用条件的数据流量,将获取的数据流量作为属于未知应用的未知数据流量,所述未知应用条件是指应用识别结果中各个应用类别对应的置信度均小于参考阈值,或者,所述未知应用条件是指应用识别结果不属于多个指定聚类簇,所述多个指定聚类簇是对所述应用识别模型的原始训练样本集中的各个应用类别的数据流量的流量特征进行聚类得到。
  4. 根据权利要求3所述的方法,其特征在于,所述根据获取的数据流量,生成多个训练样本,包括:
    获取所述未知数据流量的流量特征;
    根据所述未知数据流量的流量特征,从所述服务器中获取所述未知数据流量所属的应用的应用信息;
    将所述未知数据流量的流量特征作为第一训练样本中的训练数据,将所述未知数据流量所属的应用的应用信息作为所述第一训练样本中的标签数据,所述第一训练样本为所述多个训练样本中的一个训练样本。
  5. 根据权利要求2所述的方法,其特征在于,所述根据每条数据流量的应用识别结果,从所述多条数据流量中确定属于目标应用类别的目标数据流量,包括:
    根据每条数据流量的应用识别结果,从所述多条数据流量中确定所述指定时间段内不属于未知应用的多条已知数据流量;
    根据所述多条已知数据流量的应用识别结果、所述指定时间段和所述客户端设备的标识,从所述服务器中获取所述多条已知数据流量的应用识别结果所包括的多种应用类别分别对应的特征漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
    根据每种应用类别对应的特征漂移标志,从所述多种应用类别中确定数据流量的流量特征发生漂移的目标应用类别;
    从所述多条已知数据流量中获取属于所述目标应用类别的目标数据流量。
  6. 根据权利要求5所述的方法,其特征在于,所述根据获取的数据流量,生成多个训练样本,包括:
    将所述目标数据流量的流量特征作为第二训练样本中的训练数据,将所述目标数据流量的应用识别结果所指示的所述目标数据流量所属的应用类别作为所述第二训练样本中的标签数据,所述第二训练样本为所述多个训练样本中的一个训练样本。
  7. 根据权利要求1-6任一所述的方法,其特征在于,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数,或者,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
  8. 根据权利要求1-6任一所述的方法,其特征在于,所述联合更新后的模型数据包括联合更新后的应用识别模型的模型参数,或者,所述联合更新后的模型数据包括所述联合更新后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
  9. 一种更新应用识别模型的方法,其特征在于,应用于服务器,所述方法包括:
    接收多个客户端设备发送的训练后的应用识别模型的模型数据,所述训练后的应用识别模型由相应客户端设备根据多个训练样本对所述应用识别模型训练得到,所述多个训练样本是由相应客户端设备根据多条数据流量中每条数据流量的应用识别结果确定得到;
    根据接收到的多个模型数据获取联合更新后的模型数据;
    向所述多个客户端设备发送所述联合更新后的模型数据,以使所述多个客户端设备根据所述联合更新后的模型数据获取联合更新后的应用识别模型。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    接收第一客户端设备发送的未知数据流量的流量特征,所述未知数据流量是所述第一客户端设备从所述多条数据流量中确定的属于未知应用的数据流量;
    根据所述未知数据流量的流量特征,获取所述未知数据流量所属的应用的应用信息;
    向所述第一客户端设备发送所述未知数据流量所属的应用的应用信息,以使所述第一客户端设备根据所述未知数据流量所属的应用的应用信息生成训练样本。
  11. 根据权利要求9或10所述的方法,其特征在于,所述方法还包括:
    接收第一客户端设备发送的多条已知数据流量的应用识别结果、指定时间段和所述第一客户端设备的标识,所述多条已知数据流量是指从所述多条数据流量中确定的指定时间段内不属于未知应用的数据流量;
    根据所述多条已知数据流量的应用识别结果中包括的每个应用类别和每个应用类别对应的置信度,确定相应应用类别的当前画像;
    根据所述第一客户端设备的标识,获取最近一次确定的与所述指定时间段对应的每个应用类别的画像;
    根据每个应用类别的当前画像和获取的最近一次确定的相应应用类别的画像,确定相应应用类别对应的特征漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
    向所述第一客户端设备发送每个应用类别对应的特征漂移标志,以使所述第一客户端设备根据每个应用类别对应的特征漂移标志获取属于目标应用类别的数据流量,并根据属于目标应用类别的数据流量生成训练样本,所述目标应用类别为数据流量的流量特征发生漂移的应用类别。
  12. 一种更新应用识别模型的装置,其特征在于,所述装置包括:
    确定模块,用于根据多条数据流量中每条数据流量的应用识别结果,确定多个训练样本,所述应用识别结果是通过应用识别模型对相应数据流量进行识别得到;
    训练模块,用于根据所述多个训练样本对所述应用识别模型进行训练;
    发送模块,用于向服务器发送训练后的应用识别模型的模型数据,以使所述服务器根据接收到的多个客户端设备发送的模型数据获取联合更新后的模型数据;
    接收模块,用于接收所述服务器发送的所述联合更新后的模型数据;
    更新模块,用于根据所述联合更新后的模型数据获取联合更新后的应用识别模型。
  13. 根据权利要求12所述的装置,其特征在于,所述确定模块,包括:
    未知应用检测子模块和/或特征漂移检测子模块,所述未知应用检测模块用于根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于未知应用的未知数据流量,所述特征漂移检测子模块用于根据每条数据流量的应用识别结果,从所述多条数据流量中获取属于目标应用类别的目标数据流量,所述目标应用类别是指在指定时间段内对应的数据流量的流量特征发生特征漂移的应用类别;
    所述确定模块还包括生成子模块,用于根据获取的数据流量,生成所述多个训练样本。
  14. 根据权利要求13所述的装置,其特征在于,所述未知应用检测子模块具体用于:
    从所述多条数据流量中获取应用识别结果符合未知应用条件的数据流量,将获取的数据流量作为属于未知应用的未知数据流量,所述未知应用条件是指应用识别结果中各个应用类别对应的置信度均小于参考阈值,或者,所述未知应用条件是指应用识别结果不属于多个指定聚类簇,所述多个指定聚类簇是对所述应用识别模型的原始训练样本集中的各个应用类别 的数据流量的流量特征进行聚类得到。
  15. 根据权利要求14所述的装置,其特征在于,所述生成子模块具体用于:
    获取所述未知数据流量的流量特征;
    根据所述未知数据流量的流量特征,从所述服务器中获取所述未知数据流量所属的应用的应用信息;
    将所述未知数据流量的流量特征作为第一训练样本中的训练数据,将所述未知数据流量所属的应用的应用信息作为所述第一训练样本中的标签数据,所述第一训练样本为所述多个训练样本中的一个训练样本。
  16. 根据权利要求13所述的装置,其特征在于,所述特征漂移检测子模块具体用于:
    根据每条数据流量的应用识别结果,从所述多条数据流量中确定所述指定时间段内不属于未知应用的多条已知数据流量;
    根据所述多条已知数据流量的应用识别结果、所述指定时间段和所述客户端设备的标识,从所述服务器中获取所述多条已知数据流量的应用识别结果所包括的多种应用类别分别对应的特征漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
    根据每种应用类别对应的特征漂移标志,从所述多种应用类别中确定数据流量的流量特征发生漂移的目标应用类别;
    从所述多条已知数据流量中获取属于所述目标应用类别的目标数据流量。
  17. 根据权利要求16所述的装置,其特征在于,所述生成子模块具体用于:
    将所述目标数据流量的流量特征作为第二训练样本中的训练数据,将所述目标数据流量的应用识别结果所指示的所述目标数据流量所属的应用类别作为所述第二训练样本中的标签数据,所述第二训练样本为所述多个训练样本中的一个训练样本。
  18. 根据权利要求12-17任一所述的装置,其特征在于,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数,或者,所述训练后的应用识别模型的模型数据包括所述训练后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
  19. 根据权利要求12-17任一所述的装置,其特征在于,所述联合更新后的模型数据包括联合更新后的应用识别模型的模型参数,或者,所述联合更新后的模型数据包括所述联合更新后的应用识别模型的模型参数相较于训练前的应用识别模型的模型参数的差异数据。
  20. 一种更新应用识别模型的装置,其特征在于,所述装置包括:
    接收模块,用于接收多个客户端设备发送的训练后的应用识别模型的模型数据,所述训练后的应用识别模型由相应客户端设备根据多个训练样本对所述应用识别模型训练得到,所述多个训练样本是由相应客户端设备根据多条数据流量中每条数据流量的应用识别结果确定 得到;
    第一获取模块,用于根据接收到的多个模型数据获取联合更新后的模型数据;
    发送模块,用于向所述多个客户端设备发送所述联合更新后的模型数据,以使所述多个客户端设备根据所述联合更新后的模型数据获取联合更新后的应用识别模型。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括第二获取模块;
    所述接收模块,还用于接收第一客户端设备发送的未知数据流量的流量特征,所述未知数据流量是所述第一客户端设备从所述多条数据流量中确定的属于未知应用的数据流量;
    所述第二获取模块,用于根据所述未知数据流量的流量特征,获取所述未知数据流量所属的应用的应用信息;
    所述发送模块,还用于向所述第一客户端设备发送所述未知数据流量所属的应用的应用信息,以使所述第一客户端设备根据所述未知数据流量所属的应用的应用信息生成训练样本。
  22. 根据权利要求20或21所述的装置,其特征在于,所述装置还包括:第三获取模块和确定模块;
    所述接收模块,还用于接收第一客户端设备发送的多条已知数据流量的应用识别结果、指定时间段和所述第一客户端设备的标识,所述多条已知数据流量是指从所述多条数据流量中确定的所述指定时间段内不属于未知应用的数据流量;
    所述确定模块,用于根据所述多条已知数据流量的应用识别结果中包括的每个应用类别和每个应用类别对应的置信度,确定相应应用类别的当前画像;
    所述第三获取模块,用于根据所述第一客户端设备的标识,获取最近一次确定的与所述指定时间段对应的每个应用类别的画像;
    所述确定模块,还用于根据每个应用类别的当前画像和获取的最近一次确定的相应应用类别的画像,确定相应应用类别对应的特征漂移标志,所述特征漂移标志用于指示对应的应用类别的数据流量的流量特征是否发生漂移;
    所述发送模块,还用于向所述第一客户端设备发送每个应用类别对应的特征漂移标志,以使所述第一客户端设备根据每个应用类别对应的特征漂移标志获取属于目标应用类别的数据流量,并根据属于目标应用类别的数据流量生成训练样本,所述目标应用类别为数据流量的流量特征发生漂移的应用类别。
  23. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序代码,当所述计算机程序代码被计算机设备执行时,所述计算机设备执行上述权利要求1至8中任一项所述的方法,或者,所述计算机设备执行上述权利要求9-11中任一项所述的方法。
PCT/CN2020/118993 2020-02-29 2020-09-29 更新应用识别模型的方法、装置及存储介质 WO2021169294A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20921538.3A EP4095768A4 (en) 2020-02-29 2020-09-29 METHOD AND APPARATUS FOR UPDATING AN APPLICATION RECOGNITION MODEL AND STORAGE MEDIA
US17/822,581 US20220414487A1 (en) 2020-02-29 2022-08-26 Method and Apparatus for Updating Application Identification Model, and Storage Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010132251.0A CN113326946A (zh) 2020-02-29 2020-02-29 更新应用识别模型的方法、装置及存储介质
CN202010132251.0 2020-02-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/822,581 Continuation US20220414487A1 (en) 2020-02-29 2022-08-26 Method and Apparatus for Updating Application Identification Model, and Storage Medium

Publications (1)

Publication Number Publication Date
WO2021169294A1 true WO2021169294A1 (zh) 2021-09-02

Family

ID=77413230

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118993 WO2021169294A1 (zh) 2020-02-29 2020-09-29 更新应用识别模型的方法、装置及存储介质

Country Status (4)

Country Link
US (1) US20220414487A1 (zh)
EP (1) EP4095768A4 (zh)
CN (1) CN113326946A (zh)
WO (1) WO2021169294A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095076A (zh) * 2021-11-30 2022-02-25 北京中昱光通科技有限公司 一种基于bidi***的olp光线路保护切换监测方法
CN115134687A (zh) * 2022-06-22 2022-09-30 中国信息通信研究院 光接入网的业务识别方法、装置、电子设备及存储介质
WO2024031984A1 (zh) * 2022-08-10 2024-02-15 华为云计算技术有限公司 一种任务处理***、任务处理的方法及装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11836205B2 (en) 2022-04-20 2023-12-05 Meta Platforms Technologies, Llc Artificial reality browser configured to trigger an immersive experience
US11755180B1 (en) 2022-06-22 2023-09-12 Meta Platforms Technologies, Llc Browser enabled switching between virtual worlds in artificial reality

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102685016A (zh) * 2012-06-06 2012-09-19 济南大学 互联网流量区分方法
CN105516027A (zh) * 2016-01-12 2016-04-20 北京奇虎科技有限公司 应用识别模型建立方法、流量数据的识别方法及装置
CN108259637A (zh) * 2017-11-30 2018-07-06 湖北大学 一种基于决策树的nat设备识别方法及装置
CN109873774A (zh) * 2019-01-15 2019-06-11 北京邮电大学 一种网络流量识别方法及装置
CN110768875A (zh) * 2019-12-27 2020-02-07 北京安博通科技股份有限公司 一种基于dns学习的应用识别方法及***
CN110782014A (zh) * 2019-10-23 2020-02-11 新华三信息安全技术有限公司 一种神经网络增量学习方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180089587A1 (en) * 2016-09-26 2018-03-29 Google Inc. Systems and Methods for Communication Efficient Distributed Mean Estimation
US11170320B2 (en) * 2018-07-19 2021-11-09 Adobe Inc. Updating machine learning models on edge servers

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102685016A (zh) * 2012-06-06 2012-09-19 济南大学 互联网流量区分方法
CN105516027A (zh) * 2016-01-12 2016-04-20 北京奇虎科技有限公司 应用识别模型建立方法、流量数据的识别方法及装置
CN108259637A (zh) * 2017-11-30 2018-07-06 湖北大学 一种基于决策树的nat设备识别方法及装置
CN109873774A (zh) * 2019-01-15 2019-06-11 北京邮电大学 一种网络流量识别方法及装置
CN110782014A (zh) * 2019-10-23 2020-02-11 新华三信息安全技术有限公司 一种神经网络增量学习方法及装置
CN110768875A (zh) * 2019-12-27 2020-02-07 北京安博通科技股份有限公司 一种基于dns学习的应用识别方法及***

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4095768A4

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095076A (zh) * 2021-11-30 2022-02-25 北京中昱光通科技有限公司 一种基于bidi***的olp光线路保护切换监测方法
CN115134687A (zh) * 2022-06-22 2022-09-30 中国信息通信研究院 光接入网的业务识别方法、装置、电子设备及存储介质
CN115134687B (zh) * 2022-06-22 2024-05-07 中国信息通信研究院 光接入网的业务识别方法、装置、电子设备及存储介质
WO2024031984A1 (zh) * 2022-08-10 2024-02-15 华为云计算技术有限公司 一种任务处理***、任务处理的方法及装置

Also Published As

Publication number Publication date
CN113326946A (zh) 2021-08-31
EP4095768A1 (en) 2022-11-30
EP4095768A4 (en) 2023-08-09
US20220414487A1 (en) 2022-12-29

Similar Documents

Publication Publication Date Title
WO2021169294A1 (zh) 更新应用识别模型的方法、装置及存储介质
US10812358B2 (en) Performance-based content delivery
US10027739B1 (en) Performance-based content delivery
WO2018152919A1 (zh) 一种路径选取方法及***、网络加速节点及网络加速***
WO2020258920A1 (zh) 一种网络切片资源管理方法及设备
US20160212066A1 (en) Software-Defined Information Centric Network (ICN)
CN109768879B (zh) 目标业务服务器的确定方法、装置及服务器
CN110708256B (zh) Cdn调度方法、装置、网络设备及存储介质
TWI505682B (zh) 一種具高度適應***談管理機制之遠端管理系統
CN113133087B (zh) 针对终端设备配置网络切片的方法及装置
EP4298774A1 (en) Method, apparatus and system for nf selection
CN113328953B (zh) 网络拥塞调整的方法、装置和存储介质
CN113259145B (zh) 网络切片的端到端组网方法、组网装置及网络切片设备
CN113709776B (zh) 一种通信方法、装置及***
CN105430062B (zh) 一种基于兴趣-相关度的移动p2p网络数据预取方法
CN112314003A (zh) 包括多个网络切片的蜂窝电信网络
CN116828225A (zh) 基于点对点模式的视频流资源获取方法和***
CN115665262A (zh) 一种请求处理方法、装置、电子设备及存储介质
CN106888237B (zh) 一种数据调度方法及***
CN112488491B (zh) 基于Petri网列控车载设备接入过程可靠性评估方法
Jeon et al. Hierarchical Network Data Analytics Framework for 6G Network Automation: Design and Implementation
US20220272021A1 (en) Dynamic node cluster discovery in an unknown topology graph
CN113098763A (zh) 即时通讯消息发送方法、装置、存储介质及设备
CN113298115A (zh) 基于聚类的用户分组方法、装置、设备和存储介质
CN106941451B (zh) 一种基于网络感知和覆盖率阈值矩阵的文件智能缓存方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921538

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020921538

Country of ref document: EP

Effective date: 20220822

NENP Non-entry into the national phase

Ref country code: DE