CN111611051A - Method for accelerating first distribution of data packets on NFV platform - Google Patents

Method for accelerating first distribution of data packets on NFV platform Download PDF

Info

Publication number
CN111611051A
CN111611051A CN202010349642.8A CN202010349642A CN111611051A CN 111611051 A CN111611051 A CN 111611051A CN 202010349642 A CN202010349642 A CN 202010349642A CN 111611051 A CN111611051 A CN 111611051A
Authority
CN
China
Prior art keywords
queue
flow
network
vnf
packet receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010349642.8A
Other languages
Chinese (zh)
Other versions
CN111611051B (en
Inventor
李健
张沪滨
管海兵
钱建民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010349642.8A priority Critical patent/CN111611051B/en
Publication of CN111611051A publication Critical patent/CN111611051A/en
Application granted granted Critical
Publication of CN111611051B publication Critical patent/CN111611051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Abstract

The invention discloses a method for accelerating the first distribution of a data packet by utilizing the FDir function of a network card on an NFV (network file virtualization) platform, wherein the FDir function is utilized to cache the mapping relation between a quintuple of an active flow and a hardware receiving queue bound by a target VNF into a PMFT (virtual private network function) table, and then the network card can directly send the data packet to the target VNF without the intervention of a CPU (central processing unit); due to the capacity limitation of the PMFT table, for inactive flows, the inactive flows are not cached in the PMFT table, and the target VNFs of the inactive flows are calculated by the CPU in combination with a network policy to complete distribution; and counting the activity of each network flow through a user-defined algorithm, and periodically updating the PMFT table so as to maximize the acceleration benefit. The invention can save the CPU resource of the host machine without introducing special hardware, thereby enhancing the overall performance of the NFV platform framework.

Description

Method for accelerating first distribution of data packets on NFV platform
Technical Field
The invention relates to an NFV platform, in particular to a data packet first distribution technology.
Background
Network Function Virtualization (NFV) is a technology for building a virtual machine based on an inexpensive general-purpose server, replacing a conventional hardware NF such as a switch, a router, a firewall, etc. by a software program performing a corresponding function. The occurrence and development of NFV greatly alleviate the defects of long integration period, high deployment cost, lack of unified management mechanism and the like brought by the traditional hardware solution in the cloud environment. However, a natural gap exists between general-purpose hardware and special-purpose hardware in a specific task, and how to enable the NFV platform to be comparable to a physical hardware cluster in performance is a challenge in commercialization of the NFV platform. For this reason, researchers have proposed various NFV frameworks that attempt to improve the overall performance of the NFV platform, starting from the throughput and latency of a single Virtualized Network Function (VNF), the way a VNF service chain operates, the efficiency of packet distribution, and other factors. In terms of data packet distribution, the transmission of a data packet from a host network card to a target VNF is the first distribution after the data packet enters a host, and the distribution efficiency limits the upper limit of the processing efficiency of a related service (chain) packet, so that the data packet distribution method has a non-negligible optimization value.
Through the exploration of the prior art, "Metron, NFV Service Chains at the True Speed of the rendering Hardware" by "OpenNetVM" A Platform for high Performance Network Service Chains "by Wei Zhang et al, and" Azure accessed Networking: Smart-Nics in the public Cloud "by" Georgios P Katsikas "by" Weii Zhang et al can efficiently handle the first distribution of data packets from a host to a target VNF. The OpenNetVM specially divides one or more CPU cores to execute an RX process to process the acceptance of a network card data packet, the RX process maintains a mapping table from a five-tuple to a target core, obtains the CPU core where a target VNF is located by reading corresponding fields of the data packet and performing table lookup, and completes the delivery of the data packet. In OpenNetVM, because the RX process monopolizes the CPU without involving scheduling and polling data packets are adopted without interruption overhead, higher first-time distribution efficiency can be achieved. However, the use of RX makes the first distribution indiscriminately dependent on CPU resources, making the corresponding resources unable to serve the VNF in the host, having an impact on the overall performance of the NFV platform. Metron uses a special hardware switch and a hardware network card to distribute for the first time, wherein the special hardware switch maintains the relation between the quintuple and the target VNF, a CPU where the target VNF is located is recorded into a data packet in a Tag mode, and then the Tag is read by using the special network card and direct delivery is completed. Metron is distributed for the first time without participation of a host CPU, performance guarantee is provided by hardware while the CPU is saved, however, special hardware is very expensive, and large-scale deployment in a cloud environment is not facilitated. Azure AccelenteratedNetworking uses a self-developed SmartNIC based on an FPGA technology to efficiently distribute data packets for the first time without participation of a CPU, however, like a common FPGA device, the SmartNIC has a complex structure and is difficult to research and develop, and at present, no commercial version which can be deployed outside the Azure can be used.
Therefore, those skilled in the art are working to develop a method capable of accelerating the distribution of data packets, without specially customizing the expensive TCO of the hardware accelerator, with the potential for wide deployment in a cloud environment.
Disclosure of Invention
In view of the above defects in the prior art, the technical problem to be solved by the present invention is how to save CPU resources of a host and enhance the overall performance of an NFV platform without expensive dedicated hardware.
In order to achieve the above purpose, the present invention provides a method for accelerating the first distribution of data packets by using the FDir function of a network card, and the framework uses the FDir function originally added by a conventional commercial network card for load balancing purpose, and constructs a mapping relationship between a network flow quintuple consisting of a source IP, a source port number, a destination IP, a destination port number, and a protocol type and a hardware receiving queue concerned by a target VNF. Dividing the data stream into an active stream and an inactive stream according to the activity of the data stream; only storing the mapping relation between the five-tuple of the active flow and the hardware receiving queue into a PMFT table of a network card; and for the inactive flow, calculating the target VNF by utilizing the CPU in combination with the network policy and then completing distribution. And caching the mapping relation between the five-tuple of the active flow and the concerned hardware receiving queue of the target VNF into a PMFT table, so that the network card can directly send the data packet to the target VNF without the intervention of a CPU. Due to the capacity limitation of the PMFT table, for inactive flows, the framework no longer caches them to the PMFT table, but calculates their target VNF with the CPU in conjunction with network policies and then completes the distribution. The framework can also count the activity of each network flow through a user-defined algorithm and periodically update the PMFT table so as to maximize the acceleration benefit.
Further, the working process of the method is primarily responsible by both the flow management component and the VNF runtime component.
The method comprises the following steps:
step 1, the flow management component creates a slow speed general queue in a shared memory and binds the slow speed general queue with a default queue of the hardware receiving queue;
step 2, the flow management component creates a plurality of slow packet receiving queues and fast packet receiving queues in the shared memory, and assembles at least one slow packet receiving queue for each target VNF;
and step 3, the flow management component ensures that the target VNF is at least assembled with one fast packet receiving queue before starting the target VNF for each active flow FL which is known to be received and processed by the target VNF, selects one fast packet receiving queue as FQ, and simultaneously selects an idle queue of the hardware receiving queue as HQ. Entering the mapping of the five tuple of the FL to the HQ as an entry into the PMFT table, and simultaneously binding the HQ and the FQ;
and 4, when the data packet arrives in the slow speed general queue, the stream management component queries the currently maintained memory table of the network stream quintuple-target VNF by taking the quintuple as a Key. If the search is successful, the data packet is delivered to the slow packet receiving queue of the corresponding target VNF; otherwise, calculating according to the network policy to obtain the target VNF, inputting a calculation result into the memory table of the network flow quintuple-target VNF, and completing delivery of the data packet;
step 5, each target VNF traverses all the fast packet receiving queues and the slow packet receiving queues assembled by the target VNF for packet receiving through a packet receiving API which is provided by the target VNF runtime component and can count the flow activity corresponding to each data packet in the fast packet receiving queues;
and 6, the flow management component inspects the arrival condition of the data packets in the slow general queue, updates the flow activity corresponding to the data packets in the slow packet receiving queue by adopting the user-defined algorithm, and periodically configures the PMFT table by combining the result of the step 5, removes the inactive flow and adds the active flow.
Further, the slow total queue is created when the flow management component starts up in step 1, the life cycle of the slow total queue is the same as that of the flow management component, each element of the slow total queue is one data packet that is not cached in the PMFT table, and generally, the data packet belongs to a plurality of target VNFs on the host.
Further, step 2 pre-allocates a plurality of fast packet receiving queues and slow packet receiving queues at a time in a resource pool manner, and assembles the slow packet receiving queues when the target VNF is started; whereas for the fast receive queue, assembly occurs if and only if a buffered stream is entered into the PMFT table in step 3 or step 6.
Further, step 3 examines the CPU node where the target VNF is located, and preferentially selects the fast packet receiving queue located in the memory where the CPU node is located for assembly when the fast packet receiving queue is assembled, which is to reduce memory access overhead caused by the target VNF performing traversal on the fast packet receiving queue through the VNF runtime component as much as possible under the NUMA architecture.
Further, the memory table of the "network flow five-tuple-target VNF" related to step 4 is created when the flow management component is started, and at the same time, the flow management component loads the network policy entry set formulated based on the platform topology to the memory, calculates by using the five-tuple as an input and combining the network policy entry set, and can obtain the target VNF of any one of the packets, and the calculation process has relatively large overhead, so the memory table of the "network flow five-tuple-target VNF" is used to cache the calculation result.
Further, in step 5, the packet receiving API provided by the VNF runtime component maintains a linked list sorted according to liveness for each of the network flows corresponding to each of the data packets in the fast packet receiving queue, and each time a certain network flow has the data packet arriving, the linked list is updated to move the network flow corresponding to the data packet to the head of the table, and the network flow is considered to be more inactive as the tail of the table is closer, by this method, the VNF runtime component introduces a lightweight liveness statistic function for the network flows cached in the PMFT table and divided by bypassing the CPU.
Further, step 6 updates the liveness of the network flow not cached by adopting the following default algorithm:
Ig=α·Ig+(1-α)·If(1)
Figure BDA0002471478830000031
in formula (1) and formula (2), AL is the liveness of the network flow, IgIs the global receive interval, I, on the slow global queue maintained by the flow management componentfIt is the current packet receiving interval of the network flow corresponding to the data packet, α is an empirical factor between 0 and 1, the default value is 0.9999, and at the same time, the framework allows step 6 to integrate the user-defined algorithm, which allows the user to know the network service characteristics of the framework better or to have the need of customized personalized services.
Further, step 6, when caching the active flow into the PMFT table, investigating whether the enabled fast packet receiving queue exists in the target VNF, and if not, assembling the fast packet receiving queue pre-allocated by the flow management component to the target VNF, wherein if the pre-allocated fast packet receiving queue is exhausted, firstly performing batch creation and then assembling to the target VNF, further, the flow management component binds the newly assembled fast packet receiving queue with one idle queue, and then, the flow management component investigates whether an idle entry exists in the PMFT table, and if so, selecting one idle entry, and entering a mapping relationship between the five-tuple to be cached and the idle queue; otherwise, the inactive flow recommended by the VNF runtime component is removed from the PMFT table, and then the flow to be cached is recorded.
The invention has the following technical effects:
1. the invention caches the active stream by utilizing the FDir function of the network card, so that the data packet of the part of stream can be directly delivered to the CPU core where the target VNF is positioned from the network card without the intervention of the CPU, thereby saving the CPU resource.
2. The method and the device do not need to introduce a separate receiving process, so that the internuclear copy of the data packet between the receiving process and the target VNF process is eliminated, the delay of the first distribution of the data packet is reduced, and the first distribution performance of the data packet is improved.
3. The invention uses the PMFT table as a cache of the relation between the active flow and the destination VNF, and periodically carries out cache replacement operation based on the counted flow activity. Not only the acceleration benefit brought by the FDir function is maximized, but also the capacity of the PMFT table cannot limit the total number of network flows which can be served by the platform, and the flexibility is ensured.
4. The FDir function relied on by the invention is possessed by the conventional commercial network card, expensive hardware such as a special switch or a special FPGA network card is not required to be introduced, and the invention has reasonable TCO, thereby having the potential of wide deployment in cloud environment.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a schematic workflow diagram of one embodiment of the present invention;
fig. 2 is a schematic diagram of a platform structure according to an embodiment of the present invention.
Detailed Description
The technical contents of the preferred embodiments of the present invention will be more clearly and easily understood by referring to the drawings attached to the specification. The present invention may be embodied in many different forms of embodiments and the scope of the invention is not limited to the embodiments set forth herein.
In the drawings, structurally identical elements are represented by like reference numerals, and structurally or functionally similar elements are represented by like reference numerals throughout the several views. The size and thickness of each component shown in the drawings are arbitrarily illustrated, and the present invention is not limited to the size and thickness of each component. The thickness of the components may be exaggerated where appropriate in the figures to improve clarity.
The invention skillfully uses the FDir function originally added by the conventional commercial network card for the purpose of load balancing, constructs a mapping relation by a network flow quintuple consisting of a source IP, a source port number, a destination IP, a destination port number and a protocol type and a hardware receiving queue concerned by a target VNF, and caches the relation in a PMFT table of an FDir functional component of the network card in a key value pair mode, so that the network card directly forwards a corresponding data packet to the target VNF without CPU intervention; and for the network flow which is not cached in the PMFT table, firstly, calculating and acquiring a target VNF through the CPU, and then, forwarding the packet. In consideration of the capacity limit of the PMFT table, the framework embeds a lightweight statistical layer in the packet receiving process of the VNF for cached flows bypassing the CPU to examine the activity of the cached flows, embeds activity statistical capability allowing a user to define an algorithm in the forwarding process of the cached flows with the CPU, and further periodically removes the relatively inactive flows from the PMFT and records the relatively active flows to achieve the effect of cache replacement, so that the FDir function can accelerate the more active flows as much as possible.
As shown in fig. 1, a workflow diagram of an embodiment of the present invention mainly includes two major components, a flow management component and a VNF runtime component. The two components execute respective tasks and interact based on the platform resources and the memory data structures constructed by the platform resources, and jointly maintain the operation of the internal system of the host machine. The lines and arrows in the figures represent the interactive associations between the components and the resources, and the functions of the two components are first explained below.
A flow management component: the functions for which the component is responsible include the following two major factors. Firstly, the component provides distribution service for a data packet of uncached flow by using a mapping table from a five-tuple to a CPU core where a target VNF is located, and the content of the table entry is obtained by the related calculation of a network policy; second, the component periodically retrieves the least active cached network flows recommended by the VNF runtime component and operates the PMFT table to complete the cache replacement.
A VNF runtime component: the component is a lightweight flow activity statistics layer located between the flow set and the VNF. The component is located outside the VNF, and the VNF polls its own slow packet receiving queue and fast packet receiving queue through the APIs provided by the VNF, and takes out the data packet and processes the data packet according to the service logic. By adding a statistical function aiming at the network flow activity in the rapid packet receiving queue in the API, the VNF runtime component can identify the least active part of the PMFT cached network flow and recommend the part of the network flow to the flow management component, and the method solves the problem that the cached network flow can not be counted by the receiving process when the cached network flow bypasses the receiving process and is directly delivered to the VNF.
The first distribution flow of the data packet is described in detail below with reference to fig. 1.
Step 1: the flow management component creates a slow global queue in the shared memory. And binding the slow total queue with a default hardware receiving queue of the network card.
Step 2: the flow management component creates a plurality of slow receive queues and fast receive queues in the shared memory. At least one slow receive queue is assembled for each target VNF.
And step 3: for each active network flow FL known to be received and processed by a certain target VNF, before the VNF is started, the flow management component ensures that the VNF is provided with at least one quick receiving queue, selects one quick receiving queue as FQ, and simultaneously selects an idle network card hardware receiving queue as HQ. And recording the mapping from the FL quintuple to the HQ as an entry into the PMFT, and binding the HQ and the FQ.
And 4, step 4: when a data packet arrives in the slow general queue, the flow management component queries a currently maintained mapping relation table of the network flow quintuple-target VNF by taking the quintuple as a Key. If the search is successful, the data packet is delivered to a slow packet receiving queue of the corresponding VNF; otherwise, calculating to obtain the target VNF according to the network strategy, inputting the calculation result into the mapping relation table, and completing the delivery of the data packet.
And 5: and each VNF traverses all the fast packet receiving queues and slow packet receiving queues assembled by the VNF to receive packets through a packet receiving API which is provided by the VNF runtime component and can count the corresponding flow hops of each data packet in the fast packet receiving queues.
Step 6: and the flow management component inspects the arrival condition of the data packets in the slow main queue and updates the flow activity corresponding to the data packets in the slow packet receiving queue by adopting an algorithm supporting user definition. And combining the result of the step 5, periodically configuring the network card PMFT table, removing the network flow which is relatively not active enough, and adding the network flow which is relatively active.
Preferably, the flow management component creates a slow overall queue at start-up in step 1, the queue lifetime being the same as the flow management component. Each meta-queue element is a packet of network flow that is not cached in the PMFT table, and typically belongs to multiple VNFs on the host.
Preferably, step 2 pre-allocates a plurality of fast packet receiving queues and slow packet receiving queues at a time in a resource pool manner. For the slow packet receiving queue, assembling is carried out when the VNF is started; whereas for the fast receive queue, assembly is performed if and only if the buffered stream is logged into the PMFT in step 3 or step 6.
Preferably, step 3 examines the CPU node where the VNF is located, and preferentially locates the fast packet receiving queue in the memory where the selected CPU node is located for assembly when fast queue assembly occurs. This is to reduce as much as possible the memory access overhead incurred by the VNF when it passes through the VNF runtime component to the fast packet receive queue under the NUMA architecture.
Preferably, the "network flow quintuple-target VNF" memory table referred to in step 4 is created when the flow management component is started. Meanwhile, the flow management component loads the network policy entry set formulated based on the platform topology to the memory. The network flow quintuple is used as input, calculation is carried out by combining the network policy set, the target VNF of any data packet can be obtained, and the calculation process is relatively high in cost, so that the calculation result is cached by adopting a memory table of the network flow quintuple-target VNF.
Preferably, in step 5, the packet receiving API provided by the VNF runtime component maintains a linked list sorted according to activity for each network flow corresponding to each data packet of the fast packet receiving queue, and when a data packet arrives in a certain network flow, the linked list is updated to move the network flow corresponding to the data packet to the head of the linked list, and the network flow closer to the tail end of the linked list is considered to be less active. In this way, the VNF runtime component introduces lightweight liveness statistics for network flows that are cached for distribution in the PMFT, bypassing the CPU.
Preferably, step 6 updates the liveness of the uncached network flow using the following default algorithm:
Ig=α·Ig+(1-α)·If(1)
Figure BDA0002471478830000061
where AL is network flow activity, IgIs the global packet-receive interval, I, on the slow global queue maintained by the flow management componentfIs the packet reception interval of the current packet corresponding to the network flow, α is an empirical factor between 0 and 1, with a default value of 0.9999.
At the same time, the NFV framework herein allows step 6 to integrate user-defined liveness algorithms, which allows for users to have a better understanding of the network service characteristics of their own NFV platform or the need to customize personalized services.
Preferably, step 6 examines whether the VNF has an enabled fast packet receiving queue when caching the active flow into the PMFT. If not, the fast packet receiving queue pre-allocated by the stream management component is assembled to the VNF, and during the process, if the pre-allocated fast packet receiving queue is exhausted, the stream management component firstly carries out batch creation and then is assembled to the VNF. Further, the flow management component binds the newly assembled fast receive queue with an idle hardware receive queue. Then, the flow management component examines whether the PMFT has an idle table entry, if so, an idle table entry is selected, and the mapping relation between the flow quintuple to be cached and the hardware packet receiving queue is recorded in the idle table entry; otherwise, the inactive stream recommended by the VNF runtime component is removed from the PMFT, and then the stream to be cached is recorded.
As shown in fig. 2, a schematic platform structure diagram of an embodiment of the present invention shows an example of a possible deployment manner based on a computer cluster. Each host is provided with a network card with an FDir function and each functional component shown in fig. 1. For services, VNF1, VNF2 form a first service chain; VNF3, VNF4, and VNF6 form a second service chain; VNF5, VNF7, VNF8, VNF10, and the like constitute a third service chain; VNF9 constitutes a service independently; VNF6 has the ability to provide services independently to outside users.
Taking the second service chain as an example, a request of a certain external user is first sent to the network card of the host 1, when the user frequently sends the request, the functional component on the host 1 identifies the request as an active stream and records the mapping relationship between the five-tuple and the core where the VNF3 is located in the local network card PMFT table, and similarly, the host 2 frequently receives the corresponding data packet of the user from the VNF4, also identifies the corresponding network stream as an active network stream, and records the mapping relationship between the five-tuple and the CPU where the VNF6 is located in the local network card PMFT table. From this point on, the user's use of the second service chain is accelerated by the network card FDir function of both host 1 and host 2.
In fact, each service in the cluster, whether a chain service composed of multiple VNFs or a service composed of a single VNF independently, can obtain the acceleration effect of the present invention as long as the corresponding network flow is active enough. Based on the method, the invention can save CPU resources and improve the performance of the whole NFV platform.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (10)

1. A method for accelerating the first distribution of a data packet on an NFV platform is characterized in that a mapping relation between a quintuple of a data stream and a hardware receiving queue is established, and the receiving queue of a target VNF is bound to the hardware receiving queue; and storing the mapping relation into a PMFT table of a network card, and directly sending the data packet to the hardware receiving queue without the intervention of a CPU (Central processing Unit) by utilizing the FDir function of the network card.
2. The method for accelerating the first distribution of data packets over the NFV platform of claim 1, comprising an activity statistics function for the data stream, and dividing the data stream into active streams and inactive streams according to the activity of the data stream; storing only the mapping relation between the five-tuple of the active flow and the hardware receiving queue into the PMFT table; and for the inactive flows, calculating the target VNF by utilizing a CPU in combination with a network strategy and then completing distribution.
3. A method for accelerating the first time distribution of data packets on an NFV platform as set forth in claim 2, including a stream management component, a VNF runtime component; the method comprises the following steps:
step 1, the flow management component creates a slow speed general queue in a shared memory and binds the slow speed general queue with a default queue of the hardware receiving queue;
step 2, the flow management component creates a plurality of slow packet receiving queues and fast packet receiving queues in the shared memory, and assembles at least one slow packet receiving queue for each target VNF;
step 3, the active flow is marked as FL, the flow management component ensures that the target VNF at least assembles one fast packet receiving queue before starting the target VNF for each FL known to be received and processed by the target VNF, selects one queue as FQ, selects an idle queue of the hardware receiving queue as HQ, records the mapping from the five tuple of the FL to the HQ as an entry into the PMFT table, and binds the HQ and the FQ;
step 4, when the data packet arrives in the slow general queue, the flow management component queries a currently maintained network flow quintuple-target VNF memory table by using the quintuple as a Key, and if the lookup is successful, delivers the data packet to the slow packet receiving queue of the corresponding target VNF; otherwise, calculating according to the network policy to obtain the target VNF, inputting a calculation result into the network flow quintuple-target VNF memory table, and completing delivery of the data packet;
step 5, each target VNF traverses all the fast packet receiving queues and the slow packet receiving queues assembled by the target VNF for packet receiving through a packet receiving API provided by the VNF runtime component;
and 6, the stream management component inspects the arrival condition of the data packets in the slow general queue, updates the stream activity corresponding to the data packets in the slow packet receiving queue, and periodically configures the PMFT table by combining the result of the step 5, removes the inactive stream and adds the active stream.
4. The method for accelerating the first dispatch of a data packet over an NFV platform of claim 3, wherein in step 1, the flow management component creates the slow global queue upon startup, the slow global queue lifetime is the same as the flow management component, and each element of the slow global queue is one of the data packets that is not cached in the PMFT table.
5. The method for accelerating first time distribution of data packets on NFV platform according to claim 3, wherein in the step 2, a plurality of the fast packet receiving queues and the slow packet receiving queues are pre-allocated in a resource pool manner at a time, and for the slow packet receiving queues, the assembling is performed when the target VNF is started; and for the fast packet receiving queue, assembling if and only if a buffered stream is entered into the PMFT table in step 3 or step 6.
6. The method according to claim 3, wherein in step 3, the CPU node where the target VNF is located is examined, and when the fast packet receiving queue is assembled, the fast packet receiving queue located in a memory where the CPU node is located is preferentially selected for assembly.
7. A method for accelerating first time distribution of data packets on NFV platform as claimed in claim 3, wherein in step 4, the network flow quintuple-target VNF memory table is created when the flow management component starts up; and the flow management component loads the network policy entry set formulated based on the platform topology into a memory, calculates by taking the quintuple as input and combining the network policy entry set to obtain the target VNF of any one data packet, and caches the calculation result in the network flow quintuple-target VNF memory table.
8. A method for accelerating the first time distribution of data packets on the NFV platform as recited in claim 3, wherein in step 5, the packet receiving API provided by the VNF runtime component maintains a linked list sorted by liveness for each of the data packets of the fast packet receiving queue corresponding to the network flow, and when a certain network flow has the data packet arrived, the linked list is updated to move the network flow corresponding to the data packet to the head of the table, and the network flow is considered to be more inactive as the network flow approaches the tail of the table, and in this way, the VNF runtime component introduces a lightweight liveness statistic function for the network flows cached in the PMFT table.
9. A method for accelerating first time distribution of data packets on NFV platform as claimed in claim 3, wherein in step 6, the activity of the network flow not cached is updated by using the following default algorithm:
Ig=α·Ig+(1-α)·If(1)
Figure FDA0002471478820000021
in formula (1) and formula (2), AL is the liveness of the network flow, IgIs the global receive interval, I, on the slow global queue maintained by the flow management componentfThe current packet receiving interval of the network flow corresponding to the data packet is α is an empirical coefficient between 0 and 1, and the default value is 0.9999, or in the step 6, a user-defined algorithm is integrated to update the liveness of the network flow which is not cached.
10. The method for accelerating first time distribution of data packets on NFV platform according to claim 3, wherein in the step 6, when the active flow is cached in the PMFT table, whether the enabled fast packet receiving queue exists in the target VNF is examined, and if not, the fast packet receiving queue pre-allocated by the flow management component is assembled to the target VNF; if the pre-allocated fast packet receiving queue is exhausted, firstly performing batch creation and then re-assembling to the target VNF; the flow management component binds the newly assembled fast packet receiving queue with an idle queue of the hardware receiving queue; then, the flow management component examines whether the PMFT table has an idle table entry, if so, the idle table entry is selected, the mapping relation between the quintuple to be cached and the idle queue is recorded in the idle table entry, otherwise, the inactive flow recommended by the VNF runtime component is removed from the PMFT table, and then the stream to be cached is recorded.
CN202010349642.8A 2020-04-28 2020-04-28 Method for accelerating first distribution of data packets on NFV platform Active CN111611051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010349642.8A CN111611051B (en) 2020-04-28 2020-04-28 Method for accelerating first distribution of data packets on NFV platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010349642.8A CN111611051B (en) 2020-04-28 2020-04-28 Method for accelerating first distribution of data packets on NFV platform

Publications (2)

Publication Number Publication Date
CN111611051A true CN111611051A (en) 2020-09-01
CN111611051B CN111611051B (en) 2022-05-31

Family

ID=72203154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010349642.8A Active CN111611051B (en) 2020-04-28 2020-04-28 Method for accelerating first distribution of data packets on NFV platform

Country Status (1)

Country Link
CN (1) CN111611051B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732409A (en) * 2021-01-21 2021-04-30 上海交通大学 Method and device for enabling zero-time-consumption network flow load balancing under VNF architecture

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062269A (en) * 2017-12-05 2018-05-22 上海交通大学 A kind of computing resource elastic telescopic method and system based on DPDK
CN108200092A (en) * 2018-02-08 2018-06-22 赛特斯信息科技股份有限公司 Accelerate the method and system of message ACL matching treatments based on NFV technologies
US20180375932A1 (en) * 2017-06-21 2018-12-27 Juniper Networks, Inc. Synchronization between virtual network functions and host systems
CN109995606A (en) * 2018-01-02 2019-07-09 ***通信有限公司研究院 Virtualize deep-packet detection vDPI flow control methods and network element device
CN110311816A (en) * 2019-06-28 2019-10-08 上海交通大学 The VNF laying method of the co-located interference perception of VNF under a kind of NFV environment
CN110838980A (en) * 2019-11-22 2020-02-25 苏州浪潮智能科技有限公司 Message matching system and method based on NFV

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180375932A1 (en) * 2017-06-21 2018-12-27 Juniper Networks, Inc. Synchronization between virtual network functions and host systems
CN108062269A (en) * 2017-12-05 2018-05-22 上海交通大学 A kind of computing resource elastic telescopic method and system based on DPDK
CN109995606A (en) * 2018-01-02 2019-07-09 ***通信有限公司研究院 Virtualize deep-packet detection vDPI flow control methods and network element device
CN108200092A (en) * 2018-02-08 2018-06-22 赛特斯信息科技股份有限公司 Accelerate the method and system of message ACL matching treatments based on NFV technologies
CN110311816A (en) * 2019-06-28 2019-10-08 上海交通大学 The VNF laying method of the co-located interference perception of VNF under a kind of NFV environment
CN110838980A (en) * 2019-11-22 2020-02-25 苏州浪潮智能科技有限公司 Message matching system and method based on NFV

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李凯: "基于DPDK的流量动态负载均衡技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732409A (en) * 2021-01-21 2021-04-30 上海交通大学 Method and device for enabling zero-time-consumption network flow load balancing under VNF architecture
CN112732409B (en) * 2021-01-21 2022-07-22 上海交通大学 Method and device for enabling zero-time-consumption network flow load balancing under VNF architecture

Also Published As

Publication number Publication date
CN111611051B (en) 2022-05-31

Similar Documents

Publication Publication Date Title
CN108809854B (en) Reconfigurable chip architecture for large-flow network processing
CN109547580B (en) Method and device for processing data message
CN108833299B (en) Large-scale network data processing method based on reconfigurable switching chip architecture
US9397960B2 (en) Packet steering
Bhowmik et al. High performance publish/subscribe middleware in software-defined networks
US8923159B2 (en) Processing network traffic
CN111522653A (en) Container-based network function virtualization platform
CN107135268B (en) Distributed task computing method based on information center network
CN102299843B (en) Network data processing method based on graphic processing unit (GPU) and buffer area, and system thereof
CN104519125B (en) Distributed load distribution in order flexible for change in topology
US9292351B2 (en) Distributed fabric architecture in a cloud computing environment
US20200145316A1 (en) Distribution of network-policy configuration, management, and control using model-driven and information-centric networking
EP3559833B1 (en) Best-efforts database functions
Yang et al. Using trio: juniper networks' programmable chipset-for emerging in-network applications
CN111901236B (en) Method and system for optimizing openstack cloud network by using dynamic routing
US20120109913A1 (en) Method and system for caching regular expression results
Bhowmik et al. Distributed control plane for software-defined networks: A case study using event-based middleware
CN111611051B (en) Method for accelerating first distribution of data packets on NFV platform
Raumer et al. Performance exploration of software-based packet processing systems
US11012542B2 (en) Data processing method and apparatus
US11552907B2 (en) Efficient packet queueing for computer networks
Bonelli et al. The acceleration of OfSoftSwitch
Zhang et al. Loom: Switch-based cloud load balancer with compressed states
Pan et al. Modeling ccn packet forwarding engine
Iqbal et al. Flow migration on multicore network processors: Load balancing while minimizing packet reordering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant